I remember applying for NSF’s Graduate Research Fellowship many years ago and being asked to answer a question describing my experiences “integrating research and education”. At the time, I was baffled by the question, as I hadn’t yet done much teaching. I thought: Aren’t teaching and research orthogonal? I’m told by current students that the question no longer exists in the fellowship application, which I think is unfortunate. That question has stayed with me throughout my career: I regularly re-ask myself questions about integrating research and education.
At least in the United States (and presumably elsewhere, too), university researchers are regularly asked to tie our research back to education: for example, faculty members are regularly asked to describe the “broader impact” of their research, which includes how the results of the research will be incorporated into the curriculum. I’ve learned that this is no accident; to the contrary, I think it is one of the most important (and under-appreciated) things that researchers should be thinking about.
Although researchers are sometimes asked to think about how research can be integrated in the classroom, I’ve also found that efforts in the classroom can also ultimately result in better research. In fact, although many educators are not necessarily researchers, the converse is undeniable: It is no accident that some of the best researchers are also excellent teachers. And, while some strong researchers who are not good teachers do exist, I believe that purposeful teaching effort does in fact result in much better research.
In this post, I’ll describe my views on the relationships between research and teaching, in both directions. I’ll begin with the more “obvious” notions of how our research ultimately affects education and the curriculum and continue to what I think is the less apparent (and more interesting) direction of how our work on education can also make us better researchers. Of course, teaching also helps us develop many “general purpose” skills that are also useful in research, including mentoring and supervisory skills, learning to analyze others’ understanding, learning to give feedback, and so forth. Below, I’ll eschew these practicalities and instead focus on how the relationship between research and education ultimately result in better research ideas.
How Research Affects Teaching
Research results instill fresh material in the classroom. Although some subjects we learn in the classroom are fairly well-established, many areas of computer science (and I would assume certain other fields, too) are rapidly evolving. With the rise of large content and service providers such as Google, Amazon, and Facebook; the proliferation of mobile devices; and the spread of connectivity to developing regions (to name a few developments), computer networking looks almost nothing like twenty years ago, and, while certain principles persist, the constraints of the domain and the applications of the technologies are continually evolving. Students strive for concrete examples and applications of concepts to the world that they know which is, incidentally, different from the world we knew when we were students. New research results represent prevailing theories, the outcome of our cumulative understanding, and the application of concepts to the most relevant problem domains or our time. I find that there is no better way to keep my course material current than to peruse the latest research and update the material so that it reflects current understanding.
Industry tracks research; students should, too. Our understanding continues to evolve as new research results emerge. In many areas, industry aggressively tracks new technologies and research results, and students aim will be more poised to make important contributions in industry if they are well-versed in current technologies. Students periodically thank me for covering a certain topic or concept in the classroom because “someone asked me about it in a job interview”. Certainly, there is a balance between educating our students on the big picture and “timeless” concepts (something I discuss more below), but I find that students are often quite grateful for having some exposure to the concepts and problems that industry is thinking about today. Instilling course material with fresh research results is one important way that instructors can help this process.
How Teaching Affects Research
I think the more surprising notion is that investing effort in teaching well can actually make us better researchers. I sometimes find that certain faculty members are too eager to minimize teaching responsibilities in favor of “leaving more time to get research done”. Now, it is worth acknowledging the source of this angst: many of the administrative aspects of teaching (e.g., grading, responding to student emails, organizational logistics) are incredibly time consuming and do not necessarily offer inherent benefits to research. Nevertheless, I find that the intellectual aspects of teaching are an indispensable aspect of my own efforts to become a better researcher. Below, I’ll explain more abstractly why I think teaching makes us better researchers, and, where appropriate, I’ll describe some of my own concrete experiences in this regard.
To create new knowledge, we must first master the existing body of knowledge. Research is the process of creating new knowledge. Making progress in creating knowledge requires a significant amount of background knowledge, before one can reach the “frontier” of a topic, where the interesting questions are. Herb Simon once attested that it takes about ten years of experience to get to the point of great accomplishment in any one area, simply because it takes a significant amount of time to accumulate knowledge in an area. This necessarily implies that we can’t become great researchers in a subject area merely by taking a class (or even a few classes); we must embed ourselves in that topic area. I find that teaching a subject is perhaps one of the most efficient ways to become embedded in a subject matter, since the process of explaining concepts to students leaves no room for “cutting corners” in my own understanding. The process of building understanding in a particular area allows us to develop a deep understanding the paradigms and theories that currently exist, and how those paradigms and the existing knowledge base might be extended (or amended). Teaching Ph.D. students about a particular subject matter is also a way to bootstrap research, by helping our students get to the frontier of knowledge more quickly than they otherwise would; I sometimes teach seminars on cutting-edge topics (above and beyond my teaching “requirements”) simply because I find the process to be an efficient way of helping students quickly ramp up on a topic where I would like to see more research happening.
On a personal note, I found the process of preparing a Massive Open Online Course (MOOC) on Software Defined Networking over the past summer tremendously helpful in solidifying my own knowledge in this budding topic area. This particular sub-field has seen rapid developments over the past five years, and I had found it difficult to take the time to deeply understand many of the latest developments. I found that teaching the course was a wonderful “forcing function” to familiarize myself with new technologies and ways of thinking, and to gain hands-on experience with tools that had been recently developed. My hands-on experience with development tools helped me in two ways: First, I was able to suggest better tools for my students to use in their own research; in several cases, students who had been “stuck” using older technologies quickly familiarized themselves with technologies I learned well enough to appreciate. By investing time to deeply understand how new techniques and technologies might be applied, I was able to make connections between problems we had been trying to solve in the research lab and tools that could be useful for solving them. Second, I was able to make connections between concepts that had recently been developed to help solve some problems that we had been working on that hadn’t yet been solved. In one case, for example, as I taught concepts about composition techniques for network policies, I realized that the techniques could be applied to help some of our own technologies scale to much larger networks, which provided a breakthrough on a problem that we had been thinking about for years.
In the process of explaining an existing phenomenon, you might discover that existing explanations, technologies, or theories don’t actually suffice. According to Thomas Kuhn, research breakthroughs often occur when old paradigms are discarded (or at least amended), thus changing our way of thinking about problems completely. New paradigms begin with the need to explain or treat facts or situations that existing paradigms don’t handle well. As instructors, when we attempt to explain various facts or situations to students, we sometimes find that we can’t explain why things are a certain way—our attempts to explain may reveal instances that are not handled or explained well by current paradigms, thus exposing glaring needs to develop new technologies, theories, and paradigms.
I remember my experiences as a teaching assistant for computer networking, as my advisor and I planned lessons to teach Internet routing. My advisor had long worked on problems where correctness properties and bound were well-defined (e.g., Internet congestion control). When we came to the topic of Internet routing, however (a topic on which I had some mastery as a result of a summer internship), I found him continually asking me how (or whether) Internet routing offered any guarantees of correct behavior. How could we be certain that Internet routing algorithms would actually send traffic where it was supposed to go, for example? We realized in our attempts to codify this in lecture material that no such guarantees existed! Frustrated by our inability to explain Internet routing correctness, we spent the next several years formally defining correctness properties for Internet routing and developing tools that checked Internet routing configuration for correctness. The work eventually resulted in tools that were used by hundreds of network operators and a best paper award at a top networking conference. When I think about that work, I regularly trace its success to my teaching experience with my advisor, and our initial frustrated attempt to explain some seemingly basic concepts about networking to students. If it weren’t for that teaching experience, I think that research probably would never have happened.
Teaching encourages us to think about the long road, the big picture, and what “really matters” about a particular research contribution. I aim to explain why something is the way it is, beyond simply explaining a concept. As I explained above, efforts to explain why something is the way it is might sometimes fail to produce a good explanation, opening new possibilities for research. In other cases, research may offer solutions to a problem du jour, but sometimes research projects or papers are fairly self-contained, and it takes additional thought to really establish why (or whether) a particular result has broader implications that a student might care about. As an instructor, I strive to think about the big picture, and why a student should care about a particular research result, theory, or concept five or ten years down the road, long after they have left our classroom and received their degree. This exercise of thinking about broader implications can make classroom material more palatable to students, most of whom won’t specialize in the particular field you happen to be teaching. But, it also forces us as researchers to step back and think about why the problems we are working on have broad impact and why they matter to society at large. Explaining to a classroom of students why a particular result matters is perhaps one of the most useful exercises for distilling a research contribution to its essence.
Motivated Students + Inspiring Teachers = Great Research
I admired my university professors and wanted to emulate them; they are one of the main reasons I wanted to become a university professor in the first place. Teachers can influence and affect a large number of students in tremendously positive ways. Indeed, giving students the thirst for knowledge to the point that they want to not just consume existing knowledge but make discoveries themselves is a unique opportunity that we have as educators. And, certainly, developing smart young students into the researchers of current and future generations is yet another way that our efforts in the classroom can pay long-term dividends for research.
In this post, I’ll discuss paper selection—how a program committee considers a set of papers, each with a collection of reviews, and produces a program for a conference.
Paper Reviewing vs. Paper Selection
It’s first important to note that the paper review process and the paper selection process are related, but paper reviewing and paper selection are actually two separate events. There is a human process behind paper selection that involves distilling the reviews of a paper and ultimately making a determination that a paper should be published, but the reviewing and selection have many independent aspects. Sometimes, a paper’s reviews may appear positive, but the paper may still be ultimately rejected (or, the reviews may appear negative but the paper is accepted in spite of the reviews). More commonly, papers tend towards the middle of the review distribution: when reviewers score papers, they tend to regress towards middling scores such as “weak reject” and “weak accept”, rather than taking a strong stance. A significant fraction of papers thus end up with almost the same average “scores”, with only a handful of papers that should clearly be accepted. The number of clear accept papers is never enough to fill the entire program. Suppose, for the sake of example, that a conference can accept 40 papers. The conference may receive several hundred submissions; in most cases I’ve observed, about ten of these submissions receive uniformly glowing reviews. The next best 30–40 papers often fall into a rough equivalence class with the next best 30–40 papers, leaving anywhere from 60–80 papers that are “good enough to be published” for about 30 spots in a conference program. The job of a program committee is to create a smaller program from this larger set of papers.
Whenever a larger number of roughly equivalent candidates competes for a smaller number of positions, some amount of randomness can and will ensue. Much of this randomness is completely beyond the author’s control. As an author, the best way to avoid being subjected to the randomness of a program committee is to write your paper so that it is among the ten best submissions to the conference. Unfortunately, this is far easier said than done, and it’s not really possible to engineer this, although following various research and writing tips can increase the likelihood that your paper ends up in this class. (Speaking from experience, I can say that those tips aren’t fail-proof—even the best researchers I know routinely have papers rejected—but you can at least improve your odds!)
Unfortunately, unless you can guarantee that your paper always falls among the very best papers (a secret that nobody has yet succeeded in unlocking), much of the randomness that results from the paper selection process is out of your control as an author, but other factors can help create a sane selection process. The program committee chair (or chairs) have tremendous power for creating a sane process. These roles include setting the right tone for reviewing and mindset for paper selection, ensuring that papers are reviewed by the right program committee members, selecting the papers that will be discussed at the program committee meeting, and reducing the effects of program committee psychology, such as paper discussion order (e.g., the weight of one biased or incorrect review, and so forth). I will describe more details and tips for program committee chairs below, based on my own experiences and observations.
The Intermediate Dispositions of Papers
Before the program committee meeting itself takes place, the program committee chairs must determine whether each paper should be accepted or rejected without any discussion, or whether the paper should be discussed at the meeting itself.
Before the meeting: Selecting papers for discussion. Program committee chairs typically select papers for discussion at the program committee meeting based on their assessment of the paper’s overall quality, based on review scores, their interpretation of the paper’s reviews, and their assessment of any subsequent “online discussion” that may take place before the physical program committee meeting (most conference reviewing systems allow reviewers to discuss each other reviews in the system itself via asynchronous, email-like messages). Program committee chairs will often try to ask reviewers to reach consensus about whether a paper should be discussed at the program committee meeting before the meeting.
Best case: Acceptance without discussion. If you are very lucky, your paper will not be discussed at the program committee meeting and will be accepted without discussion based on high review scores. I think this is the best possible outcome, because a paper always has flaws, and the more a paper is discussed, the more likely that someone at the meeting will hear something they don’t like about the paper and raise an issue with it. I have seen instances where a paper that receives uniformly high scores is rejected because it is brought up for discussion and someone (often who hasn’t yet read the paper but might read it during the meeting) hears something they don’t like. Such a discussion doesn’t mean that your paper will be rejected—a good PC chair will ensure that these dynamics don’t wrongly result in the unjust rejection of an otherwise good paper—but it does mean that your paper may ultimately be subjected to some of the dynamics I discuss below. Unfortunately, as I mentioned, you can’t guarantee that your paper falls into this category, but you can sometimes be lucky, and you can certainly take many steps to increase your odds.
Next best case: Discussion. Otherwise, if you are still reasonably lucky, your paper will be discussed at the program committee meeting. Discussion represents hope for acceptance, and typically because there are a large number of papers that are of roughly equal score or rank, if your paper is being discussed, it generally has as good of a chance or being accepted as most of the other papers being discussed. The program committee chair will set the order in which these papers are discussed, which can sometimes play a surprisingly significant role in whether a paper is accepted or rejected.
The Dynamics of the Program Committee Meeting
The program committee meeting itself involves the winnowing of a larger set of papers (say, 60–80 papers) down to a final program of, say, 30–40 papers. Typical math I’ve seen in systems and networking conferences is that while overall acceptance rates hover between 10–20%, about 50% of the papers that are discussed will be accepted.
Attendance. An in-person meeting is more effective than other alternatives (e.g., conference calls), if and when it is feasible. Most top-ranked conferences—and even some competitive workshops—have in-person program committee meetings. When program committee chairs select the members of the program committee, they typically confirm that the program committee member can attend the program committee meeting; and, if the meeting is an in-person meeting, they confirm that the PC member can attend in person. If all PC members are not present at the meeting, it is nearly impossible to have anything close to a fair paper selection process. Consider that a paper under discussion might have anywhere from 3–6 reviews, and that most of those reviews might hover around average scores, but one or two reviews may be outliers. The fate of a paper can shift dramatically if the PC member who feels strongly (either positively or negatively) about a paper is not in attendance. Without a champion present, the paper may simply fade into the mix; if a reviewer who raises major concerns about a paper is not present, the rest of the committee may be more likely to discount those concerns, since they cannot hear about them first-hand. Some of these attendance-related quirks can also materialize when program committee members depart early (e.g., to catch a plane flight).
Calibration. Every reviewer has a different opinion for what constitutes a reasonable threshold for acceptance. These differences in calibration are exacerbated by the fact that every program committee member has a unique set of papers to review. That is, while each paper has many common reviewers, no two PC members review exactly the same set of papers. Therefore, calibrating the PC is incredibly important for controlling the meetings dynamics. A poorly calibrated PC may pre-emptively reject good papers (as I mentioned above), only to accept weaker papers later on in the meeting. A good PC chair will spend considerable effort to calibrate the PC to reduce randomness. There are various approaches to calibration, which I will discuss below in my advice for PC chairs. One of them is setting an appropriate order for paper discussion, since the order in which papers are discussed will affect how the PC calibrates itself at various points in the meeting.
Discussion order. There is no accepted standard for setting the order in which papers are discussed, and every program committee chair seems to have a slightly different approach. That said, discussion order has a significant effect on the ultimate disposition of a large number of papers that sit on the borderline between acceptance and rejection. Perhaps the most striking example of ordering effects is evident when a PC discusses papers in decreasing order of score (i.e., from highest ranked to lowest ranked), which is, in my opinion, the worst possible ordering. People arrive at a meeting highly energized and eager for discussion. What often happens is that papers that are discussed in the morning are subjected to a fine-toothed comb and vigorous discussion—sometimes the result of these discussions can result in pre-mature rejection of a good paper. Later in the day, PC members become more tired and are less likely to pick apart a paper’s flaws or have a paper accepted only “over his or her dead body”. Similarly, excessive aggression in the early parts of the meeting can create a situation where, by the end of the day, there are not enough papers to fill the conference program (quite an irony, given the starting point of twice as many acceptable papers as slots!) This seems to create a dynamic where papers that are discussed later in the day sometimes have a better chance of being accepted. If papers are being discussed strictly in order from best to worst, there is the potential for “inversion”, where better papers are rejected early on in the meeting, making room for weaker papers to be accepted later on, as PC members acquiesce and PC chairs are frantically trying to fill the program. PC chairs can take various steps to reduce this randomness. One tweak I have seen (and used) is to simply treat all papers under discussion as roughly equivalent, banning all discussion or consideration “scores”. In these cases, the discussion order can be set such that papers on related topics are discussed together. Another that I have used is to employ a two-pass approach to discussion, where every paper that is not accepted immediately or upon initial discussion is discussed a second time—in other words, a paper is never rejected on the first pass. A third tweak that is sometimes used is to discuss a few “highly ranked” papers first (sometimes even the papers that are supposed to be “accepted without discussion” are used as quick calibration examples), followed by a few “low ranked” papers, and so forth.
Timing of discussion. A PC meeting does not leave time for extended consideration of any single paper. If there are 80 papers to discuss, and a PC meeting has eight hours of real work (discounting breaks, lunch, and so forth), then each paper sees, on average, 6 minutes of attention. Thus, discussion must be crisp and focused. A “discussion lead” for a paper (typically one of the reviewers) will typically summarize the paper and the strengths and weaknesses, effectively summarizing all of the reviews. This summary should last no more than 90 seconds. This is not a lot of time. Therefore, a good PC member who is leading discussion on the paper will prepare that summary ahead of time, so that it is as efficient as possible. The paper’s ultimate fate is typically decided in the remaining time (less than five minutes). Many disputes and disagreements about certain papers cannot be resolved in five minutes. This truth is the most common downfall of a PC meeting—a single contentious paper runs the risk of monopolizing discussion time, leaving less time for the remaining papers, thereby increasing randomness, acquiescence, etc. for papers that are discussed towards the end of the day. I find that using the “two-pass approach” for paper discussion helps mitigate this problem. I’ll discuss this approach in more detail below.
Personalities. A paper that has a single champion tends to fare much better than a paper with middling reviews, even if both papers have the same average score. In some sense, that is exactly the right outcome—a paper that someone really likes is likely to be appreciated by others, whereas an average paper might not excite anyone in particular. However, strong personalities do hold tremendous sway over a paper’s outcome, as well. A paper stands a very good chance if it has a champion on the PC who is articulate, confident, and strong-willed enough to persistently argue for a paper’s acceptance in spite of the paper’s flaws and detractors. Every paper has flaws; a strong-willed PC member can convince the rest of the PC to overlook those flaws in favor of other bright spots. Likewise, a single similarly strong-willed PC member can amplify the weaknesses of a paper and send the paper to the reject pile. These effects are exacerbated if the strong-willed PC member asserts expertise over the topic area. Sometimes, a PC member’s expertise is as important (if not more) important as the volume or persistence of the PC member’s argument. Simple statements from an domain expert such as “I learned a lot by reading this paper” can be enough to tilt a paper towards acceptance.
Psychology. The paper selection process is a human process that involves a lot of psychology. Rather than recount the various psychological factors that can play a role in program committee meetings, I refer you to Matt Welsh’s blog post on the topic.
Tips for Program Committee Sanity
Everything I learned about how to run a program committee, I learned from Jeff Mogul, who co-chaired NSDI this past year with me. Jeff is a veteran program committee chair who has chaired pretty much every major systems and networking conference at some point in his career. He is meticulous, thorough, ethical, and fair, which are indispensable qualities for any program committee chair. Working through the process with him, I picked up several tactics and strategies, which I can recommend to others and will undoubtedly use if and when I chair another major conference program committee. Being a program committee chair has countless tasks; some of these tasks are incredibly important, and the ones that are paramount weren’t immediately obvious to me until after I’d gone through the whole process. Below, I highlight tips for what I think are the most important steps of running a program committee.
- Pick your program committee carefully. Your program committee members need to be thoughtful, reliable, and conscientious. Ultimately, the program committee will write the reviews that authors of submitted papers see, and they will also determine the fate of each submitted paper. Don’t simply pick your friends or people you know well—they may not always make the best reviewers. Take extra time to do homework on program committee members’ performance on past committees. Did they write thorough reviews? Were they active participants in discussion at the PC meeting? Did they turn in reviews on time (or, if not, were they communicative enough to help the chairs plan around timing hiccups?). Are they considered an expert in a particular area that will see a lot of submitted papers for the conference? Beyond selecting individuals, the chairs also need to ensure that every area that may see submitted papers has significant coverage. For example, if the conference will see many submissions on (say) wireless networking, it is incumbent on the chairs to ensure that there are enough reviewers for that particular paper topic.
- Make sure that every program committee member “bids” on papers, and assign reviews manually based on preferences and expertise. Most conference reviewing systems allow program committee members to review all submitted papers and express preferences for which papers they would like to review. Some conference systems also enable chairs to automatically assign reviews. Do not rely exclusively on auto-assignment. When Jeff and I chaired NSDI, we used the auto-assignment feature in HotCRP (after ensuring every PC member filled in review preferences for each submitted paper) and then manually reviewed each paper to ensure that no reviewer was assigned to review a paper for which they had expressed a negative preference. This process is painstaking, but it is perhaps the most important step in the entire paper selection process, because it ensures that reviewers are assigned papers that they are capable of and willing to review. A reviewer who knows a paper topic can typically write a thoughtful review (and often do so quickly). In contrast, a PC member who reviews a paper for which they lack expertise or enthusiasm will typically invest only minimal effort and write a superficial review (thereby increasing randomness and likely reducing overall conference quality).
- Insist on (and monitor) online discussion in advance of the meeting. Selecting discussion leads and asking the lead to type a summary in the online discussion helps organization; the discussion lead can essentially read this typed summary (or notes) at the meeting itself, ensuring that the summary discussion concludes quickly and on time. Online discussion can also encourage reviewers to identify contentious issues before the meeting (where there is limited time to resolve disputes or disagreements), and potentially resolve them ahead of time, thereby averting protracted and unfocused arguments at the PC meeting itself. Sometimes, consensus on a paper can be reached in online discussion before the meeting occurs, saving precious time at the meeting itself.
- Use a two-pass approach to paper discussion. Every other program committee I’ve served on tries to reach a conclusion about a paper’s disposition after a (short) discussion. In the best case, this works OK; in the average case, hasty and incorrect decisions can result; in the worst case, the discussion monopolizes meeting time, and the meeting is derailed and ends in a frantic rush to accept papers at the end of the meeting. The two pass approach we used worked as follows: (1) in the first pass, a discussion lead would summarize the paper and its reviews; in the remaining 3–4 minutes, the reviewers would try to agree on an outcome for the paper, but none of the outcomes were reject. Rather, possible outcomes were: accept, incremental/boring (indicating that the paper was technically correct, but not particularly interesting or groundbreaking), risky (indicating that the paper could be groundbreaking but that reviewers had concerns about correctness or something else), and discuss (indicating that there was absolutely no agreement on the paper, and that more time would be needed to reach an outcome). (2) In the second pass, our original intent was to re-discuss everything, but effectively what happened was that all papers labeled as risky were accepted by default unless someone wanted to argue against one of them. Similarly, all papers that were labeled as incremental were rejected by default unless someone wanted to advocate for one of them. (We had papers in each of these categories.) Pulling reject off the table on the first pass kept the meeting going (we could reasonably end discussion after a fixed amount of time because people knew we could come back to discussion later), and it also reduced the “ordering effects” that I described above, since every paper that wasn’t quickly accepted received two passes, with the second pass occurring after the PC had seen the complete set of papers.
- Read every paper that is being discussed. Although it is not possible to carefully read 80 papers in advance of a meeting, two PC chairs can split the workload and at least perform a quick read of about 40 papers (and take notes on them) in advance of the meeting. Jeff and I did this, and it proved to be incredibly useful, for several reasons. Having your own opinion about a paper as the chair can help a chair to moderate the PC meeting discussion, by ensuring that a strong personality (or, sometimes, a PC member who has not even read the paper!) from unfairly swaying the discussion or perception of the paper. In the end game, sometimes PC members will never come to agreement about whether a paper should be accepted or rejected. In these cases, the chair becomes the “tie breaker” and can accept or reject a paper by fiat. This typically happens at least once in every PC meeting. A PC chair who has not read a paper cannot exercise fiat effectively, so having read every paper is critical in these situations.
The Worst Process, Except for Every Other Process
The program committee process for paper selection is far from perfect; it is an inherently human process, and in the final analysis, a small handful of gatekeepers (indeed, sometimes even a single gatekeeper) can determine whether a paper is accepted to a top-tier conference and read by many others, or shelved. I personally would like to see my own community have a long discussion about (and experiment with) ways to improve this process. In the meanwhile, I trust that this post can help authors (particularly students) understand some of the dynamics of the process and also help chairs and PC members ensure that the process is as fair as possible. Although the process is far from perfect—and it is unlikely that a perfect process exists—understanding the mechanics that go on “behind the closed doors” of a program committee meeting hopefully also indicate that the outcome of the process (either acceptance or rejection) should not be interpreted as universal praise or condemnation, but rather the result of the opinions of a small number of people and the outcome of a human process. Ultimately, the proof of a research idea lies not in the outcome of a single program committee, but in an ideas ultimate acceptance, adoption, and impact (which is, in itself, a topic for a future post).
Learning how to review papers not only (obviously) makes you a better reviewer, but it can also help you as an author, since an understanding of the process can help you write your paper submissions for an audience of reviewers. If you know the criteria that a reviewer will use to judge your paper, you are in a much better position to tailor your paper so that it has a higher chance of being accepted.
There are many good resources that describe the paper reviewing process already, including those that explain the process (and its imperfections) and those that provide instructions for writing a good review (as well as techniques to avoid). There are also a few nice summaries of the review process for conferences in different areas of computer science that lend visibility into the process (e.g., here and here). Program committee chairs sometimes provide guidelines for writing reviews, such as these. I will not reiterate or summarize those previous articles here, but they are all definitely worth a read. Instead, I will discuss the importance of the review process and how it differs from simply reading a paper; I’ll also talk about how to prepare (and ultimately write) a review.
I will not talk about the paper selection process (i.e., what determines whether a paper is ultimately accepted or rejected), but will instead focus on the creation of a paper review. Program committee meetings are an important part of the paper selection process—at least in computer science—and I will be devoting a complete post to this topic next week. Meanwhile, I recommend reading Matt Welsh’s post on the psychology of program committees.
The Review Process
Why understanding the review process is important. Whether you end up reviewing a lot of papers as a Ph.D. student, your research will definitely be subject to the paper review process. It is imperative as a researcher to understand this process. Knowing the process can help you better write your paper for an audience of reviewers (and a program committee), and it can also help you maintain perspective when your paper is accepted or rejected. The process is far from perfect, and the outcome of the process is neither validation nor condemnation of your work. How you react—and how you adapt your research or follow through on it after the acceptance (or rejection)—is far more important to long-term success.
In the “Introduction to the Ph.D.” class at Georgia Tech, I ask students to create a research idea and write it up; a subsequent set of assignments asks the students to review and evaluate the ideas as part of a “mock” program committee. The process isn’t exactly the same as the review process for a full paper, but it is a lightweight way to have students experience the process first-hand in a low-stakes setting, and see both sides of the process (submission and review) at the same time. In next week’s blog post, I will discuss program committee meetings in general, as well as some observations from this year’s (and previous years’) in-class experiences with the mock PC.
Reviewing vs. reading. There are some significant distinctions between reading papers vs. reviewing them. When reading a paper for your own enrichment, your goal is to gather information as quickly as possible. In this case, you are a scientist who seeks to understand the context and content of existing work, to (for example) better understand how your own research might fit into the bigger picture or learn about techniques that might apply to your own work. The goal of reviewing is different. A reviewer’s goal is to first and foremost determine the suitability of a paper for some conference and second, to provide feedback to the authors to help them improve the paper in subsequent revisions. Remember that the reviewer’s primary goal trumps all other objectives: A reviewer often has a large number of papers to process and is typically not deeply devoted to improving the content of any particular paper. If you are lucky, you will get a diligent, thoughtful reviewer who provides thorough feedback, but do not be surprised if a review is not as thorough as you would have liked, or if the review “misses” some point you were trying to make. We would all like reviewers to make three passes through your paper submission—and, these are the instructions I would give, too, in an ideal world. Unfortunately, however, you will be lucky in many cases to get two thorough reads. The reviewer’s main goal is to determine the paper’s suitability for publication. As an author, you shouldn’t be surprised if some of the comments seem trivial: there may be underlying issues of taste that drove the reviewer’s opinion on your paper that a reviewer may not explicitly state. Whenever I read reviews I receive for a rejected paper, I try to look past specific detailed quibbles (or “excuses” for rejecting the paper) and figure out the big picture: the reviewer couldn’t find a reason to accept the paper.
Calibration: Reviewing one paper vs. reviewing many papers. The paper review process can differ depending on who, exactly, is reviewing the paper. For example, as a Ph.D. student, you may review one or two papers at a time, as an “external reviewer” for a conference or journal. Journal editors and program committee chairs often seek the help of external reviewers if they need a particular subject-matter expert to review a paper. Later in your Ph.D. career, you may have established yourself as an expert on a particular topic and find yourself reviewing a paper here and there on a handful of topics. Sometimes a member of the program committee (e.g., your advisor) might ask you to help review a particular paper. As you progress in your career, you will be asked to serve on program committees yourself, whereupon you’ll find yourself with tens of papers to review over the course of a couple of months. Ironically, it is sometimes easier to review a group of papers than a single (or a few) papers, because seeing a group of papers helps you “calibrate” your scores and rankings of papers according to the general quality of papers that have been submitted to the conference. If you have been asked to review a single paper for a conference, you should either figure out how to calibrate your assessment with respect to other papers that might have been submitted, or simply review the paper on its merits while reserving judgement as to the paper’s ultimate disposition.
Does the Paper Realize a Great Idea?
Look for a reason to accept the paper. Does it realize a great contribution or idea? Every paper is imperfect. The paper may have made an incorrect or imperfect assumption. The experiments may not have been as thorough as you liked. The graphs may be difficult to read. Parts of the paper may be difficult to understand. These types of issues certainly reflect problems with a paper, but they do not necessarily constitute a reason to reject a paper if they do not affect the correctness or significance of the main underlying conclusion or contribution of the paper. Therefore, the first two questions I ask myself when reviewing a paper are: (1) Does the paper have a great idea?; and (2) Does it realize the great idea? (or, alternatively, to what extent does it realize that great idea, since typically no paper is water-tight).
What makes an idea “great”? Judging a paper’s contribution turns out to be highly subjective, which is why the review process remains so uncertain. A paper isn’t judged on a set of fixed checkboxes, a grading “key”, or any notion of absolute correctness. Reviewers often reserve considerable judgment based on “taste“, and reasonable people will disagree as to the merits of the main contribution or idea in a paper. In fact, there has been a fair amount of documentation that, as reviewers, we are often quite terrible at predicting the merits of a particular piece of submitted work: There’s a great article on this topic, as well as some parodies to illustrate the subjective nature of the process. Many fields have also introduced a “test of time” award to papers from past decades, to recognize accepted papers that have truly had long-term positive impact (implicitly acknowledging that this is almost impossible to assess when a paper is first published). Due to the subjective nature of this judgment, it is all the more important that your writing is clear, and well-matched to what a reviewer is looking for (i.e., the contributions and ideas).
Invariant questions. Different conferences may have different value structures, and the chairs of any given conference may ask the reviewers to focus on different criteria when judging a paper. Regardless, there are some invariant questions that most reviewers would (or at least should) always consider, including:
- Is the problem important? What problem is the paper trying to solve, and is it important? Seek to summarize the paper’s contribution in one sentence. Make this short summary the beginning of your review, as well. Try to convince yourself (by reading the paper or otherwise) that a solution to the problem that the paper is proposing would advance knowledge or significantly improve the state of affairs for some group of people. Note that you may not care about the problem, but also ask yourself whether you can imagine some group of readers who will be interested in the solution to the problem. When asking yourself this question about a paper, try to divorce your own taste about the problem’s importance from the more general question concerning whether there is some group of people who would be interested in the problem the paper is addressing and solving.
- To what extent does the paper solve the problem it describes? A single paper very rarely closes the book on a single problem, but it may take an important step towards solving the problem. It might solve the problem for an important set of operating conditions or under a new set of assumptions. Or, if the problem area is completely new, perhaps the paper doesn’t really solve the problem at all, but simply articulating a new problem area for follow-on work is a significant contribution.
- What is the “intellectual nugget”? As a reviewer, I try to identify whether a paper has a particular intellectual kernel that lies at the heart of the solution. This kernel is often what separates an important research contribution from a simple matter of engineering. This intellectual nugget might be the application (or invention) of a particular technique, a proof of correctness (where one previously did not exist), or an attempt to put the solution into a broader intellectual context. In other words, the intellectual contribution might be to take a general problem and tackle a specific sub-problem (e.g., under certain assumptions or conditions), or to take a specific problem and generalize it (e.g., develop a general theory, proof of correctness, or taxonomy). Looking through the paper for applications of specific research patterns can help identify an intellectual nugget, if one exists.
- What is the main contribution or conclusion? Is it important? As a reviewer, I try to concisely articulate the paper’s main contribution (or small number of contributions). Often, a paper will helpfully summarize those contributions somewhere in the introduction (Jim Kurose’s advice on writing paper introductions advises the writer to explicitly do so). The reviewer’s job is then to assess whether those contributions are significant or important enough to warrant a publication. The significance of those contributions often depends on the perceived increment over previous work. All work is incremental to some degree, as everything builds on past work. The author’s job is to convince the reviewer that the increment is important, and the reviewer’s job is to assess the author’s claims of significance.
- Does the content support the conclusion? An introduction may make broad (or wild) claims, and it is important to dig into the paper to determine whether the content of the paper supports the conclusion. Are the experiments run correctly? Are they based on the correct set of assumptions? If the conclusion involves comparison to previous work, is the comparison performed in a controlled manner, using an equivalent (or at least fair) experimental setup? If applicable, have the authors released their code and data so that you (or others) can check the claims yourself?
Preparing Your Review
Consider the audience. Not every publication venue is the same. Some venues are explicitly geared towards acceptance of early, incomplete work that is likely to generate discussion (many workshops use this criterion for acceptance). Other venues favor contributions that constitute well-executed, smaller increments. When reviewing a paper, either externally or as a member of a committee, your first question should be to consider the audience for the conference, workshop, or journal, and whether the likely audience for the venue would benefit from reading the paper. The question of audience involves that of both the “bar” for acceptance (Does the paper meet the audience’s standards for something that is worth reading?) and the “scope” of the venue (is the paper on-topic for the venue?). Often, scope can be (and is) broadly construed, so the key question really boils down to whether the likely audience for the paper will benefit from reading it.
Consider the standards. Your standards will (and should) vary depending on the venue for which you are reviewing a paper submission. Workshops are typically more permissive as far as accepting “vision” papers that outline a new problem or problem area or papers that “foster discussion” than conferences, which typically aim to accept more complete pieces of work. Nevertheless, even the standards for a conference review process will vary depending on both the conference itself, the program committee chair’s instructions about how permissive to be, and the relative quality of the group of papers that you are reviewing. A good way to get a sense for the standards of a conference for which you are reviewing is to read through the complete set of papers that you have been asked to review and rank them, before writing a single review. This will ensure some level of calibration, although it is still biased based on the set of papers that you are reviewing. Reading past proceedings of the particular journal or conference can also help you determine the appropriate standard to set for acceptance.
Consider the purpose. Different papers serve different purposes. Multiple paper submissions to the same venue might in fact have quite different purposes, and it is important to establish what the paper is contributing (or attempting to contribute) before passing judgement. For example, a paper might be a complete piece of work, but it might also be a survey, a tutorial, or simply a proposal. If the paper is one of the latter types, your first questions as a reviewer should concern whether the audience would benefit from the survey, tutorial, or proposal, and whether such a paper meets the standards for the conference. If the answers to those questions are “yes”, then your evaluation should be tailored to the paper’s purpose. If the paper is a survey, your assessment should be based on the completeness of the survey, with respect to the area that the paper is claiming to summarize. If the paper is a tutorial, is the description correct and clearly described? If the paper is a proposal, does the proposed research agenda make sense, and is the outcome (if the proposal is successful) worthwhile?
Consider the big picture. Every paper can be rejected. It is always easy to find reasons to reject a paper. The reviewer’s goal should not be to identify the reasons to reject a paper, but rather to determine whether there are any reasons to accept the paper. If the answer to that question is negative, then it is always easy to find “excuses” to reject a paper (recall the discussion above). You should be aiming to figure out whether the paper has important contributions that the audience will benefit from knowing about, and whether the paper supports those contributions and conclusions to the level of standard that is commensurate with the standard of the audience and the venue. One litmus test I use to ensure that a negative aspect of a paper does not condemn it is to ask myself whether the problem (1) affects the main conclusion or contribution of the paper; and (2) can be fixed easily in a revision. If the problem doesn’t affect the main contribution or conclusion, and if it can be easily fixed, then it should not negatively affect a paper’s review.
Writing Your Review
Start with a summary of the paper and its contributions. A short, one-paragraph summary describing the paper’s main contribution(s) demonstrates to the authors (and to you!) that you understand the main point of the paper. This helps you as a reviewer articulate the main contributions and conclusions of the paper for the purposes of your own evaluation. Try to address the type of paper it is (is it a survey paper, for example?), the context for the paper (i.e., how it builds on or relates to previous work), its overall correctness, and its contributions. If you cannot concisely summarize the paper, then the paper is not in good shape, and you can reflect this assessment in the review, as well. These summaries are very helpful to authors, since they may not match the authors’ views of the main contribution! For example, as an author, you can easily figure out if you’ve “missed the mark” or whether the reviewer fundamentally misunderstood the paper by reading a reviewer’s summary of your own work. If the summary of the contribution does not match your own view of the paper’s contribution, then you know that you have some work to do in writing and presentation.
Assess whether the paper delivers on the main claims and contributions. You should provide an assessment, for each of the paper’s main claims and contributions, whether it delivers on that claim. If the main contribution of the paper is flawed, you should indicate whether you think a flaw is “fatal”, or whether the authors could simply fix the flaw in a revision if the paper is accepted. Sometimes flaws (e.g., inconsistent terminology) are fixable. Other flaws (e.g., a questionable experimental setup) may or may not be fixable. While it might seem that a broken experimental setup is “fatal”, ask yourself as a reviewer whether the conclusions from the paper’s experiments as is are still meaningful, even if the authors have not interpreted the results correctly. If the conclusions from the experiments can be restated and still turn out to be meaningful contributions—or, if the flaw in an experiment doesn’t affect the main contribution or conclusion—then even a flaw in experiments can likely be fixed in revision. Occasionally, however, experiments may need to be completely redesigned because they don’t support any meaningful conclusion. Or, the content of the paper may simply be incorrect; sometimes correctness issues are difficult for a reviewer to spot, so a paper isn’t necessarily “correct” simply because a reviewer has validated the paper. Regardless, if there are correctness issues that affect the main contribution of the paper that call into question whether the main result or contribution is correct in the first place, the paper’s review should reflect these concerns and likely cannot be accepted.
Discuss positive aspects of the paper; always try to find something positive, even in “bad” papers. It is easy to identify problems with a paper. It can be much trickier (especially with “average” papers) to identify the positive aspects and contributions, but most papers typically have at least some small kernel of goodness. Even for particularly bad papers, there might be one sentence in the introduction, discussion, or future work section that makes an interesting point or highlights a possibility for interesting contributions. In a pinch, if you can’t find anything positive, those are good places to look. As a reviewer, you can remark that those observations are interesting, and that you would really like to see those parts of the work further developed. These positive comments aren’t just for author morale (although that’s important, too): They give the author a direction to move forward. The worst reviews are those that reject a paper but don’t provide any specific action for moving forward. The best reviews are those that highlight the positive aspects of the work, while identifying weaknesses and areas where the work could be further developed to address weaknesses or build on the paper’s existing strengths.
Criticize the paper, not the authors. When writing your review, consider the type of review that you would like to receive. Always be polite, respectful, and positive. Don’t be personal. Choose your language carefully, as it will help convey your message. For example, if you say “the authors don’t consider the related work”, that is a much more personal statement than “the paper doesn’t consider the related work”. (In fact, you don’t know if the authors considered a particular piece of related work anyway; they may have simply chosen not to include it in the writeup!) Talking about “the authors” gets personal, and it will put the authors themselves on the defensive when reading your review. Instead, focus on “the paper” and frame your critique around “suggestions for improvement”. Never, ever insult the authors; don’t accuse the authors of being sloppy or unethical researchers. As a reviewer, you don’t always know the full context, so limit your judgement to what you can directly conclude by reading the paper.
Consider the type of feedback you would like to receive. Receiving reviews for rejected papers is a part of the research process, but it is never fun for the authors (particularly new Ph.D. students). Do your part to contribute positively to the process by suggesting changes that you’d like to see if you had to review the paper again. In all likelihood, you may see the paper again in the form of a revision!