Research PatternsPosted: September 20, 2013 Filed under: Uncategorized 1 Comment
Research creates new ideas, paradigms, and discoveries. The process of creating new ideas can seem daunting, and somewhat of a mystery. From some apparently unstructured environment, the mythical process of research will produce new discoveries and knowledge. Picking a good problem is one important piece of the puzzle; as I have previously discussed, developing taste in research problems and learning how to identify good problems can take many years of experience (indeed, part of the Ph.D. process itself is gaining experience and research taste).
Before we begin our research careers, we experience structured training: we attend classes, read textbooks, and grind through problem sets in areas that might be called “established science”. In contrast, the research process appears completely unstructured—researchers create new ideas and discoveries, seemingly from nowhere. Research papers convey the outcomes of research, but they do not shed light on the process of how the problem was identified or how the solution was devised. (In particular, most papers don’t discuss the countless problem formulations or solution attempts that failed before the breakthrough occurred.) A paper typically only tells a story about the broader context and contributions of the results after the breakthrough has occurred, with the benefit of hindsight and perspective. Reading a research paper without insight into the process can make research seem even more mystical. “How did they think of that?”, you might wonder? In this post, I’ll try to at least partially demystify the process of finding research problems and solving them.
Research patterns. In fact, the research process is a lot more formulaic than it might appear. I find that applying particular formulaic approaches tend to work fairly well for identifying (and solving) important problems. In fact, I would posit that there are research patterns, similar to the software engineering concept of design patterns. A design pattern is a reusable solution to a commonly occurring problem. Example design patterns include iterators (for lists) and locks and thread pools (for concurrency). In research, we also face common problems: selecting a problem, understanding the problem, and planning and executing a solution to a problem. Fortunately, there are research patterns—reusable solutions to each of these challenges—that can help solve each of these problems. Below, I will propose some research patterns for each of these challenges that we commonly face in research.
Finding a Problem
How does one find a problem to work on? There are certainly mainly difficult problems to work on, some of which are worth solving as a researcher and others which are best left to others. Developing research taste can help us determine which problems are worth solving, but where does one find research problems in the first place? Below are a few research patterns that I have applied and found useful.
Hop on a trend; look “upstream” for the trends. Researchers tend to coalesce around certain paradigms as they emerge. Now, you could of course try to create your own paradigm, but doing so involves effectively staging a revolution that changes the way an entire community thinks (including established researchers who may be slow to adopt radical thinking, yet hold considerable clout; the topic of revolutionary science deserves a post of its own, and I’ll write more about scientific revolutions in a subsequent post). Short of setting your own trend, you can very easily look around and identify (and even predict!) the community zeitgeist. Reading conference proceedings to identify the zeitgeist is reasonable, but this might place you as an “also ran” in an area where most of the initial breakthroughs have occurred. A better place to identify trends, I think, is “upstream” of conferences. Funding agencies often set research agendas, for example, through “calls for proposals”, and the research community chases those opportunities (“following the money”). So, for example, if you see a funding call for (say) digital health, it’s reasonable to expect that three or four years down the road, a trend or sub-community may emerge around this topic. Funding agencies even hold workshops to help shape these trends, before calls for proposals are released. Those workshops often provide significant hints about topics that might appear in a call for proposals—and, hence, what might eventually become a “trend”. In other words, trends don’t come from thin air—people define and set trends. If you can’t be a trend-setter yourself (something that’s very tough to do as a Ph.D. student), it’s best to be on the early side of a trend, and looking upstream can be incredibly helpful for doing so.
Develop a secret weapon and look for nails to hit with your hammer. Become a domain expert at something, or develop knowledge or resources that others don’t have. By developing some expertise, you become attractive and valuable to many people, and people will begin approaching you with ideas and want to collaborate with you. People will actually start coming to you with their problems. The expertise you develop can come in many forms, and it effectively becomes your secret weapon. You might, for example, develop a software toolkit or system that other people can build on. By developing a system that others want to use, you place yourself at the epicenter of new ideas that might build on that system. Or, perhaps you might become an expert on a particular topic or area—a particular set of network protocols, a particular piece of hardware, a particular programming language, or a particular set of statistical methods. Once you have developed that secret weapon, you can use it to solve problems that others are not able to solve, or to develop solutions that others might not be able to think of. Once you have developed your secret weapon, you can begin to look for problems where you might be able to apply your unique knowledge, system, dataset, or expertise.
Revisit old problems where assumptions may have changed. Old problems can be a great source of new problems. Previous solutions to old problems may have assumed certain constraints about processing power, storage, the cost of memory, the set of prevailing applications or protocols, and so forth. Yet, underlying technologies are continually advancing. A problem that was difficult to formulate or otherwise intractable five years ago might suddenly become solvable because old assumptions are now invalid, or because the emergence of a new technology (or algorithm) makes previously hard problems suddenly easier to manage. It is worthwhile to periodically revisit problems that have remained difficult and unsolvable for many years and to consider whether recent developments have made the problems any more tractable. Lots of areas of research have unsolved problems (not only theory!). For example, network management problems have remained particularly vexing in computer networking for a long time, but the emergence of recent new technologies has suddenly provided ways to make traction on problems that were previously hard to even formulate. Your area may have similar developments, and you should continually be on the lookout for those underlying phase shifts.
Look for pain points; eliminate them. Automate existing solutions. It is worth looking to industry (and even other researchers) to identify problems that continually recur. As programmers, if we have to do something more than a few times, we eliminate that pain point by writing a script. In research, if the same problem continually recurs and is being solved in the same silly, inefficient, or suboptimal way, certainly there must be a better way. People in industry often face real, important problems, but are may be too busy to step back and take a fundamentally new approach to solving an old, recurring problem. Fortunately, as a researcher, you have time to consider whether the problem is being solved in the right way, or whether a “pain point” exists that could be eliminated with a fundamentally new approach. Along these lines, many existing problems remain manual and painful, and automation of an existing solution can often significantly improve the state of the art. Yet, automation can be incredibly difficult because it often requires complex reasoning, which may be the basis of an important research problem.
Dream, and maintain wish lists. Many research solutions are ultimately about making everyone’s lives better. Think about the “grand” research problems—curing cancer, putting a man on the moon, and so forth. These are dreams that seem like science fiction, yet they are achievable. Each of these dreams involves the many sub-problems, some of which require the creation of new knowledge. (If they didn’t require new discoveries, the problems would be solved already, because we’d just “turn the crank”.) A good way to identify research problems is to maintain a wish list of your own, as well as a list of questions that you’d like to know the answer to. If you don’t know the answers and can’t easily find them, then the answers may not be known. Answering them is then, by definition, research. These might be large dreams or questions, or they might just be simpler things that bother you and you wish you had a solution for. Maybe you’re getting too much email spam and want to come up with a better way to solve it. Maybe you want to know how authoritarian governments block access to certain information or materials. A good place to start for many research problems is to identify which problems you would like solved that could make your world a better place.
Specialize the general. Applying constraints to a general problem in different ways can sometimes yield a new class of problems. For example, in networking, “routing” (the process of computing a path between two points) is a classic general problem. One can create new classes of problems by applying a constraint to the general problem: routing in wireless networks, routing in sensor networks, routing in delay-tolerant networks, and so forth. Of course, it’s best to find constraints that reflect realistic and important real-world scenarios.
Generalize the specific. Some problem areas have been subject to solutions that “chip away” at the problem, offering point solutions, rather than trying to tackle the larger problem area. When reading a collection of papers on a particular problem, it can be helpful to determine whether any particular paper proposes a general approach or algorithm, or whether each merely offers a point solution to some aspect of the problem. If particular papers only offer point solutions, there may be room for generalization.
Look for linchpins. When looking for problem areas that are in need of generalization, I think it is helpful to identify if a problem area has a linchpin, whereby a collection of papers each tackle a problem in a specific way, based on the assumption that some underlying problem is already solved. That keystone might represent a significant research problem, since so many specific solutions rely on a solution existing. For example, one (open) problem in censorship circumvention is “bootstrapping”: every paper on censorship circumvention that I have read assumes that there is some way of distributing the software and the initial information about how to configure that software to reach an initial set of trusted parties. All of the existing systems assume a solution to this problem, but no such solution exists. A solution to the problem could crack open an entire area. That’s a good problem to work on. The theoretical computer science and cryptography communities have some famous examples of this (e.g., P vs. NP, discrete log problem, factorization), but I believe most communities have these linchpin, although they may not be quite as commonly accepted or well-known.
Teach. I have found that there is an extremely tight coupling between teaching and research. When teaching concepts, I aim not only to teach the concept, but also explain the rationale behind the concept. For example: How was a particular network protocol designed? What is the rationale behind a particular design choice in a system? In seeking to explain certain concepts or theories, sometimes, we can find that things are difficult to explain. Sometimes, this reflects a gap in our own knowledge or cognition, but sometimes what we are attempting to explain might actually not make sense—the concept, approach, result, or system you are attempting to explain might in fact reflect an incorrect result, a bad design choice, or (perhaps more likely) one that no longer makes sense given today’s constraints, circumstances, or common assumptions. When you find yourself having trouble explaining the rationale behind a concept, you may have identified a new opportunity or problem area.
Solving the Problem
If solving a problem were easy, someone else would have likely already solved it. Therefore, identifying a solution to a problem (much like finding the problem itself) often requires a new perspective or way of thinking. Fortunately, problem solving also has several research patterns that seem to repeatedly arise. I have commonly used and observed the following approaches to solving research problems.
Consider related problems. Try to restate the problem you are trying to solve in a different way. Consider different terminologies and representations of the problem that you are trying to solve. By changing the form of the problem and trying to describe and represent it in different ways, you might find that your problem matches a general problem that is already formalized. For example, the problem of identifying whether an Internet service provider is intentionally degrading performance might be referred to as a “treatment” of class of customers. That terminology might make you think about “random treatment”, a process by which biologists can determine whether a particular drug or chemical has any positive (or negative) effect on a group of humans. Trying to recreate the conditions of random treatment in a network environment might lead you to a statistical approach to solving the problem of identifying ISP service degradation. This thought process was exactly that of one of my former Ph.D. students, which led to this solution.
Make analogies. Analogies are incredibly powerful. We use analogies to learn all the time, because we learn a new concept best by relating it to a concept that we already understand. Similarly, you can solve a hard problem by relating it to a problem that you already know how to solve. Analogies often create the biggest breakthroughs when they come from outside of your immediate discipline (these are the conceptual blockbusters that other people often aren’t thinking of because they’re typically not looking for solutions outside their immediate discipline). Computational thinking was one highly publicized example of applying analogies to problem solving—the notion that concepts that we learn in computer science (sorting, queuing, etc.) can help us solve problems outside of the discipline. But, these analogies can also be applied in reverse. For example, researchers have applied concepts from epidemiology to understand how computer viruses spread. These analogies—when applied well—can also often point exactly to a solution, since the solution that applies to the analogous problem can sometimes be translated to the problem you are studying.
Change the problem to one you can solve. Make simplifying assumptions that violate some of the problem constraints, or define some approximation of the ideal solution. In some sense, this is the dual to the “finding the linchpin” approach to finding a research problem. Many of the censorship circumvention papers would not have been written if they did not first assume some ability to securely and covertly distribute software and set up an initial configuration. By making some simplifying assumptions, these papers have made some progress. And now, these papers have created a linchpin that’s an important area for new unsolved problems.
Just get started, with anything. Just like writing anything down on the page is one way to overcome writer’s block, starting with any solution to a problem (however seemingly bad) can get you started. The process of iterative refinement is powerful. In algorithms, propose a simple (sub-optimal) algorithm, and check its correctness. Refine until you have something that works. In data analysis, measurement, or modeling, look at simple statistics of a dataset—timeseries, averages, histograms, etc., and begin to look at anomalies. (I like to say that an anomaly in data analysis is either a bug or a paper!) In systems, start with a simple, if imperfect, design. Try to implement it and see where stumbling blocks arise. Those stumbling blocks may represent the hard research problems, the solutions to which might result in new discoveries.
Consider nature. When thinking about solutions, ask yourself the questions: How does a human naturally solve this problem? How does nature solve this problem? Certain problems in computer security have been solved by considering, for example, human immunology. Other advances in miniature robotics have come about by studying the behavior of bees. We estimate our distnace from a lightning strike by watching the lightning and then counting the time it takes for us to hear the thunder; the time elapsed multiplied by the speed of sound gives us the distance. This same approach has been applied in location systems. Considering nature’s approach to solving problems is a specific way of applying analogies.
Work backwards from the goal. In the words of Yogi Berra, “If you don’t know where you’re going, you might not get there.” I find it helps to sometimes have a desired end result in mind, and then to figure out what is needed to get there by breaking the problem of reaching the end state into smaller sub-problems. I sometimes use a specific version of this approach by asking my students to draw the graphs that they would like to see in their final paper—complete with axis labels and trends. Given that graph, table, or result, what data do you need to gather to produce the result? If you can’t get exactly that data, can you approximate it? Do you need to develop (or apply) any special analysis techniques to produce the result? Or, suppose you’re aiming to have a working system that performs some task. What do you want that system to achieve? Now, what are the building blocks that need to be in place for the system to achieve those goals? Those building blocks are either solved problems, problems that you need to solve, or problems that you need to dispatch with some simplifying assumptions (see above on simplifying problems). Working backwards from your goal in this way can often provide a useful roadmap towards the solution. It can also allow you to solve smaller parts of a larger problem separately (starting with the parts that are most tractable); sometimes, finding a solution to a good sub-problem can turn out to be a significant research contribution by itself.
Think in speech or pictures. We have all had the experience of trying to verbalize a problem to a friend or colleague, only to find ourselves saying “nevermind, I’ve figured it out” before we even finish describing the problem. Sometimes, the process of thinking in speech can help us arrive at a solution, because verbalizing the problem can help us make the problem concrete and structured in a way that it wasn’t when it existing solely in our heads. Similarly, drawing a picture can help us think about how a problem is structured and how various sub-problems and components relate to one another. While the process of communicating a problem in words or pictures can help us arrive at a solution, it also sometimes helps a lot to have someone who is listening and approaches a problem in a different way. Describing a problem to a colleague from a completely different area might trigger a seemingly naive question that causes you to break out of your way of thinking or discard your current set of operating assumptions—breaking down mental barriers and leading you to the breakthrough. On that note, it is also useful to relentlessly come at a problem from a variety of different angles. When applying one set of algorithms doesn’t work, put it aside and try something else.
Teach (and learn). Just as teaching is useful for identifying problems, it can also help us identify solutions. If we are teaching “well”, we are constantly trying to keep abreast of new technologies and building blocks for solutions. By keeping abreast of the latest technologies, we can stay aware of tools that we can apply to problems where we might otherwise remain stuck. For example, I had been working on a certain network management problem for some time. In the process of teaching new approaches to network control, I realized that new programming languages that were being developed allowed us to solve some longstanding scalability problems with the solutions that we had come up with. Technology and knowledge is never in a steady-state; it is continually advancing around us. Someone is probably working on a technology, concept, or theorem that could help you solve your problem. Teaching is a good forcing function to continue learning about these advances and increasing the likelihood that you become aware of them.
Relax, and let your subconscious work. Above all, immerse yourself in the problem, but stay relaxed. Take breaks. Generally, creative insights come when we give our minds a break and let them think without structure. We sometimes create a process for this unstructured thinking, which we call “brainstorming”. But, probably better than brainstorming is “brain-resting”, where you take a break, go for a walk, go for a run, take a nap, or just go do something else. Often, the solutions will come when you least expect it.
[…] proof of correctness, or taxonomy). Looking through the paper for applications of specific research patterns can help identify an intellectual nugget, if one […]