Beruflich Dokumente
Kultur Dokumente
How to read a CS/EE Research Paper? Some thoughts on what it takes to produce a good Ph.D. thesis
Structure of a CS/ /EE research paper, , Tw wo phase paper reading, What do you need to retain?, Exercises
PAGE 11
If you are not sufficiently selfaware to know where you belong among these three possibilities, you could end up frustrated and with a lot of emotional and other problems down the road.
x Abstract o Describing the main idea/proposed solution of the paper in a few words x Introduction o Expands the abstract, also discusses: Limitations of existing work How the proposed solution has been evaluated x Related Work x What has already taken place in this area? And how is this paper different? x Background x Optional: Used if concepts from a different domain are used x System Model x The basic systemlevel model and assumptions x Contribution x One or two sections describing the contributions of the work x Performance Evaluation x Performance comparison with existing work x Conclusions x Salient findings x References
Before reading the paper, ask yourself, how much do I already know about this area?
If you know a lot, then skip the next slide and proceed to the twophase paper reading process
Otherwise:
x Read about the area on Wikipedia x Read other lesstechnical online articles about the area: news websites, ZDNet, CNET, etc. are good resources x Download and read tutorial/survey/white papers on the subject: It may or may not be directly relevant to the area that you are interested in x Finally, proceed to reading more advanced technical articles
PAGE 1
PAGE 10
Before reading the paper, remember everything written and presented in the paper is connected!
Well, at least if it is published in a decent conference/journal More specifically: x x x x Every sentence leads to another sentence Every paragraph is connected to the next Every section flows into another paper Every figure is there for a reason But remember:
x Actively participating in conferences and workshops in order to become noticed x Expressing verbal interest in other peoples work at conferences and workshops and following that up with email interaction x Forming friendships and collaboration with researchers from other institutions x Volunteering to help out with workshops x Volunteering to organize workshops x Volunteering to help out with conferences x Volunteering to organize conferences x Volunteering to help out with journal refereeing x etc. x The authors are generally constrained for space, so everything is condensed x In the start, you will have to refer to other sources (references) for details
Here is a guideline for those Ph.D. candidates who want to work in universities:
During the last third of a Ph.D. program, about a third of your mental focus should be on the research world outside.
During this period, a lot of your energy has to go into forming friendships (they will be your future collaborators) with people on the outside.
Ability to express ideas precisely and unambiguously is a key to success in all human endeavors, particularly so in research.
A Ph.D. is, as the degree says, a doctorate in philosophy, a doctorate in ideas, a degree that requires that the chosen ideas be articulated precisely and with rigor.
Moreover, writing imposes a discipline on thinking. Every time you write something down, you are committing yourself to a position. The act of making that commitment forces you to examine with care what it is that you are writing.
There is world of a difference between these three lifestyles. This is not to say that any one particular post Ph.D. existence is better than the other two.
1. x 2. x 3. x x
What problem is the paper trying to solve? Highlight three to four lines What are the limitations of prior work? Highlight maximum two to three lines How is the problem solved by this paper? What is the papers contribution? Highlight three to five lines
PAGE 9
PAGE 2
4. How is the proposed solution evaluated? x What kind of data/experiments were conducted? x No need to highlight anything; you can highlight two to three words here In other words, much technical literature is written with motives that are less than noble. How does one cope? My strategies for coping with the literature glut Every engineering contribution is based on assumptions about the real world. When I look at a new paper, my first attempt is to quickly extract those assumptions. If I find those assumptions excessively unrealistic, I do not pay much further attention to the paper. I read papers to seek out their limitations. But some authors do a great job of hiding the limitations. Sometimes I discount papers if I have already written the authors off in my mind. Getting to the bottom of a research paper In a face-to-face interaction (even by email sometimes), people are more likely to tell you about the limitations of their work, limitations that they did not mention in their written pa pers. In any case, if your overall research effort is experiment driven (as opposed to literature driven), you are much more likely to spot the limitations in the papers written by other people.
short term recognition, etc. (If you believe that scientists and researchers are a nobler breed than most, you are mistaken).
1. 2. x 3. x x x 4. x x
What problem is the paper trying to solve? Read the System Model carefully The whole paper is going to be based on this model Understand the gist of the contribution/proposal Does it make sense? Do you think it will work? How will the proposal be evaluated? Dont try to read all the math in one go Read the assumptions and system model Try to work out a solution to the problem
1. x 2. x x
Is the evaluation fair and comprehensive? Try to find the next paper you want to read from this section Understand the results Do not miss a single figure and table Find the corresponding discussions in the paper and read them thoroughly
What to retain?
1. 2. 3. 4.
The problem The basic idea of the proposed solution Your personal notes on the papers mathematics Shortcomings of the proposed approach
x What is the message you take away from this paper? x Are you convinced that the paper attempted an important problem? o If your answer is NO, Justify it! x Are you convinced that the paper proposed a viable solution? o If your answer is NO, Justify it! x What questions still remain unanswered?
Exercises
area and pushing that to a higher level of performance bringing together two hitherto disparate threads of engineering knowledge and creating a new thread for study and analysis.
2. What are the limitations of prior work? 3. How is the problem solved by this paper? 4. How is the proposed solution evaluated?
In order to discover a good problem you have to first push yourself to the current state of the art, before you can advance the state of the art.
Are there any strategies for rapidly pushing oneself to the current state of the art?
In engineering research, I believe that the best strategy is to actually try to do a state of the art experiment.
If you want to discover a good problem to work on in any area of engineering, there does not exist a faster way to get to the state of the art than creating your own implementation for a core problem in that area.
Reading research papers effectively is challenging. These papers are written in a very condensed style because of page limitations and the intended audience, which is assumed to already know the area well. Moreover, the reasons for writing the paper may be different than the reasons the paper has been assigned, meaning you have to work harder to find the content that you are interested in. Finally, your time is very limited, so you may not have time to read every word of the paper or read it several times to extract all the nuances. For all these reasons, reading a research paper can require a special approach. To develop an effective reading style for research papers, it can help to know two things: what you should get out of the paper, and where that information is located in the paper. First, I'll describe how a typical research paper is put together.
For example, lets say you want to understand the state of the art in information retrieval from large software libraries.
There are probably a couple of hundred papers now that have been published on the subject of information retrieval from software libraries. These papers use a variety of methods that range from static source code analysis to the construction of statistical models of the source code libraries using techniques developed by folks in information retrieval from large text corpora.
You could spend a couple of years trying to read all these papers, but by the time you are done, there will be another 100 papers to read.
If, instead of chasing at the outset all the papers that are out there, you create your own retrieval engine, you are much more likely to get a good feel for the state of the art (even if your own implementation is rather crude compared to the best out there).
Despite a paper's condensed form, it is likely repetitive. The introduction will state not only the motivations behind the work, but also outline the solution. Often this may be all the expert requires from the paper. The body of the paper states the authors' solution to the problem in detail, and should also describe a detailed evaluation of the solution in terms of arguments or an experiment. Finally, the paper will conclude with a recap, including a discussion of the primary contributions. A paper will also discuss related work to some degree. Because of the repetition in these papers at different levels of detail and from different perspectives, it may be desirable, to read the paper ``out of order'' or to skip certain sections. More on this below. The questions you want to have answered by reading a paper are the following:
The process of creating your own implementation will give you deep intuitions that would be hard to acquire by just reading the literature.
It is much more efficient if the problem discovery phase is experiment driven as opposed to literature driven. What you read in the literature should be dictated by your current experimental obsession, as opposed to the other way around.
Much technical literature is poorly written, designed more to hide than to reveal, designed more to obfuscate than to clarify, designed to gain
1. What are motivations for this work? For a research paper, there is an expectation that a problem has been solved that no one else has published in the literature. This problem intrinsically has two parts. The first is often unstated, what I call the people problem. The people problem is the benefits that are desired in the world at large; for example some issue of quality of life, such as saved time or increased safety. The second part is the technical problem, which is: why doesn't the people problem not have a trivial solution? There is also an implication that previous solutions to the problem are inadequate. What are the previous solutions and why are they inadequate? Occasionally an author will fail to state either point, making your job much more difficult. PAGE 7 PAGE 4
amount of detail in each chapter. You may want to fill out the above questions on a chapterbychapter basis, and then produce a summary form for the entire book when you have finished reading it. However, each chapter will have a particular slant that may make certain questions irrelevant. Also, a book is often not oriented towards explaining the solution to a research problem. However, engineering books are invariably oriented towards problem solving of one kind or another. I have a habit of writing on papers directly, less with books simply because they cost so much. A wellannotated paper is worth its weight in gold, as it not only contains the content of the paper, but your assessment of its value to you.
2. What is the proposed solution? This is also called the hypothesis or idea. There should also be an answer to the question why is the solution to the problem better than previous solutions? There should also be a discussion about how the solution is achieved (designed and implemented) or is at least achievable. 3. What is the evaluation of the proposed solution? An idea alone is usually not adequate for publication of a research paper. What argument and/or experiment makes the case for the value of the ideas? What benefits or problems are identified? Are they convincing? For work that has practical implications, you also want to ask: Is this really going to work, who would want it, what it will take to give it to them, and when might it become a reality? 4. What are the contributions? The contributions in a paper may be many and varied. Ideas, software, experimental techniques, and area survey are a few key possibilities. 5. What are future directions for this research? Not only what future directions do the authors identify, but what ideas did you come up with while reading the paper? Sometimes these may be identified as shortcomings or other critiques in the current work.
As you read or skim a paper, you should actively attempt to answer the above questions. Presumably, the introduction should provide motivation. The introduction and conclusion may discuss the solutions and evaluation at a high level. Future work is likely in the concluding part of the paper. The details of the solution and the evaluation should be in the body of the paper. You may find it productive to try to answer each question in turn, writing your answer down. I recommend that you keep a notebook on all the papers you read. You should use my standard one page form that you can fill out for each paper. In practice, you are not done reading a paper until you can answer all the questions. I will be asking you these questions in class.
Also, you should be aware of the context of the paper in relation to the other papers in the class. Often a paper will represent a generalization, new direction, or contradiction to earlier papers.
If you find that filling out this form doesn't work for you, you can try writing a 250 word abstract of the paper not rewriting the abstract at the front of the paper, but your abstract, capturing the above five issues from your perspective. I often find it useful to write an abstract because it develops the logical connections between the above five issues.
If you are somewhat lost on a particular paper, and sometimes if you are not, it can pay to write down questions you have about the paper. Perhaps the paper was vague on key issues, or ignored issues that you think are important. If you come to class with such questions, you are prepared to counter or preempt my own questions.
Reading a book is somewhat different. Although you want to answer the above questions for a book, it may not do the book justice given the
PAGE 5
PAGE 6