Some reasons why a particular publication might be regarded as important:
- Topic creator – A publication that created a new topic
- Breakthrough – A publication that changed scientific knowledge significantly
- Influence – A publication which has significantly influenced the world or has had a massive impact on the teaching of computer science.
Artificial intelligence
Computing Machinery and Intelligence
Description: This paper discusses whether machines can think and suggested the
Turing test as a method for checking it.
A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence
Description: This summer research proposal inaugurated and defined the field. It contains the first use of the term
artificial intelligence and this succinct description of the philosophical foundation of the field: "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." (See
philosophy of AI) The proposal invited researchers to the
Dartmouth conference, which is widely considered the "birth of AI". (See
history of AI.)
Fuzzy sets
Description: The seminal paper published in 1965 provides details on the mathematics of
fuzzy set theory.
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
Artificial Intelligence: A Modern Approach
Description: The standard textbook in Artificial Intelligence.
The book web site lists over 1100 colleges.
Machine learning
An Inductive Inference Machine
- Ray Solomonoff
- IRE Convention Record, Section on Information Theory, Part 2, pp. 56–62, 1957
- (A longer version of this, a privately circulated report, 1956, is online).
Description: The first paper written on machine learning. Emphasized the importance of training sequences, and the use of parts of previous solutions to problems in constructing trial solutions to new problems.
Language identification in the limit
On the uniform convergence of relative frequencies of events to their probabilities
A theory of the learnable
Learning representations by back-propagating errors
Induction of Decision Trees
Description:
Decision Trees are a common learning algorithm and a decision representation tool. Development of decision trees was done by many researchers in many areas, even before this paper. Though this paper is one of the most influential in the field.
Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm
Description: One of the papers that started the field of on-line learning. In this learning setting, a learner receives a sequence of examples, making predictions after each one, and receiving feedback after each prediction. Research in this area is remarkable because (1) the algorithms and proofs tend to be very simple and beautiful, and (2) the model makes no statistical assumptions about the data. In other words, the data need not be random (as in nearly all other learning models), but can be chosen arbitrarily by "nature" or even an adversary. Specifically, this paper introduced the
winnow algorithm.
Learning to predict by the method of Temporal difference
Learnability and the Vapnik–Chervonenkis dimension
Cryptographic limitations on learning boolean formulae and finite automata
The strength of weak learnability
Description: Proving that weak and strong learnability are equivalent in the noise free
PAC framework. The proof was done by introducing the
boosting method.
Learning in the presence of malicious errors
Description: Proving possibility and impossibility result in the malicious errors framework.
A training algorithm for optimum margin classifiers
Description: This paper presented
support vector machines, a practical and popular machine learning algorithm. Support vector machines utilize the
kernel trick, a generally used method.
Knowledge-based analysis of microarray gene expression data by using support vector machines
Description: The first application of supervised learning to
gene expression data, in particular
Support Vector Machines. The method is now standard, and the paper one of the most cited in the area.
Collaborative networks
- Camarinha-Matos, L. M.; Afsarmanesh,H. (2005). Collaborative networks: A new scientific discipline, J. Intelligent Manufacturing, vol. 16, Nº 4–5, pp 439–452.
- Camarinha-Matos, L. M.; Afsarmanesh,H. (2008). Collaborative Networks: Reference Modeling, Springer: New York.
Compilers
On the translation of languages from left to right
Semantics of Context-Free Languages.
A program data flow analysis procedure
Description: From the abstract: "The global data relationships in a program can be exposed and codified by the static analysis methods described in this paper. A procedure is given which determines all the definitions which can possibly reach each node of the control flow graph of the program and all the definitions that are live on each edge of the graph."
A Unified Approach to Global Program Optimization
Description: Formalized the concept of
data-flow analysis as
fixpoint computation over
lattices, and showed that most static analyses used for program optimization can be uniformly expressed within this framework.
YACC: Yet another compiler-compiler
Description:
Yacc is a tool that made
compiler writing much easier.
gprof: A Call Graph Execution Profiler
Compilers: Principles, Techniques and Tools
Description: This book became a classic in compiler writing. It is also known as the
Dragon book, after the (red) dragon that appears on its cover.
Computer architecture
Colossus computer
Description: The
Colossus machines were early computing devices used by British
codebreakers to break German messages encrypted with the
Lorenz Cipher during
World War II. Colossus was an early
binary electronic digital
computer. The design of Colossus was later described in the referenced paper.
First Draft of a Report on the EDVAC[2]
Description: It contains the first published description of the logical design of a computer using the stored-program concept, which has come to be known as the
von Neumann architecture.
Architecture of the IBM System/360
The case for the reduced instruction set computer
Description:
The CRAY-1 Computer System
Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities
A Case for Redundant Arrays of Inexpensive Disks (RAID)
Description: This paper discusses the concept of
RAID disks, outlines the different levels of RAID, and the benefits of each level. It is a good paper for discussing issues of reliability and fault tolerance of computer systems, and the cost of providing such fault-tolerance.
The case for a single-chip multiprocessor
Description: This paper argues that the approach taken to improving the performance of processors by adding multiple instruction issue and out-of-order execution cannot continue to provide speedups indefinitely. It lays out the case for making single chip processors that contain multiple "cores". With the mainstream introduction of multicore processors by
Intel in 2005, and their subsequent domination of the market, this paper was shown to be prescient.
Computer graphics
The Rendering Equation
- J. Kajiya
- SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques pages 143—150 [3]
Elastically deformable models
- D. Terzopoulos, J. Platt, A. Barr, K. Fleischer
- Computer Graphics, 21(4), 1987, 205–214, Proc. ACM SIGGRAPH'87 Conference, Anaheim, CA, July 1987.
- Online version(PDF)
Description: The Academy of Motion Picture Arts and Sciences cited this paper as a "milestone in computer graphics".
Computer vision
The Phase Correlation Image Alignment Method
- C.D. Kuglin and D.C. Hines
- IEEE 1975 Conference on Cybernetics and Society, 1975, New York, pp. 163–165, September
Determining Optical Flow
Description: A method for estimating the image motion of world points between 2 frames of a video sequence.
An Iterative Image Registration Technique with an Application to Stereo Vision
Description: This paper provides efficient technique for image registration
The Laplacian Pyramid as a compact image code
Description: A technique for image encoding using local operators of many scales.
Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images
Description: introduced 1)
MRFs for image analysis 2) the
Gibbs sampling which revolutionized computational
Bayesian statistics and thus had paramount impact in many other fields in addition to Computer Vision.
Snakes: Active contour models
Description: An interactive variational technique for image segmentation and visual tracking.
Condensation – conditional density propagation for visual tracking
Object recognition from local scale-invariant features
Concurrent, parallel, and distributed computing
Databases
A relational model for large shared data banks
Description: This paper introduced the relational model for databases. This model became the number one model.
Binary B-Trees for Virtual Memory
- Rudolf Bayer
- ACM-SIGFIDET Workshop 1971, San Diego, California, Session 5B, p. 219–235.
Relational Completeness of Data Base Sublanguages
- E. F. Codd
- In: R. Rustin (ed.): Database Systems: 65-98, Prentice Hall and IBM Research Report RJ 987, San Jose, California : (1972)
- Online version (PDF)
Description: Completeness of Data Base Sublanguages
The Entity Relationship Model – Towards a Unified View of Data
SEQUEL: A structured English query language
- Donald D. Chamberlin, Raymond F. Boyce
- International Conference on Management of Data, Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control, Ann Arbor, Michigan, pp. 249–264
Description: This paper introduced the
SQL language.
The notions of consistency and predicate locks in a database system
Description: This paper defined the concepts of
transaction,
consistency and schedule. It also argued that a transaction needs to lock a logical rather than a physical subset of the database.
Federated database systems for managing distributed, heterogeneous, and autonomous databases
- Amit Sheth, J.A. Larson,"
- ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases Surveys, Volume 22 Issue 3, Pages 183 - 236, Sept. 1990
- ACM source
Description: Introduced federated database systems concept leading huge impact on data interoperability and integration of hetereogenous data sources.
Mining association rules between sets of items in large databases
History of computation
The Computer from Pascal to von Neumann
Description: Perhaps the first book on the history of computation.
A History of Computing in the Twentieth Century
edited by:
Description: Several chapters by pioneers of computing.
Information retrieval
A Vector Space Model for Automatic Indexing
Extended Boolean Information Retrieval
Networks and security
Operating systems
An experimental timesharing system.
Description: This paper discuss
time-sharing as a method of sharing computer resource. This idea changed the interaction with computer systems.
The Working Set Model for Program Behavior
Virtual Memory, Processes, and Sharing in MULTICS
Description: The classic paper on
Multics, the most ambitious operating system in the early history of computing. Difficult reading, but it describes the implications of trying to build a system that takes information sharing to its logical extreme. Most operating systems since Multics have incorporated a subset of its facilities.
A note on the confinement problem
Description: This paper addresses issues in constraining the flow of information from untrusted programs. It discusses covert channels, but more importantly it addresses the difficulty in obtaining full confinement without making the program itself effectively unusable. The ideas are important when trying to understand containment of malicious code, as well as aspects of trusted computing.
The UNIX Time-Sharing System
Description: The
Unix operating system and its principles were described in this paper. The main importance is not of the paper but of the operating system, which had tremendous effect on operating system and computer technology.
Weighted voting for replicated data
Description: This paper describes the consistency mechanism known as quorum consensus. It is a good example of algorithms that provide a continuous set of options between two alternatives (in this case, between the read-one write-all, and the write-one read-all consistency methods). There have been many variations and improvements by researchers in the years that followed, and it is one of the consistency algorithms that should be understood by all. The options available by choosing different size quorums provide a useful structure for discussing of the core requirements for consistency in distributed systems.
Experiences with Processes and Monitors in Mesa
Description: This is the classic paper on synchronization techniques, including both alternate approaches and pitfalls.
Scheduling Techniques for Concurrent Systems
Description: Algorithms for
coscheduling of related processes were given
A Fast File System for UNIX
Description: The
file system of
UNIX. One of the first papers discussing how to manage disk storage for high-performance file systems. Most file-system research since this paper has been influenced by it, and most high-performance file systems of the last 20 years incorporate techniques from this paper.
The Design and Implementation of a Log-Structured File System
Microkernel operating system architecture and Mach
- David L. Black, David B. Golub, Daniel P. Julin, Richard F. Rashid, Richard P. Draves, Randall W. Dean, Alessandro Forin, Joseph Barrera, Hideyuki Tokuda, Gerald Malan, David Bohman
- Proceedings of the USENIX Workshop on Microkernels and Other Kernel Architectures, pages 11–30, April 1992.
Description: This is a good paper discussing one particular
microkernel architecture and contrasting it with monolithic kernel design. Mach underlies
Mac OS X, and its layered architecture had a significant impact on the design of the
Windows NT kernel and modern microkernels like
L4. In addition, its memory-mapped files feature was added to many monolithic kernels.
An Implementation of a Log-Structured File System for UNIX
Description: The paper was the first production-quality implementation of that idea which spawned much additional discussion of the viability and short-comings of log-structured filesystems. While "The Design and Implementation of a Log-Structured File System" was certainly the first, this one was important in bringing the research idea to a usable system.
Soft Updates: A Solution to the Metadata Update problem in File Systems
Description: A new way of maintaining filesystem consistency.
Programming languages
The FORTRAN Automatic Coding System[4]
Recursive functions of symbolic expressions and their computation by machine, part I[5]
Description: This paper introduced
LISP, the first
functional programming language, which was used heavily in many areas of computer science, especially in
AI. LISP also has powerful features for manipulating LISP programs within the language.
ALGOL 60
Description: Algol 60 introduced block structure.
Pascal
The next 700 programming languages[5]
Description: This seminal paper proposed an ideal language
ISWIM, which without being ever implemented influenced the whole later development.
Fundamental Concepts in Programming Languages
Lambda Papers
Description: This series of papers and reports first defined the influential
Scheme programming language and questioned the prevailing practices in programming language design, employing
lambda calculus extensively to model programming language concepts and guide efficient implementation without sacrificing
expressive power.
Structure and Interpretation of Computer Programs
Description: This textbook explains core computer programming concepts, and is widely considered a classic text in computer science.
The C Programming Language
Description: Co-authored by the man who designed the
C programming language, the first edition of this book served for many years as the language's de facto standard. As such, the book is regarded by many to be the authoritative reference on C.
The C++ Programming Language
Description: Written by the man who designed the
C++ programming language, the first edition of this book served for many years as the language's de facto standard until the publication of the ISO/IEC 14882:1998: Programming Language C++ standard on 1 September 1998.
The Java Programming Language
Scientific computing
Computational linguistics
- Booth, T. L. (1969). "Probabilistic representation of formal languages". IEEE Conference Record of the 1969 Tenth Annual Symposium on Switching and Automata Theory. pp. 74–81.
- Contains the first presentation of stochastic context-free grammars.
- The first published description of computational morphology using finite state transducers. (Kaplan and Kay had previously done work in this field and presented this at a conference; the linguist Johnson had remarked the possibility in 1972, but not produced any implementation.)
- Rabiner, Lawrence R. (1989). "A tutorial on hidden Markov models and selected applications in speech recognition". Proceedings of the IEEE 77 (2): 257–286.
- An overview of hidden Markov models geared toward speech recognition and other NLP fields, describing the Viterbi and forward-backward algorithms.
- Brill, Eric (1995). "Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging". Computational Linguistics 21 (4): 543–566.
- Describes a now commonly used POS tagger based on transformation-based learning.
- Textbook on statistical and probabilistic methods in NLP.
- This survey documents relatively less researched importance of lazy functional programming languages (i.e. Haskell) to construct Natural Language Processors and to accommodated many linguistic theories.
Software engineering
Description: Conference of leading figures in software field c. 1968
The paper defined the field of
Software engineering
Go To Statement Considered Harmful[5]
On the criteria to be used in decomposing systems into modules
Description: The importance of modularization and
information hiding. Note that information hiding was first presented in a different paper of the same author – "Information Distributions Aspects of Design Methodology", Proceedings of IFIP Congress '71, 1971, Booklet TA-3, pp. 26–30
Hierarchical Program Structures
Description: The beginning of
Object-oriented programming. This paper argued that programs should be decomposed to independent components with small and simple interfaces. They also argued that objects should have both data and related methods.
A technique for software module specification with examples
Structured Design
The Emperor's Old Clothes
Description: A lovely story of how large software projects can go right, and then wrong, and then right again, told with humility and humor. Illustrates the "
second-system effect" and the importance of simplicity.
The Mythical Man-Month: Essays on Software Engineering
Description: Throwing more people at the task will not speed its completion...
No Silver Bullet: Essence and Accidents of Software Engineering
Description: We will keep having problems with software...
The Cathedral and the Bazaar
Design Patterns: Elements of Reusable Object Oriented Software
Description: This book was the first to define and list
design patterns in computer science.
Statecharts: A Visual Formalism For Complex Systems
- David Harel
- D. Harel. Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8:231—274, 1987
- Online version
Description:
Statecharts are a visual modeling method. They are an extension of
state machine that might be exponentially more efficient. Therefore, statcharts enable formal modeling of applications that were too complex before. Statecharts are part of the
UML diagrams.
Theoretical computer science
See also
References
- Laplante, Phillip, ed. (1996). Great papers in computer science. New York: IEEE Press. ISBN 0-314-06365-X.
- Randell, Brian (ed). (1982). The Origins of Digital Computers: Selected Papers. 3rd ed. Berlin: Springer-Verlag. ISBN 0-387-11319-3.
- Turning Points in Computing: 1962–1999, Special Issue, IBM Systems Journal, 38 (2/3),1999.
- Yourdon, Edward (ed.) (1979) Classics in Software Engineering. New York: Yourdon Press. ISBN 0-917072-14-6
External links
Academic Search Engines