Sie sind auf Seite 1von 3

A compiler takes as input a source program and produces as output an equivalent sequence of machine instructions.

This process is so complex that it is divided into a series of sub-processes called phases. The different phases of the compil er are as follows: Phase 1: Lexical Analyzer or Scanner The first phase of the compiler, called Lexical Analyzer or Scanner reads the so urce program one character at a time, carving the source program into a sequence of atomic units called tokens. The usual tokens are identifiers, keywords, cons tants, operators and punctuation symbols such as comma and parenthesis. Each tok en is a sub-string of the source program that is to be treated as a single unit. The Lexical analyzer examines successive character in the source program starti ng from the first character not yet grouped into a token. It may be required to search many characters beyond the next token in order to determine what the next token actually is. Phases of a Compiler Phase 2: Syntax Analyzer or Parser The second phase of the compiler, called the Syntax Compiler or Parser receives a stream of tokens as the output of the lexical analyzer. The syntax analyzer gr oups tokens together into syntactic structure called as expression. Expression m ay further be combined to form statements. The syntactic structure can be regard ed as a tree whose leaves are the token called as parse trees. The parser has two functions: i) Firstly, it checks if the tokens from lexical analyzer, occur i n pattern that are permitted by the specification for the source language. It al so imposes on tokens that are permitted by the specification for the source lang uage. It also imposes on tokens a tree-like structure that is used by the subseq uent phases of the compiler. ii) Secondly, it makes explicit the hierarchical structure of the in coming token stream by identifying which parts of the token stream should be gro uped. DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING 2007-2008, EVEN SEMESTER PRINCIPL ES OF COMPILER DESIGN-CS1352 TWO-MARK QUESTIONS 1. What does translator mean? A translator is a program that takes a input program on one programming language ( source language) and produces output in another language (object language or tar Lexical analysis phase or s get language). 2. What are the phases of a compiler? canning phase Syntax analysis phase Intermediate code generation Code optimizati on Code generation 3. What is the role of lexical analysis phase? Lexical analyz er reads the source program one character at a time, and grouped into a sequence of atomic units called tokens. Identifiers, keywords, constants, operators and punctuation symbols such as commas, parenthesis, are typical tokens. 4. Define l exeme? The character sequence forming a token is called lexeme for the token. 5. What are the two functions of parser? It checks the tokens appearing in its i nput, which is output of the lexical analyzer. It involves grouping the tokens of source program into grammatical phrases that are used by the compiler to syn thesize the output. Usually grammatical phrases of the source program are repres ented by tree like structure called parse tree. 6. Mention the role of semantic analysis? Semantic analysis checks the source program for semantic errors and ga thers information for the subsequent code-generation phase. It uses hierarchical structure to identify the operators and operands of expressions and statements. An important component of semantic analysis is type checking .In type checking the compiler checks that each operator has operands that are permitted by the so urce language specification. In such cases, certain programming language supports operand coercion or type co ercion also. 7. Name some variety of intermediate forms. o Postfix notation or p

olish notation. o Syntax tree o Three address code o Quadruple o Triple Page - 24 Programming languages are notations for describing computations to people and to machines. The world as we know it depends on programming languages, because all the software running on all the computers was written in some programming langu age. But, before a program can be run, it first must be translated into a form i n which it can be executed by a computer. The software systems that do this tran slation are called compilers. This book is about how to design and implement com pilers. We shall discover that a few basic ideas can be used to construct transl ators for a wide variety of languages and machines. Besides compilers, the princ iples and techniques for compiler design are applicable to so many other domains that they are likely to be reused many times in the career of a computer scient ist. The study of compiler writing touches upon programming languages, machine a rchitecture, language theory, algorithms, and software engineering. In this prel iminary chapter, we introduce the different forms of language translators, give a high level overview of the structure of a typical compiler, and discuss the tr ends in programming languages and machine architecture that are shaping compiler s. We include some observations on the relationship between compiler design and computer-science theory and an outline of the applic ations of compiler technology that go beyond compilation. We end with a brief ou tline of key programming-language concepts that will be needed for our study of compilers. 1.1 Language Processors

Simply stated, a compiler is a program that can read a program in one language t he source language and translate it into an equivalent program in another langua ge the target language; see Fig. 1.1. An important role of the compiler is to re port any errors in the source program that it detects during the translation pro cess. source program Compiler target program Figure 1.1: A compiler If the target program is an executable machine-language program, it can then be called by the user to process inputs and produce outputs; see Fig. 1.2. Input -> Target Program -> output

Figure 1.2: Running the target program Interpreter An interpreter is another common kind of language processor. Instead of producin g a target program as a translation, an interpreter appears to directly execute the operations specified in the source program on inputs supplied by the user, a s shown in Fig. 1.3. source program (Source program + Input) + t input -> Interpreter -> outpu

Figure 1.3: An interpreter Different Between Compiler and Interpreter The machine-language target program produced by a compiler is usually much faste r than an interpreter at mapping inputs to outputs . An interpreter, however, ca n usually give better error diagnostics than a compiler, because it executes the source program statement by statement. ? Compiler translates the high level instruction into machine language, bu t the interpreter translates the high level instruction into an intermediate cod e. ? The compiler executes the entire program at a time, but the interpreter executes each and every line individually. ? Compiler reports the list of errors that are caused during the process o f execution, but the interpreter quits translating soon after finding an error, the progression of the other lines of the program will be done after refining th e error. ? Autonomous executable file is generated by the compiler while interprete r is compulsory for an interpreter program. Example 1.1: Java language processors combine compilation and interpretation, as shown in Fig. 1.4. A Java source program may first be compiled into an intermed iate form called bytecodes. The bytecodes are then interpreted by a virtual mach ine. A benefit of this arrangement is that bytecodes compiled on one machine can be interpreted on another machine, perhaps across a network. In order to achiev e faster processing of inputs to outputs, some Java compilers, called just-in-ti me compilers, translate the bytecodes into machine language immediately before t hey run the intermediate program to process the input.

In addition to a compiler, several other programs may be required to create an e xecutable target program, as shown in Fig. 1.5. A source program may be divided into modules stored in separate files. The task of collecting the source program is sometimes entrusted to a separate program, called a preprocessor. The preprocessor may also expand shorthands, called macros, into source language statements. The modified source program is then fed to a compiler. The compiler may produce an assembly-language program as its output, because assembly langua ge is easier to produce as output and is easier to debug. The assembly language is then processed by a program called an assembler that produces relocatable machin e code as its output. Large programs are often compiled in pieces, so the reloca table machine code may have to be linked together with other relocatable object files and libr ary files into the code that actually runs on the machine. The linker resolves e xternal memory addresses, where the code in one file may refer to a location in another file. The loader then puts together all of the executable object file s into memory for execution.