Beruflich Dokumente
Kultur Dokumente
Compilers
Compilers were the first sort of translator program to be written. The idea is simple:
You write the program, and then hand it to the compiler which translates it. Then
you run the result.
The compiler takes the file that you have written and produces another file from it.
In the case of Pascal programs, for instance, you might write a program called
myProg.pas and the Pascal compiler would translate it into the file myProg.exe
which you could then run. If you tried to examine the contents of myProg.exe using,
say, a text editor, then it would just appear as gobbled-gook. The compiler has
another task apart from translating your program. It also checks it to make sure
that it is grammatically correct. Only when it is sure that there are no grammatical
errors does it do the translation. Any errors that the compiler detects are called
compile-time errors or syntax errors. If it finds so much as one syntax error, it stops
compiling and reports the error to you. Here is an example of the C++ compiler
reporting a whole list of errors:
Most "serious" languages are compiled, including Pascal, C++ and Ada.
Interpreters
An interpreter is also a program that translates a high-level language into a low-
level one, but it does it at the moment the program is run. You write the program
using a text editor or something similar, and then instruct the interpreter to run the
program. It takes the program, one line at a time, and translates each line before
running it: It translates the first line and runs it, then translates the second line and
runs it etc. The interpreter has no "memory" for the translated lines, so if it comes
across lines of the program within a loop, it must translate them afresh every time
that particular line runs. Consider this simple Basic program:
10 FOR COUNT = 1 TO 1000
30 NEXT COUNT
Line 20 of the program displays the square of the value stored in COUNT and this
line has to be carried out 1000 times. The interpreter must also translate that line
1000 times, which is clearly an inefficient process. However, interpreted languages
do have their uses, as we will see in a later section.
So which is better?
Well, that depends on how you want to write and run your program. The main
advantages of compilers are as follows:
They can spot syntax errors while the program is being compiled (i.e. you are
informed of any grammatical errors before you try to run the program). However,
this does not mean that a program that compiles correctly is error-free!
There is no lengthy "compile time", i.e. you do not have to wait between writing a
program and running it, for it to compile. As soon as you have written a program,
you can run it. They tend to be more "portable", which means that they will run on a
greater variety of machines. This is because each machine can have its own
interpreter for that language. For instance, the version of the BASIC interpreter for
the PDP series computers is different from the QBasic program for personal
computers, as they run on different pieces of hardware, but programs written in
BASIC are identical from the user's point of view.
Some computer systems try to get the best of both worlds. for instance, when I was
at Durham, we programmed in Pascal on the old PDP/11 machines. Running a
Pascal program on those machines was a two-stage process. Firstly, we ran a
compiler program (called pc) which compiled the program to a low-level version,
and spotted any grammatical errors in the process. We then ran an interpreter
program which took the output of pc and ran it. The fact that pc produced
something that didn't have to run directly as machine code made the program more
portable. Different versions of the low-level interpreter could be written for different
machines in the PDP range, each taking as its input the same output from pc
How does a compiler work? Compiling a program takes several stages of
processing, which I have outlined below. The principles which are explained below
also apply to interpreters, with the exception that the interpreters translate each
program one line at a time before running it, and then moving on to the next line.
The process may be summarised in this diagram:
This part of the process is also sometimes called Lexical Analysis. It involves turning
the program from a series of characters into a series of tokens that represent the
building blocks of a program. Thetokens are keywords of the language i.e.
important words such as if, print or repeat), variable names and mathematical
operators (+, *, brackets etc.).
The tokeniser takes each of the characters in turn, and as soon as it recognises a
legitimate token, it reports it to the next stage of the process. Each token has a
label and a type, so a variable "count" in a program would have the label
corresponding to its name ("count") and the type "variable name". The tokeniser is
also responsible for ignoring comments in the program - these are words and
phrases inserted purely for the benefit of any human reading the program, and they
have no function.
Syntax Analysis
Syntax means "grammar" and the syntax analyser in a compiler checks that the
right tokens appear in the right order to make grammatically correct instructions.
For instance, in C++, the instruction xyz++; is syntactically correct, but the
instruction +;xyz+ is not - the order of the tokens is wrong.
Semantic Analysis
The word "semantics" refers to meaning, and the semantic analyser checks the
meaning of the program. This refers to aspects such as whether the variables have
been declared, e.g. xyz++; may be syntactically correct, but if the variable xyz has
not been declared, then it is semantically incorrect!
The semantic analyser checks not only variable declarations and scope, but whether
the program has entered or left loops, or subroutines, whether classes are accessed
correctly etc. Translation
An interpreter translates some form of source code into a target representation that
it can immediately execute and evaluate. The structure of the interpreter is similar
to that of a compiler, but the amount of time it takes to produce the executable
representation will vary as will the amount of optimization. The following diagram
shows one representation of the differences.
Compiler characteristics:
Interpreter characteristics:
The above characteristics are typical. There are well-known cases that are
somewhere in between, such as Java with it's JVM.
An interpreter reads the source code one instruction or line at a time, converts this
line into machine code and executes it. The machine code is then discarded and the
next line is read. The advantage of this is it's simple and you can interrupt it while it
is running, change the program and either continue or start again. The
disadvantage is that every line has to be translated every time it is executed, even
if it is executed many times as the program runs. Because of this interpreters tend
to be slow. Examples of interpreters are Basic on older home computers, and script
interpreters such as JavaScript, and languages such as Lisp and Forth.
A compiler reads the whole source code and translates it into a complete machine
code program to perform the required tasks which is output as a new file. This
completely separates the source code from the executable file. The biggest
advantage of this is that the translation is done once only and as a separate
process. The program that is run is already translated into machine code so is much
faster in execution. The disadvantage is that you cannot change the program
without going back to the original source code, editing that and recompiling (though
for a professional software developer this is more of an advantage because it stops
source code being copied). Current examples of compilers are Visual Basic, C, C++,
C#, Fortran, Cobol, Ada, Pascal and so on.
Compiler
Interpreter
Interpreters translate code one line at time, executing each line as it is "translated,"
much the way a foreign language interpreter would translate a book, by translating
one line at a time. Interpreters do generate binary code, but that code is never
compiled into one program entity.
Interpreters therefore can be easier to use and produce more immediate results;
however the source code of an interpreted language cannot run without the
interpreter.
Compilers produce better optimized code that generally run faster and compiled
code is self sufficient and can be run on their intended platforms without the
compiler present.
Linking: many relocatable binaries (modules plus libraries) ==> one relocatable
binary (with all external references satisfied)
Loading: relocatable ==> absolute binary (with all code and data references bound
to the addresses occupied in memory)
At compile time (CT), absolute addresses of variables and statement labels are not
known.
These are:
1. Preprocessing
processed. Comments are removed from the source file. This greatly simplifies the
later stages.
If the language supports macros, the macros are replaced with the equivalent
text.
For example, C and C++ support macros using the #define directive. So if a
#define PI 3.1415927
Any time the preprocessor encountered the word PI, it would replace PI with
The preprocessor may also replace special strings with other characters. In
and will replace the escape sequence with a special character. For example
\t is the escape code for a tab, so \t would be replaced at this stage with
a tab character.
2. Lexical analysis is the process of breaking down the source files into
syntax. Just like English grammar, it specifies how things may be put
if ( expression ) statement
The syntactical analysis checks that the syntax is correct, but doesn't
enforce that it makes sense. In English, a subject could be: Pants, the
verb: are, the predicate: a kind of car. This would yield: Pants are a kind
Is syntactically valid, but doesn't make sense because a float number can
not have string assigned to it, and a string can not be incremented.
4. Semantic analysis is the process of examining the types and values of the
statements used to make sure they make sense. During the semantic
analysis, the types, values, and other required information about statements
The semantic analysis would reveal the types do not match and can not be
float y = 5 + 3.0;
The semantical analysis would reveal that 5 is an integer, and 3.0 is a
double, and also that the rules for the language allow 5 to be converted to
would recognize y as a float, and perform another conversion from the double
Depending on the compiler, this step may be skipped, and instead the program
may be translated directly into the target language (usually machine object
and easily translated into machine language for any number of different
computers.
The part of the compiler which deals with processing the source files,
analyzing the language and generating the intermediate code is called the
front end, while the process of optimizing and converting the intermediate
6. Code optimization
During this process the code generated is analyzed and improved for
efficiency. The compiler analyzes the code to see if improvements can be
made to the intermediate code that couldn't be made earlier. For example,
some languages like Pascal do not allow pointers, while all machine
so the code optimizer may detect this case and internally use pointers.
7. Code generation
Finally, after the intermediate code has been generated and optimized, the
compiler will generated code for the specific target language. Almost
Also, it us usually not the final machine code, but is instead object code,
which contains all the instructions, but not all of the final memory
Interpreter
Last modified: Monday, December 10, 2001
A program that executes instructions written in a high-level language. There are two ways
to run programs written in a high-level language. The most common is to compile the
program; the other method is to pass the program through an interpreter.
Both interpreters and compilers are available for most high-level languages.
However, BASIC and LISP are especially designed to be executed by an interpreter. In
addition, page description languages, such as PostScript, use an interpreter. Every
PostScript printer, for example, has a built-in interpreter that executes PostScript instructions
Efficiency
The main disadvantage of interpreters is that when a program is interpreted, it typically runs more slowly
than if it had been compiled. Interpreting code is slower than running the compiled code because the
interpreter must analyze each statement in the program each time it is executed and then perform the
desired action, whereas the compiled code just performs the action within a fixed context determined by
the compilation. This run-time analysis is known as "interpretive overhead". Access to variables is also
slower in an interpreter because the mapping of identifiers to storage locations must be done repeatedly
at run-time rather than at compile time.
once a routine has been tested and debugged under the interpreter it can be compiled and thus benefit
from faster execution while other routines are being developed. Many interpreters do not execute the
source code as it stands but convert it into some more compact internal form.
•Interpreters
A program that executes instructions written in a high-level language. There are two ways
to run programs written in a high-level language. The most common is to compile the program; the other
method is to pass the program through an interpreter.
An interpreter translates high-level instructions into an intermediate form, which it then executes. In
contrast, a compilertranslates high-level instructions directly into machine language. Compiled programs
generally run faster than interpreted programs. The advantage of an interpreter, however, is that it does
not need to go through the compilation stage during which machine instructions are generated. This
process can be time-consuming if the program is long. The interpreter, on the other hand, can
immediately execute high-level programs. For this reason, interpreters are sometimes used during
thedevelopment of a program, when a programmer wants to add small sections at a time and test them
quickly. In addition, interpreters are often used in education because they allow students to program
interactively.
Both interpreters and compilers are available for most high-level language.
However, BASIC and LISP are especially designed to be executed by an interpreter. In addition, page
description languages, such as PostScript, use an interpreter. Every compiler makes the conversion just
once, while an interpreter typically converts it every time a program is executed (or in some languages
like early versions of BASIC, every time a single instruction is executed).
An interpreter usually just needs to translate to an intermediate representation or not translate at all, thus
requiring less time before the changes can be tested.
This often makes interpreted languages generally easier to learn and find bugs and correct problems.
Thus simple interpreted languages tend to have a friendlier environment for beginners.
Execution environment
An interpreter will make source translations during runtime. This means every line has to be converted
each time the program runs. This process slows down the program execution and is a major
disadvantage of interpreters over compilers. Another main disadvantage of interpreter is that it must be
present on the machine as additional software to run the program.
bINTERPRETERS
These are programs which translate computer programs from high-level
languages such as Pascal, C++, Java or JavaScript into the raw 1s and 0s
which the computer can understand, but the human programmers cannot:
... into this, which
You write this The computer translates it
it can run
Interpreters
Compiler characteristics:
• spends a lot of time analyzing and processing
the program
• the resulting executable is some form of
machine- specific binary code
• the computer hardware interprets (executes) the
resulting code
• program execution is fast
Interpreter characteristics:
• relatively little time is spent analyzing and
processing the program
• the resulting code is some sort of intermediate
code
• the resulting code is interpreted by another
program
• program execution is relatively slow
Advantages of an Interpreter
Disadvantages of an Interpreter
• Interpreters normally translate and execute
programs line by line, converting each program
statement into a sequence of machine code
instructions and executing these instructions without
retaining the translated version.