Beruflich Dokumente
Kultur Dokumente
Modern day coders are the coders of the 3GL and 4GL languages. These languages are efficient but are not machine readable. To generate the machine code from them is the task of a translator. This paper deals with such translators. Here we have listed some of the most common translators used by the modern day coders. It includes a descriptive study of the various translators along with their types, working techniques, advantages and demerits.
Introduction
In the three decades of scientific development of human, computer programming has become one of the most important skills needed to work on any kind of work associated with digital electronics. And while this enormous growth was taking place, attempts have been made to simplify the coding language to make it more suitable for humans. This approach has paved way for the development of high level languages; but such languages are not what machine can understand. So while working with such languages, it has always been mandatory to have software or a medium, which can convert human-friendly codes into machine friendly binary bits. Such a software which can generate machine understandable and executable binary data as output by taking human-friendly code as input can be regarded as a translator. In real life situations, a translator is a person who is capable of translating i.e., reproducing statements given in one language into another language by applying the given set of grammatical rules for translation. Similarly a language translator in computers is system software which has been designed to translate the codes written in one programming language in other on the basis of the given set of rules.
Assembly level language: - An assembly level language is a low-level, computer architecture specific programming language for computers, microprocessors, microcontrollers, and other programmable devices, which implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on mnemonics that symbolize processing steps (instructions), processor registers, memory locations, and other language features. High level language: - A high-level programming language is a programming language with strong abstraction from the details of the computer. it uses easy to use and understand natural language elements, making the process of developing a program simpler and more understandable with respect to a low-level language. The amount of abstraction provided defines how "high-level" a programming language is.
Types of Translators
There are a large variety of language translators available in the present day computing scenario to perform the task of translation of code from one language to another. Among them the most common ones are: Compiler: - A compiler is a special program that takes written source code and turns it into machine language. On execution a compiler analyses all of the language statements in the source code and builds the machine language object code. Assembler: - An assembler translates assembly language into machine language. It uses computer-specific commands and structure similar to machine language, but assembly language uses names instead of numbers. It is similar to a compiler, but is specific to translating programs written in assembly language into machine language. To do this, the assembler takes basic computer instructions from assembly language and converts them into a pattern of bits for the computer processor to use to perform its operations. Interpreter: - An interpreter is a translator which converts programs into machineexecutable form each time they are executed. It analyses and executes each line of source code, in order, without looking at the entire program. Instead of requiring a step before program execution, an interpreter processes the program as it is being executed. Now along with these most basic types of translators, there exist some other varieties of translators which are used for specific purposes in some specific scenarios which are not faced on the day-today basis. Some of such compliers are: Decompiler: - It is a computer program that performs, as far as possible, the reverse operation to that of a compiler i.e., it translates an executable file into human readable format. While working with decompiler, it must be kept in mind that it does not
reconstruct the original source code, and its output is far less intelligible to a human than original source code. Disassembler: - It is a computer program that translates machine language into assembly language. Principally a disassembler is a reverse-engineering tool because Disassembly, the output of a disassembler, is often formatted for human-readability rather than suitability for input to an assembler. Binary recompiler: - It is software that takes executable binaries as input, analyses the structure, applies transformations and optimizations, and outputs new optimized executable binaries. Source-to-Source compiler: - It is a type of compiler that takes a high level programming language as its input and outputs another high level language.
Now as we have seen that there are different types of translators available to us. Lets get into the detail description of some of the most basic translators. A compact description of a few of the translators is given below: -
Compiler
As described above, the term compiler is primarily used for programs that translate high-level programming language to a lower level language.
Objectives of Compiler
Compilers bridge source programs in high-level languages with the underlying hardware. A objectives of most basic working of a compiler requires it to: Determining the correctness of the syntax of programs, Generating correct and efficient object code, Run-time organization, and Format output according to assembler and/or linker conventions.
Passes of a Compiler
The task of compiling a program is not a cake-walk that can be completed in one single go. A complex algorithm is implemented through a number of steps. Every such step is called a pass and each pass has been designed to perform a very specific function to fulfil the final goal of translation of code. The various passes of compiler through which a code passes during the translation are:-
Lexical analysis: - It is the process of converting a sequence of characters into a sequence of tokens. Pre-processing: - It is a program that processes its input data to produce output that is used as input to another program. The output is said to be a pre-processed form of the input data, which is often used by some subsequent programs. The amount and kind of processing done depends on the nature of the pre-processor; some pre-processors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages. Parsing: - It is the process of analysing a text, made of a sequence of tokens, to determine its grammatical structure with respect to a given formal grammar. Semantic analysis (Syntax-directed translation): - It is a method of translating a string into a sequence of actions by attaching one such action to each rule of a grammar. Thus, parsing a string of the grammar produces a sequence of rule applications and this provides a simple way to attach semantics to any such syntax. Code generation: - It is the process by which a compiler converts some intermediate representation of source code into a machine code that can be readily executed by a machine. Code optimization: - It is the process of modifying a program or code to make some aspect of it work more efficiently or use fewer resources.
The back end is responsible for translating the IR from the middle-end into assembly code. The target instruction(s) are chosen for each IR instruction. Register allocation assigns processor registers for the program variables where possible. The backend utilizes the hardware by figuring out how to keep parallel execution units busy, filling delay slots, and so on. Although most algorithms for optimization are in NP, heuristic techniques are well-developed.
Advantages of a Compiler
Fast in execution The object/executable code produced by a compiler can be distributed or executed without having to have the compiler present. The object program can be used whenever required without the need to of recompilation.
Disadvantages of a Compiler
Debugging a program is much harder. Therefore not so good at finding errors When an error is found, the whole program has to be re-compiled
Interpreter
An interpreter behaves very differently from compilers and assemblers. It converts programs into machine-executable form each time they are executed. It analyses and executes each line of
source code, in order, without looking at the entire program. Instead of requiring a step before program execution, an interpreter processes the program as it is being executed. While an interpreter is used to execute a code then no object code is produced, i.e., the program has to be interpreted each time it is to be run. For example if the program performs a section code 1000 times, then the section is translated into machine code 1000 times since each line is interpreted and then executed. So basically an interpreter is a computer program that executes, i.e. performs, instructions written in a programming language. It is a program that: Executes the source code directly. Translates source code into some efficient intermediate representation (code) and immediately executes this. Explicitly executes stored precompiled code made by a compiler which is part of the interpreter system. While interpreting and compiling are the two main means by which programming languages are implemented, these are not fully mutually exclusive categories, one of the reasons being that most interpreting systems also perform some translation work, just like compilers. The terms "interpreted language" or "compiled language" merely mean that the canonical implementation of that language is an interpreter or a compiler; a high level language is basically an abstraction which is (ideally) independent of particular implementations.
Advantages of an Interpreter
Good at locating errors in programs Debugging is easier since the interpreter stops when it encounters an error. If an error is deducted there is no need to retranslate the whole program.
Disadvantages of an Interpreter
Rather slow No object code is produced, so a translation has to be done every time the program is running.
Assembler
An assembler is a utility program used to translate assembly language statements into the target computer's machine code. It performs a more or less isomorphic translation (a one-to-one mapping) from mnemonic statements into machine instructions and data. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on mnemonics that symbolize processing steps (instructions), processor registers, memory locations, and other language features. An assembly language is thus specific to certain physical (or virtual) computer architecture. This is in contrast to most high-level programming languages, which, ideally, are portable. Typically a modern assembler creates object code by translating assembly instruction mnemonics into opcodes, and by resolving symbolic names for memory locations and other entities. The use of symbolic references is a key feature of assemblers, saving tedious calculations and manual address updates after program modifications. Most assemblers also include macro facilities for performing textual substitution which is to generate common short sequences of instructions as inline, instead of called subroutines.
Types of Assembler
There are basically two types of assemblers based on how many times the assembler scans the source code to produce the executable program. One-pass assemblers go through the source code once. Any symbol used before it is defined will require "errata" at the end of the object code (or, at least, no earlier than the point where the symbol is defined) telling the linker or the loader to "go back" and overwrite a placeholder which had been left where the as yet undefined symbol was used.
Fig. 5 flow-chart of single pass assembler Two-pass assemblers create a table with all symbols and their values in the first pass, and then use the table in a second pass to generate code.
Fig. 6 flow-chart of 1st pass of 2-pass assembler Fig. 7 flow-chart of 2nd pass of 2-pass assembler
In both cases, the assembler must be able to determine the size of each instruction on the first or only pass in order to calculate the addresses of symbols. This means that if the size of an operation referring to an operand defined later depends on the type or distance of the operand,
the assembler will make a pessimistic estimate when first encountering the operation, and if necessary pad it with one or more "no-operation" instructions in the second pass or the errata. The original reason for the use of one-pass assemblers was speed of assembly; however, modern computers perform two-pass assembly without unacceptable delay. The advantage of the twopass assembler is that the absence of a need for errata makes the linker (or the loader if the assembler directly produces executable code) simpler and faster.
Data sections: - There are instructions used to define data elements to hold data and
variables. They define the type of data, the length and the alignment of data. These instructions can also define whether the data is available to outside programs (programs assembled separately) or only to the program in which the data section is defined.
Assembly directives: - Assembly directives, also called pseudo opcodes, are instructions
that are executed by an assembler at assembly time, not by a CPU at run time. They can make the assembly of the program dependent on parameters input by a programmer, so that one program can be assembled different ways, perhaps for different applications. They also can be used to manipulate presentation of a program to make it easier to read and maintain.
used in place of a mnemonic. When the assembler processes such a statement, it replaces the statement with the text lines associated with that macro, and then processes them as if they existed in the source code file (including, in some assemblers, expansion of any macros existing in the replacement text). Macros are used to customize large scale software systems for specific customers in the mainframe era and are also used by customer personnel to satisfy their employers' needs by making specific versions of manufacturer operating systems.
Advantages of Assemblers
The major advantages of using assemblers i.e. applying assembly level coding for your desired projects are: Working with assembler offers you a range of capabilities, which are not (all) available for 3GL- of 4GL-programmers. Easy resolving of parry errors Efficient usage of available memory Dynamic memory management Optimization Usage of operating system facilities Virtual look-aside facility Concurrent access to several datasets Subtasks Re-enter ability
Disadvantages of Assemblers
The main disadvantages of assembler over high level languages is that assembler is not portable (it is written for a particular instruction set) and that programmers are less productive since assembler is less expressive than high level languages.
References:
http://en.wikipedia.com http://wisegeek.com http://sciencepapers.com http://computerworld.com http://differencebetween.com http://bookrags.com http://wiki.answer.com http://mcargpv.blogspot.com http://bixoft.nl Leland L. Beck, System Software Donovan, Compiler Design