Sie sind auf Seite 1von 15

From Source Code to Executable

Dr. Axel Kohlmeyer


Scientific Computing Expert
Information and Telecommunication Section

The Abdus Salam International Centre for Theoretical Physics http://sites.google.com/site/a ohlmey/

akohlmey@ictp.it

Pre!process / Compile / "in


Creating an executable includes multiple steps The #compiler$ is a %rapper for se&eral commands that are executed in succession The #compiler flags$ similarly fall into categories and are handed do%n to the respecti&e tools 'hen compiling for different languages( only the first steps are language specific. 'e %ill loo into a C example first( since this is the language the )S is *mostly+ %ritten in
2

A simple C Example

Consider the minimal C program ,hello.c,: #include <stdio.h> int main(int argc, char **argv) { printf(hello world\n )! return "! # i.e.: %hat happens( if %e do: - gcc $o hello hello.c *try: gcc $v $o hello hello.c+
3

Step .: Pre!processing

Pre!processing is mandatory in C *and C//+ Pre!processing %ill handle ,0, directi&es


1ile inclusion %ith nested inclusion Conditional compilation and 2acro expansion

In this case: %usr%include%stdio.h and all files are included by it are inserted and the contained macros expanded 3se !E flag to stop after pre!processing: - cc $& $o hello.pp.c hello.c
4

Step 4: Compilation

Compiler con&erts a high!le&el language into the specific instruction set of the target CP3 Indi&idual steps:

Parse text *lexical / syntactical analysis+ 5o language specific transformations Translate to internal representation units *I6s+ )ptimi7ation *reorder( merge( eliminate+ 6eplace I6s %ith pieces of assembler language

Try:- gcc $' hello.c *produces hello.s+


5

Compilation cont,d
.file (hello.c( .section .rodata .)*"+

gcc replaced printf with puts

.string (hello, world,( try: gcc !fno!builtin !S hello.c .te-t .glo.l main .t/pe main, 0function #include <stdio.h> main+ int main(int argc, pushl 1e.p char **argv) movl 1esp, 1e.p andl 2$34, 1esp { su.l 234, 1esp printf(hello world\n )! movl 2.)*", (1esp) return "! call puts movl 2", 1ea# leave ret .si5e main, .$main .ident (6**+ (678) 9.:.3 ;"3""<;9 (=ed >at 9.:.3$9)( .section .note.678$stac?,((,0prog.its

Step 8: Assembler / Step 9: "in er

Assembler *as+ translates assembly to binary

Creates so!called ob:ect files *in E"1 format+

Try: > gcc -c hello.c Try: > nm hello.o 00000000 T main U puts

"in er *ld+ puts binary together %ith startup code and re;uired libraries 1inal step( result is executable. Try: - gcc -o hello hello.o
!

Adding "ibraries

Example 4: exp.c

#include <math.h> #include <stdio.h> int main(int argc, char **argv) { dou.le a@;."! printf((e-p(;.")@1f\n(, e-p(a))! return "! #

- gcc -o exp exp.c


1ails %ith #undefined reference to ,exp,$. exp() is in #libm$( but compiler does not lin to it

<- gcc -o exp exp.c -lm


"

Symbols in )b:ect 1iles = >isibility

Compiled ob:ect files ha&e multiple sections and a symbol table describing their entries:

#Text$: this is executable code #5ata$: pre!allocated &ariables storage #Constants$: read!only data #3ndefined$: symbols that are used but not defined #5ebug$: debugger information *e.g. line numbers+

Entries in the ob:ect files can be inspected %ith either the #nm$ tool or the #readelf$ command
#

Example 1ile: &isbility.c


static const int val1 = -5; const int val2 = 10; static int val3 = -20; int val4 = -15; extern int errno; static int add_abs(const int v1, const int v2) { return abs(v1)+abs(v2); int !ain(int ar"c, c#ar $$ar"v) { int val5 = 20; %rint&('(d ) (d ) (d*n', add_abs(val1,val2), add_abs(val3,val4), add_abs(val1,val5)); return 0;

nm visibility.o: 00000000 t add_abs U errno 00000024 T main U printf 00000000 r val1 00000004 R val2 00000000 d val3 00000004 D val4

$%

'hat ?appens 5uring "in ing@

?istorically( the lin er combines a #startup ob:ect$ *crt..o+ %ith all compiled or listed ob:ect files( the C library *libc+ and a #finish ob:ect$ *crtn.o+ into an executable *a.out+ 'ith shared libraries it is more complicated. The lin er then #builds$ the executable by matching undefined references %ith a&ailable entries in the symbol tables of the ob:ects crt..o has an undefined reference to #main$ thus C programs start at the main*+ function
$$

"ibraries

Static libraries built %ith the #ar$ command are collections of ob:ects %ith a global symbol table 'hen lin ing to a static library( ob:ect code is copied into the resulting executable and all direct addresses recomputed *e.g. for #:umps$+ Symbols are resol&ed #from left to right$( so circular dependencies re;uire to list libraries multiple times or use a special lin er flag 'hen lin ing only the name of the symbol is chec ed( not %hether its argument list matches
$2

2ore on Shared "ibraries

Shared libraries are more li e executables that are missing the main*+ function 'hen lin ing to a shared library( a mar er is added to load the library by its #generic$ name and the list of undefined symbols 'hen resol&ing a symbol *function+ from shared library all addresses ha&e to be recomputed *relocated+ on the fly. The shared lin er program is executed first and then loads the executable and its dependencies
$3

5ynamic "in er Issues

"inux defaults to dynamic libraries: > ldd hello linu-$gate.so.3 @> ("-""9<d""") li.c.so.4 @> %li.%li.c.so.4 ("-"":a"""") %li.%ld$linu-.so.; ("-"":A.""") %etc%ld.so.conf, )BC)DE=F=GCHFI> define %here to search for shared libraries gcc $Jl,$rpath,%some%dir %ill encode %some%dir into the binary for searching
$4

'hat is 5ifferent in 1ortran@


Aasic compilation principles are the same In 1ortran( symbols are case insensiti&e <- most compilers translate them to lo%er case To ma e 1ortran symbols different from C symbols( their names are modified *e.g. function ha&e an underscore appended+ 1ortran programs don,t ha&e a #main$ in the same %ay as C programs ha&e *no arguments+ P6)B6A2 <- 2AICDD *in gfortran+ C!li e main pro&ided as startup *to store args+
$5

Das könnte Ihnen auch gefallen