Sie sind auf Seite 1von 118

C Programming Notes

Steve Summit These notes are part of the UW Experimental College course on Introductory C Programming. They are based on notes prepared beginning in Spring! "##$% to supplement the boo& The C Programming Language! by 'rian (ernighan and )ennis *itchie! or (+* as the boo& and its authors are affectionately &no,n. The second edition ,as published in "#-by Prentice./all! IS'0 1."2.""1234.-.% These notes are no, as of Winter! "##$.3% intended to be stand.alone! although the sections are still cross.referenced to those of (+*! for the reader ,ho ,ants to pursue a more in.depth exposition.

"

Chapter 1: Introduction
C is as (+* admit% a relatively small language! but one ,hich to its admirers! any,ay% ,ears ,ell. C5s small! unambitious feature set is a real advantage6 there5s less to learn7 there isn5t excess baggage in the ,ay ,hen you don5t need it. It can also be a disadvantage6 since it doesn5t do everything for you! there5s a lot you have to do yourself. 8ctually! this is vie,ed by many as an additional advantage6 anything the language doesn5t do for you! it doesn5t dictate to you! either! so you5re free to do that something ho,ever you ,ant.% C is sometimes referred to as a 99high.level assembly language.55 Some people thin& that5s an insult! but it5s actually a deliberate and significant aspect of the language. If you have programmed in assembly language! you5ll probably find C very natural and comfortable although if you continue to focus too heavily on machine.level details! you5ll probably end up ,ith unnecessarily nonportable programs%. If you haven5t programmed in assembly language! you may be frustrated by C5s lac& of certain higher.level features. In either case! you should understand ,hy C ,as designed this ,ay6 so that seemingly.simple constructions expressed in C ,ould not expand to arbitrarily expensive in time or space% machine language constructions ,hen compiled. If you ,rite a C program simply and succinctly! it is li&ely to result in a succinct! efficient machine language executable. If you find that the executable program resulting from a C program is not efficient! it5s probably because of something silly you did! not because of something the compiler did behind your bac& ,hich you have no control over. In any case! there5s no point in complaining about C5s lo,.level flavor6 C is ,hat it is. 8 programming language is a tool! and no tool can perform every tas& unaided. If you5re building a house! and I5m teaching you ho, to use a hammer! and you as& ho, to assemble rafters and trusses into gables! that5s a legitimate :uestion! but the ans,er has fallen out of the realm of 99/o, do I use a hammer;55 and into 99/o, do I build a house;55. In the same ,ay! ,e5ll see that C does not have built.in features to perform every function that ,e might ever need to do ,hile programming. 8s mentioned above! C imposes relatively fe, built.in ,ays of doing things on the programmer. Some common tas&s! such as manipulating strings! allocating memory! and doing input<output I<=%! are performed by calling on library functions. =ther tas&s ,hich you might ,ant to do! such as creating or listing directories! or interacting ,ith a mouse! or displaying ,indo,s or other user.interface elements! or doing color graphics! are not defined by the C language at all. >ou can do these things from a C program! of course! but you ,ill be calling on services ,hich are peculiar to your programming environment compiler! processor! and operating system% and ,hich are not defined by the C standard. Since this course is about portable C programming! it ,ill also be steering clear of facilities not provided in all C environments. 8nother aspect of C that5s ,orth mentioning here is that it is! to put it bluntly! a bit dangerous. C does not! in general! try hard to protect a programmer from mista&es. If you ,rite a piece of code ,hich ,ill through some oversight of yours% do something ,ildly different from ,hat you intended it to do! up to and including deleting your data or trashing your dis&! and if it is possible for the compiler to compile it! it generally ,ill. >ou ,on5t get ,arnings of the form 99)o you really mean to...;55 or 998re you sure you really ,ant to...;55. C 4

is often compared to a sharp &nife6 it can do a surgically precise ?ob on some exacting tas& you have in mind! but it can also do a surgically precise ?ob of cutting off your finger. It5s up to you to use it carefully. This aspect of C is very ,idely critici@ed7 it is also used ?ustifiably% to argue that C is not a good teaching language. C aficionados love this aspect of C because it means that C does not try to protect them from themselves6 ,hen they &no, ,hat they5re doing! even if it5s ris&y or obscure! they can do it. Students of C hate this aspect of C because it often seems as if the language is some &ind of a conspiracy specifically designed to lead them into booby traps and 99gotchaA55s. This is another aspect of the language ,hich it5s fairly pointless to complain about. If you ta&e care and pay attention! you can avoid many of the pitfalls. These notes ,ill point out many of the obvious and not so obvious% trouble spots.

1.1 A First Example


BThis section corresponds to (+* Sec. "."C The best ,ay to learn programming is to dive right in and start ,riting real programs. This ,ay! concepts ,hich ,ould other,ise seem abstract ma&e sense! and the positive feedbac& you get from getting even a small program to ,or& gives you a great incentive to improve it or ,rite the next one. )iving in ,ith 99real55 programs right a,ay has another advantage! if only pragmatic6 if you5re using a conventional compiler! you can5t run a fragment of a program and see ,hat it does7 nothing ,ill run until you have a complete if tiny or trivial% program. >ou can5t learn everything you5d need to ,rite a complete program all at once! so you5ll have to ta&e some things 99on faith55 and parrot them in your first programs before you begin to understand them. >ou can5t learn to program ?ust one expression or statement at a time any more than you can learn to spea& a foreign language one ,ord at a time. If all you &no, is a handful of ,ords! you can5t actually say anything6 you also need to &no, something about the language5s ,ord order and grammar and sentence structure and declension of articles and verbs.% 'esides the occasional necessity to ta&e things on faith! there is a more serious potential dra,bac& of this 99dive in and program55 approach6 it5s a small step from learning.by.doing to learning.by.trial.and.error! and ,hen you learn programming by trial.and.error! you can very easily learn many errors. When you5re not sure ,hether something ,ill ,or&! or you5re not even sure ,hat you could use that might ,or&! and you try something! and it does ,or&! you do not have any guarantee that ,hat you tried ,or&ed for the right reason. >ou might ?ust have 99learned55 something that ,or&s only by accident or only on your compiler! and it may be very hard to un.learn it later! ,hen it stops ,or&ing. Therefore! ,henever you5re not sure of something! be very careful before you go off and try it 99?ust to see if it ,ill ,or&.55 =f course! you can never be absolutely sure that something is going to ,or& before you try it! other,ise ,e5d never have to try things. 'ut you should have an expectation that something is going to ,or& before you try it! and if you can5t predict ho, to do something or ,hether something ,ould ,or& and find yourself having to determine it experimentally! ma&e a note in your mind that ,hatever you5ve ?ust learned based on the outcome of the experiment% is suspect. 2

The first example program in (+* is the first example program in any language6 print or display a simple string! and exit. /ere is my version of (+*5s 99hello! ,orld55 program6
#include <stdio.h> main() { printf("Hello, world!\n"); return ; !

If you have a C compiler! the first thing to do is figure out ho, to type this program in and compile it and run it and see ,here its output ,ent. If you don5t have a C compiler yet! the first thing to do is to find one.% The first line is practically boilerplate7 it ,ill appear in almost all programs ,e ,rite. It as&s that some definitions having to do ,ith the 99Standard I<= Dibrary55 be included in our program7 these definitions are needed if ,e are to call the library function printf correctly. The second line says that ,e are defining a function named main. Eost of the time! ,e can name our functions anything ,e ,ant! but the function name main is special6 it is the function that ,ill be 99called55 first ,hen our program starts running. The empty pair of parentheses indicates that our main function accepts no arguments! that is! there isn5t any information ,hich needs to be passed in ,hen the function is called. The braces { and ! surround a list of statements in C. /ere! they surround the list of statements ma&ing up the function main. The line
printf("Hello, world!\n");

is the first statement in the program. It as&s that the function printf be called7 printf is a library function ,hich prints formatted output. The parentheses surround printf5s argument list6 the information ,hich is handed to it ,hich it should act on. The semicolon at the end of the line terminates the statement. name reflects the fact that C ,as first developed ,hen Teletypes and other printing terminals ,ere still in ,idespread use. Today! of course! video displays are far more common. printf5s 99prints55 to the standard output! that is! to the default location for program output to go. 0o,adays! that5s almost al,ays a video screen or a ,indo, on that screen. If you do have a printer! you5ll typically have to do something extra to get a program to print to it.%
printf5s printf5s

first and! in this case! only% argument is the string ,hich it should print. The string! enclosed in double :uotes ""! consists of the ,ords 99/ello! ,orldA55 follo,ed by a special se:uence6 \n. In strings! any t,o.character se:uence beginning ,ith the bac&slash \ represents a single special character. The se:uence \n represents the 99ne, line55 character! ,hich prints a carriage return or line feed or ,hatever it ta&es to end one line of output and move do,n to the next. This program only prints one line of output! but it5s still important to terminate it.% The second line in the main function is F

return

In general! a function may return a value to its caller! and main is no exception. When main returns that is! reaches its end and stops functioning%! the program is at its end! and the return value from main tells the operating system or ,hatever invo&ed the program that main is the main function of% ,hether it succeeded or not. 'y convention! a return value of 1 indicates success. This program may loo& so absolutely trivial that it seems as if it5s not even ,orth typing it in and trying to run it! but doing so may be a big and is certainly a vital% first hurdle. =n an unfamiliar computer! it can be arbitrarily difficult to figure out ho, to enter a text file containing program source! or ho, to compile and lin& it! or ho, to invo&e it! or ,hat happened after if;% it ran. The most experienced C programmers immediately go bac& to this one! simple program ,henever they5re trying out a ne, system or a ne, ,ay of entering or building programs or a ne, ,ay of printing output from ,ithin programs. 8s (ernighan and *itchie say! everything else is comparatively easy. /o, you compile and run this or any% program is a function of the compiler and operating system you5re using. The first step is to type it in! exactly as sho,n7 this may involve using a text editor to create a file containing the program text. >ou5ll have to give the file a name! and all C compilers that I5ve ever heard of% re:uire that files containing C source end ,ith the extension .c. So you might place the program text in a file called hello.c. The second step is to compile the program. Strictly spea&ing! compilation consists of t,o steps! compilation proper follo,ed by lin&ing! but ,e can overloo& this distinction at first! especially because the compiler often ta&es care of initiating the lin&ing step automatically.% =n many Unix systems! the command to compile a C program from a source file hello.c is
cc "o hello hello.c

>ou ,ould type this command at the Unix shell prompt! and it re:uests that the cc C compiler% program be run! placing its output i.e. the ne, executable program it creates% in the file hello! and ta&ing its input i.e. the source code to be compiled% from the file hello.c. The third step is to run execute! invo&e% the ne,ly.built hello program. 8gain on a Unix system! this is done simply by typing the program5s name6
hello

)epending on ho, your system is set up in particular! on ,hether the current directory is searched for executables! based on the P8T/ variable%! you may have to type
.#hello

to indicate that the hello program is in the current directory as opposed to some 99$in55 directory full of executable programs! else,here%. >ou may also have your choice of C compilers. =n many Unix machines! the cc command is an older compiler ,hich does not recogni@e modern! 80SI Standard C syntax. 8n old compiler ,ill accept the simple programs ,e5ll be starting ,ith! but it ,ill not accept most of our later programs. If you find yourself getting baffling compilation errors on programs ,hich you5ve typed in exactly as they5re sho,n! it probably indicates that you5re using an older compiler. =n many machines! another compiler called acc or %cc is available! and

you5ll ,ant to use it! instead. 'oth acc and %cc are typically invo&ed the same as cc7 that is! the above cc command ,ould instead be typed! say! %cc "o hello hello.c .% =ne final caveat about Unix systems6 don5t name your test programs test! because there5s already a standard command called test! and you and the command interpreter ,ill get badly confused if you try to replace the system5s test command ,ith your o,n! not least because your o,n almost certainly does something completely different.% Under ES.)=S! the compilation procedure is :uite similar. The name of the command you type ,ill depend on your compiler e.g. cl for the Eicrosoft C compiler! tc or $cc for 'orland5s Turbo C! etc.%. >ou may have to manually perform the second! lin&ing step! perhaps ,ith a command named lin& or tlin&. The executable file ,hich the compiler<lin&er creates ,ill have a name ending in .e'e or perhaps .com%! but you can still invo&e it by typing the base name e.g. hello%. See your compiler documentation for complete details7 one of the manuals should contain a demonstration of ho, to enter! compile! and run a small program that prints some simple output! ?ust as ,e5re trying to describe here. In an integrated or 99visual55 progamming environment! such as those on the Eacintosh or under various versions of Eicrosoft Windo,s! the steps you ta&e to enter! compile! and run a program are some,hat different and! theoretically! simpler%. Typically! there is a ,ay to open a ne, source ,indo,! type source code into it! give it a file name! and add it to the program or 99pro?ect55% you5re building. If necessary! there ,ill be a ,ay to specify ,hat other source files or 99modules55% ma&e up the program. Then! there5s a button or menu selection ,hich compiles and runs the program! all from ,ithin the programming environment. There ,ill also be a ,ay to create a standalone executable file ,hich you can run from outside the environment.% In a PC.compatible environment! you may have to choose bet,een creating )=S programs or Windo,s programs. If you have troubles pertaining to the printf function! try specifying a target environment of ES.)=S. Supposedly! some compilers ,hich are targeted at Windo,s environments ,on5t let you call printf! because until you call some fancier functions to re:uest that a ,indo, be created! there5s no ,indo, for printf to print to.% 8gain! chec& the introductory or tutorial manual that came ,ith the programming pac&age7 it should ,al& you through the steps necessary to get your first program running.

1.2 Second Example


=ur second example is of little more practical use than the first! but it introduces a fe, more programming language elements6
#include <stdio.h> #( print a few num$ers, to illustrate a simple loop (# main() { int i; for(i ) return ; i < * ; i ) i + *) printf("i is ,d\n", i); ;

8s before! the line #include <stdio.h> is boilerplate ,hich is necessary since ,e5re calling the printf function! and main() and the pair of braces {! indicate and delineate the function named main ,e5re again% ,riting. The first ne, line is the line
#( print a few num$ers, to illustrate a simple loop (#

,hich is a comment. 8nything bet,een the characters #( and (# is ignored by the compiler! but may be useful to a person trying to read and understand the program. >ou can add comments any,here you ,ant to in the program! to document ,hat the program is! ,hat it does! ,ho ,rote it! ho, it ,or&s! ,hat the various functions are for and ho, they ,or&! ,hat the various variables are for! etc. The second ne, line! do,n ,ithin the function main! is
int i;

,hich declares that our function ,ill use a variable named i. The variable5s type is int! ,hich is a plain integer. 0ext! ,e set up a loop6
for(i ) ; i < * ; i ) i + *)

The &ey,ord for indicates that ,e are setting up a 99for loop.55 8 for loop is controlled by three expressions! enclosed in parentheses and separated by semicolons. These expressions say that! in this case! the loop starts by setting i to 1! that it continues as long as i is less than "1! and that after each iteration of the loop! i should be incremented by " that is! have " added to its value%. Ginally! ,e have a call to the printf function! as before! but ,ith several differences. Girst! the call to printf is ,ithin the body of the for loop. This means that control flo, does not pass once through the printf call! but instead that the call is performed as many times as are dictated by the for loop. In this case! printf ,ill be called several times6 once ,hen i is 1! once ,hen i is "! once ,hen i is 4! and so on until i is #! for a total of "1 times. 8 second difference in the printf call is that the string to be printed! "i is ,d"! contains a percent sign. Whenever printf sees a percent sign! it indicates that printf is not supposed to print the exact text of the string! but is instead supposed to read another one of its arguments to decide ,hat to print. The letter after the percent sign tells it ,hat type of argument to expect and ho, to print it. In this case! the letter d indicates that printf is to expect an int! and to print it in decimal. Ginally! ,e see that printf is in fact being called ,ith another argument! for a total of t,o! separated by commas. The second argument is the variable i! ,hich is in fact an int! as re:uired by ,d. The effect of all of this is that each time it is called! printf ,ill print a line containing the current value of the variable i6
i is i is * i is ...

8fter several trips through the loop! i ,ill eventually e:ual #. 8fter that trip through the loop! the third control expression i ) i + * ,ill increment its value to "1. The condition i < * is no longer true! so no more trips through the loop are ta&en. Instead! control flo, ?umps do,n to the statement follo,ing the for loop! ,hich is the return statement. The main function returns! and the program is finished.

1.3 Program Structure


We5ll have more to say later about program structure! but for no, let5s observe a fe, basics. 8 program consists of one or more functions7 it may also contain global variables. =ur t,o example programs so far have contained one function apiece! and no global variables.% 8t the top of a source file are typically a fe, boilerplate lines such as #include <stdio.h>! follo,ed by the definitions i.e. code% for the functions. It5s also possible to split up the several functions ma&ing up a larger program into several source files! as ,e5ll see in a later chapter.% Each function is further composed of declarations and statements! in that order. When a se:uence of statements should act as one for example! ,hen they should all serve together as the body of a loop% they can be enclosed in braces ?ust as for the outer body of the entire function%. The simplest &ind of statement is an expression statement! ,hich is an expression presumably performing some useful operation% follo,ed by a semicolon. Expressions are further composed of operators! objects variables%! and constants. C source code consists of several lexical elements. Some are ,ords! such as for! return! main! and i! ,hich are either keywords of the language for! return% or identifiers names% ,e5ve chosen for our o,n functions and variables main! i%. There are constants such as * and * ,hich introduce ne, values into the program. There are operators such as )! +! and >! ,hich manipulate variables and values. There are other punctuation characters often called delimiters%! such as parentheses and s:uiggly braces {!! ,hich indicate ho, the other elements of the program are grouped. Ginally! all of the preceding elements can be separated by whitespace6 spaces! tabs! and the 99carriage returns55 bet,een lines. The source code for a C program is! for the most part! 99free form.55 This means that the compiler does not care ho, the code is arranged6 ho, it is bro&en into lines! ho, the lines are indented! or ,hether ,hitespace is used bet,een things li&e variable names and other punctuation. Dines li&e #include <stdio.h> are an exception7 they must appear alone on their o,n lines! generally unbro&en. =nly lines beginning ,ith # are affected by this rule7 ,e5ll see other examples later.% >ou can use ,hitespace! indentation! and appropriate line brea&s to ma&e your programs more readable for yourself and other people even though the compiler doesn5t care%. >ou can place explanatory comments any,here in your program..any text bet,een the characters #( and (# is ignored by the compiler. In fact! the compiler pretends that all it sa, ,as ,hitespace.% Though comments are ignored by the compiler! ,ell.chosen comments can ma&e a program much easier to read for its author! as ,ell as for others%. The usage of ,hitespace is our first style issue. It5s typical to leave a blan& line bet,een different parts of the program! to leave a space on either side of operators such as + and )! and to indent the bodies of loops and other control flo, constructs. Typically! ,e arrange the indentation so that the subsidiary statements controlled by a loop statement the 99loop body!55

such as the printf call in our second example program% are all aligned ,ith each other and placed one tab stop or some consistent number of spaces% to the right of the controlling statement. This indentation li&e all ,hitespace% is not re:uired by the compiler! but it ma&es programs much easier to read. /o,ever! it can also be misleading! if used incorrectly or in the face of inadvertent mista&es. The compiler ,ill decide ,hat 99the body of the loop55 is based on its o,n rules! not the indentation! so if the indentation does not match the compiler5s interpretation! confusion is inevitable.% To drive home the point that the compiler doesn5t care about indentation! line brea&s! or other ,hitespace! here are a fe, extreme% examples6 The fragments
for(i ) ; i < * ; i ) i + *) printf(",d\n", i); ; i < * ; i ) i + *) printf(",d\n", i);

and
for(i )

and and

for(i) ;i<* ;i)i+*)printf(",d\n",i); for(i ) ; i < * ; i ) i + *) printf(",d\n", i);

and
for ) i ; i ) ",d\n" ) ( i ; < * i ) + * printf ( , i ;

and
for (i) ; i<* ;i) i+*)printf (",d\n", i);

are all treated exactly the same ,ay by the compiler. Some programmers argue forever over the best set of 99rules55 for indentation and other aspects of programming style! calling to mind the old philosopher5s debates about the number of angels that could dance on the head of a pin. Style issues such as ho, a program is laid out% are important! but they5re not something to be too dogmatic about! and there are also other! deeper style issues besides mere layout and typography. (ernighan and *itchie ta&e a fairly moderate stance6 8lthough C compilers do not care about ho, a program loo&s! proper indentation and spacing are critical in ma&ing programs easy for people to read. We recommend ,riting only one statement per line! and using blan&s around operators to clarify grouping. The position of braces is less important! although people hold passionate beliefs. We have chosen one of several popular styles. Pic& a style that suits you! then use it consistently. There is some value in having a reasonably standard style or a fe, standard styles% for code layout. Please don5t ta&e the above advice to 99pic& a style that suits you55 as an invitation to invent your o,n brand.ne, style. If perhaps after you5ve been programming in C for a

,hile% you have specific ob?ections to specific facets of existing styles! you5re ,elcome to modify them! but if you don5t have any particular leanings! you5re probably best off copying an existing style at first. If you ,ant to place your o,n stamp of originality on the programs that you ,rite! there are better avenues for your creativity than inventing a bi@arre layout7 you might instead try to ma&e the logic easier to follo,! or the user interface easier to use! or the code freer of bugs.%

"1

Chapter 2: Basic Data Types and Operators


The type of a variable determines ,hat &inds of values it may ta&e on. 8n operator computes ne, values out of old ones. 8n expression consists of variables! constants! and operators combined to perform some useful computation. In this chapter! ,e5ll learn about C5s basic types! ho, to ,rite constants and declare variables of these types! and ,hat the basic operators are. 8s (ernighan and *itchie say! 99The type of an ob?ect determines the set of values it can have and ,hat operations can be performed on it.55 This is a fairly formal! mathematical definition of ,hat a type is! but it is traditional and meaningful%. There are several implications to remember6 ". The 99set of values55 is finite. C5s int type can not represent all of the integers7 its float type can not represent all floating.point numbers. 4. When you5re using an ob?ect that is! a variable% of some type! you may have to remember ,hat values it can ta&e on and ,hat operations you can perform on it. Gor example! there are several operators ,hich play ,ith the binary bit.level% representation of integers! but these operators are not meaningful for and may not be applied to floating.point operands. 2. When declaring a ne, variable and pic&ing a type for it! you have to &eep in mind the values and operations you5ll be needing. In other ,ords! pic&ing a type for a variable is not some abstract academic exercise7 it5s closely connected to the ,ay s% you5ll be using that variable.

2.1 Types
BThis section corresponds to (+* Sec. 4.4C There are only a fe, basic data types in C. The first ones ,e5ll be encountering and using are6

char a character int an integer! in the range .24!H3H to 24!H3H long int a larger integer up to I.4!"FH!F-2!3FH% float a floating.point number double a floating.point number! ,ith more precision and perhaps greater range than
float

If you can loo& at this list of basic types and say to yourself! 99=h! ho, simple! there are only a fe, types! I ,on5t have to ,orry much about choosing among them!55 you5ll have an easy time ,ith declarations. Some masochists ,ish that the type system ,ere more complicated so that they could specify more things about each variable! but those of us ,ho ,ould rather not have to specify these extra things each time are glad that ,e don5t have to.% The ranges listed above for types int and lon% int are the guaranteed minimum ranges. =n some systems! either of these types or! indeed! any C type% may be able to hold larger values! ""

but a program that depends on extended ranges ,ill not be as portable. Some programmers become obsessed ,ith &no,ing exactly ,hat the si@es of data ob?ects ,ill be in various situations! and go on to ,rite programs ,hich depend on these exact si@es. )etermining or controlling the si@e of an ob?ect is occasionally important! but most of the time ,e can sidestep si@e issues and let the compiler do most of the ,orrying. Grom the ranges listed above! ,e can determine that type int must be at least "3 bits! and that type lon% int must be at least 24 bits. 'ut neither of these si@es is exact7 many systens have 24.bit ints! and some systems have 3F.bit lon% ints.% >ou might ,onder ho, the computer stores characters. The ans,er involves a character set! ,hich is simply a mapping bet,een some set of characters and some set of small numeric codes. Eost machines today use the 8SCII character set! in ,hich the letter 8 is represented by the code 3$! the ampersand + is represented by the code 2-! the digit " is represented by the code F#! the space character is represented by the code 24! etc. Eost of the time! of course! you have no need to &no, or even ,orry about these particular code values7 they5re automatically translated into the right shapes on the screen or printer ,hen characters are printed out! and they5re automatically generated ,hen you type characters on the &eyboard. Eventually! though! ,e5ll appreciate! and even ta&e some control over! exactly ,hen these translations..from characters to their numeric codes..are performed.% Character codes are usually small..the largest code value in 8SCII is "43! ,hich is the J tilde or circumflex% character. Characters usually fit in a byte! ,hich is usually - bits. In C! type char is defined as occupying one byte! so it is usually - bits. Eost of the simple variables in most programs are of types int! lon% int! or dou$le. Typically! ,e5ll use int and dou$le for most purposes! and lon% int any time ,e need to hold integer values greater than 24!H3H. 8s ,e5ll see! even ,hen ,e5re manipulating individual characters! ,e5ll usually use an int variable! for reasons to be discussed later. Therefore! ,e5ll rarely use individual variables of type char7 although ,e5ll use plenty of arrays of char.

2.2 Constants
BThis section corresponds to (+* Sec. 4.2C 8 constant is ?ust an immediate! absolute value found in an expression. The simplest constants are decimal integers! e.g. ! *! -! *-. . =ccasionally it is useful to specify constants in base - or base "3 octal or hexadecimal%7 this is done by prefixing an extra @ero% for octal! or ' for hexadecimal6 the constants * ! *//! and '0/ all represent the same number. If you5re not using these non.decimal constants! ?ust remember not to use any leading @eroes. If you accidentally ,rite *-. intending to get one hundred and t,enty three! you5ll get -2 instead! ,hich is "42 base -.% We ,rite constants in decimal! octal! or hexadecimal for our convenience! not the compiler5s. The compiler doesn5t care7 it al,ays converts everything into binary internally! any,ay. There is! ho,ever! no good ,ay to specify constants in source code in binary.%

"4

8 constant can be forced to be of type lon% int by suffixing it ,ith the letter 1 in upper or lo,er case! although upper case is strongly recommended! because a lo,er case l loo&s too much li&e the digit *%. 8 constant that contains a decimal point or the letter e or both% is a floating.point constant6 ..*/! * .! . *! *-.e/! *-../20e3 . The e indicates multiplication by a po,er of "17 *-../20e3 is "42.F$3 times "1 to the Hth! or "!42F!$31!111. Gloating.point constants are of type dou$le by default.% We also have constants for specifying characters and strings. Ea&e sure you understand the difference bet,een a character and a string6 a character is exactly one character7 a string is a set of @ero or more characters7 a string containing one character is distinct from a lone character.% 8 character constant is simply a single character bet,een single :uotes6 454! 4.4! 4,4. The numeric value of a character constant is! naturally enough! that character5s value in the machine5s character set. In 8SCII! for example! 454 has the value 3$.% 8 string is represented in C as a se:uence or array of characters. We5ll have more to say about arrays in general! and strings in particular! later.% 8 string constant is a se:uence of @ero or more characters enclosed in double :uotes6 "apple"! "hello, world"! "this is a test". Within character and string constants! the bac&slash character \ is special! and is used to represent characters not easily typed on the &eyboard or for various reasons not easily typed in constants. The most common of these 99character escapes55 are6
\n \$ \r \4 \" \\ a a a a a a 66newline44 character $ac&space carria%e return (without a line feed) sin%le 7uote (e.%. in a character constant) dou$le 7uote (e.%. in a strin% constant) sin%le $ac&slash

Gor example! "he said \"hi\"" is a string constant ,hich contains t,o double :uotes! and 4\44 is a character constant consisting of a single% single :uote. 0otice once again that the character constant 454 is very different from the string constant "5".

2.3 Declarations
BThis section corresponds to (+* Sec. 4.FC Informally! a variable also called an object% is a place you can store a value. So that you can refer to it unambiguously! a variable needs a name. >ou can thin& of the variables in your program as a set of boxes or cubbyholes! each ,ith a label giving its name7 you might imagine that storing a value 99in55 a variable consists of ,riting the value on a slip of paper and placing it in the cubbyhole. 8 declaration tells the compiler the name and type of a variable you5ll be using in your program. In its simplest form! a declaration consists of the type! the name of the variable! and a terminating semicolon6
char c;

"2

int i; float f;

>ou can also declare several variables of the same type in one declaration! separating them ,ith commas6
int i*, i-;

Dater ,e5ll see that declarations may also contain initializers! qualifiers and storage classes! and that ,e can declare arrays! functions! pointers! and other &inds of data structures. The placement of declarations is significant. >ou can5t place them ?ust any,here i.e. they cannot be interspersed ,ith the other statements in your program%. They must either be placed at the beginning of a function! or at the beginning of a brace.enclosed bloc& of statements ,hich ,e5ll learn about in the next chapter%! or outside of any function. Gurthermore! the placement of a declaration! as ,ell as its storage class! controls several things about its visibility and lifetime! as ,e5ll see later. >ou may ,onder why variables must be declared before use. There are t,o reasons6 ". It ma&es things some,hat easier on the compiler7 it &no,s right a,ay ,hat &ind of storage to allocate and ,hat code to emit to store and manipulate each variable7 it doesn5t have to try to intuit the programmer5s intentions. 4. It forces a bit of useful discipline on the programmer6 you cannot introduce variables ,illy.nilly7 you must thin& about them enough to pic& appropriate types for them. The compiler5s error messages to you! telling you that you apparently forgot to declare a variable! are as often helpful as they are a nuisance6 they5re helpful ,hen they tell you that you misspelled a variable! or forgot to thin& about exactly ho, you ,ere going to use it.% 8lthough there are a fe, places ,here declarations can be omitted in ,hich case the compiler ,ill assume an implicit declaration%! ma&ing use of these removes the advantages of reason 4 above! so I recommend al,ays declaring everything explicitly. Eost of the time! I recommend ,riting one declaration per line. Gor the most part! the compiler doesn5t care ,hat order declarations are in. >ou can order the declarations alphabetically! or in the order that they5re used! or to put related declarations next to each other. Collecting all variables of the same type together on one line essentially orders declarations by type! ,hich isn5t a very useful order it5s only slightly more useful than random order%. 8 declaration for a variable can also contain an initial value. This initializer consists of an e:uals sign and an expression! ,hich is usually a single constant6
int i ) *; int i* ) * , i- ) - ;

2.4 aria!le "ames


BThis section corresponds to (+* Sec. 4."C

"F

Within limits! you can give your variables and functions any names you ,ant. These names the formal term is 99identifiers55% consist of letters! numbers! and underscores. Gor our purposes! names must begin ,ith a letter. Theoretically! names can be as long as you ,ant! but extremely long ones get tedious to type after a ,hile! and the compiler is not re:uired to &eep trac& of extremely long ones perfectly. What this means is that if you ,ere to name a variable! say! supercalafra%alisticespialidocious! the compiler might get la@y and pretend that you5d named it supercalafra%alisticespialidocio! such that if you later misspelled it supercalafra%alisticespialidociou8! the compiler ,ouldn5t catch your mista&e. 0or ,ould the compiler necessarily be able to tell the difference if for some perverse reason you deliberately declared a second variable named supercalafra%alisticespialidociou8.% The capitali@ation of names in C is significant6 the variable names 9aria$le! :aria$le! and :5;<5=1> as ,ell as silly combinations li&e 9ari5$le% are all distinct. 8 final restriction on names is that you may not use keywords the ,ords such as int and for ,hich are part of the syntax of the language% as the names of variables or functions or as identifiers of any &ind%.

2.# Arit$metic %perators


BThis section corresponds to (+* Sec. 4.$C The basic operators for performing arithmetic are the same in many computer languages6
+ " ( # , addition su$traction multiplication di9ision modulus (remainder)

The " operator can be used in t,o ,ays6 to subtract t,o numbers as in a " $%! or to negate one number as in "a + $ or a + "$%. When applied to integers! the division operator # discards any remainder! so * # - is 1 and 3 # / is ". 'ut ,hen either operand is a floating.point :uantity type float or dou$le%! the division operator yields a floating.point result! ,ith a potentially non@ero fractional part. So * # -. is 1.$! and 3. # /. is ".H$. The modulus operator , gives you the remainder ,hen t,o integers are divided6 * , - is "7 3 , / is 2. The modulus operator can only be applied to integers.% 8n additional arithmetic operation you might be ,ondering about is exponentiation. Some languages have an exponentiation operator typically ? or ((%! but C doesn5t. To s:uare or cube a number! ?ust multiply it by itself.% Eultiplication! division! and modulus all have higher precedence than addition and subtraction. The term 99precedence55 refers to ho, 99tightly55 operators bind to their operands that is! to the things they operate on%. In mathematics! multiplication has higher precedence

"$

than addition! so * + - ( . is H! not #. In other ,ords! * + - ( . is e:uivalent to * + (- ( .). C is the same ,ay. 8ll of these operators 99group55 from left to right! ,hich means that ,hen t,o or more of them have the same precedence and participate next to each other in an expression! the evaluation conceptually proceeds from left to right. Gor example! * " - " . is e:uivalent to (* " -) " . and gives .F! not I4. 99Krouping55 is sometimes called associativity! although the term is used some,hat differently in programming than it is in mathematics. 0ot all C operators group from left to right7 a fe, group from right to left.% Whenever the default precedence or associativity doesn5t give you the grouping you ,ant! you can al,ays use explicit parentheses. Gor example! if you ,anted to add " to 4 and then multiply the result by 2! you could ,rite (* + -) ( .. 'y the ,ay! the ,ord 99arithmetic55 as used in the title of this section is an ad?ective! not a noun! and it5s pronounced differently than the noun6 the accent is on the third syllable.

2.& Assignment %perators


BThis section corresponds to (+* Sec. 4."1C The assignment operator ) assigns a value to a variable. Gor example!
' ) *

sets ' to "! and


a ) $

sets a to ,hatever $5s value is. The expression


i ) i + *

is! as ,e5ve mentioned else,here! the standard programming idiom for increasing a variable5s value by "6 this expression ta&es i5s old value! adds " to it! and stores it bac& into i. C provides several 99shortcut55 operators for modifying variables in this and similar ,ays! ,hich ,e5ll meet later.% We5ve called the ) sign the 99assignment operator55 and referred to 99assignment expressions55 because! in fact! ) is an operator ?ust li&e + or ". C does not have 99assignment statements557 instead! an assignment li&e a ) $ is an expression and can be used ,herever any expression can appear. Since it5s an expression! the assignment a ) $ has a value! namely! the same value that5s assigned to a. This value can then be used in a larger expression7 for example! ,e might ,rite
c ) a ) $

,hich is e:uivalent to

c ) (a ) $)

and assigns $5s value to both a and c. The assignment operator! therefore! groups from right to left.% Dater ,e5ll see other circumstances in ,hich it can be useful to use the value of an assignment expression.

"3

It5s usually a matter of style ,hether you initiali@e a variable ,ith an initiali@er in its declaration or ,ith an assignment expression near ,here you first use it. That is! there5s no particular difference bet,een
int a ) * ;

and

int a; #( later... (# a ) * ;

2.' Function Calls


We5ll have much more to say about functions in a later chapter! but for no, let5s ?ust loo& at ho, they5re called. To revie,6 ,hat a function is is a piece of code! ,ritten by you or by someone else! ,hich performs some useful! compartmentali@able tas&.% >ou call a function by mentioning its name follo,ed by a pair of parentheses. If the function ta&es any arguments! you place the arguments bet,een the parentheses! separated by commas. These are all function calls6
printf("Hello, world!\n") printf(",d\n", i) s7rt(*//.) %etchar()

The arguments to a function can be arbitrary expressions. Therefore! you don5t have to say things li&e
int sum ) a + $ + c; printf("sum ) ,d\n", sum);

if you don5t ,ant to7 you can instead collapse it to


printf("sum ) ,d\n", a + $ + c);

Eany functions return values! and ,hen they do! you can embed calls to these functions ,ithin larger expressions6

c ) s7rt(a ( a + $ ( $) ' ) r ( cos(theta) i ) f*(f-(@)) The first expression s:uares a and $! computes the s:uare root of the sum of the s:uares! and assigns the result to c. In other ,ords! it computes a ( a + $ ( $! passes that number to the s7rt function! and assigns s7rt5s return value to c.% The second expression passes the value of the variable theta to the cos cosine% function! multiplies the result by r! and assigns the result to '. The third expression passes the value of the variable @ to the function f-! passes the return value of f- immediately to the function f*! and finally assigns f*5s return value to the variable i.

"H

Chapter 3: Statements and Control lo!


Statements are the 99steps55 of a program. Eost statements compute and assign values or call functions! but ,e ,ill eventually meet several other &inds of statements as ,ell. 'y default! statements are executed in se:uence! one after another. We can! ho,ever! modify that se:uence by using control flow constructs ,hich arrange that a statement or group of statements is executed only if some condition is true or false! or executed over and over again to form a loop. 8 some,hat different &ind of control flo, happens ,hen ,e call a function6 execution of the caller is suspended ,hile the called function proceeds. We5ll discuss functions in chapter $.% Ey definitions of the terms statement and control flow are some,hat circular. 8 statement is an element ,ithin a program ,hich you can apply control flo, to7 control flo, is ho, you specify the order in ,hich the statements in your program are executed. 8 ,ea&er definition of a statement might be 99a part of your program that does something!55 but this definition could as easily be applied to expressions or functions.%

3.1 Expression Statements


BThis section corresponds to (+* Sec. 2."C Eost of the statements in a C program are expression statements. 8n expression statement is simply an expression follo,ed by a semicolon. The lines
i ) ; i ) i + *; and printf("Hello, world!\n");

are all expression statements. In some languages! such as Pascal! the semicolon separates statements! such that the last statement is not follo,ed by a semicolon. In C! ho,ever! the semicolon is a statement terminator7 all simple statements are follo,ed by semicolons. The semicolon is also used for a fe, other things in C7 ,e5ve already seen that it terminates declarations! too.% Expression statements do all of the real ,or& in a C program. Whenever you need to compute ne, values for variables! you5ll typically use expression statements and they5ll typically contain assignment operators%. Whenever you ,ant your program to do something visible! in the real ,orld! you5ll typically call a function as part of an expression statement%. We5ve already seen the most basic example6 calling the function printf to print text to the screen. 'ut anything else you might do..read or ,rite a dis& file! tal& to a modem or printer! dra, pictures on the screen..,ill also involve function calls. Gurthermore! the functions you call to do these things are usually different depending on ,hich operating system you5re using. The C language does not define them! so ,e ,on5t be tal&ing about or using them much.% Expressions and expression statements can be arbitrarily complicated. They don5t have to consist of exactly one simple function call! or of one simple assignment to a variable. Gor one thing! many functions return values! and the values they return can then be used by other

"-

parts of the expression. Gor example! C provides a s7rt s:uare root% function! ,hich ,e might use to compute the hypotenuse of a right triangle li&e this6
c ) s7rt(a(a + $($);

To be useful! an expression statement must do something7 it must have some lasting effect on the state of the program. Gormally! a useful statement must have at least one side effect.% The first t,o sample expression statements in this section above% assign ne, values to the variable i! and the third one calls printf to print something out! and these are good examples of statements that do something useful. To ma&e the distinction clear! ,e may note that degenerate constructions such as
; i; i + *;

or

are syntactically valid statements..they consist of an expression follo,ed by a semicolon..but in each case! they compute a value ,ithout doing anything ,ith it! so the computed value is discarded! and the statement is useless. 'ut if the 99degenerate55 statements in this paragraph don5t ma&e much sense to you! don5t ,orry7 it5s because they! fran&ly! don5t ma&e much sense.% It5s also possible for a single expression to have multiple side effects! but it5s easy for such an expression to be a% confusing or b% undefined. Gor no,! ,e5ll only be loo&ing at expressions and! therefore! statements% ,hich do one ,ell.defined thing at a time.

3.2 if Statements
BThis section corresponds to (+* Sec. 2.4C The simplest ,ay to modify the control flo, of a program is ,ith an if statement! ,hich in its simplest form loo&s li&e this6
if(' > ma') ma' ) ';

Even if you didn5t &no, any C! it ,ould probably be pretty obvious that ,hat happens here is that if ' is greater than ma'! ' gets assigned to ma'. We5d use code li&e this to &eep trac& of the maximum value of ' ,e5d seen..for each ne, '! ,e5d compare it to the old maximum value ma'! and if the ne, value ,as greater! ,e5d update ma'.% Eore generally! ,e can say that the syntax of an if statement is6
if( expression ) statement

,here expression is any expression and statement is any statement. What if you have a series of statements! all of ,hich should be executed together or not at all depending on ,hether some condition is true; The ans,er is that you enclose them in braces6

"#

if( expression ) { statement<sub>1</sub> statement<sub>2</sub> statement<sub>3</sub> !

8s a general rule! any,here the syntax of C calls for a statement! you may ,rite a series of statements enclosed by braces. >ou do not need to! and should not! put a semicolon after the closing brace! because the series of statements enclosed by braces is not itself a simple expression statement.% 8n if statement may also optionally contain a second statement! the 99else clause!55 ,hich is to be executed if the condition is not met. /ere is an example6
if(n > else ) a9era%e ) sum # n; { printf("can4t compute a9era%e\n"); a9era%e ) ; !

The first statement or bloc& of statements is executed if the condition is true! and the second statement or bloc& of statements follo,ing the &ey,ord else% is executed if the condition is not true. In this example! ,e can compute a meaningful average only if n is greater than 17 other,ise! ,e print a message saying that ,e cannot compute the average. The general syntax of an if statement is therefore
if( expression ) statement<sub>1</sub> else statement<sub>2</sub> ,here both statement<sub> </sub> and statement<sub>!</sub>

may be lists of statements

enclosed in braces%. It5s also possible to nest one if statement inside another. Gor that matter! it5s in general possible to nest any &ind of statement or control flo, construct ,ithin another.% Gor example! here is a little piece of code ,hich decides roughly ,hich :uadrant of the compass you5re ,al&ing into! based on an ' value ,hich is positive if you5re ,al&ing east! and a A value ,hich is positive if you5re ,al&ing north6
if(' > ) { if(A > else ! { if(A >

) printf("Bortheast.\n"); printf("Coutheast.\n"); ) printf("Borthwest.\n"); printf("Couthwest.\n");

else

else ! When you have one if statement or loop% nested inside another! it5s a very good idea to use explicit braces {!! as sho,n! to ma&e it clear both to you and to the compiler% ho, they5re nested and ,hich else goes ,ith ,hich if. It5s also a good idea to indent the various levels!

also as sho,n! to ma&e the code more readable to humans. Why do both; >ou use indentation to ma&e the code visually more readable to yourself and other humans! but the compiler doesn5t pay attention to the indentation since all ,hitespace is essentially 41

e:uivalent and is essentially ignored%. Therefore! you also have to ma&e sure that the punctuation is right. /ere is an example of another common arrangement of if and else. Suppose ,e have a variable %rade containing a student5s numeric grade! and ,e ,ant to print out the corresponding letter grade. /ere is code that ,ould do the ?ob6
if(%rade >) D ) printf("5"); else if(%rade >) E ) printf("="); else if(%rade >) 3 ) printf("F"); else if(%rade >) 0 ) printf("G"); else printf("H");

What happens here is that exactly one of the five printf calls is executed! depending on ,hich of the conditions is true. Each condition is tested in turn! and if one is true! the corresponding statement is executed! and the rest are s&ipped. If none of the conditions is true! ,e fall through to the last one! printing 99G55. In the cascaded if<else<if<else<... chain! each else clause is another if statement. This may be more obvious at first if ,e reformat the example! including every set of braces and indenting each if statement relative to the previous one6
if(%rade >) D ) { printf("5"); ! else { if(%rade >) E ) { printf("="); ! else { if(%rade >) 3 ) { printf("F"); ! else { if(%rade >) 0 ) { printf("G"); ! else { printf("H"); ! ! ! !

'y examining the code this ,ay! it should be obvious that exactly one of the printf calls is executed! and that ,henever one of the conditions is found true! the remaining conditions do not need to be chec&ed and none of the later statements ,ithin the chain ,ill be executed. 'ut once you5ve convinced yourself of this and learned to recogni@e the idiom! it5s generally preferable to arrange the statements as in the first example! ,ithout trying to indent each

4"

successive if statement one tabstop further out. =bviously! you5d run into the right margin very :uic&ly if the chain had ?ust a fe, more casesA%

3.3 (oolean Expressions


8n if statement li&e
if(' > ma') ma' ) ';

is perhaps deceptively simple. Conceptually! ,e say that it chec&s ,hether the condition ' > ma' is 99true55 or 99false55. The mechanics underlying C5s conception of 99true55 and 99false!55 ho,ever! deserve some explanation. We need to understand ho, true and false values are represented! and ho, they are interpreted by statements li&e if. 8s far as C is concerned! a true<false condition can be represented as an integer. 8n integer can represent many values7 here ,e care about only t,o values6 99true55 and 99false.55 The study of mathematics involving only t,o values is called 'oolean algebra! after Keorge 'oole! a mathematician ,ho refined this study.% In C! 99false55 is represented by a value of 1 @ero%! and 99true55 is represented by any value that is non@ero. Since there are many non@ero values at least 3$!$2F! for values of type int%! ,hen ,e have to pic& a specific value for 99true!55 ,e5ll pic& ". The relational operators such as <! <)! >! and >) are in fact operators! ?ust li&e +! "! (! and #. The relational operators ta&e t,o values! loo& at them! and 99return55 a value of " or 1 depending on ,hether the tested relation ,as true or false. The complete set of relational operators in C is6
< <) > >) )) !) less than less than or e7ual %reater than %reater than or e7ual e7ual not e7ual

Gor example! * < - is "! . > / is 1! 2 )) 2 is "! and 0 !) 0 is 1. We5ve no, encountered perhaps the most easy.to.stumble.on 99gotchaA55 in C6 the e:uality. testing operator is ))! not a single )! ,hich is assignment. If you accidentally ,rite
if(a ) )

and you probably ,ill at some point7 everybody ma&es this mista&e%! it ,ill not test ,hether a is @ero! as you probably intended. Instead! it ,ill assign 1 to a! and then perform the 99true55 branch of the if statement if a is non@ero. 'ut a ,ill have ?ust been assigned the value 1! so the 99true55 branch ,ill never be ta&enA This could drive you cra@y ,hile debugging..you ,anted to do something if a ,as 1! and after the test! a is 1! ,hether it ,as supposed to be or not! but the 99true55 branch is nevertheless not ta&en.% The relational operators ,or& ,ith arbitrary numbers and generate true<false values. >ou can also combine true<false values by using the "oolean operators! ,hich ta&e true<false values as operands and compute ne, true<false values. The three 'oolean operators are6

44

II JJ !

and or not (ta&es one operand; 66unary44)

The II 99and55% operator ta&es t,o true<false values and produces a true "% result if both operands are true that is! if the left.hand side is true and the right.hand side is true%. The JJ 99or55% operator ta&es t,o true<false values and produces a true "% result if either operand is true. The ! 99not55% operator ta&es a single true<false value and negates it! turning false to true and true to false 1 to " and non@ero to 1%. Gor example! to test ,hether the variable i lies bet,een " and "1! you might use
if(* < i II i < * ) ...

/ere ,e5re expressing the relation 99i is bet,een " and "155 as 99" is less than i and i is less than "1.55 It5s important to understand ,hy the more obvious expression
if(* < i < * ) #( K;LBM (#

,ould not ,or&. The expression * < i < * is parsed by the compiler analogously to * + i + * . The expression * + i + * is parsed as (* + i) + * and means 99add " to i! and then add the result to "1.55 Similarly! the expression * < i < * is parsed as (* < i) < * and means 99see if " is less than i! and then see if the result is less than "1.55 'ut in this case! 99the result55 is " or 1! depending on ,hether i is greater than ". Since both 1 and " are less than "1! the expression * < i < * ,ould always be true in C! regardless of the value of iA *elational and 'oolean expressions are usually used in contexts such as an if statement! ,here something is to be done or not done depending on some condition. In these cases ,hat5s actually chec&ed is ,hether the expression representing the condition has a @ero or non@ero value. 8s long as the expression is a relational or 'oolean expression! the interpretation is ?ust ,hat ,e ,ant. Gor example! ,hen ,e ,rote
if(' > ma')

the > operator produced a " if ' ,as greater than ma'! and a 1 other,ise. The if statement interprets 1 as false and " or any non@ero value% as true. 'ut ,hat if the expression is not a relational or 'oolean expression; 8s far as C is concerned! the controlling expression of conditional statements li&e if% can in fact be any expression6 it doesn5t have to 99loo& li&e55 a 'oolean expression7 it doesn5t have to contain relational or logical operators. 8ll C loo&s at ,hen it5s evaluating an if statement! or any,here else ,here it needs a true<false value% is ,hether the expression evaluates to 1 or non@ero. Gor example! if you have a variable '! and you ,ant to do something if ' is non@ero! it5s possible to ,rite
if(') statement

and the statement ,ill be executed if ' is non@ero since non@ero means 99true55%.

42

This possibility that the controlling expression of an if statement doesn5t have to 99loo& li&e55 a 'oolean expression% is both useful and potentially confusing. It5s useful ,hen you have a variable or a function that is 99conceptually 'oolean!55 that is! one that you consider to hold a true or false actually non@ero or @ero% value. Gor example! if you have a variable 9er$ose ,hich contains a non@ero value ,hen your program should run in verbose mode and @ero ,hen it should be :uiet! you can ,rite things li&e
if(9er$ose) printf("Ctartin% first pass\n");

and this code is both legal and readable! besides ,hich it does ,hat you ,ant. The standard library contains a function isupper() ,hich tests ,hether a character is an upper.case letter! so if c is a character! you might ,rite 'oth of
if(isupper(c)) ... these examples 9er$ose

and isupper()% are useful and readable.

/o,ever! you ,ill eventually come across code li&e


if(n) a9era%e ) sum # n;

,here n is ?ust a number. /ere! the programmer ,ants to compute the average only if n is non@ero other,ise! of course! the code ,ould divide by 1%! and the code ,or&s! because! in the context of the if statement! the trivial expression n is as al,ays% interpreted as 99true55 if it is non@ero! and 99false55 if it is @ero. 99Coding shortcuts55 li&e these can seem cryptic! but they5re also :uite common! so you5ll need to be able to recogni@e them even if you don5t choose to ,rite them in your o,n code. Whenever you see code li&e
if(')

or

if(f()) ,here ' or f() do not have obvious 99'oolean55 non@ero55 or 99if f() returns non@ero.55

names! you can read them as 99if ' is

3.4 while )oops


BThis section corresponds to half of (+* Sec. 2.$C Doops generally consist of t,o parts6 one or more control expressions ,hich not surprisingly% control the execution of the loop! and the body! ,hich is the statement or set of statements ,hich is executed over and over. The most basic loop in C is the while loop. 8 while loop has one control expression! and executes as long as that expression is true. This example repeatedly doubles the number 4 4! F! -! "3! ...% and prints the resulting numbers as long as they are less than "1116
int ' ) -; while(' < * ) { printf(",d\n", ');

4F

' ) ' ( -; !

=nce again! ,e5ve used braces {! to enclose the group of statements ,hich are to be executed together as the body of the loop.% The general syntax of a while loop is
while( expression ) statement while loop starts out li&e an if

8 statement6 if the condition expressed by the expression is true! the statement is executed. /o,ever! after executing the statement! the condition is tested again! and if it5s still true! the statement is executed again. Presumably! the condition depends on some value ,hich is changed in the body of the loop.% 8s long as the condition remains true! the body of the loop is executed over and over again. If the condition is false right at the start! the body of the loop is not executed at all.% 8s another example! if you ,anted to print a number of blan& lines! ,ith the variable n holding the number of blan& lines to be printed! you might use code li&e this6
while(n > ) { printf("\n"); n ) n " *; !

8fter the loop finishes ,hen control 99falls out55 of it! due to the condition being false%! n ,ill have the value 1. >ou use a while loop ,hen you have a statement or group of statements ,hich may have to be executed a number of times to complete their tas&. The controlling expression represents the condition 99the loop is not done55 or 99there5s more ,or& to do.55 8s long as the expression is true! the body of the loop is executed7 presumably! it ma&es at least some progress at its tas&. When the expression becomes false! the tas& is done! and the rest of the program beyond the loop% can proceed. When ,e thin& about a loop in this ,ay! ,e can seen an additional important property6 if the expression evaluates to 99false55 before the very first trip through the loop! ,e ma&e zero trips through the loop. In other ,ords! if the tas& is already done if there5s no ,or& to do% the body of the loop is not executed at all. It5s al,ays a good idea to thin& about the 99boundary conditions55 in a piece of code! and to ma&e sure that the code ,ill ,or& correctly ,hen there is no ,or& to do! or ,hen there is a trivial tas& to do! such as sorting an array of one number. Experience has sho,n that bugs at boundary conditions are :uite common.%

3.# for )oops


BThis section corresponds to the other half of (+* Sec. 2.$C =ur second loop! ,hich ,e5ve seen at least one example of already! is the for loop. The first one ,e sa, ,as6
for (i ) ; i < * ; i ) i + *) printf("i is ,d\n", i); Eore generally! the syntax of a for loop is for( expr<sub>1</sub> ; expr<sub>2</sub> ; expr<sub>3</sub> )

4$

statement

/ere ,e see that the for loop has three control expressions. 8s al,ays! the statement can be a brace.enclosed bloc&.% Eany loops are set up to cause some variable to step through a range of values! or! more generally! to set up an initial condition and then modify some value to perform each succeeding loop as long as some condition is true. The three expressions in a for loop encapsulate these conditions6 expr<sub> </sub> sets up the initial condition! expr<sub>!</sub> tests ,hether another trip through the loop should be ta&en! and expr<sub>#</sub> increments or updates things after each trip through the loop and prior to the next one. In our first example! ,e had i ) as expr<sub> </sub>! i < * as expr<sub>!</sub>! i ) i + * as expr<sub>#</sub>! and the call to printf as statement! the body of the loop. So the loop began by setting i to 1! proceeded as long as i ,as less than "1! printed out i5s value during each trip through the loop! and added " to i bet,een each trip through the loop. When the compiler sees a for loop! first! expr<sub> </sub> is evaluated. Then! expr<sub>!</sub> is evaluated! and if it is true! the body of the loop statement% is executed. Then! expr<sub>#</sub> is evaluated to go to the next step! and expr<sub>!</sub> is evaluated again! to see if there is a next step. )uring the execution of a for loop! the se:uence is6
expr<sub>1</sub> expr<sub>2</sub> statement expr<sub>3</sub> expr<sub>2</sub> statement expr<sub>3</sub> ... expr<sub>2</sub> statement expr<sub>3</sub> expr<sub>2</sub>

The first thing executed is expr<sub> </sub>. expr<sub>#</sub> is evaluated after every trip through the loop. The last thing executed is al,ays expr<sub>!</sub>! because ,hen expr<sub>!</sub> evaluates false! the loop exits. 8ll three expressions of a for loop are optional. If you leave out expr<sub> </sub>! there simply is no initiali@ation step! and the variable s% used ,ith the loop had better have been initiali@ed already. If you leave out expr<sub>!</sub>! there is no test! and the default for the for loop is that another trip through the loop should be ta&en such that unless you brea& out of it some other ,ay! the loop runs forever%. If you leave out expr<sub>#</sub>! there is no increment step. The semicolons separate the three controlling expressions of a for loop. These semicolons! by the ,ay! have nothing to do ,ith statement terminators.% If you leave out one or more of the expressions! the semicolons remain. Therefore! one ,ay of ,riting a deliberately infinite loop in C is
for(;;) ...

43

It5s useful to compare C5s for loop to the e:uivalent loops in other computer languages you might &no,. The C loop
for(i ) '; i <) A; i ) i + 8)

is roughly e:uivalent to6


do *

for < ) N to O step P i)',A,8

(BASIC) (FORTRA ) (!as"a#)

for i Q) ' to A

In C unli&e G=*T*80%! if the test condition is false before the first trip through the loop! the loop ,on5t be traversed at all. In C unli&e Pascal%! a loop control variable in this case! i% is guaranteed to retain its final value after the loop completes! and it is also legal to modify the control variable ,ithin the loop! if you really ,ant to. When the loop terminates due to the test condition turning false! the value of the control variable after the loop ,ill be the first value for ,hich the condition failed! not the last value for ,hich it succeeded.% It5s also ,orth noting that a for loop can be used in more general ,ays than the simple! iterative examples ,e5ve seen so far. The 99control variable55 of a for loop does not have to be an integer! and it does not have to be incremented by an additive increment. It could be 99incremented55 by a multiplicative factor "! 4! F! -! ...% if that ,as ,hat you needed! or it could be a floating.point variable! or it could be another type of variable ,hich ,e haven5t met yet ,hich ,ould step! not over numeric values! but over the elements of an array or other data structure. Strictly spea&ing! a for loop doesn5t have to have a 99control variable55 at all7 the three expressions can be anything! although the loop ,ill ma&e the most sense if they are related and together form the expected initiali@e! test! increment se:uence. The po,ers.of.t,o example of the previous section does fit this pattern! so ,e could re,rite it li&e this6
int '; for(' ) -; ' < * ; ' ) ' ( -) printf(",d\n", ');

There is no earth.sha&ing or fundamental difference bet,een the while and for loops. In fact! given the general for loop
for(expr<sub>1</sub>; expr<sub>2</sub>; expr<sub>3</sub>) statement could usually re,rite it as a while loop! moving the initiali@e and increment expressions

you to statements before and ,ithin the loop6

expr<sub>1</sub> ; while(expr<sub>2</sub>) { statement expr<sub>3</sub> ; ! Similarly! given the general while loop while(expr) statement you could re,rite it as a for loop6 for(; expr; ) statement

4H

8nother contrast bet,een the for and while loops is that although the test expression expr<sub>!</sub>% is optional in a for loop! it is re:uired in a while loop. If you leave out the controlling expression of a while loop! the compiler ,ill complain about a syntax error. To ,rite a deliberately infinite while loop! you have to supply an expression ,hich is al,ays non@ero. The most obvious one ,ould simply be while(*) .% If it5s possible to re,rite a for loop as a while loop and vice versa! ,hy do they both exist; Which one should you choose; In general! ,hen you choose a for loop! its three expressions should all manipulate the same variable or data structure! using the initiali@e! test! increment pattern. If they don5t manipulate the same variable or don5t follo, that pattern! ,edging them into a for loop buys nothing and a while loop ,ould probably be clearer. The reason that one loop or the other can be clearer is simply that! ,hen you see a for loop! you expect to see an idiomatic initiali@e<test<increment of a single variable! and if the for loop you5re loo&ing at doesn5t end up matching that pattern! you5ve been momentarily misled.%

3.& break and continue


BThis section corresponds to (+* Sec. 2.HC Sometimes! due to an exceptional condition! you need to ?ump out of a loop early! that is! before the main controlling expression of the loop causes it to terminate normally. =ther times! in an elaborate loop! you may ,ant to ?ump bac& to the top of the loop to test the controlling expression again! and perhaps begin a ne, trip through the loop% ,ithout playing out all the steps of the current loop. The $rea& and continue statements allo, you to do these t,o things. They are! in fact! essentially restricted forms of %oto.% To put everything ,e5ve seen in this chapter together! as ,ell as demonstrate the use of the $rea& statement! here is a program for printing prime numbers bet,een " and "116
#include <stdio.h> #include <math.h> main() { int i, @; printf(",d\n", -); for(i ) .; i <) * ; i ) i + *) { for(@ ) -; @ < i; @ ) @ + *) { if(i , @ )) ) $rea&; if(@ > s7rt(i)) { printf(",d\n", i); $rea&; ! ! ! return ! ;

4-

The outer loop steps the variable i through the numbers from 2 to "117 the code tests to see if each number has any divisors other than " and itself. The trial divisor @ loops from 4 up to i. @ is a divisor of i if the remainder of i divided by @ is 1! so the code uses C5s 99remainder55 or 99modulus55 operator , to ma&e this test. *emember that i , @ gives the remainder ,hen i is divided by @.% If the program finds a divisor! it uses $rea& to brea& out of the inner loop! ,ithout printing anything. 'ut if it notices that @ has risen higher than the s:uare root of i! ,ithout its having found any divisors! then i must not have any divisors! so i is prime! and its value is printed. =nce ,e5ve determined that i is prime by noticing that @ > s7rt(i)! there5s no need to try the other trial divisors! so ,e use a second $rea& statement to brea& out of the loop in that case! too.% The simple algorithm and implementation ,e used here li&e many simple prime number algorithms% does not ,or& for 4! the only even prime number! so the program 99cheats55 and prints out 4 no matter ,hat! before going on to test the numbers from 2 to "11. Eany improvements to this simple program are of course possible7 you might experiment ,ith it. )id you notice that the 99test55 expression of the inner loop for(@ ) -; @ < i; @ ) @ + *) is in a sense unnecessary! because the loop al,ays terminates early due to one of the t,o $rea& statements;%

4#

Chapter ": #ore a$out Declarations %and Initiali&ation'


4.1 Arrays
So far! ,e5ve been declaring simple variables6 the declaration
int i;

declares a single variable! named i! of type int. It is also possible to declare an array of several elements. The declaration
int aR* S;

declares an array! named a! consisting of ten elements! each of type int. Simply spea&ing! an array is a variable that can hold more than one value. >ou specify ,hich of the several values you5re referring to at any given time by using a numeric subscript. 8rrays in programming are similar to vectors or matrices in mathematics.% We can represent the array a above ,ith a picture li&e this6 In C! arrays are zero$based6 the ten elements of a "1.element array are numbered from 1 to #. The subscript ,hich specifies a single element of an array is simply an integer expression in s:uare brac&ets. The first element of the array is aR S! the second element is aR*S! etc. >ou can use these 99array subscript expressions55 any,here you can use the name of a simple variable! for example6
aR S ) * ; aR*S ) - ; aR-S ) aR S + aR*S;

0otice that the subscripted array references i.e. expressions such as aR S and aR*S% can appear on either side of the assignment operator. The subscript does not have to be a constant li&e or *7 it can be any integral expression. Gor example! it5s common to loop over all elements of an array6
int i; for(i ) ; i < * ; i ) i + *) aRiS ) ; ten elements of the array a to

This loop sets all

1.

8rrays are a real convenience for many problems! but there is not a lot that C ,ill do ,ith them for you automatically. In particular! you can neither set all elements of an array at once nor assign one array to another7 both of the assignments
a ) ; #( K;LBM (#

and

int $R* S; $ ) a;

#( K;LBM (#

are illegal.

21

To set all of the elements of an array to some value! you must do so one by one! as in the loop example above. To copy the contents of one array to another! you must again do so one by one6
int $R* S; for(i ) ; i < * ; i ) i + *) $RiS ) aRiS;

*emember that for an array declared there is topmost element is aRDS. This is one reason that @ero.based loops are also common in C. 0ote that the for loop
for(i ) ; i < * ; i ) i + *) ... int aR* S; no element aR* S7 the

does ?ust ,hat you ,ant in this case6 it starts at 1! the number "1 suggests correctly% that it goes through "1 iterations! but the less.than comparison means that the last trip through the loop has i set to #. The comparison i <) D ,ould also ,or&! but it ,ould be less clear and therefore poorer style.% In the little examples so far! ,e5ve al,ays looped over all "1 elements of the sample array a. It5s common! ho,ever! to use an array that5s bigger than necessarily needed! and to use a second variable to &eep trac& of ho, many elements of the array are currently in use. Gor example! ,e might have an integer variable
int na; #( num$er of elements of aRS in use (#

Then! ,hen ,e ,anted to do something ,ith a such as print it out%! the loop ,ould run from 1 to na! not "1 or ,hatever a5s si@e ,as%6
for(i ) ; i < na; i ) i + *) printf(",d\n", aRiS);

0aturally! ,e ,ould have to ensure ensure that na5s value ,as al,ays less than or e:ual to the number of elements actually declared in a. 8rrays are not limited to type int7 you can have arrays of char or dou$le or any other type. /ere is a slightly larger example of the use of arrays. Suppose ,e ,ant to investigate the behavior of rolling a pair of dice. The total roll can be any,here from 4 to "4! and ,e ,ant to count ho, often each roll comes up. We ,ill use an array to &eep trac& of the counts6 aR-S ,ill count ho, many times ,e5ve rolled 4! etc. We5ll simulate the roll of a die by calling C5s random number generation function! rand(). Each time you call rand()! it returns a different! pseudo.random integer. The values that rand() returns typically span a large range! so ,e5ll use C5s modulus or 99remainder55% operator , to produce random numbers in the range ,e ,ant. The expression rand() , 0 produces random numbers in the range 1 to $! and rand() , 0 + * produces random numbers in the range " to 3. /ere is the program6
#include <stdio.h> #include <stdli$.h> main()

2"

int i; int d*, d-; int aR*.S;

#( uses R-..*-S (#

for(i ) -; i <) *-; i ) i + *) aRiS ) ; for(i ) ; i { d* ) d- ) aRd* ! < * ; i ) i + *)

rand() , 0 + *; rand() , 0 + *; + d-S ) aRd* + d-S + *;

for(i ) -; i <) *-; i ) i + *) printf(",dQ ,d\n", i, aRiS); ! return ;

We include the header <stdli$.h> because it contains the necessary declarations for the rand() function. We declare the array of si@e "2 so that its highest element ,ill be aR*-S. We5re ,asting aR S and aR*S7 this is no great loss.% The variables d* and d- contain the rolls of the t,o individual dice7 ,e add them together to decide ,hich cell of the array to increment! in the line
aRd* + d-S ) aRd* + d-S + *;

8fter "11 rolls! ,e print the array out. Typically as craps players ,ell &no,%! ,e5ll see mostly H5s! and relatively fe, 45s and "45s. 'y the ,ay! it turns out that using the , operator to reduce the range of the rand function is not al,ays a good idea. We5ll say more about this problem in an exercise.%

4.1.1 Array Initialization


8lthough it is not possible to assign to all elements of an array at once using an assignment expression! it is possible to initiali@e some or all elements of an array ,hen the array is defined. The syntax loo&s li&e this6
int aR* S ) { , *, -, ., /, 2, 0, 3, E, D!;

The list of values! enclosed in braces {!! separated by commas! provides the initial values for successive elements of the array. Under older! pre.80SI C compilers! you could not al,ays supply initiali@ers for 99local55 arrays inside functions7 you could only initiali@e 99global55 arrays! those outside of any function. Those compilers are no, rare! so you shouldn5t have to ,orry about this distinction any more. We5ll tal& more about local and global variables later in this chapter.% If there are fe,er initiali@ers than elements in the array! the remaining elements are automatically initiali@ed to 1. Gor example!
int aR* S ) { , *, -, ., /, 2, 0!;

,ould initiali@e aR3S! aRES! and aRDS to 1. When an array definition includes an initiali@er! the array dimension may be omitted! and the compiler ,ill infer the dimension from the number of initiali@ers. Gor example! 24

int $RS ) {* , **, *-, *., */!;

,ould declare! define! and initiali@e an array $ of $ elements i.e. ?ust as if you5d typed int $R2S%. =nly the dimension is omitted7 the brac&ets RS remain to indicate that $ is in fact an array. In the case of arrays of char! the initiali@er may be a string constant6
char s*R3S ) "Hello,"; char s-R* S ) "there,"; char s.RS ) "world!";

8s before! if the dimension is omitted! it is inferred from the si@e of the string initiali@er. We haven5t covered strings in detail yet..,e5ll do so in chapter -..but it turns out that all strings in C are terminated by a special character ,ith the value 1. Therefore! the array s. ,ill be of si@e H! and the explicitly.si@ed s* does need to be of si@e at least H. Gor s-! the last F characters in the array ,ill all end up being this @ero.value character.%

4.1.2 Arrays of Arrays (``Multidimensional'' Arrays)


BThis section is optional and may be s&ipped.C When ,e said that 998rrays are not limited to type int7 you can have arrays of... any other type!55 ,e meant that more literally than you might have guessed. If you have an 99array of int!55 it means that you have an array each of ,hose elements is of type int. 'ut you can have an array each of ,hose elements is of type x! ,here x is any type you choose. In particular! you can have an array each of ,hose elements is another arrayA We can use these arrays of arrays for the same sorts of tas&s as ,e5d use multidimensional arrays in other computer languages or matrices in mathematics%. 0aturally! ,e are not limited to arrays of arrays! either7 ,e could have an array of arrays of arrays! ,hich ,ould act li&e a 2. dimensional array! etc. The declaration of an array of arrays loo&s li&e this6
int a-R2SR3S;

>ou have to read complicated declarations li&e these 99inside out.55 What this one says is that a- is an array of $ somethings! and that each of the somethings is an array of H ints. Eore briefly! 99a- is an array of $ arrays of H ints!55 or! 99a- is an array of array of int.55 In the declaration of a-! the brac&ets closest to the identifier a- tell you ,hat a- first and foremost is. That5s ho, you &no, it5s an array of $ arrays of si@e H! not the other ,ay around. >ou can thin& of a- as having $ 99ro,s55 and H 99columns!55 although this interpretation is not mandatory. >ou could also treat the 99first55 or inner subscript as 99x55 and the second as 99y.55 Unless you5re doing something fancy! all you have to ,orry about is that the subscripts ,hen you access the array match those that you used ,hen you declared it! as in the examples belo,.% To illustrate the use of multidimensional arrays! ,e might fill in the elements of the above array a- using this piece of code6
int i, @; for(i ) ; i < 2; i ) i + *) {

22

for(@ ) !

; @ < 3; @ ) @ + *) a-RiSR@S ) * ( i + @;

This pair of nested loops sets aR*SR-S to "4! aR/SR*S to F"! etc. Since the first dimension of a- is $! the first subscripting index variable! i! runs from 1 to F. Similarly! the second subscript varies from 1 to 3. We could print a- out in a t,o.dimensional ,ay! suggesting its structure% ,ith a similar pair of nested loops6
for(i ) ; i < 2; i ) i + *) { for(@ ) ; @ < 3; @ ) @ + *) printf(",d\t", a-RiSR@S); printf("\n"); ! \t in the printf string is the tab character.%

The character

Lust to see more clearly ,hat5s going on! ,e could ma&e the 99ro,55 and 99column55 subscripts explicit by printing them! too6
for(@ ) ; @ < 3; @ ) @ + *) printf("\t,dQ", @); printf("\n"); for(i ) ; i < 2; i ) i + *) { printf(",dQ", i); for(@ ) ; @ < 3; @ ) @ + *) printf("\t,d", a-RiSR@S); printf("\n"); ! *Q * ** -* .* /* -Q *-./.Q . *. -. .. /. /Q / */ -/ ./ // 2Q 2 *2 -2 .2 /2 0Q 0 *0 -0 .0 /0

This last fragment ,ould print


Q *Q -Q .Q /Q Q * . /

Ginally! there5s no reason ,e have to loop over the 99ro,s55 first and the 99columns55 second7 depending on ,hat ,e ,anted to do! ,e could interchange the t,o loops! li&e this6
for(@ ) ; @ < 3; @ ) @ + *) { for(i ) ; i < 2; i ) i + *) printf(",d\t", a-RiSR@S); printf("\n"); !

0otice that i is still the first subscript and it still runs from 1 to F! and @ is still the second subscript and it still runs from 1 to 3.

2F

4.2 isi!ility and )i*etime +,lo!al aria!les- etc..


We haven5t said so explicitly! but variables are channels of communication ,ithin a program. >ou set a variable to a value at one point in a program! and at another point or points% you read the value out again. The t,o points may be in ad?oining statements! or they may be in ,idely separated parts of the program. /o, long does a variable last; /o, ,idely separated can the setting and fetching parts of the program be! and ho, long after a variable is set does it persist; )epending on the variable and ho, you5re using it! you might ,ant different ans,ers to these :uestions. The visibility of a variable determines ho, much of the rest of the program can access that variable. >ou can arrange that a variable is visible only ,ithin one part of one function! or in one function! or in one source file! or any,here in the program. We haven5t really tal&ed about source files yet7 ,e5ll be exploring them soon.% Why ,ould you ,ant to limit the visibility of a variable; Gor maximum flexibility! ,ouldn5t it be handy if all variables ,ere potentially visible every,here; 8s it happens! that arrangement ,ould be too flexible6 every,here in the program! you ,ould have to &eep trac& of the names of all the variables declared any,here else in the program! so that you didn5t accidentally re.use one. Whenever a variable had the ,rong value by mista&e! you5d have to search the entire program for the bug! because any statement in the entire program could potentially have modified that variable. >ou ,ould constantly be stepping all over yourself by using a common variable name li&e i in t,o parts of your program! and having one snippet of code accidentally over,rite the values being used by another part of the code. The communication ,ould be sort of li&e an old party line..you5d al,ays be accidentally interrupting other conversations! or having your conversations interrupted. To avoid this confusion! ,e generally give variables the narro,est or smallest visibility they need. 8 variable declared ,ithin the braces {! of a function is visible only ,ithin that function7 variables declared ,ithin functions are called local variables. If another function some,here else declares a local variable ,ith the same name! it5s a different variable entirely! and the t,o don5t clash ,ith each other. =n the other hand! a variable declared outside of any function is a global variable! and it is potentially visible any,here ,ithin the program. >ou use global variables ,hen you do ,ant the communications path to be able to travel to any part of the program. When you declare a global variable! you ,ill usually give it a longer! more descriptive name not something generic li&e i% so that ,henever you use it you ,ill remember that it5s the same variable every,here. 8nother ,ord for the visibility of variables is scope. /o, long do variables last; 'y default! local variables those declared ,ithin a function% have automatic duration6 they spring into existence ,hen the function is called! and they and their values% disappear ,hen the function returns. Klobal variables! on the other hand! have static duration6 they last! and the values stored in them persist! for as long as the program does. =f course! the values can in general still be over,ritten! so they don5t necessarily persist forever.%

2$

Ginally! it is possible to split a function up into several source files! for easier maintenance. When several source files are combined into one program ,e5ll be seeing ho, in the next chapter% the compiler must have a ,ay of correlating the global variables ,hich might be used to communicate bet,een the several source files. Gurthermore! if a global variable is going to be useful for communication! there must be exactly one of it6 you ,ouldn5t ,ant one function in one source file to store a value in one global variable named %lo$al9ar! and then have another function in another source file read from a different global variable named %lo$al9ar. Therefore! a global variable should have exactly one defining instance! in one place in one source file. If the same variable is to be used any,here else i.e. in some other source file or files%! the variable is declared in those other file s% ,ith an external declaration! ,hich is not a defining instance. The external declaration says! 99hey! compiler! here5s the name and type of a global variable I5m going to use! but don5t define it here! don5t allocate space for it7 it5s one that5s defined some,here else! and I5m ?ust referring to it here.55 If you accidentally have t,o distinct defining instances for a variable of the same name! the compiler or the lin&er% ,ill complain that it is 99multiply defined.55 It is also possible to have a variable ,hich is global in the sense that it is declared outside of any function! but private to the one source file it5s defined in. Such a variable is visible to the functions in that source file but not to any functions in any other source files! even if they try to issue a matching declaration. >ou get any extra control you might need over visibility and lifetime! and you distinguish bet,een defining instances and external declarations! by using storage classes. 8 storage class is an extra &ey,ord at the beginning of a declaration ,hich modifies the declaration in some ,ay. Kenerally! the storage class if any% is the first ,ord in the declaration! preceding the type name. Strictly spea&ing! this ordering has not traditionally been necessary! and you may see some code ,ith the storage class! type name! and other parts of a declaration in an unusual order.% We said that! by default! local variables had automatic duration. To give them static duration so that! instead of coming and going as the function is called! they persist for as long as the function does%! you precede their declaration ,ith the static &ey,ord6
static int i;

'y default! a declaration of a global variable especially if it specifies an initial value% is the defining instance. To ma&e it an external declaration! of a variable ,hich is defined some,here else! you precede it ,ith the &ey,ord e'tern6
e'tern int @;

Ginally! to arrange that a global variable is visible only ,ithin its containing source file! you precede it ,ith the static &ey,ord6
static int &;

0otice that the static &ey,ord can do t,o different things6 it ad?usts the duration of a local variable from automatic to static! or it ad?usts the visibility of a global variable from truly global to private.to.the.file.

23

To summari@e! ,e5ve tal&ed about t,o different attributes of a variable6 visibility and duration. These are orthogonal! as sho,n in this table6 duration6 visibility6 automatic static local normal local variables static local variables global 0<8 normal global variables We can also distinguish bet,een file.scope global variables and truly global variables! based on the presence or absence of the static &ey,ord. We can also distinguish bet,een external declarations and defining instances of global variables! based on the presence or absence of the e'tern &ey,ord.

4.3 De*ault /nitiali0ation


The duration of a variable ,hether static or automatic% also affects its default initiali@ation. If you do not explicitly initiali@e them! automatic.duration variables that is! local! non. static ones% are not guaranteed to have any particular initial value7 they ,ill typically contain garbage. It is therefore a fairly serious error to attempt to use the value of an automatic variable ,hich has never been initiali@ed or assigned to6 the program ,ill either ,or& incorrectly! or the garbage value may ?ust happen to be 99correct55 such that the program appears to ,or& correctlyA /o,ever! the particular value that the garbage ta&es on can vary depending literally on anything6 other parts of the program! ,hich compiler ,as used! ,hich hard,are or operating system the program is running on! the time of day! the phase of the moon. =&ay! maybe the phase of the moon is a bit of an exaggeration.% So you hardly ,ant to say that a program ,hich uses an uninitiali@ed variable 99,or&s557 it may seem to ,or&! but it ,or&s for the ,rong reason! and it may stop ,or&ing tomorro,. Static.duration variables global and static local%! on the other hand! are guaranteed to be initiali@ed to 1 if you do not use an explicit initiali@er in the definition. =nce upon a time! there ,as another distinction bet,een the initiali@ation of automatic vs. static variables6 you could initiali@e aggregate ob?ects! such as arrays! only if they had static duration. If your compiler complains ,hen you try to initiali@e a local array! it5s probably an old! pre.80SI compiler. Eodern! 80SI.compatible compilers remove this limitation! so it5s no longer much of a concern.%

4.4 Examples
/ere is an example demonstrating almost everything ,e5ve seen so far6
int %lo$al9ar ) *; e'tern int another%lo$al9ar; static int pri9ate9ar; f() { int local9ar;

2H

int local9ar- ) -; static int persistent9ar; !

/ere ,e have six variables! three declared outside and three declared inside of the function f(). is a global variable. The declaration ,e see is its defining instance it happens also to include an initial value%. %lo$al9ar can be used any,here in this source file! and it could be used in other source files! too as long as corresponding external declarations are issued in those other source files%.
%lo$al9ar

is a second global variable. It is not defined here7 the defining instance for it and its initiali@ation% is some,here else.
another%lo$al9ar

is a 99private55 global variable. It can be used any,here ,ithin this source file! but functions in other source files cannot access it! even if they try to issue external declarations for it. If other source files try to declare a global variable called 99pri9ate9ar55! they5ll get their o,n7 they ,on5t be sharing this one.% Since it has static duration and receives no explicit initiali@ation! pri9ate9ar ,ill be initiali@ed to 1.
pri9ate9ar local9ar is a function f().

local variable ,ithin the function f(). It can be accessed only ,ithin the If any other part of the program declares a variable named 99local9ar55! that variable ,ill be distinct from the one ,e5re loo&ing at here.% local9ar is conceptually 99created55 each time f() is called! and disappears ,hen f() returns. 8ny value ,hich ,as stored in local9ar last time f() ,as running ,ill be lost and ,ill not be available next time f() is called. Gurthermore! since it has no explicit initiali@er! the value of local9ar ,ill in general be garbage each time f() is called. is also local! and everything that ,e said about local9ar applies to it! except that since its declaration includes an explicit initiali@er! it ,ill be initiali@ed to 4 each time f() is called.
local9ar-

Ginally! persistent9ar is again local to f()! but it does maintain its value bet,een calls to f(). It has static duration but no explicit initiali@er! so its initial value ,ill be 1. The defining instances and external declarations ,e5ve been loo&ing at so far have all been of simple variables. There are also defining instances and external declarations of functions! ,hich ,e5ll be loo&ing at in the next chapter. 8lso! don5t ,orry about static variables for no, if they don5t ma&e sense to you7 they5re a relatively sophisticated concept! ,hich you ,on5t need to use at first.% The term declaration is a general one ,hich encompasses defining instances and external declarations7 defining instances and external declarations are t,o different &inds of declarations. Gurthermore! either &ind of declaration suffices to inform the compiler of the name and type of a particular variable or function%. If you have the defining instance of a global variable in a source file! the rest of that source file can use that variable ,ithout having to issue any external declarations. It5s only in source files ,here the defining instance hasn5t been seen that you need external declarations. 2-

>ou ,ill sometimes hear a defining instance referred to simply as a 99definition!55 and you ,ill sometimes hear an external declaration referred to simply as a 99declaration.55 These usages are mildly ambiguous! in that you can5t tell out of context ,hether a 99declaration55 is a generic declaration that might be a defining instance or an external declaration% or ,hether it5s an external declaration that specifically is not a defining instance. Similarly! there are other constructions that can be called 99definitions55 in C! namely the definitions of preprocessor macros! structures! and typedefs! none of ,hich ,e5ve met.% In these notes! ,e5ll try to ma&e things clear by using the unambiguous terms defining instance and external declaration. Else,here! you may have to loo& at the context to determine ho, the terms 99definition55 and 99declaration55 are being used.

2#

Chapter (: unctions and Program Structure


BThis chapter corresponds to (+* chapter F.C 8 function is a 99blac& box55 that ,e5ve loc&ed part of our program into. The idea behind a function is that it compartmentalizes part of the program! and in particular! that the code ,ithin the function has some useful properties6 ". It performs some ,ell.defined tas&! ,hich ,ill be useful to other parts of the program. 4. It might be useful to other programs as ,ell7 that is! ,e might be able to reuse it and ,ithout having to re,rite it%. 2. The rest of the program doesn5t have to &no, the details of ho, the function is implemented. This can ma&e the rest of the program easier to thin& about. F. The function performs its tas& well. It may be ,ritten to do a little more than is re:uired by the first program that calls it! ,ith the anticipation that the calling program or some other program% may later need the extra functionality or improved performance. It5s important that a finished function do its ?ob ,ell! other,ise there might be a reluctance to call it! and it therefore might not achieve the goal of reusability.% $. 'y placing the code to perform the useful tas& into a function! and simply calling the function in the other parts of the program ,here the tas& must be performed! the rest of the program becomes clearer6 rather than having some large! complicated! difficult. to.understand piece of code repeated ,herever the tas& is being performed! ,e have a single simple function call! and the name of the function reminds us ,hich tas& is being performed. 3. Since the rest of the program doesn5t have to &no, the details of ho, the function is implemented! the rest of the program doesn5t care if the function is reimplemented later! in some different ,ay as long as it continues to perform its same tas&! of courseA%. This means that one part of the program can be re,ritten! to improve performance or add a ne, feature or simply to fix a bug%! ,ithout having to re,rite the rest of the program. Gunctions are probably the most important ,eapon in our battle against soft,are complexity. >ou5ll ,ant to learn ,hen it5s appropriate to brea& processing out into functions and also ,hen it5s not%! and how to set up function interfaces to best achieve the :ualities mentioned above6 reuseability! information hiding! clarity! and maintainability.

#.1 Function (asics


So ,hat defines a function; It has a name that you call it by! and a list of @ero or more arguments or parameters that you hand to it for it to act on or to direct its ,or&7 it has a body containing the actual instructions statements% for carrying out the tas& the function is supposed to perform7 and it may give you bac& a return value! of a particular type. /ere is a very simple function! ,hich accepts one argument! multiplies it by 4! and hands that value bac&6 F1

int mult$Atwo(int ') { int ret9al; ret9al ) ' ( -; return ret9al; !

=n the first line ,e see the return type of the function int%! the name of the function mult$Atwo%! and a list of the function5s arguments! enclosed in parentheses. Each argument has both a name and a type7 mult$Atwo accepts one argument! of type int! named '. The name ' is arbitrary! and is used only ,ithin the definition of mult$Atwo. The caller of this function only needs to &no, that a single argument of type int is expected7 the caller does not need to &no, ,hat name the function ,ill use internally to refer to that argument. In particular! the caller does not have to pass the value of a variable named '.% 0ext ,e see! surrounded by the familiar braces! the body of the function itself. This function consists of one declaration of a local variable ret9al% and t,o statements. The first statement is a conventional expression statement! ,hich computes and assigns a value to ret9al! and the second statement is a return statement! ,hich causes the function to return to its caller! and also specifies the value ,hich the function returns to its caller. The return statement can return the value of any expression! so ,e don5t really need the local ret9al variable7 the function could be collapsed to
int mult$Atwo(int ') { return ' ( -; !

/o, do ,e call a function; We5ve been doing so informally since day one! but no, ,e have a chance to call one that ,e5ve ,ritten! in full detail. /ere is a tiny s&eletal program to call mult$A-6
#include <stdio.h> e'tern int mult$Atwo(int); int main() { int i, @; i ) .; @ ) mult$Atwo(i); printf(",d\n", @); return ; !

This loo&s much li&e our other test programs! ,ith the exception of the ne, line
e'tern int mult$Atwo(int);

This is an external function prototype declaration. It is an external declaration! in that it declares something ,hich is defined some,here else. We5ve already seen the defining instance of the function mult$Atwo! but maybe the compiler hasn5t seen it yet.% The function prototype declaration contains the three pieces of information about the function that a caller needs to &no,6 the function5s name! return type! and argument type s%. Since ,e don5t care ,hat name the mult$Atwo function ,ill use to refer to its first argument! ,e don5t need to mention it. =n the other hand! if a function ta&es several arguments! giving them names in the prototype may ma&e it easier to remember ,hich is ,hich! so names may optionally be F"

used in function prototype declarations.% Ginally! to remind us that this is an external declaration and not a defining instance! the prototype is preceded by the &ey,ord e'tern. The presence of the function prototype declaration lets the compiler &no, that ,e intend to call this function! mult$Atwo. The information in the prototype lets the compiler generate the correct code for calling the function! and also enables the compiler to chec& up on our code by ma&ing sure! for example! that ,e pass the correct number of arguments to each function ,e call%. )o,n in the body of main! the action of the function call should be obvious6 the line
@ ) mult$Atwo(i);

calls mult$Atwo! passing it the value of i as its argument. When mult$Atwo returns! the return value is assigned to the variable @. 0otice that the value of main5s local variable i ,ill become the value of mult$Atwo5s parameter '7 this is absolutely not a problem! and is a normal sort of affair.% This example is ,ritten out in 99longhand!55 to ma&e each step e:uivalent. The variable i isn5t really needed! since ,e could ?ust as ,ell call
@ ) mult$Atwo(.);

8nd the variable @ isn5t really needed! either! since ,e could ?ust as ,ell call
printf(",d\n", mult$Atwo(.)); /ere! the call to mult$Atwo is a subexpression ,hich serves as the second argument to printf. The value returned by mult$Atwo is passed immediately to printf. /ere! as in

general! ,e see the flexibility and generality of expressions in C. 8n argument passed to a function may be an arbitrarily complex subexpression! and a function call is itself an expression ,hich may be embedded as a subexpression ,ithin arbitrarily complicated surrounding expressions.% We should say a little more about the mechanism by ,hich an argument is passed do,n from a caller into a function. Gormally! C is call by value! ,hich means that a function receives copies of the values of its arguments. We can illustrate this ,ith an example. Suppose! in our implementation of mult$Atwo! ,e had gotten rid of the unnecessary ret9al variable li&e this6
int mult$Atwo(int ') { ' ) ' ( -; return '; !

We might ,onder! if ,e ,rote it this ,ay! ,hat ,ould happen to the value of the variable i ,hen ,e called
@ ) mult$Atwo(i);

When our implementation of mult$Atwo changes the value of '! does that change the value of i up in the caller; The ans,er is no. ' receives a copy of i5s value! so ,hen ,e change ' ,e don5t change i. /o,ever! there is an exception to this rule. When the argument you pass to a function is not a single variable! but is rather an array! the function does not receive a copy of the array! and it therefore can modify the array in the caller. The reason is that it might be too expensive to

F4

copy the entire array! and furthermore! it can be useful for the function to ,rite into the caller5s array! as a ,ay of handing bac& more data than ,ould fit in the function5s single return value. We5ll see an example of an array argument ,hich the function deliberately ,rites into% in the next chapter.

#.2 Function Prototypes


In modern C programming! it is considered good practice to use prototype declarations for all functions that you call. 8s ,e mentioned! these prototypes help to ensure that the compiler can generate correct code for calling the functions! as ,ell as allo,ing the compiler to catch certain mista&es you might ma&e. Strictly spea&ing! ho,ever! prototypes are optional. If you call a function for ,hich the compiler has not seen a prototype! the compiler ,ill do the best it can! assuming that you5re calling the function correctly. If prototypes are a good idea! and if ,e5re going to get in the habit of ,riting function prototype declarations for functions ,e call that ,e5ve ,ritten such as mult$Atwo%! ,hat happens for library functions such as printf; Where are their prototypes; The ans,er is in that boilerplate line
#include <stdio.h>

,e5ve been including at the top of all of our programs. stdio.h is conceptually a file full of external declarations and other information pertaining to the 99Standard I<=55 library functions! including printf. The #include directive ,hich ,e5ll meet formally in a later chapter% arranges that all of the declarations ,ithin stdio.h are considered by the compiler! rather as if ,e5d typed them all in ourselves. Some,here ,ithin these declarations is an external function prototype declaration for printf! ,hich satisfies the rule that there should be a prototype for each function ,e call. Gor other standard library functions ,e call! there ,ill be other 99header files55 to include.% Ginally! one more thing about external function prototype declarations. We5ve said that the distinction bet,een external declarations and defining instances of normal variables hinges on the presence or absence of the &ey,ord e'tern. The situation is a little bit different for functions. The 99defining instance55 of a function is the function! including its body that is! the brace.enclosed list of declarations and statements implementing the function%. 8n external declaration of a function! even ,ithout the &ey,ord e'tern! loo&s nothing li&e a function declaration. Therefore! the &ey,ord e'tern is optional in function prototype declarations. If you ,ish! you can ,rite
int mult$Atwo(int);

and this is ?ust as good an external function prototype declaration as


e'tern int mult$Atwo(int);

In the first form! ,ithout the e'tern! as soon as the compiler sees the semicolon! it &no,s it5s not going to see a function body! so the declaration can5t be a definition.% >ou may ,ant to stay in the habit of using e'tern in all external declarations! including function declarations! since 99e'tern M external declaration55 is an easier rule to remember.

#.3 Function P$ilosop$y


What ma&es a good function; The most important aspect of a good 99building bloc&55 is that have a single! ,ell.defined tas& to perform. When you find that a program is hard to manage!

F2

it5s often because it has not been designed and bro&en up into functions cleanly. T,o obvious reasons for moving code do,n into a function are because6 ". It appeared in the main program several times! such that by ma&ing it a function! it can be ,ritten ?ust once! and the several places ,here it used to appear can be replaced ,ith calls to the ne, function. 4. The main program ,as getting too big! so it could be made presumably% smaller and more manageable by lopping part of it off and ma&ing it a function. These t,o reasons are important! and they represent significant benefits of ,ell.chosen functions! but they are not sufficient to automatically identify a good function. 8s ,e5ve been suggesting! a good function has at least these t,o additional attributes6 2. It does ?ust one ,ell.defined tas&! and does it ,ell. F. Its interface to the rest of the program is clean and narro,. 8ttribute 2 is ?ust a restatement of t,o things ,e said above. 8ttribute F says that you shouldn5t have to &eep trac& of too many things ,hen calling a function. If you &no, ,hat a function is supposed to do! and if its tas& is simple and ,ell.defined! there should be ?ust a fe, pieces of information you have to give it to act upon! and one or ?ust a fe, pieces of information ,hich it returns to you ,hen it5s done. If you find yourself having to pass lots and lots of information to a function! or remember details of its internal implementation to ma&e sure that it ,ill ,or& properly this time! it5s often a sign that the function is not sufficiently ,ell.defined. 8 poorly.defined function may be an arbitrary chun& of code that ,as ripped out of a main program that ,as getting too big! such that it essentially has to have access to all of that main function5s local variables.% The ,hole point of brea&ing a program up into functions is so that you don5t have to thin& about the entire program at once7 ideally! you can thin& about ?ust one function at a time. We say that a good function is a 99blac& box!55 ,hich is supposed to suggest that the 99container55 it5s in is opa:ue..callers can5t see inside it and the function inside can5t see out%. When you call a function! you only have to &no, ,hat it does! not ho, it does it. When you5re writing a function! you only have to &no, ,hat it5s supposed to do! and you don5t have to &no, ,hy or under ,hat circumstances its caller ,ill be calling it. When designing a function! ,e should perhaps thin& about the callers ?ust enough to ensure that the function ,e5re designing ,ill be easy to call! and that ,e aren5t accidentally setting things up so that callers ,ill have to thin& about any internal details.% Some functions may be hard to ,rite if they have a hard ?ob to do! or if it5s hard to ma&e them do it truly ,ell%! but that difficulty should be compartmentali@ed along ,ith the function itself. =nce you5ve ,ritten a 99hard55 function! you should be able to sit bac& and relax and ,atch it do that hard ,or& on call from the rest of your program. It should be pleasant to notice in the ideal case% ho, much easier the rest of the program is to ,rite! no, that the hard ,or& can be deferred to this ,or&horse function. In fact! if a difficult.to.,rite function5s interface is ,ell.defined! you may be able to get a,ay ,ith ,riting a :uic&.and.dirty version of the function first! so that you can begin testing the rest of the program! and then go bac& later and re,rite the function to do the hard parts. FF

8s long as the function5s original interface anticipated the hard parts! you ,on5t have to re,rite the rest of the program ,hen you fix the function.% What I5ve been trying to say in the preceding fe, paragraphs is that functions are important for far more important reasons than ?ust saving typing. Sometimes! ,e5ll ,rite a function ,hich ,e only call once! ?ust because brea&ing it out into a function ma&es things clearer and easier. If you find that difficulties pervade a program! that the hard parts can5t be buried inside blac&. box functions and then forgotten about7 if you find that there are hard parts ,hich involve complicated interactions among multiple functions! then the program probably needs redesigning. Gor the purposes of explanation! ,e5ve been seeming to tal& so far only about 99main programs55 and the functions they call and the rationale behind moving some piece of code do,n out of a 99main program55 into a function. 'ut in reality! there5s obviously no need to restrict ourselves to a t,o.tier scheme. 8ny function ,e find ourself ,riting ,ill often be appropriately ,ritten in terms of sub.functions! sub.sub.functions! etc. Gurthermore! the 99main program!55 main()! is itself ?ust a function.%

#.4 Separate Compilation11)ogistics


When a program consists of many functions! it can be convenient to split them up into several source files. 8mong other things! this means that ,hen a change is made! only the source file containing the change has to be recompiled! not the ,hole program. The ?ob of putting the pieces of a program together and producing the final executable falls to a tool called the linker. We may or may not need to invo&e the lin&er explicitly7 a compiler often invo&es it automatically! as needed.% The lin&er loo&s through all of the pieces ma&ing up the program! sorting out the external declarations and defining instances. The compiler has noted the definitions made by each source file! as ,ell as the declarations of things used by each source file but presumably% defined else,here. Gor each thing global variable or function% used but not defined by one piece of the program! the lin&er loo&s for another piece ,hich does define that thing. The logistics of ,riting a program in several source files! and then compiling and lin&ing all of the source files together! depend on the programming environment you5re using. We5ll cover t,o possibilities! depending on ,hether you5re using a traditional command.line compiler or a ne,er integrated development environment I)E% or other graphical user interface KUI% compiler. When using a command.line compiler! there are usually t,o main steps involved in building an executable program from one or more source files. Girst! each source file is compiled! resulting in an object file containing the machine instructions generated by the compiler% corresponding to ?ust the code in that source file. Second! the various ob?ect files are linked together! ,ith each other and ,ith libraries containing code for functions ,hich you did not ,rite such as printf%! to produce a final! executable program. Under Unix! the cc command can perform one or both steps. So far! ,e5ve been using extremely simple invocations of cc such as F$

cc "o hello hello.c

This invocation compiles a single source file! hello.c! lin&s it! and places the executable in a file named hello. Suppose ,e have a program ,hich ,e5re trying to build from three separate source files! '.c! A.c! and 8.c. We could compile all three of them! and lin& them together! all at once! ,ith the command
cc "o mApro% '.c A.c 8.c

8lternatively! ,e could compile them separately6 the "c option to cc tells it to compile only! but not to lin&. Instead of building an executable! it merely creates an ob?ect file! ,ith a name ending in .o! for each source file compiled. So the three commands
cc "c '.c cc "c A.c cc "c A.c compile '.c! A.c!

,ould and 8.c and create ob?ect files '.o! A.o! and 8.o. Then! the three ob?ect files could be lin&ed together using When it &no,s that it does not have to compile it it5s an ob?ect file! already compiled%7 it ?ust sends it through to the lin& process. 8bove ,e mentioned that the second! lin&ing step also involves pulling in library functions. 0ormally! the functions from the Standard C library are lin&ed in automatically. =ccasionally! you must re:uest a library manually7 one common situation under Unix is that the math functions tend to be in a separate math library! ,hich is re:uested by using "lm on the command line. Since the libraries must typically be searched after your program5s o,n ob?ect files are lin&ed so that the lin&er &no,s ,hich library functions your program uses%! any "l option must appear after the names of your files on the command line. Gor example! to lin& the ob?ect file mAmath.o previously compiled ,ith cc "c mAmath.c% together ,ith the math library! you might use The
cc "o mAmathpro% mAmath.o "lm l in the "l option is the lo,er case ell! for cc "o mApro% '.o A.o 8.o the cc command is given an .o file!

library7 it is not the digit *.%

Everything ,e5ve said about cc also applies to most other Unix C compilers. Eany of you ,ill be using %cc! the GSG5s K0U C Compiler.% There are command.line compilers for ES.)=S systems ,hich ,or& similarly. Gor example! the Eicrosoft C compiler comes ,ith a F1 99compile and lin&55% command! ,hich ,or&s almost the same as Unix cc. >ou can compile and lin& in one step6
cl hello.c

or you can compile only6


cl #c hello.c

creating an ob?ect file named hello.o$@ ,hich you can lin& later. The preceding has all been about command.line compilers. If you5re using some &ind of integrated development environment! such as 'orland5s Turbo C or the Eicrosoft Programmer5s Wor&bench or Nisual C or Thin& C or Code,arrior! most of the mechanical details are ta&en care of for you. There5s also less I can say here about these environments! because they5re all different.% Typically you define a 99pro?ect!55 and there5s a ,ay to specify

F3

the list of files modules% ,hich ma&e up your pro?ect. The modules might be source files ,hich you typed in or obtained else,here! or they might be source files ,hich you created ,ithin the environment perhaps by re:uesting a 990e, source file!55 and typing it in%. Typically! the programming environment has a single 99build55 button ,hich does ,hatever5s re:uired to build and perhaps even execute% your program. There may also be configuration ,indo,s in ,hich you can specify compiler options such as ,hether you5d li&e it to accept C or CII%. 99See your manual for details.55

FH

Chapter ): Basic I*O


So far! ,e5ve been using printf to do output! and ,e haven5t had a ,ay of doing any input. In this chapter! ,e5ll learn a bit more about printf! and ,e5ll begin learning about character. based input and output.

&.1 printf
printf5s

name comes from print +ormatted. It generates output under the control of a format string its first argument% ,hich consists of literal characters to be printed and also special character se:uences..format specifiers..,hich re:uest that other arguments be fetched! formatted! and inserted into the string. =ur very first program ,as nothing more than a call to printf! printing a constant string6
printf("Hello, world!\n");

=ur second program also featured a call to printf6


printf("i is ,d\n", i); In that case! ,henever printf 99printed55 the string "i is ,d"! replaced the t,o characters ,d ,ith the value of the variable i.

it did not print it verbatim7 it

There are :uite a number of format specifiers for printf. /ere are the basic ones 6
,d ,ld ,c ,s ,f ,e ,% ,o ,' ,, print an int ar%ument in decimal print a lon% int ar%ument in decimal print a character print a strin% print a float or dou$le ar%ument same as ,f, $ut use e'ponential notation use ,e or ,f, whiche9er is $etter print an int ar%ument in octal ($ase E) print an int ar%ument in he'adecimal ($ase *0) print a sin%le ,

It is also possible to specify the ,idth and precision of numbers and strings as they are inserted some,hat li&e G=*T*80 format statements%7 ,e5ll present those details in a later chapter. Nery briefly! for those ,ho are curious6 a notation li&e ,.d means to print an int in a field at least 2 spaces ,ide7 a notation li&e ,2.-f means to print a float or dou$le in a field at least $ spaces ,ide! ,ith t,o places to the right of the decimal.% To illustrate ,ith a fe, more examples6 the call
printf(",c ,d ,f ,e ,s ,d,,\n", 4*4, -, ..*/, 20 D); ., "ei%ht",

,ould print The call

* - ..*/

2.0

e+ 3 ei%ht D, , * , * );

printf(",d ,o ,'\n", *

,ould print
*

Successive

*// 0/ calls to printf

?ust build up the output a piece at a time! so the calls F-

,ould

printf("Hello, "); printf("world!\n"); also print Hello, world!

on one line of output%.

Earlier ,e learned that C represents characters internally as small integers corresponding to the characters5 values in the machine5s character set typically 8SCII%. This means that there isn5t really much difference bet,een a character and an integer in C7 most of the difference is in ,hether ,e choose to interpret an integer as an integer or a character. printf is one place ,here ,e get to ma&e that choice6 ,d prints an integer value as a string of digits representing its decimal value! ,hile ,c prints the character corresponding to a character set value. So the lines
char c ) 454; int i ) D3; printf("c ) ,c, i ) ,d\n", c, i); ,ould print c as the character 8 and i as the number #H. 'ut if! on the other hand! ,e called printf("c ) ,d, i ) ,c\n", c, i); ,e5d see the decimal value printed by ,d% of the character 454! follo,ed by the character

,hatever it is% ,hich happens to have the decimal value #H. >ou have to be careful ,hen calling printf. It has no ,ay of &no,ing ho, many arguments you5ve passed it or ,hat their types are other than by loo&ing for the format specifiers in the format string. If there are more format specifiers that is! more , signs% than there are arguments! or if the arguments have the ,rong types for the format specifiers! printf can misbehave badly! often printing nonsense numbers or even ,orse% numbers ,hich mislead you into thin&ing that some other part of your program is bro&en. 'ecause of some automatic conversion rules ,hich ,e haven5t covered yet! you have a small amount of latitude in the types of the expressions you pass as arguments to printf. The argument for ,c may be of type char or int! and the argument for ,d may be of type char or int. The string argument for ,s may be a string constant! an array of characters! or a pointer to some characters though ,e haven5t really covered strings or pointers yet%. Ginally! the arguments corresponding to ,e! ,f! and ,% may be of types float or dou$le. 'ut other combinations do not ,or& reliably6 ,d ,ill not print a lon% int or a float or a dou$le7 ,ld ,ill not print an int7 ,e! ,f! and ,% ,ill not print an int.

&.2 C$aracter /nput and %utput


BThis section corresponds to (+* Sec. ".$C Unless a program can read some input! it5s hard to &eep it from doing exactly the same thing every time it5s run! and thus being rather boring after a ,hile. The most basic ,ay of reading input is by calling the function %etchar. %etchar reads one character from the 99standard input!55 ,hich is usually the user5s &eyboard! but ,hich can sometimes be redirected by the operating system. %etchar returns rather obviously% the character it reads! or! if there are no more characters available! the special value >LH 99end of file55%.

F#

8 companion function is putchar! ,hich ,rites one character to the 99standard output.55 The standard output is! again not surprisingly! usually the user5s screen! although it! too! can be redirected. printf! li&e putchar! prints to the standard output7 in fact! you can imagine that printf calls putchar to actually print each of the characters it formats.% Using these t,o functions! ,e can ,rite a very basic program to copy the input! a character at a time! to the output6
#include <stdio.h> #( copA input to output (# main() { int c; c ) %etchar(); while(c !) >LH) { putchar(c); c ) %etchar(); ! ! return ;

This code is straightfor,ard! and I encourage you to type it in and try it out. It reads one character! and if it is not the >LH code! enters a while loop! printing one character and reading another! as long as the character read is not >LH. This is a straightfor,ard loop! although there5s one mystery surrounding the declaration of the variable c6 if it holds characters! ,hy is it an int; We said that a char variable could hold integers corresponding to character set values! and that an int could hold integers of more arbitrary values up to I.24H3H%. Since most character sets contain a fe, hundred characters no,here near 24H3H%! an int variable can in general comfortably hold all char values! and then some. Therefore! there5s nothing ,rong ,ith declaring c as an int. 'ut in fact! it5s important to do so! because %etchar can return every character value! plus that special! non.character value >LH! indicating that there are no more characters. Type char is only guaranteed to be able to hold all the character values7 it is not guaranteed to be able to hold this 99no more characters55 value ,ithout possibly mixing it up ,ith some actual character value. It5s li&e trying to cram five pounds of boo&s into a four. pound box! or "2 eggs into a carton that holds a do@en.% Therefore! you should al,ays remember to use an int for anything you assign %etchar5s return value to. When you run the character copying program! and it begins copying its input your typing% to its output your screen%! you may find yourself ,ondering ho, to stop it. It stops ,hen it receives end.of.file E=G%! but ho, do you send E=G; The ans,er depends on ,hat &ind of computer you5re using. =n Unix and Unix.related systems! it5s almost al,ays control.). =n ES.)=S machines! it5s control.O follo,ed by the *ETU*0 &ey. Under Thin& C on the Eacintosh! it5s control.)! ?ust li&e Unix. =n other systems! you may have to do some research to learn ho, to send E=G.

$1

0ote! too! that the character you type to generate an end.of.file condition from the &eyboard is not the same as the special >LH value returned by %etchar. The >LH value returned by %etchar is a code indicating that the input system has detected an end.of.file condition! ,hether it5s reading the &eyboard or a file or a magnetic tape or a net,or& connection or anything else. In a dis& file! at least! there is not li&ely to be any character in the file corresponding to >LH7 as far as your program is concerned! >LH indicates the absence of any more characters to read.% 8nother excellent thing to &no, ,hen doing any &ind of programming is ho, to terminate a runa,ay program. If a program is running forever ,aiting for input! you can usually stop it by sending it an end.of.file! as above! but if it5s running forever not ,aiting for something! you5ll have to ta&e more drastic measures. Under Unix! control.C or! occasionally! the )EDETE &ey% ,ill terminate the current program! almost no matter ,hat. Under ES.)=S! control.C or control.'*E8( ,ill sometimes terminate the current program! but by default ES.)=S only chec&s for control.C ,hen it5s loo&ing for input! so an infinite loop can be un&illable. There5s a )=S command!
$rea& on

,hich tells )=S to loo& for control.C more often! and I recommend using this command if you5re doing any programming. If a program is in a really tight infinite loop under ES.)=S! there can be no ,ay of &illing it short of rebooting.% =n the Eac! try command.period or command.option.ESC8PE. Ginally! don5t be disappointed as I ,as% the first time you run the character copying program. >ou5ll type a character! and see it on the screen right a,ay! and assume it5s your program ,or&ing! but it5s only your computer echoing every &ey you type! as it al,ays does. When you hit *ETU*0! a full line of characters is made available to your program. It then @ips several times through its loop! reading and printing all the characters in the line in :uic& succession. In other ,ords! ,hen you run this program! it ,ill probably seem to copy the input a line at a time! rather than a character at a time. >ou may ,onder ho, a program could instead read a character right a,ay! ,ithout ,aiting for the user to hit *ETU*0. That5s an excellent :uestion! but unfortunately the ans,er is rather complicated! and beyond the scope of our discussion here. 8mong other things! ho, to read a character right a,ay is one of the things that5s not defined by the C language! and it5s not defined by any of the standard library functions! either. /o, to do it depends on ,hich operating system you5re using.% Stylistically! the character.copying program above can be said to have one minor fla,6 it contains t,o calls to %etchar! one ,hich reads the first character and one ,hich reads by virtue of the fact that it5s in the body of the loop% all the other characters. This seems inelegant and perhaps unnecessary! and it can also be ris&y6 if there ,ere more things going on ,ithin the loop! and if ,e ever changed the ,ay ,e read characters! it ,ould be easy to change one of the %etchar calls but forget to change the other one. Is there a ,ay to re,rite the loop so that there is only one call to %etchar! responsible for reading all the characters; Is there a ,ay to read a character! test it for >LH! and assign it to the variable c! all at the same time; There is. It relies on the fact that the assignment operator! )! is ?ust another operator in C. 8n assignment is not necessarily% a standalone statement7 it is an expression! and it has a value the value that5s assigned to the variable on the left.hand side%! and it can therefore participate

$"

in a larger! surrounding expression. Therefore! most C programmers ,ould ,rite the character.copying loop li&e this6
while((c ) %etchar()) !) >LH) putchar(c); What does this mean; The function %etchar is called! as before! and its return value is assigned to the variable c. Then the value is immediately compared against the value >LH. Ginally! the true<false value of the comparison controls the while loop6 as long as the value not >LH! the loop continues executing! but as soon as an >LH is received! no more trips through the loop are ta&en! and it exits. The net result is that the call to %etchar happens inside the test at the top of the while loop! and doesn5t have to be repeated before the loop

is

and ,ithin the loop more on this in a bit%. Stated another ,ay! the syntax of a while loop is al,ays 8
while( expression ) ... comparison using the !) operator% is expression !) expression expression ) expression

of course an expression7 the syntax is

8nd an assignment is an expression7 the syntax is What ,e5re seeing is ?ust another example of the fact that expressions can be combined ,ith essentially limitless generality and therefore infinite variety. The left.hand side of the !) operator its first expression% is the sub%expression c ) %etchar()! and the combined expression is the expression needed by the while loop. The extra parentheses around
(c ) %etchar())

are important! and are there because because the precedence of the !) operator is higher than that of the ) operator. If ,e incorrectly% ,rote
while(c ) %etchar() !) >LH) #( K;LBM (#

the compiler ,ould interpret it as That is! ,ant.

while(c ) (%etchar() !) >LH)) it ,ould assign the result of the !) operator

to the variable c! ,hich is not ,hat ,e

99Precedence55 refers to the rules for ,hich operators are applied to their operands in ,hich order! that is! to the rules controlling the default grouping of expressions and subexpressions. Gor example! the multiplication operator ( has higher precedence than the addition operator +! ,hich means that the expression a + $ ( c is parsed as a + ($ ( c). We5ll have more to say about precedence later.% The line
while((c ) %etchar()) !) >LH)

epitomi@es the cryptic brevity ,hich C is notorious for. >ou may find this terseness infuriating and you5re not aloneA%! and it can certainly be carried too far! but bear ,ith me for a moment ,hile I defend it. The simple example ,e5ve been discussing illustrates the tradeoffs ,ell. We have four things to do6

$4

". 4. 2. F.

call %etchar! assign its return value to a variable! test the return value against >LH! and process the character in this case! print it out again%.

We can5t eliminate any of these steps. We have to assign %etchar5s value to a variable ,e can5t ?ust use it directly% because ,e have to do t,o different things ,ith it test! and print%. Therefore! compressing the assignment and test into the same line is the only good ,ay of avoiding t,o distinct calls to %etchar. >ou may not agree that the compressed idiom is better for being more compact or easier to read! but the fact that there is no, only one call to %etchar is a real virtue. )on5t thin& that you5ll have to ,rite compressed lines li&e
while((c ) %etchar()) !) >LH)

right a,ay! or in order to be an 99expert C programmer.55 'ut! for better or ,orse! most experienced C programmers do li&e to use these idioms ,hether they5re ?ustified or not%! so you5ll need to be able to at least recogni@e and understand them ,hen you5re reading other peoples5 code.

&.3 2eading )ines


It5s often convenient for a program to process its input not a character at a time but rather a line at a time! that is! to read an entire line of input and then act on it all at once. The standard C library has a couple of functions for reading lines! but they have a fe, a,&,ard features! so ,e5re going to learn more about character input and about ,riting functions in general% by ,riting our o,n function to read one line. /ere it is6
#include <stdio.h> #( #( #( #( int { int int ma' ;ead one line from standard input, (# copAin% it to line arraA ($ut no more than ma' chars). (# Goes not place terminatin% \n in line arraA. (# ;eturns line len%th, or for emptA line, or >LH for end"of"file. (# %etline(char lineRS, int ma') nch ) ; c; ) ma' " *;

#( lea9e room for 4\ 4 (#

while((c ) %etchar()) !) >LH) { if(c )) 4\n4) $rea&; if(nch < ma') { lineRnchS ) c; nch ) nch + *; ! ! if(c )) >LH II nch )) )

$2

return >LH; lineRnchS ) 4\ 4; return nch; !

8s the comment indicates! this function ,ill read one line of input from the standard input! placing it into the line array. The si@e of the line array is given by the ma' argument7 the function ,ill never ,rite more than ma' characters into line. The main body of the function is a %etchar loop! much as ,e used in the character.copying program. In the body of this loop! ho,ever! ,e5re storing the characters in an array rather than immediately printing them out%. 8lso! ,e5re only reading one line of characters! then stopping and returning. There are several ne, things to notice here. Girst of all! the %etline function accepts an array as a parameter. 8s ,e5ve said! array parameters are an exception to the rule that functions receive copies of their arguments..in the case of arrays! the function does have access to the actual array passed by the caller! and can modify it. Since the function is accessing the caller5s array! not creating a ne, one to hold a copy! the function does not have to declare the argument array5s si@e7 it5s set by the caller. Thus! the brac&ets in 99char lineRS55 are empty.% /o,ever! so that ,e ,on5t overflo, the caller5s array by reading too long a line into it! ,e allo, the caller to pass along the si@e of the array! ,hich ,e promise not to exceed. Second! ,e see an example of the $rea& statement. The top of the loop loo&s li&e our earlier character.copying loop..it stops ,hen it reaches >LH..but ,e only ,ant this loop to read one line! so ,e also stop that is! brea& out of the loop% ,hen ,e see the \n character signifying end.of.line. 8n e:uivalent loop! ,ithout the $rea& statement! ,ould be
while((c ) %etchar()) !) >LH II c !) 4\n4) { if(nch < ma') { lineRnchS ) c; nch ) nch + *; ! !

We haven5t learned about the internal representation of strings yet! but it turns out that strings in C are simply arrays of characters! ,hich is ,hy ,e are reading the line into an array of characters. The end of a string is mar&ed by the special character! 4\ 4. To ma&e sure that there5s al,ays room for that character! on our ,ay in ,e subtract " from ma'! the argument that tells us ho, many characters ,e may place in the line array. When ,e5re done reading the line! ,e store the end.of.string character 4\ 4 at the end of the string ,e5ve ?ust built in the line array. Ginally! there5s one subtlety in the code ,hich isn5t too important for our purposes no, but ,hich you may ,onder about6 it5s arranged to handle the possibility that a fe, characters i.e. the apparent beginning of a line% are read! follo,ed immediately by an >LH! ,ithout the usual

$F

end.of.line character. That5s ,hy ,e return >LH only if ,e received >LH and ,e hadn5t read any characters first.%
\n

In any case! the function returns the length number of characters% of the line it read! not including the \n. Therefore! it returns 1 for an empty line.% Di&e %etchar! it returns >LH ,hen there are no more lines to read. It happens that >LH is a negative number! so it ,ill never match the length of a line that %etline has read.% /ere is an example of a test program ,hich calls %etline! reading the input a line at a time and then printing each line bac& out6
#include <stdio.h> e'tern int %etline(char RS, int); main() { char lineR-20S; while(%etline(line, -20) !) >LH) printf("Aou tAped \",s\"\n", line); return ! ;

The notation char RS in the function prototype for %etline says that %etline accepts as its first argument an array of char. When the program calls %etline! it is careful to pass along the actual si@e of the array. >ou might notice a potential problem6 since the number 4$3 appears in t,o places! if ,e ever decide that 4$3 is too small! and that ,e ,ant to be able to read longer lines! ,e could easily change one of the instances of 4$3! and forget to change the other one. Dater ,e5ll learn ,ays of solving..that is! avoiding..this sort of problem.%

&.4 2eading "um!ers


The %etline function of the previous section reads one line from the user! as a string. What if ,e ,ant to read a number; =ne straightfor,ard ,ay is to read a string as before! and then immediately convert the string to a number. The standard C library contains a number of functions for doing this. The simplest to use are atoi()! ,hich converts a string to an integer! and atof()! ,hich converts a string to a floating.point number. 'oth of these functions are declared in the header <stdli$.h>! so you should #include that header at the top of any file using these functions.% >ou could read an integer from the user li&e this6
#include <stdli$.h> char lineR-20S; int n; printf("TApe an inte%erQ\n"); %etline(line, -20); n ) atoi(line); 0o, the variable n contains the number typed by the user. type a valid number! and that %etline did not return >LH.%

This assumes that the user did

*eading a floating.point number is similar6

$$

#include <stdli$.h> char lineR-20S; dou$le '; printf("TApe a floatin%"point num$erQ\n"); %etline(line, -20); ' ) atof(line); atof is actually declared as returning type dou$le! but you could also use it ,ith a variable of type float! because in general! C automatically converts bet,een float and dou$le as

needed.% 8nother ,ay of reading in numbers! ,hich you5re li&ely to see in other boo&s on C! involves the scanf function! but it has several problems! so ,e ,on5t discuss it for no,. Superficially! scanf seems simple enough! ,hich is ,hy it5s often used! especially in textboo&s. The trouble is that to perform input reliably using scanf is not nearly as easy as it loo&s! especially ,hen you5re not sure ,hat the user is going to type.%

$3

Chapter ,: #ore Operators


In this chapter ,e5ll meet some though still not all% of C5s more advanced arithmetic operators. The ones ,e5ll meet here have to do ,ith ma&ing common patterns of operations easier. It5s extremely common in programming to have to increment a variable by "! that is! to add " to it. Gor example! if you5re processing each element of an array! you5ll typically ,rite a loop ,ith an index or pointer variable stepping through the elements of the array! and you5ll increment the variable each time through the loop.% The classic ,ay to increment a variable is ,ith an assignment li&e
i ) i + *

Such an assignment is perfectly common and acceptable! but it has a fe, slight problems6 ". 8s ,e5ve mentioned! it loo&s a little odd! especially from an algebraic perspective. 4. If the ob?ect being incremented is not a simple variable! the idiom can become cumbersome to type! and correspondingly more error.prone. Gor example! the expression
.. aRi+@+-(&S ) aRi+@+-(&S + *

is a bit of a mess! and you may have to loo& closely to see that the similar.loo&ing expression
aRi+@+-(&S ) aRi+@+-+&S + *

probably has a mista&e in it. F. Since incrementing things is so common! it might be nice to have an easier ,ay of doing it. In fact! C provides not one but t,o other! simpler ,ays of incrementing variables and performing other similar operations.

'.1 Assignment %perators


BThis section corresponds to (+* Sec. 4."1C The first and more general ,ay is that any time you have the pattern
$ ) $ op e

,here v is any variable or anything li&e aRiS%! op is any of the binary arithmetic operators ,e5ve seen so far! and e is any expression! you can replace it ,ith the simplified
$ op) e

Gor example! you can replace the expressions


i ) i + * @ ) @ " * & ) & ( (n + *) aRiS ) aRiS # $

$H

,ith
i +) @ ") & () aRiS * * n + * #) $

In an example in a previous chapter! ,e used the assignment


aRd* + d-S ) aRd* + d-S + *;

to count the rolls of a pair of dice. Using +)! ,e could simplify this expression to
aRd* + d-S +) *;

8s these examples sho,! you can use the 99opM55 form ,ith any of the arithmetic operators and ,ith several other operators that ,e haven5t seen yet%. The expression! e! does not have to be the constant "7 it can be any expression. >ou don5t al,ays need as many explicit parentheses ,hen using the op) operators6 the expression
& () n + *

is interpreted as

& ) & ( (n + *)

'.2 /ncrement and Decrement %perators


BThis section corresponds to (+* Sec. 4.-C The assignment operators of the previous section let us replace v ) v op e ,ith v op) e! so that ,e didn5t have to mention v t,ice. In the most common cases! namely ,hen ,e5re adding or subtracting the constant " that is! ,hen op is + or " and e is "%! C provides another set of shortcuts6 the autoincrement and autodecrement operators. In their simplest forms! they loo& li&e this6
a%%&1&to i subtra"t&1&'rom @ These correspond to the slightly longer i +) * and @ ") *! 99longhand55 forms i ) i + * and @ ) @ " *. ++i ""@

respectively! and also to the fully

The ++ and "" operators apply to one operand they5re unary operators%. The expression ++i adds " to i! and stores the incremented result bac& in i. This means that these operators don5t ?ust compute ne, values7 they also modify the value of some variable. They share this property..modifying some variable..,ith the assignment operators7 ,e can say that these operators all have side effects. That is! they have some effect! on the side! other than ?ust computing a ne, value.% The incremented or decremented% result is also made available to the rest of the expression! so an expression li&e means result bac& in i! multiply it by 4! and store that result in &.55 This is a pretty meaningless expression7 our actual uses of ++ later ,ill ma&e more sense.%
& ) - ( ++i 99add one to i! store the

$-

'oth the ++ and "" operators have an unusual property6 they can be used in t,o ,ays! depending on ,hether they are ,ritten to the left or the right of the variable they5re operating on. In either case! they increment or decrement the variable they5re operating on7 the difference concerns ,hether it5s the old or the ne, value that5s 99returned55 to the surrounding expression. The prefix form ++i increments i and returns the incremented value. The postfix form i++ increments i! but returns the prior! non.incremented value. *e,riting our previous example slightly! the expression
& ) - ( i++

means 99ta&e i5s old value and multiply it by 4! increment i! store the result of the multiplication in &.55 The distinction bet,een the prefix and postfix forms of ++ and "" ,ill probably seem strained at first! but it ,ill ma&e more sense once ,e begin using these operators in more realistic situations. Gor example! our %etline function of the previous chapter used the statements
lineRnchS ) c; nch ) nch + *;

as the body of its inner loop. Using the ++ operator! ,e could simplify this to
lineRnch++S ) c; We ,anted to increment nch after deciding the postfix form nch++ is appropriate.

,hich element of the line array to store into! so

0otice that it only ma&es sense to apply the ++ and "" operators to variables or to other 99containers!55 such as aRiS%. It ,ould be meaningless to say something li&e
*++

or

(-+.)++

The ++ operator doesn5t ?ust mean 99add one557 it means 99add one to a variable55 or 99ma&e a variable5s value one more than it ,as before.55 'ut (*+-) is not a variable! it5s an expression7 so there5s no place for ++ to store the incremented result. 8nother unfortunate example is
i ) i++;

,hich some confused programmers sometimes ,rite! presumably because they ,ant to be extra sure that i is incremented by ". 'ut i++ all by itself is sufficient to increment i by "7 the extra explicit% assignment to i is unnecessary and in fact counterproductive! meaningless! and incorrect. If you ,ant to increment i that is! add one to it! and store the result bac& in i%! either use
i ) i + *; i +) *; ++i; i++; or or or

)on5t try to use some bi@arre combination.

$#

)id it matter ,hether ,e used ++i or i++ in this last example; *emember! the difference bet,een the t,o forms is ,hat value either the old or the ne,% is passed on to the surrounding expression. If there is no surrounding expression! if the ++i or i++ appears all by itself! to increment i and do nothing else! you can use either form7 it ma&es no difference. T,o ,ays that an expression can appear 99all by itself!55 ,ith 99no surrounding expression!55 are ,hen it is an expression statement terminated by a semicolon! as above! or ,hen it is one of the controlling expressions of a for loop.% Gor example! both the loops
for(i ) ; i < * ; ++i) printf(",d\n", i); ; i < * ; i++) printf(",d\n", i);

and
for(i )

,ill behave exactly the same ,ay and produce exactly the same results. In real code! postfix increment is probably more common! though prefix definitely has its uses! too.% In the preceding section! ,e simplified the expression
aRd* + d-S ) aRd* + d-S + *;

from a previous chapter do,n to


aRd* + d-S +) *; aRd* + d-S++;

Using ++! ,e could simplify it still further to or


++aRd* + d-S;

8gain! in this case! both are e:uivalent.% We5ll see more examples of these operators in the next section and in the next chapter.

'.3 %rder o* E3aluation


BThis section corresponds to (+* Sec. 4."4C When you start using the ++ and "" operators in larger expressions! you end up ,ith expressions ,hich do several things at once! i.e.! they modify several different variables at more or less the same time. When you ,rite such an expression! you must be careful not to have the expression 99pull the rug out from under itself55 by assigning t,o different values to the same variable! or by assigning a ne, value to a variable at the same time that another part of the expression is trying to use the value of that variable. 8ctually! ,e had already started ,riting expressions ,hich did several things at once even before ,e met the ++ and "" operators. The expression
(c ) %etchar()) !) >LH

assigns %etchar5s return value to c! and compares it to >LH. The ++ and "" operators ma&e it much easier to cram a lot into a small expression6 the example
lineRnch++S ) c;

from the previous section assigned c to lineRnchS! and incremented nch. We5ll eventually meet expressions ,hich do three things at once! such as ,hich
aRi++S ) $R@++S; assigns $R@S to aRiS! and

increments i! and increments @.

31

If you5re not careful! though! it5s easy for this sort of thing to get out of hand. Can you figure out exactly ,hat the expression
aRi++S ) $Ri++S; #( K;LBM (#

should do; I can5t! and here5s the important part6 neither can the compiler. We &no, that the definition of postfix ++ is that the former value! before the increment! is ,hat goes on to participate in the rest of the expression! but the expression aRi++S ) $Ri++S contains two ++ operators. Which of them happens first; )oes this expression assign the old ith element of $ to the ne, ith element of a! or vice versa; 0o one &no,s. When the order of evaluation matters but is not ,ell.defined that is! ,hen ,e can5t say for sure ,hich order the compiler ,ill evaluate the various dependent parts in% ,e say that the meaning of the expression is undefined! and if ,e5re smart ,e ,on5t ,rite the expression in the first place. Why ,ould anyone ever ,rite an 99undefined55 expression; 'ecause sometimes! the compiler happens to evaluate it in the order a programmer ,anted! and the programmer assumes that since it ,or&s! it must be o&ay.% Gor example! suppose ,e carelessly ,rote this loop6
int i, aR* S; i ) ; while(i < * ) aRiS ) i++; li&e ,e5re trying to set aR S

#( K;LBM (#

It loo&s to 1! aR*S to "! etc. 'ut ,hat if the increment i++ happens before the compiler decides ,hich cell of the array a to store the unincremented% result in; We might end up setting aR*S to 1! aR-S to "! etc.! instead. Since! in this case! ,e can5t be sure ,hich order things ,ould happen in! ,e simply shouldn5t ,rite code li&e this. In this case! ,hat ,e5re doing matches the pattern of a for loop! any,ay! ,hich ,ould be a better choice6
for(i )

0o, that into the same expression that5s setting aRiS! the code is perfectly ,ell.defined! and is guaranteed to do ,hat ,e ,ant. In general! you should be ,ary of ever trying to second.guess the order an expression ,ill be evaluated in! ,ith t,o exceptions6 ". >ou can obviously assume that precedence ,ill dictate the order in ,hich binary operators are applied. This typically says more than ?ust ,hat order things happens in! but also ,hat the expression actually means. In other ,ords! the precedence of ( over + says more than that the multiplication 99happens first55 in * + - ( .7 it says that the ans,er is H! not #.% 4. 8lthough ,e haven5t mentioned it yet! it is guaranteed that the logical operators II and JJ are evaluated left.to.right! and that the right.hand side is not evaluated at all if the left.hand side determines the outcome. To loo& at one more example! it might seem that the code
int i ) 3; printf(",d\n", i++ ( i++);

; i < * ; i++) aRiS ) i; the increment i++ isn5t crammed

3"

,ould have to print $3! because no matter ,hich order the increments happen in! H(- is -(H is $3. 'ut ++ ?ust says that the increment happens later! not that it happens immediately! so this code could print F# if the compiler chose to perform the multiplication first! and both increments later%. 8nd! it turns out that ambiguous expressions li&e this are such a bad idea that the 80SI C Standard does not re:uire compilers to do anything reasonable ,ith them at all. Theoretically! the above code could end up printing F4! or -#42F1#2F4! or 1! or crashing your computer. Programmers sometimes mista&enly imagine that they can ,rite an expression ,hich tries to do too much at once and then predict exactly ho, it ,ill behave based on 99order of evaluation.55 Gor example! ,e &no, that multiplication has higher precedence than addition! ,hich means that in the expression
i + @ ( &

,ill be multiplied by &! and then i ,ill be added to the result. Informally! ,e often say that the multiplication happens 99before55 the addition. That5s true in this case! but it doesn5t say as much as ,e might thin& about a more complicated expression! such as
@ i++ + @++ ( &++

In this case! besides the addition and multiplication! i! @! and & are all being incremented. We can not say ,hich of them ,ill be incremented first7 it5s the compiler5s choice. In particular! it is not necessarily the case that @++ or &++ ,ill happen first7 the compiler might choose to save i5s value some,here and increment i first! even though it ,ill have to &eep the old value around until after it has done the multiplication.% In the preceding example! it probably doesn5t matter ,hich variable is incremented first. It5s not too hard! though! to ,rite an expression ,here it does matter. In fact! ,e5ve seen one already6 the ambiguous assignment aRi++S ) $Ri++S. We still don5t &no, ,hich i++ happens first. We can not assume! based on the right.to.left behavior of the ) operator! that the right.hand i++ ,ill happen first.% 'ut if ,e had to &no, ,hat aRi++S ) $Ri++S really did! ,e5d have to &no, ,hich i++ happened first. Ginally! note that parentheses don5t dictate overall evaluation order any more than precedence does. Parentheses override precedence and say ,hich operands go ,ith ,hich operators! and they therefore affect the overall meaning of an expression! but they don5t say anything about the order of subexpressions or side effects. We could not 99fix55 the evaluation order of any of the expressions ,e5ve been discussing by adding parentheses. If ,e ,rote
i++ + (@++ ( &++)

,e still ,ouldn5t &no, ,hich of the increments ,ould happen first. The parentheses ,ould force the multiplication to happen before the addition! but precedence already ,ould have forced that! any,ay.% If ,e ,rote
(i++) ( (i++)

the parentheses ,ouldn5t force the increments to happen before the multiplication or in any ,ell.defined order7 this parenthesi@ed version ,ould be ?ust as undefined as i++ ( i++ ,as. There5s a line from (ernighan + *itchie! ,hich I am fond of :uoting ,hen discussing these issues BSec. 4."4! p. $FC6 The moral is that ,riting code that depends on order of evaluation is a bad programming practice in any language. 0aturally! it is necessary to &no, ,hat things to avoid! but if you

34

don5t &no, how they are done on various machines! you ,on5t be tempted to ta&e advantage of a particular implementation. The first edition of (+* said ...if you don5t &no, how they are done on various machines! that innocence may help to protect you. I actually prefer the first edition ,ording. Eany textboo&s encourage you to ,rite small programs to find out ho, your compiler implements some of these ambiguous expressions! but it5s ?ust one step from ,riting a small program to find out! to ,riting a real program ,hich ma&es use of ,hat you5ve ?ust learned. 'ut you don%t ,ant to ,rite programs that ,or& only under one particular compiler! that ta&e advantage of the ,ay that one compiler but perhaps no other% happens to implement the undefined expressions. It5s fine to be curious about ,hat goes on 99under the hood!55 and many of you ,ill be curious enough about ,hat5s going on ,ith these 99forbidden55 expressions that you5ll ,ant to investigate them! but please &eep very firmly in mind that! for real programs! the very easiest ,ay of dealing ,ith ambiguous! undefined expressions ,hich one compiler interprets one ,ay and another interprets another ,ay and a third crashes on% is not to ,rite them in the first place.

32

Chapter -: Strings
Strings in C are represented by arrays of characters. The end of the string is mar&ed ,ith a special character! the null character! ,hich is simply the character ,ith the value 1. The null character has no relation except in name to the null pointer. In the 8SCII character set! the null character is named 0UD.% The null or string.terminating character is represented by another character escape se:uence! \ . We5ve seen it once already! in the %etline function of chapter 3.% 'ecause C has no built.in facilities for manipulating entire arrays copying them! comparing them! etc.%! it also has very fe, built.in facilities for manipulating strings. In fact! C5s only truly built.in string.handling is that it allo,s us to use string constants also called string literals% in our code. Whenever ,e ,rite a string! enclosed in double :uotes! C automatically creates an array of characters for us! containing that string! terminated by the \ character. Gor example! ,e can declare and define an array of characters! and initiali@e it ,ith a string constant6
char strin%RS ) "Hello, world!";

In this case! ,e can leave out the dimension of the array! since the compiler can compute it for us based on the si@e of the initiali@er "F! including the terminating \ %. This is the only case ,here the compiler si@es a string array for us! ho,ever7 in other cases! it ,ill be necessary that we decide ho, big the arrays and other data structures ,e use to hold strings are. To do anything else ,ith strings! ,e must typically call functions. The C library contains a fe, basic string manipulation functions! and to learn more about strings! ,e5ll be loo&ing at ho, these functions might be implemented. Since C never lets us assign entire arrays! ,e use the strcpA function to copy one string to another6
#include <strin%.h> char strin%*RS ) "Hello, world!"; char strin%-R- S;

The so that a call to strcpA mimics an assignment expression ,ith the destination on the left.hand side%. 0otice that ,e had to allocate strin%- big enough to hold the string that ,ould be copied to it. 8lso! at the top of any source file ,here ,e5re using the standard library5s string.handling functions such as strcpA% ,e must include the line
#include <strin%.h>

strcpA(strin%-, strin%*); destination string is strcpA5s first argument!

,hich contains external declarations for these functions. Since C ,on5t let us compare entire arrays! either! ,e must call a function to do that! too. The standard library5s strcmp function compares t,o strings! and returns 1 if they are identical! or a negative number if the first string is alphabetically 99less than55 the second string! or a

3F

positive number if the first string is 99greater.55 *oughly spea&ing! ,hat it means for one string to be 99less than55 another is that it ,ould come first in a dictionary or telephone boo&! although there are a fe, anomalies.% /ere is an example6
char strin%.RS ) "this is"; char strin%/RS ) "a test"; if(strcmp(strin%., strin%/) )) ) printf("strin%s are e7ual\n"); else printf("strin%s are different\n");

This code fragment ,ill print 99strings are different55. 0otice that strcmp does not return a 'oolean! true<false! @ero<non@ero ans,er! so it5s not a good idea to ,rite something li&e
if(strcmp(strin%., strin%/)) ...

because it ,ill behave bac&,ards from ,hat you might reasonably expect. 0evertheless! if you start reading other people5s code! you5re li&ely to come across conditionals li&e if(strcmp(a, $)) or even if(!strcmp(a, $)). The first does something if the strings are une:ual7 the second does something if they5re e:ual. >ou can read these more easily if you pretend for a moment that strcmp5s name ,ere strdiff! instead.% 8nother standard library function is strcat! ,hich concatenates strings. It does not concatenate t,o strings together and give you a third! ne, string7 ,hat it really does is append one string onto the end of another. If it gave you a ne, string! it ,ould have to allocate memory for it some,here! and the standard library string functions generally never do that for you automatically.% /ere5s an example6
char strin%2R- S ) "Hello, "; char strin%0RS ) "world!"; printf(",s\n", strin%2); strcat(strin%2, strin%0); printf(",s\n", strin%2);

The first call to printf prints 99/ello! 55! and the second one prints 99/ello! ,orldA55! indicating that the contents of strin%0 have been tac&ed on to the end of strin%2. 0otice that ,e declared strin%2 ,ith extra space! to ma&e room for the appended characters. If you have a string and you ,ant to &no, its length perhaps so that you can chec& ,hether it ,ill fit in some other array you5ve allocated for it%! you can call strlen! ,hich returns the length of the string i.e. the number of characters in it%! not including the \ 6
char strin%3RS ) "a$c"; int len ) strlen(strin%3); printf(",d\n", len);

Ginally! you can print strings out ,ith printf using the ,s format specifier! as ,e5ve been doing in these examples already e.g. printf(",s\n", strin%2);%. Since a string is ?ust an array of characters! all of the string.handling functions ,e5ve ?ust seen can be ,ritten :uite simply! using no techni:ues more complicated than the ones ,e already &no,. In fact! it5s :uite instructive to loo& at ho, these functions might be implemented. /ere is a version of strcpA6 3$

mAstrcpA(char destRS, char srcRS) { int i ) ; while(srcRiS !) 4\ 4) { destRiS ) srcRiS; i++; ! destRiS ) 4\ 4; ! We5ve called it mAstrcpA

instead of strcpA so that it ,on5t clash ,ith the version that5s already in the standard library. Its operation is simple6 it loo&s at characters in the src string one at a time! and as long as they5re not \ ! assigns them! one by one! to the corresponding positions in the dest string. When it5s done! it terminates the dest string by appending a \ . 8fter exiting the while loop! i is guaranteed to have a value one greater than the subscript of the last character in src.% Gor comparison! here5s a ,ay of ,riting the same code! using a for loop6
for(i ) ; srcRiS !) 4\ 4; i++) destRiS ) srcRiS;

destRiS ) 4\ 4;

>et a third possibility is to move the test for the terminating \ character out of the for loop header and into the body of the loop! using an explicit if and $rea& statement! so that ,e can perform the test after the assignment and therefore use the assignment inside the loop to copy the \ to dest! too6
for(i ) ; ; i++) { destRiS ) srcRiS; if(srcRiS )) 4\ 4) $rea&; !

There are in fact many! many ,ays to ,rite strcpA. Eany programmers li&e to combine the assignment and test! using an expression li&e (destRiS ) srcRiS) !) 4\ 4. This is actually the same sort of combined operation as ,e used in our %etchar loop in chapter 3.% /ere is a version of strcmp6
mAstrcmp(char str*RS, char str-RS) { int i ) ; while(*) { if(str*RiS !) str-RiS) return str*RiS " str-RiS; if(str*RiS )) 4\ 4 JJ str-RiS )) 4\ 4) return ; i++; ! !

Characters are compared one at a time. If t,o characters in one position differ! the strings are different! and ,e are supposed to return a value less than @ero if the first string str*% is alphabetically less than the second string. Since characters in C are represented by their 33

numeric character set values! and since most reasonable character sets assign values to characters in alphabetical order! ,e can simply subtract the t,o differing characters from each other6 the expression str*RiS " str-RiS ,ill yield a negative result if the i5th character of str* is less than the corresponding character in str-. 8s it turns out! this ,ill behave a bit strangely ,hen comparing upper. and lo,er.case letters! but it5s the traditional approach! ,hich the standard versions of strcmp tend to use.% If the characters are the same! ,e continue around the loop! unless the characters ,e ?ust compared ,ere both% \ ! in ,hich case ,e5ve reached the end of both strings! and they ,ere both e:ual. 0otice that ,e used ,hat may at first appear to be an infinite loop..the controlling expression is the constant "! ,hich is al,ays true. What actually happens is that the loop runs until one of the t,o return statements brea&s out of it and the entire function%. 0ote also that ,hen one string is longer than the other! the first test ,ill notice this because one string ,ill contain a real character at the RiS location! ,hile the other ,ill contain \ ! and these are not e:ual% and the return value ,ill be computed by subtracting the real character5s value from 1! or vice versa. Thus the shorter string ,ill be treated as 99less than55 the longer.% Ginally! here is a version of strlen6
int mAstrlen(char strRS) { int i; for(i ) ; strRiS !) 4\ 4; i++) {!

return i; !

In this case! all ,e have to do is find the \ that terminates the string! and it turns out that the three control expressions of the for loop do all the ,or&7 there5s nothing left to do in the body. Therefore! ,e use an empty pair of braces {! as the loop body. E:uivalently! ,e could use a null statement! ,hich is simply a semicolon6
for(i ) ; strRiS !) 4\ 4; i++) ;

Empty loop bodies can be a bit startling at first! but they5re not unheard of. Everything ,e5ve loo&ed at so far has come out of C5s standard libraries. 8s one last example! let5s ,rite a su$str function! for extracting a substring out of a larger string. We might call it li&e this6
char strin%ERS ) "this is a test"; char strin%DR* S; su$str(strin%D, strin%E, 2, /); printf(",s\n", strin%D);

The idea is that ,e5ll extract a substring of length F! starting at character $ 1.based% of strin%E! and copy the substring to strin%D. Lust as ,ith strcpA! it5s our responsibility to declare the destination string strin%D% big enough. /ere is an implementation of su$str. 0ot surprisingly! it5s :uite similar to strcpA6
su$str(char destRS, char srcRS, int offset, int len) { int i; for(i ) ; i < len II srcRoffset + iS !) 4\ 4; i++) destRiS ) srcRi + offsetS; destRiS ) 4\ 4; !

3H

If you compare this code to the code for mAstrcpA! you5ll see that the only differences are that characters are fetched from srcRoffset + iS instead of srcRiS! and that the loop stops ,hen len characters have been copied or ,hen the src string runs out of characters! ,hichever comes first%. In this chapter! ,e5ve been careless about declaring the return types of the string functions! and ,ith the exception of mAstrlen% they haven5t returned values. The real string functions do return values! but they5re of type 99pointer to character!55 ,hich ,e haven5t discussed yet. When ,or&ing ,ith strings! it5s important to &eep firmly in mind the differences bet,een characters and strings. We must also occasionally remember the ,ay characters are represented! and about the relation bet,een character values and integers. 8s ,e have had several occasions to mention! a character is represented internally as a small integer! ,ith a value depending on the character set in use. Gor example! ,e might find that 454 had the value 3$! that 4a4 had the value #H! and that 4+4 had the value F2. These are! in fact! the values in the 8SCII character set! ,hich most computers use. /o,ever! you don5t need to learn these values! because the vast ma?ority of the time! you use character constants to refer to characters! and the compiler ,orries about the values for you. Using character constants in preference to ra, numeric values also ma&es your programs more portable.% 8s ,e may also have mentioned! there is a big difference bet,een a character and a string! even a string ,hich contains only one character other than the \ %. Gor example! 454 is not the same as "5". To drive home this point! let5s illustrate it ,ith a fe, examples. If you have a string6
char strin%RS ) "hello, world!";

you can modify its first character by saying


strin%R S ) 4H4;

=f course! there5s nothing magic about the first character7 you can modify any character in the string in this ,ay. 'e a,are! though! that it is not al,ays safe to modify strings in.place li&e this7 ,e5ll say more about the modifiability of strings in a later chapter on pointers.% Since you5re replacing a character! you ,ant a character constant! 4H4. It ,ould not be right to ,rite
strin%R S ) "H"; #( K;LBM (# because "H" is a string an array of characters%! not a single character. The destination assignment! strin%R S! is a char! but the right.hand side is a string7 these types don5t

of the

match.% =n the other hand! ,hen you need a string! you must use a string. To print a single ne,line! you could call
printf("\n");

It ,ould not be correct to call


printf(4\n4); printf #( K;LBM (#

al,ays ,ants a string as its first argument. 8s one final example! putchar ,ants a single character! so putchar(4\n4) ,ould be correct! and putchar("\n") ,ould be incorrect.%

3-

We must also remember the difference bet,een strings and integers. If ,e treat the character 4*4 as an integer! perhaps by saying
int i ) 4*4;

,e ,ill probably not get the value " in i7 ,e5ll get the value of the character 4*4 in the machine5s character set. In 8SCII! it5s F#.% When ,e do need to find the numeric value of a digit character or to go the other ,ay! to get the digit character ,ith a particular value% ,e can ma&e use of the fact that! in any character set used by C! the values for the digit characters! ,hatever they are! are contiguous. In other ,ords! no matter ,hat values 4 4 and 4*4 have! 4*4 " 4 4 ,ill be " and! obviously! 4 4 " 4 4 ,ill be 1%. So! for a variable c holding some digit character! the expression
c " 4 4

gives us its value. Similarly! for an integer value i! i + 4 4 gives us the corresponding digit character! as long as 1 PM i PM #.% Lust as the character 4*4 is not the integer "! the string "*-." is not the integer "42. When ,e have a string of digits! ,e can convert it to the corresponding integer by calling the standard function atoi6
char strin%RS ) "*-."; int i ) atoi(strin%); int @ ) atoi("/20");

Dater ,e5ll learn ho, to go in the other direction! to convert an integer into a string. =ne ,ay! as long as ,hat you ,ant to do is print the number out! is to call printf! using ,d in the format string.%

3#

Chapter .: The C Preprocessor


Conceptually! the 99preprocessor55 is a translation phase that is applied to your source code before the compiler proper gets its hands on it. =nce upon a time! the preprocessor ,as a separate program! much as the compiler and lin&er may still be separate programs today.% Kenerally! the preprocessor performs textual substitutions on your source code! in three sorts of ,ays6

Gile inclusion6 inserting the contents of another file into your source file! as if you had typed it all in there. Eacro substitution6 replacing instances of one piece of text ,ith another. Conditional compilation6 8rranging that! depending on various circumstances! certain parts of your source code are seen or not seen by the compiler at all.

The next three sections ,ill introduce these three preprocessing functions. The syntax of the preprocessor is different from the syntax of the rest of C in several respects. Girst of all! the preprocessor is 99line based.55 Each of the preprocessor directives ,e5re going to learn about all of ,hich begin ,ith the # character% must begin at the beginning of a line! and each ends at the end of the line. The rest of C treats line ends as ?ust another ,hitespace character! and doesn5t care ho, your program text is arranged into lines.% Secondly! the preprocessor does not &no, about the structure of C..about functions! statements! or expressions. It is possible to play strange tric&s ,ith the preprocessor to turn something ,hich does not loo& li&e C into C or vice versa%. It5s also possible to run into problems ,hen a preprocessor substitution does not do ,hat you expected it to! because the preprocessor does not respect the structure of C statements and expressions but you expected it to%. Gor the simple uses of the preprocessor ,e5ll be discussing! you shouldn5t have any of these problems! but you5ll ,ant to be careful before doing anything tric&y or outrageous ,ith the preprocessor. 8s it happens! playing tric&y and outrageous games ,ith the preprocessor is considered sporting in some circles! but it rapidly gets out of hand! and can lead to be,ilderingly impenetrable programs.%

4.1 File /nclusion


BThis section corresponds to (+* Sec. F.""."C 8 line of the form
#include <filename.h>

or
#include "filename.h" causes the contents of the file filename.h to 8fter filename.h is processed! compilation

be read! parsed! and compiled at that point. continues on the line follo,ing the #include line.% Gor example! suppose you got tired of retyping external function prototypes such as
e'tern int %etline(char RS, int);

at the top of each source file. >ou could instead place the prototype in a header file! perhaps %etline.h! and then simply place
#include "%etline.h"

H1

at the top of each source file ,here you called %etline. >ou might not find it ,orth,hile to create an entire header file for a single function! but if you had a pac&age of several related function! it might be very useful to place all of their declarations in one header file.% 8s ,e may have mentioned! that5s exactly ,hat the Standard header files such as stdio.h are.. collections of declarations including external function prototype declarations% having to do ,ith various sets of Standard library functions. When you use #include to read in a header file! you automatically get the prototypes and other declarations it contains! and you should use header files! precisely so that you ,ill get the prototypes and other declarations they contain. The difference bet,een the <> and "" forms is ,here the preprocessor searches for filename.h. 8s a general rule! it searches for files enclosed in <> in central! standard directories! and it searches for files enclosed in "" in the 99current directory!55 or the directory containing the source file that5s doing the including. Therefore! "" is usually used for header files you5ve ,ritten! and <> is usually used for headers ,hich are provided for you ,hich someone else has ,ritten%. The extension 99.h55! by the ,ay! simply stands for 99header!55 and reflects the fact that #include directives usually sit at the top head% of your source files! and contain global declarations and definitions ,hich you ,ould other,ise put there. That extension is not mandatory..you can theoretically name your o,n header files anything you ,ish..but .h is traditional! and recommended.% 8s ,e5ve already begun to see! the reason for putting something in a header file! and then using #include to pull that header file into several different source files! is ,hen the something ,hatever it is% must be declared or defined consistently in all of the source files. If! instead of using a header file! you typed the something in to each of the source files directly! and the something ever changed! you5d have to edit all those source files! and if you missed one! your program could fail in subtle or serious% ,ays due to the mismatched declarations i.e. due to the incompatibility bet,een the ne, declaration in one source file and the old one in a source file you forgot to change%. Placing common declarations and definitions into header files means that if they ever change! they only have to be changed in one place! ,hich is a much more ,or&able system. What should you put in header files;

External declarations of global variables and functions. We said that a global variable must have exactly one defining instance! but that it can have external declarations in many places. We said that it ,as a grave error to issue an external declaration in one place saying that a variable or function has one type! ,hen the defining instance in some other place actually defines it ,ith another type. If the t,o places are t,o source files! separately compiled! the compiler ,ill probably not even catch the discrepancy.% If you put the external declarations in a header file! ho,ever! and include the header ,herever it5s needed! the declarations are virtually guaranteed to be consistent. It5s a good idea to include the header in the source file ,here the defining instance appears! too! so that the compiler can chec& that the declaration and definition match. That is! if you ever change the type! you do still have to change it in t,o places6 in the source file ,here the defining instance occurs! and in the header file ,here the external declaration appears. 'ut at least you don5t have to change it in an

H"

arbitrary number of places! and! if you5ve set things up correctly! the compiler can catch any remaining mista&es.% Preprocessor macro definitions ,hich ,e5ll meet in the next section%. Structure definitions ,hich ,e haven5t seen yet%. Typedef declarations ,hich ,e haven5t seen yet%.

/o,ever! there are a fe, things not to put in header files6 )efining instances of global variables. If you put these in a header file! and include the header file in more than one source file! the variable ,ill end up multiply defined. Gunction bodies ,hich are also defining instances%. >ou don5t ,ant to put these in headers for the same reason..it5s li&ely that you5ll end up ,ith multiple copies of the function and hence 99multiply defined55 errors. People sometimes put commonly.used functions in header files and then use #include to bring them once% into each program ,here they use that function! or use #include to bring together the several source files ma&ing up a program! but both of these are poor ideas. It5s much better to learn ho, to use your compiler or lin&er to combine together separately.compiled ob?ect files. Since header files typically contain only external declarations! and should not contain function bodies! you have to understand ?ust ,hat does and doesn5t happen ,hen you #include a header file. The header file may provide the declarations for some functions! so that the compiler can generate correct code ,hen you call them and so that it can ma&e sure that you5re calling them correctly%! but the header file does not give the compiler the functions themselves. The actual functions ,ill be combined into your program at the end of compilation! by the part of the compiler called the linker. The lin&er may have to get the functions out of libraries! or you may have to tell the compiler<lin&er ,here to find them. In particular! if you are trying to use a third.party library containing some useful functions! the library ,ill often come ,ith a header file describing those functions. Using the library is therefore a t,o.step process6 you must #include the header in the files ,here you call the library functions! and you must tell the lin&er to read in the functions from the library itself.

4.2 5acro De*inition and Su!stitution


BThis section corresponds to (+* Sec. F."".4C 8 preprocessor line of the form
#define name text

defines a macro ,ith the given name! having as its value the given replacement text. 8fter that for the rest of the current source file%! ,herever the preprocessor sees that name! it ,ill replace it ,ith the replacement text. The name follo,s the same rules as ordinary identifiers it can contain only letters! digits! and underscores! and may not begin ,ith a digit%. Since macros behave :uite differently from normal variables or functions%! it is customary to give them names ,hich are all capital letters or at least ,hich begin ,ith a capital letter%. The replacement text can be absolutely anything..it5s not restricted to numbers! or simple strings! or anything. The most common use for macros is to propagate various constants around and to ma&e them more self.documenting. We5ve been saying things li&e

H4

char lineR* S; ... %etline(line, *

);

but this is neither readable nor reliable7 it5s not necessarily obvious ,hat all those "115s scattered around the program are! and if ,e ever decide that "11 is too small for the si@e of the array to hold lines! ,e5ll have to remember to change the number in t,o or more% places. 8 much better solution is to use a macro6
#define U5N1<B> * char lineRU5N1<B>S; ... %etline(line, U5N1<B>);

0o,! if ,e ever ,ant to change the si@e! ,e only have to do it in one place! and it5s more obvious ,hat the ,ords U5N1<B> sprin&led through the program mean than the magic numbers "11 did. Since the replacement text of a preprocessor macro can be anything! it can also be an expression! although you have to reali@e that! as al,ays! the text is substituted and perhaps evaluated% later. 0o evaluation is performed ,hen the macro is defined. Gor example! suppose that you ,rite something li&e
#define 5 #define = . #define F 5 + =

this is a pretty meaningless example! but the situation does come up in practice%. Then! later! suppose that you ,rite
int ' ) F ( -;

If 5! =! and F ,ere ordinary variables! you5d expect ' to end up ,ith the value "1. 'ut let5s see ,hat happens. The preprocessor al,ays substitutes text for macros exactly as you have ,ritten it. So it first substitites the replacement text for the macro F! resulting in
int ' ) 5 + = ( -;

Then it substitutes the macros 5 and =! resulting in


int ' ) - + . ( -;

=nly ,hen the preprocessor is done doing all this substituting does the compiler get into the act. 'ut ,hen it evaluates that expression using the normal precedence of multiplication over addition%! it ends up initiali@ing ' ,ith the value -A To guard against this sort of problem! it is al,ays a good idea to include explicit parentheses in the definitions of macros ,hich contain expressions. If ,e ,ere to define the macro F as
#define F (5 + =) then the declaration of ' ,ould ultimately expand to int ' ) (- + .) ( -; and ' ,ould be initiali@ed to "1! as ,e probably expected.

0otice that there does not have to be and in fact there usually is not% a semicolon at the end of a #define line. This is ?ust one of the ,ays that the syntax of the preprocessor is different from the rest of C.% If you accidentally type
#define U5N1<B> * ; #( K;LBM (#

then ,hen you later declare H2

char lineRU5N1<B>S;

the preprocessor ,ill expand it to


char lineR* ;S;

#( K;LBM (#

,hich is a syntax error. This is ,hat ,e mean ,hen ,e say that the preprocessor doesn5t &no, much of anything about the syntax of C..in this last example! the value or replacement text for the macro U5N1<B> ,as the F characters * ; ! and that5s exactly ,hat the preprocessor substituted even though it didn5t ma&e any sense%. Simple macros li&e U5N1<B> act sort of li&e little variables! ,hose values are constant or constant expressions%. It5s also possible to have macros ,hich loo& li&e little functions that is! you invo&e them ,ith ,hat loo&s li&e function call syntax! and they expand to replacement text ,hich is a function of the actual arguments they are invo&ed ,ith% but ,e ,on5t be loo&ing at these yet.

4.3 Conditional Compilation


BThis section corresponds to (+* Sec. F."".2C The last preprocessor directive ,e5re going to loo& at is #ifdef. If you have the se:uence
#ifdef name pro(ram&text #else more&pro(ram&text #endif

in your program! the code that gets compiled depends on ,hether a preprocessor macro by that name is defined or not. If it is that is! if there has been a #define line for a macro called name%! then 99program text55 is compiled and 99more program text55 is ignored. If the macro is not defined! 99more program text55 is compiled and 99program text55 is ignored. This loo&s a lot li&e an if statement! but it behaves completely differently6 an if statement controls ,hich statements of your program are executed at run time! but #ifdef controls ,hich parts of your program actually get compiled. Lust as for the if statement! the #else in an #ifdef is optional. There is a companion directive #ifndef! ,hich compiles code if the macro is not defined although the 99#else clause55 of an #ifndef directive ,ill then be compiled if the macro is defined%. There is also an #if directive ,hich compiles code depending on ,hether a compile.time expression is true or false. The expressions ,hich are allo,ed in an #if directive are some,hat restricted! ho,ever! so ,e ,on5t tal& much about #if here.% Conditional compilation is useful in t,o general classes of situations6

>ou are trying to ,rite a portable program! but the ,ay you do something is different depending on ,hat compiler! operating system! or computer you5re using. >ou place different versions of your code! one for each situation! bet,een suitable #ifdef directives! and ,hen you compile the progam in a particular environment! you arrange to have the macro names defined ,hich select the variants you need in that environment. Gor this reason! compilers usually have ,ays of letting you define macros from the invocation command line or in a configuration file! and many also predefine certain macro names related to the operating system! processor! or compiler in use. That ,ay! you don5t have to change the code to change the #define lines each HF

time you compile it in a different environment.% Gor example! in 80SI C! the function to delete a file is remo9e. =n older Unix systems! ho,ever! the function ,as called unlin&. So if filename is a variable containing the name of a file you ,ant to delete! and if you ,ant to be able to compile the program under these older Unix systems! you might ,rite
#ifdef uni' unlin&(filename); #else remo9e(filename); #endif

Then! you could place the line


#define uni'

at the top of the file ,hen compiling under an old Unix system. Since all you5re using the macro uni' for is to control the #ifdef! you don5t need to give it any replacement text at all. &ny definition for a macro! even if the replacement text is empty! causes an #ifdef to succeed.% In fact! in this example! you ,ouldn5t even need to define the macro uni' at all! because C compilers on old Unix systems tend to predefine it for you! precisely so you can ma&e tests li&e these.%

>ou ,ant to compile several different versions of your program! ,ith different features present in the different versions. >ou brac&et the code for each feature ,ith #ifdef directives! and as for the previous case% arrange to have the right macros defined or not to build the version you ,ant to build at any given time. This ,ay! you can build the several different versions from the same source code. =ne common example is ,hether you turn debugging statements on or off. >ou can brac&et each debugging printout ,ith #ifdef G>=VM and #endif! and then turn on debugging only ,hen you need it.% Gor example! you might use lines li&e this6
#ifdef G>=VM printf("' is ,d\n", '); #endif

to print out the value of the variable ' at some point in your program to see if it5s ,hat you expect. To enable debugging printouts! you insert the line
#define G>=VM

at the top of the file! and to turn them off! you delete that line! but the debugging printouts :uietly remain in your code! temporarily deactivated! but ready to reactivate if you find yourself needing them again later. 8lso! instead of inserting and deleting the #define line! you might use a compiler flag such as "GG>=VM to define the macro G>=VM from the compiler invocatin line.%

H$

Conditional compilation can be very handy! but it can also get out of hand. When large chun&s of the program are completely different depending on! say! ,hat operating system the program is being compiled for! it5s often better to place the different versions in separate source files! and then only use one of the files corresponding to one of the versions% to build the program on any given system. 8lso! if you are using an 80SI Standard compiler and you are ,riting 80SI.compatible code! you usually ,on5t need so much conditional compilation! because the Standard specifies exactly ho, the compiler must do certain things! and exactly ,hich library functions it much provide! so you don5t have to ,or& so hard to accommodate the old variations among compilers and libraries.

H3

Chapter 1/: Pointers


Pointers are often thought to be the most difficult aspect of C. It5s true that many people have various problems ,ith pointers! and that many programs founder on pointer.related bugs. 8ctually! though! many of the problems are not so much ,ith the pointers per se but rather ,ith the memory they point to! and more specifically! ,hen there isn%t any valid memory ,hich they point to. 8s long as you5re careful to ensure that the pointers in your programs al,ays point to valid memory! pointers can be useful! po,erful! and relatively trouble.free tools. We5ll tal& about memory allocation in the next chapter.% BThis chapter is the only one in this series that contains any graphics. If you are using a text. only bro,ser! there are a fe, figures you ,on5t be able to see.C 8 pointer is a variable that points at! or refers to! another variable. That is! if ,e have a pointer variable of type 99pointer to int!99 it might point to the int variable i! or to the third cell of the int array a. Kiven a pointer variable! ,e can as& :uestions li&e! 99What5s the value of the variable that this pointer points to;55 Why ,ould ,e ,ant to have a variable that refers to another variable; Why not ?ust use that other variable directly; The ans,er is that a level of indirection can be very useful. 'ndirection is ?ust another ,ord for the situation ,hen one variable refers to another.% Imagine a club ,hich elects ne, officers each year. In its clubroom! it might have a set of mailboxes for each member! along ,ith special mailboxes for the president! secretary! and treasurer. The ban& doesn5t mail statements to the treasurer under the treasurer5s name7 it mails them to 99treasurer!55 and the statements go to the mailbox mar&ed 99treasurer.55 This ,ay! the ban& doesn5t have to change the mailing address it uses every year. The mailboxes labeled 99president!55 99treasurer!55 and 99secretary55 are a little bit li&e pointers..they don5t refer to people directly. If ,e ma&e the analogy that a mailbox holding letters is li&e a variable holding numbers! then mailboxes for the president! secretary! and treasurer aren5t :uite li&e pointers! because they5re still mailboxes ,hich in principle could hold letters directly. 'ut suppose that mail is never actually put in those three mailboxes6 suppose each of the officers5 mailboxes contains a little mar&er listing the name of the member currently holding that office. When you5re sorting mail! and you have a letter for the treasurer! you first go to the treasurer5s mailbox! but rather than putting the letter there! you read the name on the mar&er there! and put the mail in the mailbox for that person. Similarly! if the club is poorly organi@ed! and the treasurer stops doing his ?ob! and you5re the president! and one day you get a call from the ban& saying that the club5s account is in arrears and the treasurer hasn5t done anything about it and as&ing if you! the president! can loo& into it7 and if the club is so poorly organi@ed that you5ve forgotten ,ho the treasurer is! you can go to the treasurer5s mailbox! read the name on the mar&er there! and go to that mailbox ,hich is probably overflo,ing% to find all the treasury.related mail. We could say that the mar&ers in the mailboxes for the president! secretary! and treasurer ,ere pointers to other mailboxes. In an analogous ,ay! pointer variables in C contain pointers to other variables or memory locations.

HH

16.1 (asic Pointer %perations


BThis section corresponds to (+* Sec. $."C The first things to do ,ith pointers are to declare a pointer variable! set it to point some,here! and finally manipulate the value that it points to. 8 simple pointer declaration loo&s li&e this6
int (ip;

This declaration loo&s li&e our earlier declarations! ,ith one obvious difference6 that asteris&. The asteris& means that ip! the variable ,e5re declaring! is not of type int! but rather of type pointer.to.int. 8nother ,ay of loo&ing at it is that (ip! ,hich as ,e5ll see is the value pointed to by ip! ,ill be an int.% We may thin& of setting a pointer variable to point to another variable as a t,o.step process6 first ,e generate a pointer to that other variable! then ,e assign this ne, pointer to the pointer variable. We can say but ,e have to be careful ,hen ,e5re saying it% that a pointer variable has a value! and that its value is 99pointer to that other variable55. This ,ill ma&e more sense ,hen ,e see ho, to generate pointer values. Pointers that is! pointer values% are generated ,ith the 99address.of55 operator I! ,hich ,e can also thin& of as the 99pointer.to55 operator. We demonstrate this by declaring and initiali@ing% an int variable i! and then setting ip to point to it6
int i ) 2; ip ) Ii;

The assignment expression ip ) Ii; contains both parts of the 99t,o.step process556 Ii generates a pointer to i! and the assignment operator assigns the ne, pointer to that is! places it 99in55% the variable ip. 0o, ip 99points to55 i! ,hich ,e can illustrate ,ith this picture6

is a variable of type int! so the value in its box is a number! $. ip is a variable of type pointer.to.int! so the 99value55 in its box is an arro, pointing at another box. *eferring once again bac& to the 99t,o.step process55 for setting a pointer variable6 the I operator dra,s us the arro,head pointing at i5s box! and the assignment operator )! ,ith the pointer variable ip on its left! anchors the other end of the arro, in ip5s box.
i

We discover the value pointed to by a pointer using the 99contents.of55 operator! (. Placed in front of a pointer! the ( operator accesses the value pointed to by that pointer. In other ,ords! if ip is a pointer! then the expression (ip gives us ,hatever it is that5s in the variable or location pointed to by ip. Gor example! ,e could ,rite something li&e ,hich
printf(",d\n", (ip); ,ould print $! since ip points

to i! and i is at the moment% $.

>ou may ,onder ho, the asteris& ( can be the pointer contents.of operator ,hen it is also the multiplication operator. There is no ambiguity here6 it is the multiplication operator ,hen it sits bet,een t,o variables! and it is the contents.of operator ,hen it sits in front of a single H-

variable. The situation is analogous to the minus sign6 bet,een t,o variables or expressions it5s the subtraction operator! but in front of a single operator or expression it5s the negation operator. Technical terms you may hear for these distinct roles are unary and binary6 a binary operator applies to t,o operands! usually on either side of it! ,hile a unary operator applies to a single operand.% The contents.of operator ( does not merely fetch values through pointers7 it can also set values through pointers. We can ,rite something li&e
(ip ) 3;

,hich means 99set ,hatever ip points to to H.55 8gain! the ( tells us to go to the location pointed to by ip! but this time! the location isn5t the one to fetch from..,e5re on the left.hand sign of an assignment operator! so (ip tells us the location to store to. The situation is no different from array subscripting expressions such as aR.S ,hich ,e5ve already seen appearing on both sides of assignments.% The result of the assignment (ip ) 3 is that i5s value is changed to H! and the picture changes to6

If ,e called printf(",d\n", (ip) again! it ,ould no, print H. 8t this point! you may be ,ondering ,hy ,e5re going through this rigamarole..if ,e ,anted to set i to H! ,hy didn5t ,e do it directly; We5ll begin to explore that next! but first let5s notice the difference bet,een changing a pointer that is! changing ,hat variable it points to% and changing the value at the location it points to. When ,e ,rote (ip ) 3! ,e changed the value pointed to by ip! but if ,e declare another variable @6
int @ ) .;

and ,rite
ip ) I@; ,e5ve changed ip itself.

The picture no, loo&s li&e this6

We have to be careful ,hen ,e say that a pointer assignment changes 99,hat the pointer points to.55 =ur earlier assignment
(ip ) 3; ip ) I@;

changed the value pointed to by ip! but this more recent assignment has changed ,hat variable ip points to. It5s true that 99,hat ip points to55 has changed! but this time! it has changed for a different reason. 0either i ,hich is still H% nor @ ,hich is still 2% has changed. What has changed is ip5s value.% If ,e again call
printf(",d\n", (ip);

this time it ,ill print 2.

H#

We can also assign pointer values to other pointer variables. If ,e declare a second pointer variable6
int (ip-;

then ,e can say

ip- ) ip;

0o, ip- points ,here ip does7 ,e5ve essentially made a 99copy55 of the arro,6

0o,! if ,e set ip to point bac& to i again6


ip ) Ii;

the t,o arro,s point to different places6

We can no, see that the t,o assignments


ip- ) ip;

and
(ip- ) (ip;

do t,o very different things. The first ,ould ma&e ip- again point to ,here ip points in other ,ords! bac& to i again%. The second ,ould store! at the location pointed to by ip-! a copy of the value pointed to by ip7 in other ,ords if ip and ip- still point to i and @ respectively% it ,ould set @ to i5s value! or H. It5s important to &eep very clear in your mind the distinction bet,een a pointer and what it points to. The t,o are li&e apples and oranges or perhaps oil and ,ater%7 you can5t mix them. >ou can5t 99set ip to $55 by ,riting something li&e
ip ) 2; #( K;LBM (#

$ is an integer! but ip is a pointer. >ou probably ,anted to 99set the value pointed to by ip to $!55 ,hich you express by ,riting
(ip ) 2;

Similarly! you can5t 99see ,hat ip is55 by ,riting 8gain!

printf(",d\n", ip); #( K;LBM (# ip is a pointer.to.int! but ,d expects an int. printf(",d\n", (ip);

To print what ip points to! use

Ginally! a fe, more notes about pointer declarations. The ( in a pointer declaration is related to! but different from! the contents.of operator (. 8fter ,e declare a pointer variable
int (ip;

-1

the expression sets


ip ) Ii ,hat ip points (ip ) 2

to that is! ,hich location it points to%! ,hile the expression

sets the value of the location pointed to by ip. =n the other hand! if ,e declare a pointer variable and include an initiali@er6
int (ip. ) Ii;

,e5re setting the initial value for ip.! ,hich is ,here ip. ,ill point! so that initial value is a pointer. In other ,ords! the ( in the declaration int (ip. ) Ii; is not the contents.of operator! it5s the indicator that ip. is a pointer.% If you have a pointer declaration containing an initiali@ation! and you ever have occasion to brea& it up into a simple declaration and a conventional assignment! do it li&e this6
int (ip.; ip. ) Ii;

)on5t ,rite

int (ip.; (ip. ) Ii;

or you5ll be trying to mix oil and ,ater again. 8lso! ,hen ,e ,rite
int (ip;

although the asteris& affects ip5s type! it goes ,ith the identifier name ip! not ,ith the type int on the left. To declare t,o pointers at once! the declaration loo&s li&e
int (ip*, (ip-;

Some people ,rite pointer declarations li&e this6


int( ip;

This ,or&s for one pointer! because C essentially ignores ,hitespace. 'ut if you ever ,rite
int( ip*, ip-; #( W;L=5=1O K;LBM (#

it ,ill declare one pointer.to.int ip* and one plain int ip-! ,hich is probably not ,hat you meant. What is all of this good for; If it ,as ?ust for changing variables li&e i from $ to H! it ,ould not be good for much. What it5s good for! among other things! is ,hen for various reasons ,e don5t &no, exactly ,hich variable ,e ,ant to change! ?ust li&e the ban& didn5t &no, exactly ,hich club member it ,anted to send the statement to.

16.1 (asic Pointer %perations


BThis section corresponds to (+* Sec. $."C The first things to do ,ith pointers are to declare a pointer variable! set it to point some,here! and finally manipulate the value that it points to. 8 simple pointer declaration loo&s li&e this6
int (ip;

This declaration loo&s li&e our earlier declarations! ,ith one obvious difference6 that asteris&. The asteris& means that ip! the variable ,e5re declaring! is not of type int! but rather of type pointer.to.int. 8nother ,ay of loo&ing at it is that (ip! ,hich as ,e5ll see is the value pointed to by ip! ,ill be an int.%

-"

We may thin& of setting a pointer variable to point to another variable as a t,o.step process6 first ,e generate a pointer to that other variable! then ,e assign this ne, pointer to the pointer variable. We can say but ,e have to be careful ,hen ,e5re saying it% that a pointer variable has a value! and that its value is 99pointer to that other variable55. This ,ill ma&e more sense ,hen ,e see ho, to generate pointer values. Pointers that is! pointer values% are generated ,ith the 99address.of55 operator I! ,hich ,e can also thin& of as the 99pointer.to55 operator. We demonstrate this by declaring and initiali@ing% an int variable i! and then setting ip to point to it6
int i ) 2; ip ) Ii;

The assignment expression ip ) Ii; contains both parts of the 99t,o.step process556 Ii generates a pointer to i! and the assignment operator assigns the ne, pointer to that is! places it 99in55% the variable ip. 0o, ip 99points to55 i! ,hich ,e can illustrate ,ith this picture6

is a variable of type int! so the value in its box is a number! $. ip is a variable of type pointer.to.int! so the 99value55 in its box is an arro, pointing at another box. *eferring once again bac& to the 99t,o.step process55 for setting a pointer variable6 the I operator dra,s us the arro,head pointing at i5s box! and the assignment operator )! ,ith the pointer variable ip on its left! anchors the other end of the arro, in ip5s box.
i

We discover the value pointed to by a pointer using the 99contents.of55 operator! (. Placed in front of a pointer! the ( operator accesses the value pointed to by that pointer. In other ,ords! if ip is a pointer! then the expression (ip gives us ,hatever it is that5s in the variable or location pointed to by ip. Gor example! ,e could ,rite something li&e
printf(",d\n", (ip);

,hich ,ould print $! since ip points to i! and i is at the moment% $. >ou may ,onder ho, the asteris& ( can be the pointer contents.of operator ,hen it is also the multiplication operator. There is no ambiguity here6 it is the multiplication operator ,hen it sits bet,een t,o variables! and it is the contents.of operator ,hen it sits in front of a single variable. The situation is analogous to the minus sign6 bet,een t,o variables or expressions it5s the subtraction operator! but in front of a single operator or expression it5s the negation operator. Technical terms you may hear for these distinct roles are unary and binary6 a binary operator applies to t,o operands! usually on either side of it! ,hile a unary operator applies to a single operand.% The contents.of operator ( does not merely fetch values through pointers7 it can also set values through pointers. We can ,rite something li&e
(ip ) 3;

,hich means 99set ,hatever ip points to to H.55 8gain! the ( tells us to go to the location pointed to by ip! but this time! the location isn5t the one to fetch from..,e5re on the left.hand sign of an assignment operator! so (ip tells us the location to store to. The situation is no -4

different from array subscripting expressions such as aR.S ,hich ,e5ve already seen appearing on both sides of assignments.% The result of the assignment (ip ) 3 is that i5s value is changed to H! and the picture changes to6

If ,e called printf(",d\n", (ip) again! it ,ould no, print H. 8t this point! you may be ,ondering ,hy ,e5re going through this rigamarole..if ,e ,anted to set i to H! ,hy didn5t ,e do it directly; We5ll begin to explore that next! but first let5s notice the difference bet,een changing a pointer that is! changing ,hat variable it points to% and changing the value at the location it points to. When ,e ,rote (ip ) 3! ,e changed the value pointed to by ip! but if ,e declare another variable @6
int @ ) .;

and ,rite

ip ) I@;

,e5ve changed ip itself. The picture no, loo&s li&e this6

We have to be careful ,hen ,e say that a pointer assignment changes 99,hat the pointer points to.55 =ur earlier assignment
(ip ) 3;

changed the value pointed to by ip! but this more recent assignment
ip ) I@;

has changed ,hat variable ip points to. It5s true that 99,hat ip points to55 has changed! but this time! it has changed for a different reason. 0either i ,hich is still H% nor @ ,hich is still 2% has changed. What has changed is ip5s value.% If ,e again call
printf(",d\n", (ip);

this time it ,ill print 2. We can also assign pointer values to other pointer variables. If ,e declare a second pointer variable6
int (ip-;

then ,e can say


ip- ) ip;

0o, ip- points ,here ip does7 ,e5ve essentially made a 99copy55 of the arro,6

-2

0o,! if ,e set ip to point bac& to i again6


ip ) Ii;

the t,o arro,s point to different places6

We can no, see that the t,o assignments


ip- ) ip;

and
(ip- ) (ip;

do t,o very different things. The first ,ould ma&e ip- again point to ,here ip points in other ,ords! bac& to i again%. The second ,ould store! at the location pointed to by ip-! a copy of the value pointed to by ip7 in other ,ords if ip and ip- still point to i and @ respectively% it ,ould set @ to i5s value! or H. It5s important to &eep very clear in your mind the distinction bet,een a pointer and what it points to. The t,o are li&e apples and oranges or perhaps oil and ,ater%7 you can5t mix them. >ou can5t 99set ip to $55 by ,riting something li&e
ip ) 2; #( K;LBM (#

$ is an integer! but ip is a pointer. >ou probably ,anted to 99set the value pointed to by ip to $!55 ,hich you express by ,riting
(ip ) 2;

Similarly! you can5t 99see ,hat ip is55 by ,riting 8gain!

printf(",d\n", ip); #( K;LBM (# ip is a pointer.to.int! but ,d expects an int. printf(",d\n", (ip);

To print what ip points to! use

Ginally! a fe, more notes about pointer declarations. The ( in a pointer declaration is related to! but different from! the contents.of operator (. 8fter ,e declare a pointer variable
int (ip;

the expression
ip ) Ii

sets ,hat ip points to that is! ,hich location it points to%! ,hile the expression
(ip ) 2

sets the value of the location pointed to by ip. =n the other hand! if ,e declare a pointer variable and include an initiali@er6
int (ip. ) Ii;

,e5re setting the initial value for ip.! ,hich is ,here ip. ,ill point! so that initial value is a pointer. In other ,ords! the ( in the declaration int (ip. ) Ii; is not the contents.of operator! it5s the indicator that ip. is a pointer.% If you have a pointer declaration containing an initiali@ation! and you ever have occasion to brea& it up into a simple declaration and a conventional assignment! do it li&e this6

-F

int (ip.; ip. ) Ii;

)on5t ,rite
int (ip.; (ip. ) Ii;

or you5ll be trying to mix oil and ,ater again. 8lso! ,hen ,e ,rite
int (ip;

although the asteris& affects ip5s type! it goes ,ith the identifier name ip! not ,ith the type int on the left. To declare t,o pointers at once! the declaration loo&s li&e
int (ip*, (ip-; int( ip;

Some people ,rite pointer declarations li&e this6 This ,or&s for one pointer! because C essentially ignores ,hitespace. 'ut if you ever ,rite
int( ip*, ip-; #( W;L=5=1O K;LBM (#

it ,ill declare one pointer.to.int ip* and one plain int ip-! ,hich is probably not ,hat you meant. What is all of this good for; If it ,as ?ust for changing variables li&e i from $ to H! it ,ould not be good for much. What it5s good for! among other things! is ,hen for various reasons ,e don5t &no, exactly ,hich variable ,e ,ant to change! ?ust li&e the ban& didn5t &no, exactly ,hich club member it ,anted to send the statement to.

16.2 Pointers and Arrays7 Pointer Arit$metic


BThis section corresponds to (+* Sec. $.2C Pointers do not have to point to single variables. They can also point at the cells of an array. Gor example! ,e can ,rite
int (ip; int aR* S; ip ) IaR.S;

and ,e ,ould end up ,ith ip pointing at the fourth cell of the array a remember! arrays are 1.based! so aR S is the first cell%. We could illustrate the situation li&e this6

We5d use this ip ?ust li&e the one in the previous section6 (ip gives us ,hat ip points to! ,hich in this case ,ill be the value in aR.S. =nce ,e have a pointer pointing into an array! ,e can start doing pointer arithmetic. Kiven that ip is a pointer to aR.S! ,e can add " to ip6
ip + *

What does it mean to add one to a pointer; In C! it gives a pointer to the cell one farther on! ,hich in this case is aR/S. To ma&e this clear! let5s assign this ne, pointer to another pointer variable6
ip- ) ip + *;

-$

0o, the picture loo&s li&e this6

If ,e no, do
(ip- ) /; ,e5ve set aR/S to F. 'ut it5s ((ip + *) ) 2;

not necessary to assign a ne, pointer value to a pointer variable in order to use it7 ,e could also compute a ne, pointer value and use it immediately6 In this last example! ,e5ve changed aR/S again! setting it to $. The parentheses are needed because the unary 99contents of55 operator ( has higher precedence i.e.! binds more tightly than% the addition operator. If ,e ,rote (ip + *! ,ithout the parentheses! ,e5d be fetching the value pointed to by ip! and adding " to that value. The expression ((ip + *)! on the other hand! accesses the value one past the one pointed to by ip. Kiven that ,e can add " to a pointer! it5s not surprising that ,e can add and subtract other numbers as ,ell. If ip still points to aR.S! then
((ip + .) ) 3;

sets aR0S to H! and sets aR*S to F.

((ip " -) ) /;

Up above! ,e added " to ip and assigned the ne, pointer to ip-! but there5s no reason ,e can5t add one to a pointer! and change the same pointer6
ip ) ip + *;

0o, ip points one past ,here it used to to aR/S! if ,e hadn5t changed it in the meantime%. The shortcuts ,e learned in a previous chapter all ,or& for pointers! too6 ,e could also increment a pointer using
ip +) *; ip++;

or =f course! pointers are not limited to ints. It5s :uite common to use pointers to other types! especially char. /ere is the innards of the mAstrcmp function ,e sa, in a previous chapter! re,ritten to use pointers. mAstrcmp! you may recall! compares t,o strings! character by character.%
char (p* ) Istr*R S, (p- ) Istr-R S; while(*) { if((p* !) (p-) return (p* " (p-; if((p* )) 4\ 4 JJ (p- )) 4\ 4) return ; p*++; p-++; !

-3

The autoincrement operator ++ li&e its companion! ""% ma&es it easy to do t,o things at once. We5ve seen idioms li&e aRi++S ,hich accesses aRiS and simultaneously increments i! leaving it referencing the next cell of the array a. We can do the same thing ,ith pointers6 an expression li&e (ip++ lets us access ,hat ip points to! ,hile simultaneously incrementing ip so that it points to the next element. The preincrement form ,or&s! too6 (++ip increments ip! then accesses ,hat it points to. Similarly! ,e can use notations li&e (ip"" and (""ip. 8s another example! here is the strcpA string copy% loop from a previous chapter! re,ritten to use pointers6
char (dp ) IdestR S, (sp ) IsrcR S; while((sp !) 4\ 4) (dp++ ) (sp++; (dp ) 4\ 4;

=ne :uestion that comes up is ,hether the expression (p++ increments p or ,hat it points to. The ans,er is that it increments p. To increment ,hat p points to! you can use ((p)++.% When you5re doing pointer arithmetic! you have to remember ho, big the array the pointer points into is! so that you don5t ever point outside it. If the array a has "1 elements! you can5t access aR2 S or aR"*S or even aR* S remember! the valid subscripts for a "1.element array run from 1 to #%. Similarly! if a has "1 elements and ip points to aR.S! you can5t compute or access ip + * or ip " 2. There is one special case6 you can! in this case! compute! but not access! a pointer to the nonexistent element ?ust beyond the end of the array! ,hich in this case is IaR* S. This becomes useful ,hen you5re doing pointer comparisons! ,hich ,e5ll loo& at next.%

16.3 Pointer Su!traction and Comparison


8s ,e5ve seen! you can add an integer to a pointer to get a ne, pointer! pointing some,here beyond the original as long as it5s in the same array%. Gor example! you might ,rite
ip- ) ip* + .;

8pplying a little algebra! you might ,onder ,hether


ip- " ip* ) .

and the ans,er is! yes. When you subtract t,o pointers! as long as they point into the same array! the result is the number of elements separating them. >ou can also as& again! as long as they point into the same array% ,hether one pointer is greater or less than another6 one pointer is 99greater than55 another if it points beyond ,here the other one points. >ou can also compare pointers for e:uality and ine:uality6 t,o pointers are e:ual if they point to the same variable or to the same cell in an array! and are obviously% une:ual if they don5t. When testing for e:uality or ine:uality! the t,o pointers do not have to point into the same array.% =ne common use of pointer comparisons is ,hen copying arrays using pointers. /ere is a code fragment ,hich copies "1 elements from arraA* to arraA-! using pointers. It uses an end pointer! ep! to &eep trac& of ,hen it should stop copying.
int arraA*R* S, arraA-R* S; int (ip*, (ip- ) IarraA-R S; int (ep ) IarraA*R* S; for(ip* ) IarraA*R S; ip* < ep; ip*++)

-H

(ip-++ ) (ip*;

8s ,e mentioned! there is no element arraA*R* S! but it is legal to compute a pointer to this nonexistent% element! as long as ,e only use it in pointer comparisons li&e this that is! as long as ,e never try to fetch or store the value that it points to.%

16.4 "ull Pointers


We said that the value of a pointer variable is a pointer to some other variable. There is one other value a pointer may have6 it may be set to a null pointer. 8 null pointer is a special pointer value that is &no,n not to point any,here. What this means that no other valid pointer! to any other variable or array cell or anything else! ,ill ever compare e:ual to a null pointer. The most straightfor,ard ,ay to 99get55 a null pointer in your program is by using the predefined constant BV11! ,hich is defined for you by several standard header files! including <stdio.h>! <stdli$.h>! and <strin%.h>. To initiali@e a pointer to a null pointer! you might use code li&e
#include <stdio.h> int (ip ) BV11;

and to test it for a null pointer before inspecting the value pointed to you might use code li&e
if(ip !) BV11) printf(",d\n", (ip);

It is also possible to refer to the null pointer by using a constant ! and you ,ill see some code that sets null pointers by simply doing
int (ip ) ;

In fact! BV11 is a preprocessor macro ,hich typically has the value! or replacement text! .% Gurthermore! since the definition of 99true55 in C is a value that is not e:ual to 1! you ,ill see code that tests for non.null pointers ,ith abbreviated code li&e
if(ip) printf(",d\n", (ip);

This has the same meaning as our previous example7 if(ip) is e:uivalent to if(ip !) and to if(ip !) BV11). 8ll of these uses are legal! and although I recommend that you use the constant BV11 for clarity! you ,ill come across the other forms! so you should be able to recogni@e them.

>ou can use a null pointer as a placeholder to remind yourself or! more importantly! to help your program remember% that a pointer variable does not point any,here at the moment and that you should not use the 99contents of55 operator on it that is! you should not try to inspect ,hat it points to! since it doesn5t point to anything%. 8 function that returns pointer values can return a null pointer ,hen it is unable to perform its tas&. 8 null pointer used in this ,ay is analogous to the >LH value that functions li&e %etchar return.%

--

8s an example! let us ,rite our o,n version of the standard library function strstr! ,hich loo&s for one string ,ithin another! returning a pointer to the string if it can! or a null pointer if it cannot. /ere is the function! using the obvious brute.force algorithm6 at every character of the input string! the code chec&s for a match there of the pattern string6
#include <stddef.h> char (mAstrstr(char inputRS, char patRS) { char (start, (p*, (p-; for(start ) IinputR S; (start !) 4\ 4; start++) { #( for each position in input strin%... (# p* ) pat; #( prepare to chec& for pattern strin% there (# p- ) start; while((p* !) 4\ 4) { if((p* !) (p-) #( characters differ (# $rea&; p*++; p-++; ! if((p* )) 4\ 4) #( found match (# return start; ! return BV11; !

The start pointer steps over each character position in the input string. 8t each character! the inner loop chec&s for a match there! by using p* to step over the pattern string pat%! and p- to step over the input string starting at start%. We compare successive characters until either a% ,e reach the end of the pattern string (p* )) 4\ 4%! or b% ,e find t,o characters ,hich differ. When ,e5re done ,ith the inner loop! if ,e reached the end of the pattern string (p* )) 4\ 4%! it means that all preceding characters matched! and ,e found a complete match for the pattern starting at start! so ,e return start. =ther,ise! ,e go around the outer loop again! to try another starting position. If ,e run out of those if (start )) 4\ 4%! ,ithout finding a match! ,e return a null pointer. 0otice that the function is declared as returning and does in fact return% a pointer.to.char. We can use mAstrstr or its standard library counterpart strstr% to determine ,hether one string contains another6
if(mAstrstr("Hello, world!", "lo") )) BV11) printf("no\n"); else printf("Aes\n");

In general! C does not initiali@e pointers to null for you! and it never tests pointers to see if they are null before using them. If one of the pointers in your programs points some,here some of the time but not all of the time! an excellent convention to use is to set it to a null pointer ,hen it doesn5t point any,here valid! and to test to see if it5s a null pointer before using it. 'ut you must use explicit code to set it to BV11! and to test it against BV11. In other ,ords! ?ust setting an unused pointer variable to BV11 doesn5t guarantee safety7 you also have to chec& for the null value before using the pointer.% =n the other hand! if you &no, that a

-#

particular pointer variable is al,ays valid! you don5t have to insert a paranoid test against BV11 before using it.

16.# 88E9ui3alence:: !et;een Pointers and Arrays


There are a number of similarities bet,een arrays and pointers in C. If you have an array
int aR* S;

you can refer to aR S! aR*S! aR-S! etc.! or to aRiS ,here i is an int. If you declare a pointer variable ip and set it to point to the beginning of an array6
int (ip ) IaR S;

you can refer to (ip! ((ip+*)! ((ip+-)! etc.! or to ((ip+i) ,here i is an int. There are also differences! of course. >ou cannot assign t,o arrays7 the code
int aR* S, $R* S; a ) $; int (ip*, (ip-; ip* ) IaR S; ip- ) ip*; #( K;LBM (#

is illegal. 8s ,e5ve seen! though! you can assign t,o pointer variables6

Pointer assignment is straightfor,ard7 the pointer on the left is simply made to point ,herever the pointer on the right does. We haven5t copied the data pointed to there5s still ?ust one copy! in the same place%7 ,e5ve ?ust made t,o pointers point to that one place. The similarities bet,een arrays and pointers end up being :uite useful! and in fact C builds on the similarities! leading to ,hat is called 99the e:uivalence of arrays and pointers in C.55 When ,e spea& of this 99e:uivalence55 ,e do not mean that arrays and pointers are the same thing they are in fact :uite different%! but rather that they can be used in related ,ays! and that certain operations may be used bet,een them. The first such operation is that it is possible to apparently% assign an array to a pointer6
int aR* S; int (ip; ip ) a;

What can this mean; In that last assignment ip ) a! aren5t ,e mixing apples and oranges again; It turns out that ,e are not7 C defines the result of this assignment to be that ip receives a pointer to the first element of a. In other ,ords! it is as if you had ,ritten
ip ) IaR S;

The second facet of the e:uivalence is that you can use the 99array subscripting55 notation RiS on pointers! too. If you ,rite
ipR.S

it is ?ust as if you had ,ritten


((ip + .)

So ,hen you have a pointer that points to a bloc& of memory! such as an array or a part of an array! you can treat that pointer 99as if55 it were an array! using the convenient RiS notation. In other ,ords! at the beginning of this section ,hen ,e tal&ed about (ip! ((ip+*)! ((ip+-)! and ((ip+i)! ,e could have ,ritten ipR S! ipR*S! ipR-S! and ipRiS. 8s ,e5ll see! this can be :uite useful or at least convenient%. #1

The third facet of the e:uivalence ,hich is actually a more general version of the first one ,e mentioned% is that whenever you mention the name of an array in a context ,here the 99value55 of the array ,ould be needed! C automatically generates a pointer to the first element of the array! as if you had ,ritten IarraAR S. When you ,rite something li&e
int aR* S; int (ip; ip ) a + .;

it is as if you had ,ritten

ip ) IaR S + .;

,hich and you might li&e to convince yourself of this% gives the same result as if you had ,ritten
ip ) IaR.S;

Gor example! if the character array


char strin%R* int len; char (p; for(p ) strin%; (p !) 4\ 4; p++) ; len ) p " strin%; 8fter the loop! p points to the 4\ 4 terminating the string. The expression p " strin% is e:uivalent to p " Istrin%R S! and gives the length of the string. =f course! ,e could also call strlen7 in fact here ,e5ve essentially ,ritten another implementation of strlen.% S;

contains some string! here is another ,ay to find its length6

16.& Arrays and Pointers as Function Arguments


BThis section corresponds to (+* Sec. $.4C Earlier! ,e learned that functions in C receive copies of their arguments. This means that C uses call by value7 it means that a function can modify one of its arguments ,ithout modifying the value in the caller.% We didn5t say so at the time! but ,hen a function is called! the copies of the arguments are made as if by assignment. 'ut since arrays can5t be assigned! ho, can a function receive an array as an argument; The ans,er ,ill explain ,hy arrays are an apparent exception to the rule that functions cannot modify their arguments. We5ve been regularly calling a function %etline li&e this6
char lineR* S; %etline(line, * ); the intention that %etline read

,ith the next line of input into the character array line. 'ut in the previous paragraph! ,e learned that ,hen ,e mention the name of an array in an expression! the compiler generates a pointer to its first element. So the call above is as if ,e had ,ritten
char lineR* S; %etline(IlineR S, * ); In other ,ords! the %etline function does receives a pointer to charA

not receive an array of char at all7 it actually

#"

8s ,e5ve seen throughout this chapter! it5s straightfor,ard to manipulate the elements of an array using pointers! so there5s no particular insurmountable difficulty if %etline receives a pointer. =ne :uestion remains! though6 ,e had been defining %etline ,ith its line parameter declared as an array6
int %etline(char lineRS, int ma') { ... !

We mentioned that ,e didn5t have to specify a si@e for the line parameter! ,ith the explanation that %etline really used the array in its caller! ,here the actual si@e ,as specified. 'ut that declaration certainly does loo& li&e an array..ho, can it ,or& ,hen %etline actually receives a pointer; The ans,er is that the C compiler does a little something behind your bac&. It &no,s that ,henever you mention an array name in an expression! it the compiler% generates a pointer to the array5s first element. Therefore! it &no,s that a function can never actually receive an array as a parameter. Therefore! ,henever it sees you defining a function that seems to accept an array as a parameter! the compiler :uietly pretends that you had declared it as accepting a pointer! instead. The definition of %etline above is compiled exactly as if it had been ,ritten
int %etline(char (line, int ma') { ... ! loo& at ho, %etline might be ,ritten if ,e thought of

Det5s a pointer! instead6


int { int int ma' nch ) ; c; ) ma' " *;

its first parameter argument% as

%etline(char (line, int ma')

#( lea9e room for 4\ 4 (#

#ifndef HM>T1<B> while((c ) %etchar()) !) >LH) #else while((c ) %etc(fp)) !) >LH) #endif { if(c )) 4\n4) $rea&; if(nch < ma') { ((line + nch) ) c; nch ) nch + *; ! ! if(c )) >LH II nch )) return >LH; ((line + nch) ) 4\ 4; return nch; ! )

#4

'ut! as ,e5ve learned! ,e can also use 99array subscript55 notation ,ith pointers! so ,e could re,rite the pointer version of %etline li&e this6
int { int int ma' %etline(char (line, int ma') nch ) ; c; ) ma' " *;

#( lea9e room for 4\ 4 (#

#ifndef HM>T1<B> while((c ) %etchar()) !) >LH) #else while((c ) %etc(fp)) !) >LH) #endif { if(c )) 4\n4) $rea&; if(nch < ma') { lineRnchS ) c; nch ) nch + *; ! ! if(c )) >LH II nch )) return >LH; lineRnchS ) 4\ 4; return nch; ! )

'ut this is exactly ,hat ,e5d ,ritten before see chapter 3! Sec. 3.2%! except that the declaration of the line parameter is different. In other ,ords! ,ithin the body of the function! it hardly matters ,hether ,e thought line ,as an array or a pointer! since ,e can use array subscripting notation ,ith both arrays and pointers. These games that the compiler is playing ,ith arrays and pointers may seem be,ildering at first! and it may seem faintly miraculous that everything comes out in the ,ash ,hen you declare a function li&e %etline that seems to accept an array. The e:uivalence in C bet,een arrays and pointers can be confusing! but it does ,or& and is one of the central features of C. If the games ,hich the compiler plays pretending that you declared a parameter as a pointer ,hen you thought you declared it as an array% bother you! you can do t,o things6 ". Continue to pretend that functions can receive arrays as parameters7 declare and use them that ,ay! but remember that unli&e other arguments! a function can modify the copy in its caller of an argument that seems to be% an array. 4. *eali@e that arrays are al,ays passed to functions as pointers! and al,ays declare your functions as accepting pointers.

16.' Strings
'ecause of the 99e:uivalence55 of arrays and pointers! it is extremely common to refer to and manipulate strings as character pointers! or char (5s. It is so common! in fact! that it is easy to forget that strings are arrays! and to imagine that they5re represented by pointers. 8ctually! in the case of strings! it may not even matter that much if the distinction gets a little blurred7

#2

there5s certainly nothing ,rong ,ith referring to a character pointer! suitably initiali@ed! as a 99string.55% Det5s loo& at a fe, of the implications6 ". 8ny function that manipulates a string ,ill actually accept it as a char ( argument. The caller may pass an array containing a string! but the function ,ill receive a pointer to the array5s string5s% first element character%. 4. The ,s format in printf expects a character pointer. 2. 8lthough you have to use strcpA to copy a string from one array to another! you can use simple pointer assignment to assign a string to a pointer. The string being assigned might either be in an array or pointed to by another pointer. In other ,ords! given
/. 2. char strin%RS ) "Hello, world!"; char (p*, (p-;

both
p* ) strin%

and
p- ) p*

are legal. *emember! though! that ,hen you assign a pointer! you5re ma&ing a copy of the pointer but not of the data it points to. In the first example! p* ends up pointing to the string in strin%. In the second example! p- ends up pointing to the same string as p*. In any case! after a pointer assignment! if you ever change the string or other data% pointed to! the change is 99visible55 to both pointers. 3. Eany programs manipulate strings exclusively using character pointers! never explicitly declaring any actual arrays. 8s long as these programs are careful to allocate appropriate memory for the strings! they5re perfectly valid and correct. When you start ,or&ing heavily ,ith strings! ho,ever! you have to be a,are of one subtle fact. When you initiali@e a character array ,ith a string constant6
char strin%RS ) "Hello, world!";

you end up ,ith an array containing the string! and you can modify the array5s contents to your heart5s content6
strin%R S ) 4X4;

/o,ever! it5s possible to use string constants the formal term is string literals% at other places in your code. Since they5re arrays! the compiler generates pointers to their first elements ,hen they5re used in expressions! as usual. That is! if you say
char (p* ) "Hello"; int len ) strlen("world");

it5s almost as if you5d said


char internalYstrin%Y*RS ) "Hello"; char internalYstrin%Y-RS ) "world"; char (p* ) IinternalYstrin%Y*R S; int len ) strlen(IinternalYstrin%Y-R S);

#F

/ere! the arrays named internalYstrin%Y* and internalYstrin%Y- are supposed to suggest the fact that the compiler is actually generating little temporary arrays every time you use a string constant in your code. (owever! the subtle fact is that the arrays ,hich are 99behind55 the string constants are not necessarily modifiable. In particular! the compiler may store them in read.only.memory. Therefore! if you ,rite
char (p. ) "Hello, world!"; p.R S ) 4X4;

your program may crash! because it may try to store a value in this case! the character 4X4% into non,ritable memory. The moral is that ,henever you5re building or modifying strings! you have to ma&e sure that the memory you5re building or modifying them in is ,ritable. That memory should either be an array you5ve allocated! or some memory ,hich you5ve dynamically allocated by the techni:ues ,hich ,e5ll see in the next chapter. Ea&e sure that no part of your program ,ill ever try to modify a string ,hich is actually one of the unnamed! un,ritable arrays ,hich the compiler generated for you in response to one of your string constants. The only exception is array initiali@ation! because if you ,rite to such an array! you5re ,riting to the array! not to the string literal ,hich you used to initiali@e the array.%

16.< Example= (rea>ing a )ine into 88?ords::


In an earlier assignment! an 99extra credit55 version of a problem as&ed you to ,rite a little chec&boo& balancing program that accepted a series of lines of the form
deposit * chec& * chec& *-../ deposit 2 chec& -

It ,as a surprising nuisance to do this in an ad hoc ,ay! using only the tools ,e had at the time. It ,as easy to read each line! but it ,as cumbersome to brea& it up into the ,ord 99deposit55 or 99chec&55% and the amount. I find it very convenient to use a more general approach6 first! brea& lines li&e these into a series of ,hitespace.separated ,ords! then deal ,ith each ,ord separately. To do this! ,e ,ill use an array of pointers to char! ,hich ,e can also thin& of as an 99array of strings!55 since a string is an array of char! and a pointer.to.char can easily point at a string. /ere is the declaration of such an array6
char (wordsR* S;

This is the first complicated C declaration ,e5ve seen6 it says that words is an array of "1 pointers to char. We5re going to ,rite a function! %etwords! ,hich ,e can call li&e this6
int nwords; nwords ) %etwords(line, words, * ); ,here line is the line ,e5re brea&ing into ,ords! words is the array to be filled in ,ith the pointers to the% ,ords! and nwords the return value from %etwords% is the number of ,ords ,hich the function finds. 8s ,ith %etline! ,e tell the function the si@e of the array so that if

the line should happen to contain more ,ords than that! it ,on5t overflo, the array%. /ere is the definition of the %etwords function. It finds the beginning of each ,ord! places a pointer to it in the array! finds the end of that ,ord ,hich is signified by at least one

#$

,hitespace character% and terminates the ,ord by placing a 4\ 4 character after it. The 4\ 4 character ,ill over,rite the first ,hitespace character follo,ing the ,ord.% 0ote that the original input string is therefore modified by %etwords6 if you ,ere to try to print the input line after calling %etwords! it ,ould appear to contain only its first ,ord because of the first inserted 4\ 4%.
#include <stddef.h> #include <ctApe.h> %etwords(char (line, char (wordsRS, int ma'words) { char (p ) line; int nwords ) ; while(*) { while(isspace((p)) p++; if((p )) 4\ 4) return nwords; wordsRnwords++S ) p; while(!isspace((p) II (p !) 4\ 4) p++; if((p )) 4\ 4) return nwords; (p++ ) 4\ 4; if(nwords >) ma'words) return nwords; !

Each time through the outer while loop! the function tries to find another ,ord. Girst it s&ips over ,hitespace ,hich might be leading spaces on the line! or the space s% separating this ,ord from the previous one%. The isspace function is ne,6 it5s in the standard library! declared in the header file <ctApe.h>! and it returns non@ero 99true55% if the character you hand it is a space character a space or a tab! or any other ,hitespace character there might happen to be%. When the function finds a non.,hitespace character! it has found the beginning of another ,ord! so it places the pointer to that character in the next cell of the words array. Then it steps though the ,ord! loo&ing at non.,hitespace characters! until it finds another ,hitespace character! or the \ at the end of the line. If it finds the \ ! it5s done ,ith the entire line7 other,ise! it changes the ,hitespace character to a \ ! to terminate the ,ord it5s ?ust found! and continues. If it5s found as many ,ords as ,ill fit in the words array! it returns prematurely.% Each time it finds a ,ord! the function increments the number of ,ords nwords% it has found. Since arrays in C start at R S! the number of ,ords the function has found so far is also the index of the cell in the words array ,here the next ,ord should be stored. The function actually assigns the next ,ord and increments nwords in one expression6 #3

wordsRnwords++S ) p;

>ou should convince yourself that this arrangement ,or&s! and that in this case% the preincrement form
wordsR++nwordsS ) p; #( K;LBM (#

,ould not behave as desired. When the function is done ,hen it finds the \ terminating the input line! or ,hen it runs out of cells in the words array% it returns the number of ,ords it has found. /ere is a complete example of calling %etwords6
char lineRS ) "this is a test"; int i; nwords ) %etwords(line, words, * ); for(i ) ; i < nwords; i++) printf(",s\n", wordsRiS);

#H

Chapter 11: #emory 0llocation


In this chapter! ,e5ll meet malloc! C5s dynamic memory allocation function! and ,e5ll cover dynamic memory allocation in some detail. 8s ,e begin doing dynamic memory allocation! ,e5ll begin to see if ,e haven5t seen it already% ,hat pointers can really be good for. Eany of the pointer examples in the previous chapter those ,hich used pointers to access arrays% didn5t do all that much for us that ,e couldn5t have done using arrays. /o,ever! ,hen ,e begin doing dynamic memory allocation! pointers are the only ,ay to go! because ,hat malloc returns is a pointer to the memory it gives us. )ue to the e:uivalence bet,een pointers and arrays! though! ,e ,ill still be able to thin& of dynamically allocated regions of storage as if they ,ere arrays! and even to use array.li&e subscripting notation on them.% >ou have to be careful ,ith dynamic memory allocation. malloc operates at a pretty 99lo, level557 you ,ill often find yourself having to do a certain amount of ,or& to manage the memory it gives you. If you don5t &eep accurate trac& of the memory ,hich malloc has given you! and the pointers of yours ,hich point to it! it5s all too easy to accidentally use a pointer ,hich points 99no,here55! ,ith generally unpleasant results. The basic problem is that if you assign a value to the location pointed to by a pointer6
(p ) ;

and if the pointer p points 99no,here55! ,ell actually it can be construed to point some,here! ?ust not ,here you ,anted it to! and that 99some,here55 is ,here the 1 gets ,ritten. If the 99some,here55 is memory ,hich is in use by some other part of your program! or even ,orse! if the operating system has not protected itself from you and 99some,here55 is in fact in use by the operating system! things could get ugly.%

11.1 Allocating 5emory ;it$ malloc


BThis section corresponds to parts of (+* Secs. $.F! $.3! 3.$! and H.-.$C 8 problem ,ith many simple programs! including in particular little teaching programs such as ,e5ve been ,riting so far! is that they tend to use fixed.si@e arrays ,hich may or may not be big enough. We have an array of "11 ints for the numbers ,hich the user enters and ,ishes to find the average of..,hat if the user enters "1" numbers; We have an array of "11 chars ,hich ,e pass to %etline to receive the user5s input..,hat if the user types a line of 411 characters; If ,e5re luc&y! the relevant parts of the program chec& ho, much of an array they5ve used! and print an error message or other,ise gracefully abort before overflo,ing the array. If ,e5re not so luc&y! a program may sail off the end of an array! over,riting other data and behaving :uite badly. In either case! the user doesn5t get his ?ob done. /o, can ,e avoid the restrictions of fixed.si@e arrays; The ans,ers all involve the standard library function malloc. Nery simply! malloc returns a pointer to n bytes of memory ,hich ,e can do anything ,e ,ant to ,ith. If ,e didn5t ,ant to read a line of input into a fixed.si@e array! ,e could use malloc! instead. /ere5s the first step6

#-

#include <stdli$.h> char (line; int linelen ) * ; line ) malloc(linelen); #( incomplete "" malloc4s return 9alue not chec&ed (# %etline(line, linelen); malloc is declared in <stdli$.h>! so ,e #include that header in any program that calls malloc. 8 99byte55 in C is! by definition! an amount of storage suitable for storing one character! so the above invocation of malloc gives us exactly as many chars as ,e as& for.

We could illustrate the resulting pointer li&e this6

The "11 bytes of memory not all of ,hich are sho,n% pointed to by line are those allocated by malloc. They are brand.ne, memory! conceptually a bit different from the memory ,hich the compiler arranges to have allocated automatically for our conventional variables. The "11 boxes in the figure don5t have a name next to them! because they5re not storage for a variable ,e5ve declared.% 8s a second example! ,e might have occasion to allocate a piece of memory! and to copy a string into it ,ith strcpA6
char (p ) malloc(*2); #( incomplete "" malloc4s return 9alue not chec&ed (# strcpA(p, "Hello, world!");

When copying strings! remember that all strings have a terminating \ character. If you use strlen to count the characters in a string for you! that count ,ill not include the trailing \ ! so you must add one before calling malloc6
char (somestrin%, (copA; ... copA ) malloc(strlen(somestrin%) + *); #( +* for \ #( incomplete "" malloc4s return 9alue not chec&ed (# strcpA(copA, somestrin%);

(#

What if ,e5re not allocating characters! but integers; If ,e ,ant to allocate "11 ints! ho, many bytes is that; If ,e &no, ho, big ints are on our machine i.e. depending on ,hether ,e5re using a "3. or 24.bit machine% ,e could try to compute it ourselves! but it5s much safer and more portable to let C compute it for us. C has a si8eof operator! ,hich computes the si@e! in bytes! of a variable or type. It5s ?ust ,hat ,e need ,hen calling malloc. To allocate space for "11 ints! ,e could call The use and it does its ,or& at compile time.
int (ip ) malloc(* ( si8eof(int)); of the si8eof operator tends to loo& li&e a function

call! but it5s really an operator!

Since ,e can use array indexing syntax on pointers! ,e can treat a pointer variable after a call to malloc almost exactly as if it ,ere an array. In particular! after the above call to malloc initiali@es ip to point at storage for "11 ints! ,e can access ipR S! ipR*S! ... up to ipRDDS. This ,ay! ,e can get the effect of an array even if ,e don5t &no, until run time ho, big the

##

99array55 should be. In a later section ,e5ll see ho, ,e might deal ,ith the case ,here ,e5re not even sure at the point ,e begin using it ho, big an 99array55 ,ill eventually have to be.% =ur examples so far have all had a significant omission6 they have not chec&ed malloc5s return value. =bviously! no real computer has an infinite amount of memory available! so there is no guarantee that malloc ,ill be able to give us as much memory as ,e as& for. If ,e call malloc(* )! or if ,e call malloc(* ) "1!111!111 times! ,e5re probably going to run out of memory. When malloc is unable to allocate the re:uested memory! it returns a null pointer. 8 null pointer! remember! points definitively no,here. It5s a 99not a pointer55 mar&er7 it5s not a pointer you can use. 8s ,e said in section #.F! a null pointer can be used as a failure return from a function that returns pointers! and malloc is a perfect example.% Therefore! ,henever you call malloc! it5s vital to chec& the returned pointer before using itA If you call malloc! and it returns a null pointer! and you go off and use that null pointer as if it pointed some,here! your program probably ,on5t last long. Instead! a program should immediately chec& for a null pointer! and if it receives one! it should at the very least print an error message and exit! or perhaps figure out some ,ay of proceeding ,ithout the memory it as&ed for. 'ut it cannot go on to use the null pointer it got bac& from malloc in any ,ay! because that null pointer by definition points no,here. 99It cannot use a null pointer in any ,ay55 means that the program cannot use the ( or RS operators on such a pointer value! or pass it to any function that expects a valid pointer.% 8 call to malloc! ,ith an error chec&! typically loo&s something li&e this6
int (ip ) malloc(* ( si8eof(int)); if(ip )) BV11) { printf("out of memorA\n"); exit&or&return !

8fter printing the error message! this code should return to its caller! or exit from the program entirely7 it cannot proceed ,ith the code that ,ould have used ip. =f course! in our examples so far! ,e5ve still limited ourselves to 99fixed si@e55 regions of memory! because ,e5ve been calling malloc ,ith fixed arguments li&e "1 or "11. =ur call to %etline is still limited to "11.character lines! or ,hatever number ,e set the linelen variable to7 our ip variable still points at only "11 ints.% /o,ever! since the si@es are no, values ,hich can in principle be determined at run.time! ,e5ve at least moved beyond having to recompile the program ,ith a bigger array% to accommodate longer lines! and ,ith a little more ,or&! ,e could arrange that the 99arrays55 automatically gre, to be as large as re:uired. Gor example! ,e could ,rite something li&e %etline ,hich could read the longest input line actually seen.% We5ll begin to explore this possibility in a later section.

11.2 Freeing 5emory


Eemory allocated ,ith malloc lasts as long as you ,ant it to. It does not automatically disappear ,hen a function returns! as automatic.duration variables do! but it does not have to remain for the entire duration of your program! either. Lust as you can use malloc to control

"11

exactly ,hen and ho, much memory you allocate! you can also control exactly ,hen you deallocate it. In fact! many programs use memory on a transient basis. They allocate some memory! use it for a ,hile! but then reach a point ,here they don5t need that particular piece any more. 'ecause memory is not inexhaustible! it5s a good idea to deallocate that is! release or free% memory you5re no longer using. )ynamically allocated memory is deallocated ,ith the free function. If p contains a pointer previously returned by malloc! you can call
free(p);

,hich ,ill 99give the memory bac&55 to the stoc& of memory sometimes called the 99arena55 or 99pool55% from ,hich malloc re:uests are satisfied. Calling free is sort of the ultimate in recycling6 it costs you almost nothing! and the memory you give bac& is immediately usable by other parts of your program. Theoretically! it may even be usable by other programs.% Greeing unused memory is a good idea! but it5s not mandatory. When your program exits! any memory ,hich it has allocated but not freed should be automatically released. If your computer ,ere to someho, 99lose55 memory ?ust because your program forgot to free it! that ,ould indicate a problem or deficiency in your operating system.% 0aturally! once you5ve freed some memory you must remember not to use it any more. 8fter calling
free(p);

it is probably the case that p still points at the same memory. /o,ever! since ,e5ve given it bac&! it5s no, 99available!55 and a later call to malloc might give that memory to some other part of your program. If the variable p is a global variable or ,ill other,ise stic& around for a ,hile! one good ,ay to record the fact that it5s not to be used any more ,ould be to set it to a null pointer6
free(p); p ) BV11;

0o, ,e don5t even have the pointer to the freed memory any more! and as long as ,e chec& to see that p is non.BV11 before using it%! ,e ,on5t misuse any memory via the pointer p. When thin&ing about malloc! free! and dynamically.allocated memory in general! remember again the distinction bet,een a pointer and ,hat it points to. If you call malloc to allocate some memory! and store the pointer ,hich malloc gives you in a local pointer variable! ,hat happens ,hen the function containing the local pointer variable returns; If the local pointer variable has automatic duration ,hich is the default! unless the variable is declared static%! it ,ill disappear ,hen the function returns. 'ut for the pointer variable to disappear says nothing about the memory pointed toA That memory still exists and! as far as malloc and free are concerned! is still allocated. The only thing that has disappeared is the pointer variable you had ,hich pointed at the allocated memory. Gurthermore! if it contained the only copy of the pointer you had! once it disappears! you5ll have no ,ay of freeing the memory! and no ,ay of using it! either. Using memory and freeing memory both re:uire that you have at least one pointer to the memoryA%

"1"

11.3 2eallocating 5emory (loc>s


Sometimes you5re not sure at first ho, much memory you5ll need. Gor example! if you need to store a series of items you read from the user! and if the only ,ay to &no, ho, many there are is to read them until the user types some 99end55 signal! you5ll have no ,ay of &no,ing! as you begin reading and storing the first fe,! ho, many you5ll have seen by the time you do see that 99end55 mar&er. >ou might ,ant to allocate room for! say! "11 items! and if the user enters a "1"st item before entering the 99end55 mar&er! you might ,ish for a ,ay to say 99uh! malloc! remember those "11 items I as&ed for; Could I change my mind and have 411 instead;55 In fact! you can do exactly this! ,ith the realloc function. >ou hand realloc an old pointer such as you received from an initial call to malloc% and a ne, si@e! and realloc does ,hat it can to give you a chun& of memory big enough to hold the ne, si@e. Gor example! if ,e ,anted the ip variable from an earlier example to point at 411 ints instead of "11! ,e could try calling
ip ) realloc(ip, ( si8eof(int));

Since you al,ays ,ant each bloc& of dynamically.allocated memory to be contiguous so that you can treat it as if it ,ere an array%! you and realloc have to ,orry about the case ,here realloc can5t ma&e the old bloc& of memory bigger 99in place!55 but rather has to relocate it else,here in order to find enough contiguous space for the ne, re:uested si@e. realloc does this by returning a ne, pointer. If realloc ,as able to ma&e the old bloc& of memory bigger! it returns the same pointer. If realloc has to go else,here to get enough contiguous memory! it returns a pointer to the ne, memory! after copying your old data there. In this case! after it ma&es the copy! it frees the old bloc&.% Ginally! if realloc can5t find enough memory to satisfy the ne, re:uest at all! it returns a null pointer. Therefore! you usually don5t ,ant to over,rite your old pointer ,ith realloc5s return value until you5ve tested it to ma&e sure it5s not a null pointer. >ou might use code li&e this6

int (newp; newp ) realloc(ip, ( si8eof(int)); if(newp !) BV11) ip ) newp; else { printf("out of memorA\n"); #( e'it or return (# #( $ut ip still points at * ints (# ! If realloc returns something other than a null pointer! it succeeded! and ,e set ip to ,hat it returned. We5ve either set ip to ,hat it used to be or to a ne, pointer! but in either case! it points to ,here our data is no,.% If realloc returns a null pointer! ho,ever! ,e hang on to our old pointer in ip ,hich still points at our original "11 values.

Putting this all together! here is a piece of code ,hich reads lines of text from the user! treats each line as an integer by calling atoi! and stores each integer in a dynamically.allocated 99array556
#define U5N1<B> * char lineRU5N1<B>S; int (ip; int nalloc, nitems;

"14

nalloc ) * ; ip ) malloc(nalloc ( si8eof(int)); if(ip )) BV11) { printf("out of memorA\n"); e'it(*); ! nitems ) ;

#( initial allocation (#

while(%etline(line, U5N1<B>) !) >LH) { if(nitems >) nalloc) { #( increase allocation (# int (newp; nalloc +) * ; newp ) realloc(ip, nalloc ( si8eof(int)); if(newp )) BV11) { printf("out of memorA\n"); e'it(*); ! ip ) newp; ! ipRnitems++S ) atoi(line); !

We use t,o different variables to &eep trac& of the 99array55 pointed to by ip. nalloc is ho, many elements ,e5ve allocated! and nitems is ho, many of them are in use. Whenever ,e5re about to store another item in the 99array!55 if nitems >) nalloc! the old 99array55 is full! and it5s time to call realloc to ma&e it bigger. Ginally! ,e might as& ,hat the return type of malloc and realloc is! if they are able to return pointers to char or pointers to int or though ,e haven5t seen it yet% pointers to any other type. The ans,er is that both of these functions are declared in <stdli$.h>% as returning a type ,e haven5t seen! 9oid ( that is! pointer to 9oid%. We haven5t really seen type 9oid! either! but ,hat5s going on here is that 9oid ( is specially defined as a 99generic55 pointer type! ,hich may be used strictly spea&ing! assigned to or from% any pointer type.

11.4 Pointer Sa*ety


8t the beginning of the previous chapter! ,e said that the hard thing about pointers is not so much manipulating them as ensuring that the memory they point to is valid. When a pointer doesn5t point ,here you thin& it does! if you inadvertently access or modify the memory it points to! you can damage other parts of your program! or in some cases% other programs or the operating system itselfA When ,e use pointers to simple variables! as in section "1."! there5s not much that can go ,rong. When ,e use pointers into arrays! as in section "1.4! and begin moving the pointers around! ,e have to be more careful! to ensure that the roving pointers al,ays stay ,ithin the bounds of the array s%. When ,e begin passing pointers to functions! and especially ,hen ,e begin returning them from functions as in the strstr function of section "1.F% ,e have to be "12

more careful still! because the code using the pointer may be far removed from the code ,hich o,ns or allocated the memory. =ne particular problem concerns functions that return pointers. Where is the memory to ,hich the returned pointer points; Is it still around by the time the function returns; The strstr function returns either a null pointer ,hich points definitively no,here! and ,hich the caller presumably chec&s for% or it returns a pointer ,hich points into the input string! ,hich the caller supplied! ,hich is pretty safe. =ne thing a function must not do! ho,ever! is return a pointer to one of its o,n! local! automatic.duration arrays. *emember that automatic. duration variables ,hich includes all non.static local variables%! including automatic. duration arrays! are deallocated and disappear ,hen the function returns. If a function returns a pointer to a local array! that pointer ,ill be invalid by the time the caller tries to use it. Ginally! ,hen ,e5re doing dynamic memory allocation ,ith malloc! realloc! and free! ,e have to be most careful of all. )ynamic allocation gives us a lot more flexibility in ho, our programs use memory! although ,ith that flexibility comes the responsibility that ,e manage dynamically allocated memory carefully. The possibilities for misdirected pointers and associated mayhem are greatest in programs that ma&e heavy use of dynamic memory allocation. >ou can reduce these possibilities by designing your program in such a ,ay that it5s easy to ensure that pointers are used correctly and that memory is al,ays allocated and deallocated correctly. If! on the other hand! your program is designed in such a ,ay that meeting these guarantees is a tedious nuisance! sooner or later you5ll forget or neglect to! and maintenance ,ill be a nightmare.%

"1F

Chapter 12: Input and Output


So far! ,e5ve been calling printf to print formatted output to the 99standard output55 ,herever that is%. We5ve also been calling %etchar to read single characters from the 99standard input!55 and putchar to ,rite single characters to the standard output. 99Standard input55 and 99standard output55 are t,o predefined I<= streams ,hich are implicitly available to us. In this chapter ,e5ll learn ho, to ta&e control of input and output by opening our o,n streams! perhaps connected to data files! ,hich ,e can read from and ,rite to.

12.1 File Pointers and fopen


BThis section corresponds to (+* Sec. H.$C /o, ,ill ,e specify that ,e ,ant to access a particular data file; It ,ould theoretically be possible to mention the name of a file each time it ,as desired to read from or ,rite to it. 'ut such an approach ,ould have a number of dra,bac&s. Instead! the usual approach and the one ta&en in C5s stdio library% is that you mention the name of the file once! at the time you open it. Thereafter! you use some little to&en..in this case! the file pointer..,hich &eeps trac& both for your sa&e and the library5s% of ,hich file you5re tal&ing about. Whenever you ,ant to read from or ,rite to one of the files you5re ,or&ing ,ith! you identify that file by using its file pointer that is! the file pointer you obtained ,hen you opened the file%. 8s ,e5ll see! you store file pointers in variables ?ust as you store any other data you manipulate! so it is possible to have several files open! as long as you use distinct variables to store the file pointers. >ou declare a variable to store a file pointer li&e this6 The type for you by <stdio.h>. It is a data structure ,hich holds the information the standard I<= library needs to &eep trac& of the file for you. Gor historical reasons! you declare a variable ,hich is a pointer to this H<1> type. The name of the variable can as for any variable% be anything you choose7 it is traditional to use the letters fp in the variable name since ,e5re tal&ing about a file pointer%. If you ,ere reading from t,o files at once you5d probably use t,o file pointers6
H<1> (fp*, (fp-; H<1> (fp; H<1> is predefined

If you ,ere reading from one file and ,riting to another you might declare and input file pointer and an output file pointer6
H<1> (ifp, (ofp;

Di&e any pointer variable! a file pointer isn5t any good until it5s initiali@ed to point to something. 8ctually! no variable of any type is much good until you5ve initiali@ed it.% To actually open a file! and receive the 99to&en55 ,hich you5ll store in your file pointer variable! you call fopen. fopen accepts a file name as a string% and a mode value indicating among other things ,hether you intend to read or ,rite this file. The mode variable is also a string.% To open the file input.dat for reading you might call
ifp ) fopen("input.dat", "r");

The mode string "r" indicates reading. Eode "w" indicates ,riting! so ,e could open output.dat for output li&e this6

"1$

ofp ) fopen("output.dat", "w");

The other values for the mode string are less fre:uently used. The third ma?or mode is "a" for append. If you use "w" to ,rite to a file ,hich already exists! its old contents ,ill be discarded.% >ou may also add a + character to the mode string to indicate that you ,ant to both read and ,rite! or a $ character to indicate that you ,ant to do 99binary55 as opposed to text% I<=. =ne thing to be,are of ,hen opening files is that it5s an operation ,hich may fail. The re:uested file might not exist! or it might be protected against reading or ,riting. These possibilities ought to be obvious! but it5s easy to forget them.% fopen returns a null pointer if it can5t open the re:uested file! and it5s important to chec& for this case before going off and using fopen5s return value as a file pointer. Every call to fopen ,ill typically be follo,ed ,ith a test! li&e this6
ifp ) fopen("input.dat", "r"); if(ifp )) BV11) { printf("can4t open file\n"); exit&or&return !

If fopen returns a null pointer! and you store it in your file pointer variable and go off and try to do I<= ,ith it! your program ,ill typically crash. It5s common to collapse the call to fopen and the assignment in ,ith the test6
if((ifp ) fopen("input.dat", "r")) )) BV11) { printf("can4t open file\n"); exit&or&return !

>ou don5t have to ,rite these 99collapsed55 tests if you5re not comfortable ,ith them! but you5ll see them in other people5s code! so you should be able to read them.

12.2 /@% ;it$ File Pointers


Gor each of the I<= library functions ,e5ve been using so far! there5s a companion function ,hich accepts an additional file pointer argument telling it ,here to read from or ,rite to. The companion function to printf is fprintf! and the file pointer argument comes first. To print a string to the output.dat file ,e opened in the previous section! ,e might call
fprintf(ofp, "Hello, world!\n");

The companion function to %etchar is %etc! and the file pointer is its only argument. To read a character from the input.dat file ,e opened in the previous section! ,e might call
int c; c ) %etc(ifp);

The companion function to putchar is putc! and the file pointer argument comes last. To ,rite a character to output.dat! ,e could call

"13

putc(c, ofp);

=ur o,n %etline function calls %etchar and so al,ays reads the standard input. We could ,rite a companion f%etline function ,hich reads from an arbitrary file pointer6
#include <stdio.h> #( #( #( #( int { int int ma' ;ead one line from fp, (# copAin% it to line arraA ($ut no more than ma' chars). (# Goes not place terminatin% \n in line arraA. (# ;eturns line len%th, or for emptA line, or >LH for end"of"file. (# f%etline(H<1> (fp, char lineRS, int ma') nch ) ; c; ) ma' " *;

#( lea9e room for 4\ 4 (#

while((c ) %etc(fp)) !) >LH) { if(c )) 4\n4) $rea&; if(nch < ma') { lineRnchS ) c; nch ) nch + *; ! ! if(c )) >LH II nch )) return >LH; lineRnchS ) 4\ 4; return nch; ! )

0o, ,e could read one line from ifp by calling


char lineRU5N1<B>S; ... f%etline(ifp, line, U5N1<B>);

12.3 Prede*ined Streams


'esides the file pointers ,hich ,e explicitly open by calling fopen! there are also three predefined streams. stdin is a constant file pointer corresponding to standard input! and stdout is a constant file pointer corresponding to standard output. 'oth of these can be used any,here a file pointer is called for7 for example! %etchar() is the same as %etc(stdin) and putchar(c) is the same as putc(c, stdout). The third predefined stream is stderr. Di&e stdout! stderr is typically connected to the screen by default. The difference is that stderr is not redirected ,hen the standard output is redirected. Gor example! under Unix or ES.)=S! ,hen you invo&e
pro%ram > filename

"1H

anything printed to stdout is redirected to the file filename! but anything printed to stderr still goes to the screen. The intent behind stderr is that it is the 99standard error output557 error messages printed to it ,ill not disappear into an output file. Gor example! a more realistic ,ay to print an error message ,hen a file can5t be opened ,ould be
if((ifp ) fopen(filename, "r")) )) BV11) { fprintf(stderr, "can4t open file ,s\n", filename); exit&or&return ! ,here filename is a string variable indicating the file name to be opened. 0ot only is the error message printed to stderr! but it is also more informative in that it mentions the name

of the file that couldn5t be opened. We5ll see another example in the next chapter.%

12.4 Closing Files


8lthough you can open multiple files! there5s a limit to ho, many you can have open at once. If your program ,ill open many files in succession! you5ll ,ant to close each one as you5re done ,ith it7 other,ise the standard I<= library could run out of the resources it uses to &eep trac& of open files. Closing a file simply involves calling fclose ,ith the file pointer as its argument6
fclose(fp);

Calling fclose arranges that if the file ,as open for output% any last! buffered output is finally ,ritten to the file! and that those resources used by the operating system and the C library% for this file are released. If you forget to close a file! it ,ill be closed automatically ,hen the program exits.

12.# Example= 2eading a Data File


Suppose you had a data file consisting of ro,s and columns of numbers6
* 2 D 0 * ./ 3E **-

Suppose you ,anted to read these numbers into an array. 8ctually! the array ,ill be an array of arrays! or a 99multidimensional55 array7 see section F.".4.% We can ,rite code to do this by putting together several pieces6 the f%etline function ,e ?ust sho,ed! and the %etwords function from chapter "1. 8ssuming that the data file is named input.dat! the code ,ould loo& li&e this6
#define U5N1<B> * #define U5N;LKC * #define U5NFL1C *

int arraARU5N;LKCSRU5NFL1CS; char (filename ) "input.dat"; H<1> (ifp; char lineRU5N1<B>S; char (wordsRU5NFL1CS; int nrows ) ; int n; int i; ifp ) fopen(filename, "r");

"1-

if(ifp )) BV11) { fprintf(stderr, "can4t open ,s\n", filename); e'it(>N<TYH5<1V;>); ! while(f%etline(ifp, line, U5N1<B>) !) >LH) { if(nrows >) U5N;LKC) { fprintf(stderr, "too manA rows\n"); e'it(>N<TYH5<1V;>); ! n ) %etwords(line, words, U5NFL1C); for(i ) ; i < n; i++) arraARnrowsSRiS ) atoi(wordsRiS); nrows++; !

Each trip through the loop reads one line from the file! using f%etline. Each line is bro&en up into 99,ords55 using %etwords7 each 99,ord55 is actually one number. The numbers are ho,ever still represented as strings! so each one is converted to an int by calling atoi before being stored in the array. The code chec&s for t,o different error conditions failure to open the input file! and too many lines in the input file% and if one of these conditions occurs! it prints an error message! and exits. The e'it function is a Standard library function ,hich terminates your program. It is declared in <stdli$.h>! and accepts one argument! ,hich ,ill be the exit status of the program. >N<TYH5<1V;> is a code! also defined by <stdli$.h>! ,hich indicates that the program failed. Success is indicated by a code of >N<TYCVFF>CC! or simply 1. These values can also be returned from main()7 calling e'it ,ith a particular status value is essentially e:uivalent to returning that same status value from main.%

"1#

Chapter 13: 1eading the Command 2ine


BThis section corresponds to (+* Sec. $."1C We5ve mentioned several times that a program is rarely useful if it does exactly the same thing every time you run it. 8nother ,ay of giving a program some variable input to ,or& on is by invo&ing it ,ith command line arguments. We should probably admit that command line user interfaces are a bit old.fashioned! and currently some,hat out of favor. If you5ve used Unix or ES.)=S! you &no, ,hat a command line is! but if your experience is confined to the Eacintosh or Eicrosoft Windo,s or some other Kraphical User Interface! you may never have seen a command line. In fact! if you5re learning C on a Eac or under Windo,s! it can be tric&y to give your program a command line at all. Thin& C for the Eacintosh provides a ,ay7 I5m not sure about other compilers. If your compilation environment doesn5t provide an easy ,ay of simulating an old.fashioned command line! you may s&ip this chapter.% C5s model of the command line is that it consists of a se:uence of ,ords! typically separated by ,hitespace. >our main program can receive these ,ords as an array of strings! one ,ord per string. In fact! the C run.time startup code is al,ays ,illing to pass you this array! and all you have to do to receive it is to declare main as accepting t,o parameters! li&e this6
int main(int ar%c, char (ar%9RS) { ... ! main is called! ar%c ,ill be a count of the number

When of command.line arguments! and ar%9 ,ill be an array 99vector55% of the arguments themselves. Since each ,ord is a string ,hich is represented as a pointer.to.char! ar%9 is an array.of.pointers.to.char. Since ,e are not defining the ar%9 array! but merely declaring a parameter ,hich references an array some,here else namely! in main5s caller! the run.time startup code%! ,e do not have to supply an array dimension for ar%9. 8ctually! since functions never receive arrays as parameters in C! ar%9 can also be thought of as a pointer.to.pointer.to.char! or char ((. 'ut multidimensional arrays and pointers to pointers can be confusing! and ,e haven5t covered them! so ,e5ll tal& about ar%9 as if it ,ere an array.% 8lso! there5s nothing magic about the names ar%c and ar%9. >ou can give main5s t,o parameters any names you li&e! as long as they have the appropriate types. The names ar%c and ar%9 are traditional.% The first program to ,rite ,hen playing ,ith ar%c and ar%9 is one ,hich simply prints its arguments6
#include <stdio.h> main(int ar%c, char (ar%9RS) { int i; for(i ) ; i < ar%c; i++) printf("ar% ,dQ ,s\n", i, ar%9RiS); return ;

""1

This program is essentially the Unix or ES.)=S echo command.% If you run this program! you5ll discover that the set of 99,ords55 ma&ing up the command line includes the command you typed to invo&e your program that is! the name of your program%. In other ,ords! ar%9R S typically points to the name of your program! and ar%9R*S is the first argument. There are no hard.and.fast rules for ho, a program should interpret its command line. There is one set of conventions for Unix! another for ES.)=S! another for NES. Typically you5ll loop over the arguments! perhaps treating some as option flags and others as actual arguments input files! etc.%! interpreting or acting on each one. Since each argument is a string! you5ll have to use strcmp or the li&e to match arguments against any patterns you might be loo&ing for. *emember that ar%c contains the number of ,ords on the command line! and that ar%9R S is the command name! so if ar%c is "! there are no arguments to inspect. >ou5ll never ,ant to loo& at ar%9RiS! for i >) ar%c! because it ,ill be a null or invalid pointer.% 8s another example! also illustrating fopen and the file I<= techni:ues of the previous chapter! here is a program ,hich copies one or more input files to its standard output. Since 99standard output55 is usually the screen by default! this is therefore a useful program for displaying files. It5s analogous to the obscurely.named Unix cat command! and to the ES. )=S tApe command.% >ou might also ,ant to compare this program to the character. copying program of section 3.4.
#include <stdio.h> main(int ar%c, char (ar%9RS) { int i; H<1> (fp; int c; for(i ) *; i < ar%c; i++) { fp ) fopen(ar%9RiS, "r"); if(fp )) BV11) { fprintf(stderr, "catQ can4t open ,s\n", ar%9RiS); continue; ! while((c ) %etc(fp)) !) >LH) putchar(c); fclose(fp); ! return ! ;

8s a historical note! the Unix cat program is so named because it can be used to concatenate t,o files together! li&e this6
cat a $ > c

"""

This illustrates ,hy it5s a good idea to print error messages to stderr! so that they don5t get redirected. The 99can5t open file55 message in this example also includes the name of the program as ,ell as the name of the file. >et another piece of information ,hich it5s usually appropriate to include in error messages is the reason ,hy the operation failed! if &no,n. Gor operating system problems! such as inability to open a file! a code indicating the error is often stored in the global variable errno. The standard library function strerror ,ill convert an errno value to a human.readable error message string. Therefore! an even more informative error message printout ,ould be
fp ) fopen(ar%9RiS, "r"); if(fp )) BV11) fprintf(stderr, "catQ can4t open ,sQ ,s\n", ar%9RiS, strerror(errno)); If you use code li&e this! you can #include <errno.h> to get the declaration <strin%.h> to get the declaration for strerror().

for errno! and

""4

Chapter 1": 3hat4s Ne5t6


This last handout contains a brief list of the significant topics in C ,hich ,e have not covered! and ,hich you5ll ,ant to investigate further if you ,ant to &no, all of C.

Types and Declarations


We have not tal&ed about the 9oid! short int! and lon% dou$le types. 9oid is a type ,ith no values! used as a placeholder to indicate functions that do not return values or that accept no arguments! and in the 99generic55 pointer type 9oid ( that can point to anything. short int is an integer type that might use less space than a plain int7 lon% dou$le is a floating. point type that might have even more range or precision than plain dou$le. The char type and the various si@es of int also have 99unsigned55 versions! ,hich are declared using the &ey,ord unsi%ned. Unsigned types cannot hold negative values but have guaranteed properties on overflo,. Whether a plain char is signed or unsigned is implementation.defined7 you can use the &ey,ord si%ned to force a character type to contain signed characters.% Unsigned types are also useful ,hen manipulating individual bits and bytes! ,hen 99sign extension55 might other,ise be a problem. T,o additional type qualifiers const and 9olatile allo, you to declare variables or pointers to data% ,hich you promise not to change! or ,hich might change in unexpected ,ays behind the program5s bac&. There are user.defined structure and union types. 8 structure or struct is a 99record55 consisting of one or more values of one or more types concreted together into one entity ,hich can be manipulated as a ,hole. 8 union is a type ,hich! at any one time! can hold a value from one of a specified set of types. There are user.defined enumeration types 99enum55% ,hich are li&e integers but ,hich al,ays contain values from some fixed! predefined set! and for ,hich the values are referred to by name instead of by number. Pointers can point to functions as ,ell as to data types. Types can be arbitrarily complicated! ,hen you start using multiple levels of pointers! arrays! functions! structures! and<or unions. Eventually! it5s important to understand the concept of a declarator6 in the declaration ,e have and (fpi(). The declarator gives the name of a variable or function% and also indicates ,hether it is a simple variable or a pointer! array! function! or some more elaborate combination array of pointers! function returning pointer! etc.%. In the example! i is declared to be a plain int! ip is declared to be a pointer to int! and fpi is declared to be a function returning pointer to int. Complicated declarators may also contain parentheses for grouping! since there5s a precedence hierarchy in declarators
int i, (ip, (fpi(); the base type int and three declarators i! (ip!

""2

as ,ell as expressions6 RS for arrays and () for functions have higher precedence than ( for pointers.% We have not said much about pointers to pointers! or arrays of arrays i.e. multidimensional arrays%! or the ramifications of array<pointer e:uivalence on multidimensional arrays. In particular! a reference to an array of arrays does not generate a pointer to a pointer7 it generates a pointer to an array. >ou cannot pass a multidimensional array to a function ,hich accepts pointers to pointers.% Nariables can be declared ,ith a hint that they be placed in high.speed CPU registers! for efficiency. These hints are rarely needed or used today! because modern compilers do a good ?ob of register allocation by themselves! ,ithout hints.% 8 mechanism called tApedef allo,s you to define user.defined aliases i.e. ne, and perhaps more.convenient names% for other types.

%perators
The bitwise operators I! J! ?! and Z operate on integers thought of as binary numbers or strings of bits. The I operator is bit,ise 80)! the J operator is bit,ise =*! the ? operator is bit,ise exclusive.=* Q=*%! and the Z operator is a bit,ise negation or complement. I! J! and ? are 99binary55 in that they ta&e t,o operands7 Z is unary.% These operators let you ,or& ,ith the individual bits of a variable7 one common use is to treat an integer as a set of single. bit flags. >ou might define the 2rd 4RR4% bit as the 99verbose55 flag bit by defining
#define :>;=LC> /

Then you can 99turn the verbose bit on55 in an integer variable fla%s by executing
fla%s ) fla%s J :>;=LC>; fla%s J) :>;=LC>; fla%s ) fla%s I Z:>;=LC>; fla%s I) Z:>;=LC>; or

and turn it off ,ith


or

and test ,hether it5s set ,ith

if(fla%s I :>;=LC>)

The left.shift and right.shift operators << and >> let you shift an integer left or right by some number of bit positions7 for example! 9alue << - shifts 9alue left by t,o bits. The [Q or conditional operator also called the 99ternary operator55% essentially lets you embed an if<then statement in an expression. The assignment
a ) e'pr [ $ Q c;

is roughly e:uivalent to

if(e'pr) a ) $; else a ) c; Since you can use [Q any,here in an expression! it ,ould be cumbersome ,ith if<then. Gor example! f(a, $, c [ d Q e);

can do things that if<then can5t! or that the function call

is roughly e:uivalent to ""F

if(c) else

f(a, $, d); f(a, $, e);

Exercise6 ,hat ,ould the call be e:uivalent to;%

%(a, $, c [ d Q e, h [ i Q @, &);

The comma operator lets you put t,o separate expressions ,here one is re:uired7 the expressions are executed one after the other. The most common use for comma operators is ,hen you ,ant multiple variables controlling a for loop! for example6
for(i ) , @ ) * ; i < @; i++, @"")

8 cast operator allo,s you to explicitly force conversion of a value from one type to another. 8 cast consists of a type name in parentheses. Gor example! you could convert an int to a dou$le by typing
int i ) * ; dou$le d; d ) (dou$le)i;

In this case! though! the cast is redundant! since this is a conversion that C ,ould have performed for you automatically! i.e. if you5d ?ust said d ) i .% >ou use explicit casts in those circumstances ,here C does not do a needed conversion automatically. =ne example is division6 if you5re dividing t,o integers and you ,ant a floating.point result! you must explicitly force at least one of the operands to floating.point! other,ise C ,ill perform an integer division and ,ill discard the remainder. The code
int i ) *, @ ) -; dou$le d ) i # @;

,ill set d to 1! but


d ) (dou$le)i # @;

,ill set d to 1.$. >ou can also 99cast to 9oid55 to explicitly indicate that you5re ignoring a function5s return value! as in
(9oid)fclose(fp);

or
(9oid)printf("Hello, world!\n");

Usually! it5s a bad idea to ignore return values! but in some cases it5s essentially inevitable! and the (9oid) cast &eeps some compilers from issuing ,arnings every time you ignore a value.% There5s a precise! mildly elaborate set of rules ,hich C uses for converting values automatically! in the absence of explicit casts. The . and "> operators let you access the members components% of structures and unions.

Statements
The switch statement allo,s you to ?ump to one of a number of numeric case labels depending on the value of an expression7 it5s more convenient than a long if<else chain. /o,ever! you can use switch only ,hen the expression is integral and all of the case labels are compile.time constants.%

""$

The do<while loop is a loop that tests its controlling expression at the bottom of the loop! so that the body of the loop al,ays executes once even if the condition is initially false. C5s do<while loop is therefore li&e Pascal5s repeat<until loop! ,hile C5s while loop is li&e Pascal5s while<do loop.% Ginally! ,hen you really need to ,rite 99spaghetti code!55 C does have the all.purpose %oto statement! and labels to go to.

Functions
Gunctions can5t return arrays! and it5s tric&y to ,rite a function as if it returns an array perhaps by simulating the array ,ith a pointer% because you have to be careful about allocating the memory that the returned pointer points to. The functions ,e5ve ,ritten have all accepted a ,ell.defined! fixed number of arguments. printf accepts a variable number of arguments depending on ho, many , signs there are in the format string% but ,e haven5t seen ho, to declare and ,rite functions that do this.

C Preprocessor
If you5re careful! it5s possible and can be useful% to use #include ,ithin a header file! so that you end up ,ith 99nested header files.55 It5s possible to use #define to define 99function.li&e55 macros that accept arguments7 the expansion of the macro can therefore depend on the arguments it5s 99invo&ed55 ,ith. T,o special preprocessing operators # and ## let you control the expansion of macro arguments in fancier ,ays. The preprocessor directive #if lets you conditionally include or! ,ith #else! conditionally not include% a section of code depending on some arbitrary compile.time expression. #if can also do the same macro.definedness tests as #ifdef and #ifndef! because the expression can use a defined() operator.% =ther preprocessing directives are #elif! #error! #line! and #pra%ma. There are a fe, predefined preprocessor macros! some re:uired by the C standard! others perhaps defined by particular compilation environments. These are useful for conditional compilation #ifdef! #ifndef%.

Standard )i!rary Functions


C5s standard library contains many features and functions ,hich ,e haven5t seen. We5ve seen many of printf5s formatting capabilities! but not all. 'esides format specifier characters for a fe, types ,e haven5t seen! you can also control the ,idth! precision! ?ustification left or right% and a fe, other attributes of printf5s format conversions. In their full complexity! printf formats are about as elaborate and po,erful as G=*T*80 format statements.% ""3

8 scanf function lets you do 99formatted input55 analogous to printf5s formatted output. scanf reads from the standard input7 a variant fscanf reads from a specified file pointer. The sprintf and sscanf functions let you 99print55 and 99read55 to and from in.memory strings instead of files. We5ve seen that atoi lets you convert a numeric string into an integer7 the inverse operation can be performed ,ith sprintf6
int i ) * ; char strR* S; sprintf(str, ",d", i);

We5ve used printf and fprintf to ,rite formatted output! and %etchar! %etc! putchar! and putc to read and ,rite characters. There are also functions %ets! f%ets! puts! and fputs for reading and ,riting lines though ,e rarely need these! especially if ,e5re using our o,n %etline and maybe f%etline%! and also fread and fwrite for reading or ,riting arbitrary numbers of characters. It5s possible to 99un.read55 a character! that is! to push it bac& on an input stream! ,ith un%etc. This is useful if you accidentally read one character too far! and ,ould prefer that some other part of your program read that character instead.% >ou can use the ftell! fsee&! and rewind functions to ?ump around in files! performing random access as opposed to se:uential% I<=. The feof and ferror functions ,ill tell you ,hether you got >LH due to an actual end.of.file condition or due to a read error of some sort. >ou can clear errors and end.of.file conditions ,ith clearerr. >ou can open files in 99binary55 mode! or for simultaneous reading and ,riting. These options involve extra characters appended to fopen5s mode string6 $ for binary! + for read<,rite.% There are several more string functions in <strin%.h>. 8 second set of string functions strncpA! strncat! and strncmp all accept a third argument telling them to stop after n characters if they haven5t found the \ mar&ing the end of the string. 8 third set of 99mem55 functions! including memcpA and memcmp! operate on bloc&s of memory ,hich aren5t necessarily strings and ,here \ is not treated as a terminator. The strchr and strrchr functions find characters in strings. There is a motley collection of 99span55 and 99scan55 functions! strspn! strcspn! and strp$r&! for searching out or s&ipping over se:uences of characters all dra,n from a specified set of characters. The strto& function aids in brea&ing up a string into ,ords or 99to&ens!55 much li&e our o,n %etwords function. The header file <ctApe.h> contains several functions ,hich let you classify and manipulate characters6 chec& for letters or digits! convert bet,een upper. and lo,er.case! etc. 8 host of mathematical functions are defined in the header file <math.h>. 8s ,e5ve mentioned! besides including <math.h>! you may on some Unix systems have to as& for a special library containing the math functions ,hile compiling<lin&ing.%

""H

There5s a random.number generator! rand! and a ,ay to 99seed55 it! srand. rand returns integers from 1 up to ;5BGYU5N ,here ;5BGYU5N is a constant #defined in <stdli$.h>%. =ne ,ay of getting random integers from " to n is to call
(int)(rand() # (;5BGYU5N + *. ) ( n) + *

8nother ,ay is

rand() # (;5BGYU5N # n + *) + * rand() , n + *

It seems li&e it ,ould be simpler to ?ust say but this method is imperfect or rather! it5s imperfect if n is a po,er of t,o and your system5s implementation of rand() is imperfect! as all too many of them are%. Several functions let you interact ,ith the operating system under ,hich your program is running. The e'it function returns control to the operating system immediately! terminating your program and returning an 99exit status.55 The %eten9 function allo,s you to read your operating system5s or process5s 99environment variables55 if any%. The sAstem function allo,s you to invo&e an operating.system command i.e. another program% from ,ithin your program. The 7sort function allo,s you to sort an array of any type%7 you supply a comparison function via a function pointer% ,hich &no,s ho, to compare t,o array elements! and 7sort does the rest. The $search function allo,s you to search for elements in sorted arrays7 it! too! operates in terms of a caller.supplied comparison function. Several functions..time! asctime! %mtime! localtime! asctime! m&time! difftime! and strftime..allo, you to determine the current date and time! print dates and times! and perform other date<time manipulations. Gor example! to print today5s date in a program! you can ,rite
#include <time.h> timeYt now; now ) time((timeYt ()BV11); printf("<t4s ,.-/s", ctime(Inow));

The header file <stdar%.h> lets you manipulate variable.length function argument lists such as the ones printf is called ,ith%. 8dditional members of the printf family of functions let you ,rite your o,n functions ,hich accept printf.li&e format specifiers and variable numbers of arguments but call on the standard printf to do most of the ,or&. There are facilities for dealing ,ith multibyte and 99,ide55 characters and strings! for use ,ith multinational character sets.

""-

Das könnte Ihnen auch gefallen