Sie sind auf Seite 1von 36

Chapter 1: Introduction

C is (as K&R admit) a relatively small language, but one which (to its
admirers, anyway) wears well. C's small, unambitious feature set is a real
advantage: there's less to learn there isn't e!cess baggage in the way when you
don't need it. "t can also be a disadvantage: since it doesn't do everything for
you, there's a lot you have to do yourself. (#ctually, this is viewed by many as an
additional advantage: anything the language doesn't do for you, it doesn't dictate
to you, either, so you're free to do that something however you want.)
C is sometimes referred to as a $$high%level assembly language.'' &ome 'eo'le
thin( that's an insult, but it's actually a deliberate and significant as'ect of the
language. "f you have 'rogrammed in assembly language, you'll 'robably find C
very natural and comfortable (although if you continue to focus too heavily on
machine%level details, you'll 'robably end u' with unnecessarily non'ortable
'rograms). "f you haven't 'rogrammed in assembly language, you may be
frustrated by C's lac( of certain higher%level features. "n either case, you should
understand why C was designed this way: so that seemingly%sim'le
constructions e!'ressed in C would not e!'and to arbitrarily e!'ensive (in time or
s'ace) machine language constructions when com'iled. "f you write a C 'rogram
sim'ly and succinctly, it is li(ely to result in a succinct, efficient machine
language e!ecutable. "f you find that the e!ecutable 'rogram resulting from a C
'rogram is not efficient, it's 'robably because of something silly you did, not
because of something the com'iler did behind your bac( which you have no
control over. "n any case, there's no 'oint in com'laining about C's low%level
flavor: C is what it is.
# 'rogramming language is a tool, and no tool can 'erform every tas( unaided. "f
you're building a house, and "'m teaching you how to use a hammer, and you as(
how to assemble rafters and trusses into gables, that's a legitimate )uestion, but
the answer has fallen out of the realm of $$*ow do " use a hammer+'' and into
$$*ow do " build a house+''. "n the same way, we'll see that C does not have built%
in features to 'erform every function that we might ever need to do while
#s mentioned above, C im'oses relatively few built%in ways of doing things on
the 'rogrammer. &ome common tas(s, such as mani'ulating strings, allocating
memory, and doing in'ut,out'ut (",-), are 'erformed by calling on library
functions. -ther tas(s which you might want to do, such as creating or listing
directories, or interacting with a mouse, or dis'laying windows or other user%
interface elements, or doing color gra'hics, are not defined by the C language at
all. .ou can do these things from a C 'rogram, of course, but you will be calling
on services which are 'eculiar to your 'rogramming environment (com'iler,
'rocessor, and o'erating system) and which are not defined by the C standard.
&ince this course is about 'ortable C 'rogramming, it will also be steering clear
of facilities not 'rovided in all C environments.
#nother as'ect of C that's worth mentioning here is that it is, to 'ut it bluntly, a bit
dangerous. C does not, in general, try hard to 'rotect a 'rogrammer from
mista(es. "f you write a 'iece of code which will (through some oversight of
yours) do something wildly different from what you intended it to do, u' to and
including deleting your data or trashing your dis(, and if it is 'ossible for the
com'iler to com'ile it, it generally will. .ou won't get warnings of the form $$/o
you really mean to...+'' or $$#re you sure you really want to...+''. C is often
com'ared to a shar' (nife: it can do a surgically 'recise 0ob on some e!acting
tas( you have in mind, but it can also do a surgically 'recise 0ob of cutting off
your finger. "t's u' to you to use it carefully.
1his as'ect of C is very widely critici2ed it is also used (0ustifiably) to argue that
C is not a good teaching language. C aficionados love this as'ect of C because it
means that C does not try to 'rotect them from themselves: when they (now
what they're doing, even if it's ris(y or obscure, they can do it. &tudents of C hate
this as'ect of C because it often seems as if the language is some (ind of a
cons'iracy s'ecifically designed to lead them into booby tra's and $$gotcha3''s.
1his is another as'ect of the language which it's fairly 'ointless to com'lain
about. "f you ta(e care and 'ay attention, you can avoid many of the 'itfalls.
1hese notes will 'oint out many of the obvious (and not so obvious) trouble
4.4 # 5irst 6!am'le
4.7 &econd 6!am'le
4.8 9rogram &tructure
1.1 A First Example
:1his section corres'onds to K&R &ec. 4.4;
1he best way to learn 'rogramming is to dive right in and start writing real
'rograms. 1his way, conce'ts which would otherwise seem abstract ma(e
sense, and the 'ositive feedbac( you get from getting even a small 'rogram to
wor( gives you a great incentive to im'rove it or write the ne!t one.
/iving in with $$real'' 'rograms right away has another advantage, if only
'ragmatic: if you're using a conventional com'iler, you can't run a fragment of a
'rogram and see what it does nothing will run until you have a com'lete (if tiny
or trivial) 'rogram. .ou can't learn everything you'd need to write a com'lete
'rogram all at once, so you'll have to ta(e some things $$on faith'' and 'arrot
them in your first 'rograms before you begin to understand them. (.ou can't
learn to 'rogram 0ust one e!'ression or statement at a time any more than you
can learn to s'ea( a foreign language one word at a time. "f all you (now is a
handful of words, you can't actually say anything: you also need to (now
something about the language's word order and grammar and sentence structure
and declension of articles and verbs.)
<esides the occasional necessity to ta(e things on faith, there is a more serious
'otential drawbac( of this $$dive in and 'rogram'' a''roach: it's a small ste' from
learning%by%doing to learning%by%trial%and%error, and when you learn 'rogramming
by trial%and%error, you can very easily learn many errors. =hen you're not sure
whether something will wor(, or you're not even sure what you could use that
might wor(, and you try something, and it does wor(, you do not have any
guarantee that what you tried wor(ed for the right reason. .ou might 0ust have
$$learned'' something that wor(s only by accident or only on your com'iler, and it
may be very hard to un%learn it later, when it sto's wor(ing.
1herefore, whenever you're not sure of something, be very careful before you go
off and try it $$0ust to see if it will wor(.'' -f course, you can never be absolutely
sure that something is going to wor( before you try it, otherwise we'd never have
to try things. <ut you should have an e!'ectation that something is going to wor(
before you try it, and if you can't 'redict how to do something or whether
something would wor( and find yourself having to determine it e!'erimentally,
ma(e a note in your mind that whatever you've 0ust learned (based on the
outcome of the e!'eriment) is sus'ect.
1he first e!am'le 'rogram in K&R is the first e!am'le 'rogram in any language:
'rint or dis'lay a sim'le string, and e!it. *ere is my version of K&R's $$hello,
world'' 'rogram:
>include ?stdio.h@
'rintf(B*ello, world3CnB)
return D
"f you have a C com'iler, the first thing to do is figure out how to ty'e this
'rogram in and com'ile it and run it and see where its out'ut went. ("f you don't
have a C com'iler yet, the first thing to do is to find one.)
1he first line is 'ractically boiler'late it will a''ear in almost all 'rograms we
write. "t as(s that some definitions having to do with the $$&tandard ",- Fibrary''
be included in our 'rogram these definitions are needed if we are to call the
library function 'rintf correctly.
1he second line says that we are defining a function named main. Gost of the
time, we can name our functions anything we want, but the function name main is
s'ecial: it is the function that will be $$called'' first when our 'rogram starts
running. 1he em'ty 'air of 'arentheses indicates that our main function acce'ts
no arguments, that is, there isn't any information which needs to be 'assed in
when the function is called.
1he braces A and E surround a list of statements in C. *ere, they surround the list
of statements ma(ing u' the function main.
1he line
'rintf(B*ello, world3CnB)
is the first statement in the 'rogram. "t as(s that the function 'rintf be called 'rintf
is a library function which 'rints formatted out'ut. 1he 'arentheses surround
'rintf's argument list: the information which is handed to it which it should act on.
1he semicolon at the end of the line terminates the statement.
(printf's name reflects the fact that C was first develo'ed when 1elety'es and
other 'rinting terminals were still in wides'read use. 1oday, of course, video
dis'lays are far more common. printf's $$'rints'' to the standard output, that is,
to the default location for 'rogram out'ut to go. Howadays, that's almost always
a video screen or a window on that screen. "f you do have a 'rinter, you'll
ty'ically have to do something e!tra to get a 'rogram to 'rint to it.)
'rintf's first (and, in this case, only) argument is the string which it should 'rint.
1he string, enclosed in double )uotes BB, consists of the words $$*ello, world3''
followed by a s'ecial se)uence: Cn. "n strings, any two%character se)uence
beginning with the bac(slash C re'resents a single s'ecial character. 1he
se)uence Cn re'resents the $$new line'' character, which 'rints a carriage return
or line feed or whatever it ta(es to end one line of out'ut and move down to the
ne!t. (1his 'rogram only 'rints one line of out'ut, but it's still im'ortant to
terminate it.)
1he second line in the main function is
return D
"n general, a function may return a value to its caller, and main is no e!ce'tion.
=hen main returns (that is, reaches its end and sto's functioning), the 'rogram is
at its end, and the return value from main tells the o'erating system (or whatever
invo(ed the 'rogram that main is the main function of) whether it succeeded or
not. <y convention, a return value of D indicates success.
1his 'rogram may loo( so absolutely trivial that it seems as if it's not even worth
ty'ing it in and trying to run it, but doing so may be a big (and is certainly a vital)
first hurdle. -n an unfamiliar com'uter, it can be arbitrarily difficult to figure out
how to enter a te!t file containing 'rogram source, or how to com'ile and lin( it,
or how to invo(e it, or what ha''ened after (if+) it ran. 1he most e!'erienced C
'rogrammers immediately go bac( to this one, sim'le 'rogram whenever they're
trying out a new system or a new way of entering or building 'rograms or a new
way of 'rinting out'ut from within 'rograms. #s Kernighan and Ritchie say,
everything else is com'aratively easy.
*ow you com'ile and run this (or any) 'rogram is a function of the com'iler and
o'erating system you're using. 1he first ste' is to ty'e it in, e!actly as shown this
may involve using a te!t editor to create a file containing the 'rogram te!t. .ou'll
have to give the file a name, and all C com'ilers (that "'ve ever heard of) re)uire
that files containing C source end with the e!tension .c. &o you might 'lace the
'rogram te!t in a file called hello.c.
1he second ste' is to com'ile the 'rogram. (&trictly s'ea(ing, com'ilation
consists of two ste's, com'ilation 'ro'er followed by lin(ing, but we can overloo(
this distinction at first, es'ecially because the com'iler often ta(es care of
initiating the lin(ing ste' automatically.) -n many Ini! systems, the command to
com'ile a C 'rogram from a source file hello.c is
cc %o hello hello.c
.ou would ty'e this command at the Ini! shell 'rom't, and it re)uests that the cc
(C com'iler) 'rogram be run, 'lacing its out'ut (i.e. the new e!ecutable 'rogram
it creates) in the file hello, and ta(ing its in'ut (i.e. the source code to be
com'iled) from the file hello.c.
1he third ste' is to run (e!ecute, invo(e) the newly%built hello 'rogram. #gain on a
Ini! system, this is done sim'ly by ty'ing the 'rogram's name:
/e'ending on how your system is set u' (in 'articular, on whether the current
directory is searched for e!ecutables, based on the 9#1* variable), you may
have to ty'e
to indicate that the hello 'rogram is in the current directory (as o''osed to some
$$bin'' directory full of e!ecutable 'rograms, elsewhere).
.ou may also have your choice of C com'ilers. -n many Ini! machines, the cc
command is an older com'iler which does not recogni2e modern, #H&" &tandard
C synta!. #n old com'iler will acce't the sim'le 'rograms we'll be starting with,
but it will not acce't most of our later 'rograms. "f you find yourself getting
baffling com'ilation errors on 'rograms which you've ty'ed in e!actly as they're
shown, it 'robably indicates that you're using an older com'iler. -n many
machines, another com'iler called acc or gcc is available, and you'll want to use it,
instead. (<oth acc and gcc are ty'ically invo(ed the same as cc that is, the above
cc command would instead be ty'ed, say, gcc %o hello hello.c .)
(-ne final caveat about Ini! systems: don't name your test 'rograms test,
because there's already a standard command called test, and you and the
command inter'reter will get badly confused if you try to re'lace the system's test
command with your own, not least because your own almost certainly does
something com'letely different.)
Inder G&%/-&, the com'ilation 'rocedure is )uite similar. 1he name of the
command you ty'e will de'end on your com'iler (e.g. cl for the Gicrosoft C
com'iler, tc or bcc for <orland's 1urbo C, etc.). .ou may have to manually
'erform the second, lin(ing ste', 'erha's with a command named lin( or tlin(. 1he
e!ecutable file which the com'iler,lin(er creates will have a name ending in .e!e
(or 'erha's .com), but you can still invo(e it by ty'ing the base name (e.g. hello).
&ee your com'iler documentation for com'lete details one of the manuals
should contain a demonstration of how to enter, com'ile, and run a small
'rogram that 'rints some sim'le out'ut, 0ust as we're trying to describe here.
"n an integrated or $$visual'' 'rogamming environment, such as those on the
Gacintosh or under various versions of Gicrosoft =indows, the ste's you ta(e to
enter, com'ile, and run a 'rogram are somewhat different (and, theoretically,
sim'ler). 1y'ically, there is a way to o'en a new source window, ty'e source
code into it, give it a file name, and add it to the 'rogram (or $$'ro0ect'') you're
building. "f necessary, there will be a way to s'ecify what other source files (or
$$modules'') ma(e u' the 'rogram. 1hen, there's a button or menu selection
which com'iles and runs the 'rogram, all from within the 'rogramming
environment. (1here will also be a way to create a standalone e!ecutable file
which you can run from outside the environment.) "n a 9C%com'atible
environment, you may have to choose between creating /-& 'rograms or
=indows 'rograms. ("f you have troubles 'ertaining to the printf function, try
s'ecifying a target environment of G&%/-&. &u''osedly, some com'ilers which
are targeted at =indows environments won't let you call printf, because until
you call some fancier functions to re)uest that a window be created, there's no
window for printf to 'rint to.) #gain, chec( the introductory or tutorial manual
that came with the 'rogramming 'ac(age it should wal( you through the ste's
necessary to get your first 'rogram running.
1.2 Second Example
-ur second e!am'le is of little more 'ractical use than the first, but it introduces
a few more 'rogramming language elements:
>include ?stdio.h@
,J 'rint a few numbers, to illustrate a sim'le loo' J,
int i
for(i K D i ? 4D i K i L 4)
'rintf(Bi is MdCnB, i)
return D
#s before, the line >include ?stdio.h@ is boiler'late which is necessary since we're
calling the 'rintf function, and main() and the 'air of braces AE indicate and
delineate the function named main we're (again) writing.
1he first new line is the line
,J 'rint a few numbers, to illustrate a sim'le loo' J,
which is a comment. #nything between the characters ,J and J, is ignored by the
com'iler, but may be useful to a 'erson trying to read and understand the
'rogram. .ou can add comments anywhere you want to in the 'rogram, to
document what the 'rogram is, what it does, who wrote it, how it wor(s, what the
various functions are for and how they wor(, what the various variables are for,
1he second new line, down within the function main, is
int i
which declares that our function will use a variable named i. 1he variable's ty'e is
int, which is a 'lain integer.
He!t, we set u' a loop:
for(i K D i ? 4D i K i L 4)
1he (eyword for indicates that we are setting u' a $$for loo'.'' # for loo' is
controlled by three e!'ressions, enclosed in 'arentheses and se'arated by
semicolons. 1hese e!'ressions say that, in this case, the loo' starts by setting i
to D, that it continues as long as i is less than 4D, and that after each iteration of
the loo', i should be incremented by 4 (that is, have 4 added to its value).
5inally, we have a call to the 'rintf function, as before, but with several
differences. 5irst, the call to 'rintf is within the body of the for loo'. 1his means
that control flow does not 'ass once through the 'rintf call, but instead that the
call is 'erformed as many times as are dictated by the for loo'. "n this case, 'rintf
will be called several times: once when i is D, once when i is 4, once when i is 7,
and so on until i is N, for a total of 4D times.
# second difference in the 'rintf call is that the string to be 'rinted, Bi is MdB,
contains a 'ercent sign. =henever 'rintf sees a 'ercent sign, it indicates that 'rintf
is not su''osed to 'rint the e!act te!t of the string, but is instead su''osed to
read another one of its arguments to decide what to 'rint. 1he letter after the
'ercent sign tells it what ty'e of argument to e!'ect and how to 'rint it. "n this
case, the letter d indicates that 'rintf is to e!'ect an int, and to 'rint it in decimal.
5inally, we see that 'rintf is in fact being called with another argument, for a total
of two, se'arated by commas. 1he second argument is the variable i, which is in
fact an int, as re)uired by Md. 1he effect of all of this is that each time it is called,
'rintf will 'rint a line containing the current value of the variable i:
i is D
i is 4
i is 7
#fter several tri's through the loo', i will eventually e)ual N. #fter that tri'
through the loo', the third control e!'ression i = i + 1 will increment its value to
4D. 1he condition i < 10 is no longer true, so no more tri's through the loo' are
ta(en. "nstead, control flow 0um's down to the statement following the for loo',
which is the return statement. 1he main function returns, and the 'rogram is
1.3 Program Structure
=e'll have more to say later about 'rogram structure, but for now let's observe a
few basics. # 'rogram consists of one or more functions it may also contain
global variables. (-ur two e!am'le 'rograms so far have contained one function
a'iece, and no global variables.) #t the to' of a source file are ty'ically a few
boiler'late lines such as >include ?stdio.h@, followed by the definitions (i.e. code) for
the functions. ("t's also 'ossible to s'lit u' the several functions ma(ing u' a
larger 'rogram into several source files, as we'll see in a later cha'ter.)
6ach function is further com'osed of declarations and statements, in that order.
=hen a se)uence of statements should act as one (for e!am'le, when they
should all serve together as the body of a loo') they can be enclosed in braces
(0ust as for the outer body of the entire function). 1he sim'lest (ind of statement
is an expression statement, which is an e!'ression ('resumably 'erforming
some useful o'eration) followed by a semicolon. 6!'ressions are further
com'osed of operators, objects (variables), and constants.
C source code consists of several lexical elements. &ome are words, such as for,
return, main, and i, which are either keywords of the language (for, return) or
identifiers (names) we've chosen for our own functions and variables (main, i).
1here are constants such as 4 and 4D which introduce new values into the
'rogram. 1here are operators such as K, L, and @, which mani'ulate variables
and values. 1here are other 'unctuation characters (often called delimiters), such
as 'arentheses and s)uiggly braces AE, which indicate how the other elements of
the 'rogram are grou'ed. 5inally, all of the 'receding elements can be se'arated
by whitespace: s'aces, tabs, and the $$carriage returns'' between lines.
1he source code for a C 'rogram is, for the most 'art, $$free form.'' 1his means
that the com'iler does not care how the code is arranged: how it is bro(en into
lines, how the lines are indented, or whether whites'ace is used between things
li(e variable names and other 'unctuation. (Fines li(e >include ?stdio.h@ are an
e!ce'tion they must a''ear alone on their own lines, generally unbro(en. -nly
lines beginning with > are affected by this rule we'll see other e!am'les later.)
.ou can use whites'ace, indentation, and a''ro'riate line brea(s to ma(e your
'rograms more readable for yourself and other 'eo'le (even though the com'iler
doesn't care). .ou can 'lace e!'lanatory comments anywhere in your 'rogram%%
any te!t between the characters ,J and J, is ignored by the com'iler. ("n fact, the
com'iler 'retends that all it saw was whites'ace.) 1hough comments are ignored
by the com'iler, well%chosen comments can ma(e a 'rogram much easier to
read (for its author, as well as for others).
1he usage of whites'ace is our first style issue. "t's ty'ical to leave a blan( line
between different 'arts of the 'rogram, to leave a s'ace on either side of
o'erators such as L and K, and to indent the bodies of loo's and other control
flow constructs. 1y'ically, we arrange the indentation so that the subsidiary
statements controlled by a loo' statement (the $$loo' body,'' such as the 'rintf call
in our second e!am'le 'rogram) are all aligned with each other and 'laced one
tab sto' (or some consistent number of s'aces) to the right of the controlling
statement. 1his indentation (li(e all whites'ace) is not re)uired by the com'iler,
but it ma(es 'rograms much easier to read. (*owever, it can also be misleading,
if used incorrectly or in the face of inadvertent mista(es. 1he com'iler will decide
what $$the body of the loo''' is based on its own rules, not the indentation, so if
the indentation does not match the com'iler's inter'retation, confusion is
1o drive home the 'oint that the com'iler doesn't care about indentation, line
brea(s, or other whites'ace, here are a few (e!treme) e!am'les: 1he fragments
for(i K D i ? 4D i K i L 4)
'rintf(BMdCnB, i)
for(i K D i ? 4D i K i L 4) 'rintf(BMdCnB, i)
for(i K D i ? 4D i K i L 4)
'rintf(BMdCnB, i)
for ( i
i ? 4D
i K
i L 4
) 'rintf (
BMdCnB , i
(BMdCnB, i)
are all treated e!actly the same way by the com'iler.
&ome 'rogrammers argue forever over the best set of $$rules'' for indentation and
other as'ects of 'rogramming style, calling to mind the old 'hiloso'her's debates
about the number of angels that could dance on the head of a 'in. &tyle issues
(such as how a 'rogram is laid out) are im'ortant, but they're not something to be
too dogmatic about, and there are also other, dee'er style issues besides mere
layout and ty'ogra'hy. Kernighan and Ritchie ta(e a fairly moderate stance:
#lthough C com'ilers do not care about how a 'rogram loo(s, 'ro'er indentation
and s'acing are critical in ma(ing 'rograms easy for 'eo'le to read. =e
recommend writing only one statement 'er line, and using blan(s around
o'erators to clarify grou'ing. 1he 'osition of braces is less im'ortant, although
'eo'le hold 'assionate beliefs. =e have chosen one of several 'o'ular styles.
9ic( a style that suits you, then use it consistently.
1here is some value in having a reasonably standard style (or a few standard
styles) for code layout. 9lease don't ta(e the above advice to $$'ic( a style that
suits you'' as an invitation to invent your own brand%new style. "f ('erha's after
you've been 'rogramming in C for a while) you have s'ecific ob0ections to
s'ecific facets of e!isting styles, you're welcome to modify them, but if you don't
have any 'articular leanings, you're 'robably best off co'ying an e!isting style at
first. ("f you want to 'lace your own stam' of originality on the 'rograms that you
write, there are better avenues for your creativity than inventing a bi2arre layout
you might instead try to ma(e the logic easier to follow, or the user interface
easier to use, or the code freer of bugs.)
Chapter 2: Basic ata !"pes and #perators
1he type of a variable determines what (inds of values it may ta(e on. #n
operator com'utes new values out of old ones. #n expression consists of
variables, constants, and o'erators combined to 'erform some useful
com'utation. "n this cha'ter, we'll learn about C's basic ty'es, how to write
constants and declare variables of these ty'es, and what the basic o'erators are.
#s Kernighan and Ritchie say, $$1he ty'e of an ob0ect determines the set of
values it can have and what o'erations can be 'erformed on it.'' 1his is a fairly
formal, mathematical definition of what a ty'e is, but it is traditional (and
meaningful). 1here are several im'lications to remember:
1. 1he $$set of values'' is finite. C's int ty'e can not re'resent all of the
integers its float ty'e can not re'resent all floating%'oint numbers.
7. =hen you're using an ob0ect (that is, a variable) of some ty'e, you may
have to remember what values it can ta(e on and what o'erations you can
'erform on it. 5or e!am'le, there are several o'erators which 'lay with the
binary (bit%level) re'resentation of integers, but these o'erators are not
meaningful for and may not be a''lied to floating%'oint o'erands.
8. =hen declaring a new variable and 'ic(ing a ty'e for it, you have to (ee'
in mind the values and o'erations you'll be needing.
"n other words, 'ic(ing a ty'e for a variable is not some abstract academic
e!ercise it's closely connected to the way(s) you'll be using that variable.
7.4 1y'es
7.7 Constants
7.8 /eclarations
7.O Pariable Hames
7.Q #rithmetic -'erators
7.R #ssignment -'erators
7.S 5unction Calls
Read se)uentially: 'rev ne!t u' to'
1his 'age by &teve &ummit ,, Co'yright 4NNQ, 4NNR ,, mail feedbac(
2.1 !"pes
:1his section corres'onds to K&R &ec. 7.7;
1here are only a few basic data ty'es in C. 1he first ones we'll be encountering
and using are:
char a character
int an integer, in the range %87,SRS to 87,SRS
long int a larger integer (u' to L%7,4OS,OT8,ROS)
float a floating%'oint number
double a floating%'oint number, with more 'recision and 'erha's greater
range than float
"f you can loo( at this list of basic ty'es and say to yourself, $$-h, how sim'le,
there are only a few ty'es, " won't have to worry much about choosing among
them,'' you'll have an easy time with declarations. (&ome masochists wish that
the ty'e system were more com'licated so that they could s'ecify more things
about each variable, but those of us who would rather not have to s'ecify these
e!tra things each time are glad that we don't have to.)
1he ranges listed above for ty'es int and long int are the guaranteed minimum
ranges. -n some systems, either of these ty'es (or, indeed, any C ty'e) may be
able to hold larger values, but a 'rogram that de'ends on e!tended ranges will
not be as 'ortable. &ome 'rogrammers become obsessed with (nowing e!actly
what the si2es of data ob0ects will be in various situations, and go on to write
'rograms which de'end on these e!act si2es. /etermining or controlling the si2e
of an ob0ect is occasionally im'ortant, but most of the time we can sideste' si2e
issues and let the com'iler do most of the worrying.
(5rom the ranges listed above, we can determine that ty'e int must be at least 4R
bits, and that ty'e long int must be at least 87 bits. <ut neither of these si2es is
e!act many systens have 87%bit ints, and some systems have RO%bit long ints.)
.ou might wonder how the com'uter stores characters. 1he answer involves a
character set, which is sim'ly a ma''ing between some set of characters and
some set of small numeric codes. Gost machines today use the #&C"" character
set, in which the letter # is re'resented by the code RQ, the am'ersand & is
re'resented by the code 8T, the digit 4 is re'resented by the code ON, the s'ace
character is re'resented by the code 87, etc. (Gost of the time, of course, you
have no need to (now or even worry about these 'articular code values they're
automatically translated into the right sha'es on the screen or 'rinter when
characters are 'rinted out, and they're automatically generated when you ty'e
characters on the (eyboard. 6ventually, though, we'll a''reciate, and even ta(e
some control over, e!actly when these translations%%from characters to their
numeric codes%%are 'erformed.) Character codes are usually small%%the largest
code value in #&C"" is 47R, which is the U (tilde or circumfle!) character.
Characters usually fit in a byte, which is usually T bits. "n C, ty'e char is defined
as occu'ying one byte, so it is usually T bits.
Gost of the sim'le variables in most 'rograms are of ty'es int, long int, or
double. 1y'ically, we'll use int and double for most 'ur'oses, and long int any
time we need to hold integer values greater than 87,SRS. #s we'll see, even when
we're mani'ulating individual characters, we'll usually use an int variable, for
reasons to be discussed later. 1herefore, we'll rarely use individual variables of
ty'e char although we'll use 'lenty of arrays of char.
2.2 Constants
:1his section corres'onds to K&R &ec. 7.8;
# constant is 0ust an immediate, absolute value found in an e!'ression. 1he
sim'lest constants are decimal integers, e.g. D, 4, 7, 478 . -ccasionally it is useful
to s'ecify constants in base T or base 4R (octal or he!adecimal) this is done by
'refi!ing an e!tra D (2ero) for octal, or D! for he!adecimal: the constants 4DD, D4OO,
and D!RO all re'resent the same number. ("f you're not using these non%decimal
constants, 0ust remember not to use any leading 2eroes. "f you accidentally write
D478 intending to get one hundred and twenty three, you'll get T8 instead, which
is 478 base T.)
=e write constants in decimal, octal, or he!adecimal for our convenience, not the
com'iler's. 1he com'iler doesn't care it always converts everything into binary
internally, anyway. (1here is, however, no good way to s'ecify constants in
source code in binary.)
# constant can be forced to be of ty'e long int by suffi!ing it with the letter F (in
u''er or lower case, although u''er case is strongly recommended, because a
lower case l loo(s too much li(e the digit 4).
# constant that contains a decimal 'oint or the letter e (or both) is a floating%'oint
constant: 8.4O, 4D., .D4, 478eO, 478.OQReS . 1he e indicates multi'lication by a 'ower
of 4D 478.OQReS is 478.OQR times 4D to the Sth, or 4,78O,QRD,DDD. (5loating%'oint
constants are of ty'e double by default.)
=e also have constants for s'ecifying characters and strings. (Ga(e sure you
understand the difference between a character and a string: a character is
e!actly one character a string is a set of 2ero or more characters a string
containing one character is distinct from a lone character.) # character constant
is sim'ly a single character between single )uotes: '#', '.', 'M'. 1he numeric value
of a character constant is, naturally enough, that character's value in the
machine's character set. ("n #&C"", for e!am'le, '#' has the value RQ.)
# string is re'resented in C as a se)uence or array of characters. (=e'll have
more to say about arrays in general, and strings in 'articular, later.) # string
constant is a se)uence of 2ero or more characters enclosed in double )uotes:
Ba''leB, Bhello, worldB, Bthis is a testB.
=ithin character and string constants, the bac(slash character C is s'ecial, and is
used to re'resent characters not easily ty'ed on the (eyboard or for various
reasons not easily ty'ed in constants. 1he most common of these $$character
esca'es'' are:
Cn a $$newline'' character
Cb a bac(s'ace
Cr a carriage return (without a line feed)
C' a single )uote (e.g. in a character constant)
CB a double )uote (e.g. in a string constant)
CC a single bac(slash
5or e!am'le, "he said \"hi\"" is a string constant which contains two double
)uotes, and '\'' is a character constant consisting of a (single) single )uote.
Hotice once again that the character constant 'A' is very different from the string
constant "A".
2.3 eclarations
:1his section corres'onds to K&R &ec. 7.O;
"nformally, a variable (also called an object) is a 'lace you can store a value. &o
that you can refer to it unambiguously, a variable needs a name. .ou can thin( of
the variables in your 'rogram as a set of bo!es or cubbyholes, each with a label
giving its name you might imagine that storing a value $$in'' a variable consists of
writing the value on a sli' of 'a'er and 'lacing it in the cubbyhole.
# declaration tells the com'iler the name and ty'e of a variable you'll be using in
your 'rogram. "n its sim'lest form, a declaration consists of the ty'e, the name of
the variable, and a terminating semicolon:
char c
int i
float f
.ou can also declare several variables of the same ty'e in one declaration,
se'arating them with commas:
int i4, i7
Fater we'll see that declarations may also contain initializers, qualifiers and
storage classes, and that we can declare arrays, functions, pointers, and other
(inds of data structures.
1he 'lacement of declarations is significant. .ou can't 'lace them 0ust anywhere
(i.e. they cannot be inters'ersed with the other statements in your 'rogram).
1hey must either be 'laced at the beginning of a function, or at the beginning of a
brace%enclosed bloc( of statements (which we'll learn about in the ne!t cha'ter),
or outside of any function. 5urthermore, the 'lacement of a declaration, as well
as its storage class, controls several things about its visibility and lifetime, as we'll
see later.
.ou may wonder why variables must be declared before use. 1here are two
4. "t ma(es things somewhat easier on the com'iler it (nows right away
what (ind of storage to allocate and what code to emit to store and
mani'ulate each variable it doesn't have to try to intuit the 'rogrammer's
7. "t forces a bit of useful disci'line on the 'rogrammer: you cannot introduce
variables willy%nilly you must thin( about them enough to 'ic( a''ro'riate
ty'es for them. (1he com'iler's error messages to you, telling you that you
a''arently forgot to declare a variable, are as often hel'ful as they are a
nuisance: they're hel'ful when they tell you that you miss'elled a variable,
or forgot to thin( about e!actly how you were going to use it.)
#lthough there are a few 'laces where declarations can be omitted (in which
case the com'iler will assume an im'licit declaration), ma(ing use of these
removes the advantages of reason 7 above, so " recommend always declaring
everything e!'licitly.
Gost of the time, " recommend writing one declaration 'er line. 5or the most 'art,
the com'iler doesn't care what order declarations are in. .ou can order the
declarations al'habetically, or in the order that they're used, or to 'ut related
declarations ne!t to each other. Collecting all variables of the same ty'e together
on one line essentially orders declarations by ty'e, which isn't a very useful order
(it's only slightly more useful than random order).
# declaration for a variable can also contain an initial value. 1his initializer
consists of an e)uals sign and an e!'ression, which is usually a single constant:
int i K 4
int i4 K 4D, i7 K 7D
2.$ %aria&le 'ames
:1his section corres'onds to K&R &ec. 7.4;
=ithin limits, you can give your variables and functions any names you want.
1hese names (the formal term is $$identifiers'') consist of letters, numbers, and
underscores. 5or our 'ur'oses, names must begin with a letter. 1heoretically,
names can be as long as you want, but e!tremely long ones get tedious to ty'e
after a while, and the com'iler is not re)uired to (ee' trac( of e!tremely long
ones 'erfectly. (=hat this means is that if you were to name a variable, say,
su'ercalafragalistices'ialidocious, the com'iler might get la2y and 'retend that you'd
named it su'ercalafragalistices'ialidocio, such that if you later miss'elled it
su'ercalafragalistices'ialidociou2, the com'iler wouldn't catch your mista(e. Hor would
the com'iler necessarily be able to tell the difference if for some 'erverse reason
you deliberately declared a second variable named su'ercalafragalistices'ialidociou2.)
1he ca'itali2ation of names in C is significant: the variable names variable,
Pariable, and P#R"#<F6 (as well as silly combinations li(e vari#ble) are all distinct.
# final restriction on names is that you may not use keywords (the words such as
int and for which are 'art of the synta! of the language) as the names of
variables or functions (or as identifiers of any (ind).
2.( Arithmetic #perators
:1his section corres'onds to K&R &ec. 7.Q;
1he basic o'erators for 'erforming arithmetic are the same in many com'uter
L addition
% subtraction
J multi'lication
, division
M modulus (remainder)
1he % o'erator can be used in two ways: to subtract two numbers (as in a % b), or
to negate one number (as in %a L b or a L %b).
=hen a''lied to integers, the division o'erator , discards any remainder, so 4 , 7
is D and S , O is 4. <ut when either o'erand is a floating%'oint )uantity (ty'e float or
double), the division o'erator yields a floating%'oint result, with a 'otentially
non2ero fractional 'art. &o 4 , 7.D is D.Q, and S.D , O.D is 4.SQ.
1he modulus o'erator M gives you the remainder when two integers are divided:
4 M 7 is 4 S M O is 8. (1he modulus o'erator can only be a''lied to integers.)
#n additional arithmetic o'eration you might be wondering about is
e!'onentiation. &ome languages have an e!'onentiation o'erator (ty'ically V or
JJ), but C doesn't. (1o s)uare or cube a number, 0ust multi'ly it by itself.)
Gulti'lication, division, and modulus all have higher precedence than addition
and subtraction. 1he term $$'recedence'' refers to how $$tightly'' o'erators bind to
their o'erands (that is, to the things they o'erate on). "n mathematics,
multi'lication has higher 'recedence than addition, so 4 L 7 J 8 is S, not N. "n other
words, 4 L 7 J 8 is e)uivalent to 4 L (7 J 8). C is the same way.
#ll of these o'erators $$grou''' from left to right, which means that when two or
more of them have the same 'recedence and 'artici'ate ne!t to each other in an
e!'ression, the evaluation conce'tually 'roceeds from left to right. 5or e!am'le,
4 % 7 % 8 is e)uivalent to (4 % 7) % 8 and gives %O, not L7. ($$Wrou'ing'' is sometimes
called associativity, although the term is used somewhat differently in
'rogramming than it is in mathematics. Hot all C o'erators grou' from left to
right a few grou' from right to left.)
=henever the default 'recedence or associativity doesn't give you the grou'ing
you want, you can always use e!'licit 'arentheses. 5or e!am'le, if you wanted
to add 4 to 7 and then multi'ly the result by 8, you could write (4 L 7) J 8.
<y the way, the word $$arithmetic'' as used in the title of this section is an
ad0ective, not a noun, and it's 'ronounced differently than the noun: the accent is
on the third syllable.
2.( Arithmetic #perators
:1his section corres'onds to K&R &ec. 7.Q;
1he basic o'erators for 'erforming arithmetic are the same in many com'uter
L addition
% subtraction
J multi'lication
, division
M modulus (remainder)
1he % o'erator can be used in two ways: to subtract two numbers (as in a % b), or
to negate one number (as in %a L b or a L %b).
=hen a''lied to integers, the division o'erator , discards any remainder, so 4 , 7
is D and S , O is 4. <ut when either o'erand is a floating%'oint )uantity (ty'e float or
double), the division o'erator yields a floating%'oint result, with a 'otentially
non2ero fractional 'art. &o 4 , 7.D is D.Q, and S.D , O.D is 4.SQ.
1he modulus o'erator M gives you the remainder when two integers are divided:
4 M 7 is 4 S M O is 8. (1he modulus o'erator can only be a''lied to integers.)
#n additional arithmetic o'eration you might be wondering about is
e!'onentiation. &ome languages have an e!'onentiation o'erator (ty'ically V or
JJ), but C doesn't. (1o s)uare or cube a number, 0ust multi'ly it by itself.)
Gulti'lication, division, and modulus all have higher precedence than addition
and subtraction. 1he term $$'recedence'' refers to how $$tightly'' o'erators bind to
their o'erands (that is, to the things they o'erate on). "n mathematics,
multi'lication has higher 'recedence than addition, so 4 L 7 J 8 is S, not N. "n other
words, 4 L 7 J 8 is e)uivalent to 4 L (7 J 8). C is the same way.
#ll of these o'erators $$grou''' from left to right, which means that when two or
more of them have the same 'recedence and 'artici'ate ne!t to each other in an
e!'ression, the evaluation conce'tually 'roceeds from left to right. 5or e!am'le,
4 % 7 % 8 is e)uivalent to (4 % 7) % 8 and gives %O, not L7. ($$Wrou'ing'' is sometimes
called associativity, although the term is used somewhat differently in
'rogramming than it is in mathematics. Hot all C o'erators grou' from left to
right a few grou' from right to left.)
=henever the default 'recedence or associativity doesn't give you the grou'ing
you want, you can always use e!'licit 'arentheses. 5or e!am'le, if you wanted
to add 4 to 7 and then multi'ly the result by 8, you could write (4 L 7) J 8.
<y the way, the word $$arithmetic'' as used in the title of this section is an
ad0ective, not a noun, and it's 'ronounced differently than the noun: the accent is
on the third syllable.
2.) Assignment #perators
:1his section corres'onds to K&R &ec. 7.4D;
1he assignment o'erator K assigns a value to a variable. 5or e!am'le,
! K 4
sets ! to 4, and
a K b
sets a to whatever b's value is. 1he e!'ression
i K i L 4
is, as we've mentioned elsewhere, the standard 'rogramming idiom for
increasing a variable's value by 4: this e!'ression ta(es i's old value, adds 4 to it,
and stores it bac( into i. (C 'rovides several $$shortcut'' o'erators for modifying
variables in this and similar ways, which we'll meet later.)
=e've called the K sign the $$assignment o'erator'' and referred to $$assignment
e!'ressions'' because, in fact, K is an o'erator 0ust li(e L or %. C does not have
$$assignment statements'' instead, an assignment li(e a K b is an e!'ression and
can be used wherever any e!'ression can a''ear. &ince it's an e!'ression, the
assignment a K b has a value, namely, the same value that's assigned to a. 1his
value can then be used in a larger e!'ression for e!am'le, we might write
c K a K b
which is e)uivalent to
c K (a K b)
and assigns b's value to both a and c. (1he assignment o'erator, therefore,
grou's from right to left.) Fater we'll see other circumstances in which it can be
useful to use the value of an assignment e!'ression.
"t's usually a matter of style whether you initiali2e a variable with an initiali2er in
its declaration or with an assignment e!'ression near where you first use it. 1hat
is, there's no 'articular difference between
int a K 4D
int a
,J later... J,
a K 4D
2.* Function Calls
=e'll have much more to say about functions in a later cha'ter, but for now let's
0ust loo( at how they're called. (1o review: what a function is is a 'iece of code,
written by you or by someone else, which 'erforms some useful,
com'artmentali2able tas(.) .ou call a function by mentioning its name followed
by a 'air of 'arentheses. "f the function ta(es any arguments, you 'lace the
arguments between the 'arentheses, se'arated by commas. 1hese are all
function calls:
'rintf(B*ello, world3CnB)
'rintf(BMdCnB, i)
1he arguments to a function can be arbitrary e!'ressions. 1herefore, you don't
have to say things li(e
int sum K a L b L c
'rintf(Bsum K MdCnB, sum)
if you don't want to you can instead colla'se it to
'rintf(Bsum K MdCnB, a L b L c)
Gany functions return values, and when they do, you can embed calls to these
functions within larger e!'ressions:
c K s)rt(a J a L b J b)
! K r J cos(theta)
i K f4(f7(0))
1he first e!'ression s)uares a and b, com'utes the s)uare root of the sum of the
s)uares, and assigns the result to c. ("n other words, it com'utes a * a + b * b,
'asses that number to the sqrt function, and assigns sqrt's return value to c.)
1he second e!'ression 'asses the value of the variable theta to the cos (cosine)
function, multi'lies the result by r, and assigns the result to x. 1he third
e!'ression 'asses the value of the variable to the function f!, 'asses the
return value of f! immediately to the function f1, and finally assigns f1's return
value to the variable i.
Chapter 3: Statements and Control Flo+
&tatements are the $$ste's'' of a 'rogram. Gost statements com'ute and assign
values or call functions, but we will eventually meet several other (inds of
statements as well. <y default, statements are e!ecuted in se)uence, one after
another. =e can, however, modify that se)uence by using control flow
constructs which arrange that a statement or grou' of statements is e!ecuted
only if some condition is true or false, or e!ecuted over and over again to form a
loop. (# somewhat different (ind of control flow ha''ens when we call a function:
e!ecution of the caller is sus'ended while the called function 'roceeds. =e'll
discuss functions in cha'ter Q.)
Gy definitions of the terms statement and control flow are somewhat circular. #
statement is an element within a 'rogram which you can a''ly control flow to
control flow is how you s'ecify the order in which the statements in your 'rogram
are e!ecuted. (# wea(er definition of a statement might be $$a 'art of your
'rogram that does something,'' but this definition could as easily be a''lied to
e!'ressions or functions.)
8.4 6!'ression &tatements
8.7 if &tatements
8.8 <oolean 6!'ressions
8.O "hile Foo's
8.Q for Foo's
8.R brea# and continue
3.1 Expression Statements
:1his section corres'onds to K&R &ec. 8.4;
Gost of the statements in a C 'rogram are expression statements. #n e!'ression
statement is sim'ly an e!'ression followed by a semicolon. 1he lines
i K D
i K i L 4
'rintf(B*ello, world3CnB)
are all e!'ression statements. ("n some languages, such as 9ascal, the
semicolon se'arates statements, such that the last statement is not followed by a
semicolon. "n C, however, the semicolon is a statement terminator all sim'le
statements are followed by semicolons. 1he semicolon is also used for a few
other things in C we've already seen that it terminates declarations, too.)
6!'ression statements do all of the real wor( in a C 'rogram. =henever you
need to com'ute new values for variables, you'll ty'ically use e!'ression
statements (and they'll ty'ically contain assignment o'erators). =henever you
want your 'rogram to do something visible, in the real world, you'll ty'ically call a
function (as 'art of an e!'ression statement). =e've already seen the most basic
e!am'le: calling the function 'rintf to 'rint te!t to the screen. <ut anything else
you might do%%read or write a dis( file, tal( to a modem or 'rinter, draw 'ictures
on the screen%%will also involve function calls. (5urthermore, the functions you call
to do these things are usually different de'ending on which o'erating system
you're using. 1he C language does not define them, so we won't be tal(ing about
or using them much.)
6!'ressions and e!'ression statements can be arbitrarily com'licated. 1hey
don't have to consist of e!actly one sim'le function call, or of one sim'le
assignment to a variable. 5or one thing, many functions return values, and the
values they return can then be used by other 'arts of the e!'ression. 5or
e!am'le, C 'rovides a s)rt (s)uare root) function, which we might use to com'ute
the hy'otenuse of a right triangle li(e this:
c K s)rt(aJa L bJb)
1o be useful, an e!'ression statement must do something it must have some
lasting effect on the state of the 'rogram. (5ormally, a useful statement must
have at least one side effect.) 1he first two sam'le e!'ression statements in this
section (above) assign new values to the variable i, and the third one calls 'rintf to
'rint something out, and these are good e!am'les of statements that do
something useful.
(1o ma(e the distinction clear, we may note that degenerate constructions such
i L 4
are syntactically valid statements%%they consist of an e!'ression followed by a
semicolon%%but in each case, they com'ute a value without doing anything with it,
so the com'uted value is discarded, and the statement is useless. <ut if the
$$degenerate'' statements in this 'aragra'h don't ma(e much sense to you, don't
worry it's because they, fran(ly, don't ma(e much sense.)
"t's also 'ossible for a single e!'ression to have multi'le side effects, but it's
easy for such an e!'ression to be (a) confusing or (b) undefined. 5or now, we'll
only be loo(ing at e!'ressions (and, therefore, statements) which do one well%
defined thing at a time.
3.2 i, Statements
:1his section corres'onds to K&R &ec. 8.7;
1he sim'lest way to modify the control flow of a 'rogram is with an if statement,
which in its sim'lest form loo(s li(e this:
if(! @ ma!)
ma! K !
6ven if you didn't (now any C, it would 'robably be 'retty obvious that what
ha''ens here is that if ! is greater than ma!, ! gets assigned to ma!. (=e'd use
code li(e this to (ee' trac( of the ma!imum value of ! we'd seen%%for each new !,
we'd com'are it to the old ma!imum value ma!, and if the new value was greater,
we'd u'date ma!.)
Gore generally, we can say that the synta! of an if statement is:
if( expression )
where expression is any e!'ression and statement is any statement.
=hat if you have a series of statements, all of which should be e!ecuted together
or not at all de'ending on whether some condition is true+ 1he answer is that
you enclose them in braces:
if( expression )
#s a general rule, anywhere the synta! of C calls for a statement, you may write
a series of statements enclosed by braces. (.ou do not need to, and should not,
'ut a semicolon after the closing brace, because the series of statements
enclosed by braces is not itself a sim'le e!'ression statement.)
#n if statement may also o'tionally contain a second statement, the $$else
clause,'' which is to be e!ecuted if the condition is not met. *ere is an e!am'le:
if(n @ D)
average K sum , n
else A
'rintf(Bcan't com'ute averageCnB)
average K D
1he first statement or bloc( of statements is e!ecuted if the condition is true, and
the second statement or bloc( of statements (following the (eyword else) is
e!ecuted if the condition is not true. "n this e!am'le, we can com'ute a
meaningful average only if n is greater than D otherwise, we 'rint a message
saying that we cannot com'ute the average. 1he general synta! of an if
statement is therefore
if( expression )
(where both statement<sub>1</sub> and statement<sub></sub> may be lists of
statements enclosed in braces).
"t's also 'ossible to nest one if statement inside another. (5or that matter, it's in
general 'ossible to nest any (ind of statement or control flow construct within
another.) 5or e!am'le, here is a little 'iece of code which decides roughly which
)uadrant of the com'ass you're wal(ing into, based on an ! value which is
'ositive if you're wal(ing east, and a y value which is 'ositive if you're wal(ing
if(! @ D)
if(y @ D)
else 'rintf(B&outheast.CnB)
else A
if(y @ D)
else 'rintf(B&outhwest.CnB)
=hen you have one if statement (or loo') nested inside another, it's a very good
idea to use e!'licit braces AE, as shown, to ma(e it clear (both to you and to the
com'iler) how they're nested and which else goes with which if. "t's also a good
idea to indent the various levels, also as shown, to ma(e the code more readable
to humans. =hy do both+ .ou use indentation to ma(e the code visually more
readable to yourself and other humans, but the com'iler doesn't 'ay attention to
the indentation (since all whites'ace is essentially e)uivalent and is essentially
ignored). 1herefore, you also have to ma(e sure that the 'unctuation is right.
*ere is an e!am'le of another common arrangement of if and else. &u''ose we
have a variable grade containing a student's numeric grade, and we want to 'rint
out the corres'onding letter grade. *ere is code that would do the 0ob:
if(grade @K ND)
else if(grade @K TD)
else if(grade @K SD)
else if(grade @K RD)
else 'rintf(B5B)
=hat ha''ens here is that e!actly one of the five 'rintf calls is e!ecuted,
de'ending on which of the conditions is true. 6ach condition is tested in turn, and
if one is true, the corres'onding statement is e!ecuted, and the rest are s(i''ed.
"f none of the conditions is true, we fall through to the last one, 'rinting $$5''.
"n the cascaded if,else,if,else,... chain, each else clause is another if statement.
1his may be more obvious at first if we reformat the e!am'le, including every set
of braces and indenting each if statement relative to the 'revious one:
if(grade @K ND)
else A
if(grade @K TD)
else A
if(grade @K SD)
else A
if(grade @K RD)
else A
<y e!amining the code this way, it should be obvious that e!actly one of the
printf calls is e!ecuted, and that whenever one of the conditions is found true,
the remaining conditions do not need to be chec(ed and none of the later
statements within the chain will be e!ecuted. <ut once you've convinced yourself
of this and learned to recogni2e the idiom, it's generally 'referable to arrange the
statements as in the first e!am'le, without trying to indent each successive if
statement one tabsto' further out. (-bviously, you'd run into the right margin very
)uic(ly if the chain had 0ust a few more cases3)
3.3 Boolean Expressions
#n if statement li(e
if(! @ ma!)
ma! K !
is 'erha's dece'tively sim'le. Conce'tually, we say that it chec(s whether the
condition ! @ ma! is $$true'' or $$false''. 1he mechanics underlying C's conce'tion
of $$true'' and $$false,'' however, deserve some e!'lanation. =e need to
understand how true and false values are re'resented, and how they are
inter'reted by statements li(e if.
#s far as C is concerned, a true,false condition can be re'resented as an integer.
(#n integer can re'resent many values here we care about only two values:
$$true'' and $$false.'' 1he study of mathematics involving only two values is called
<oolean algebra, after Weorge <oole, a mathematician who refined this study.) "n
C, $$false'' is re'resented by a value of D (2ero), and $$true'' is re'resented by any
value that is non2ero. &ince there are many non2ero values (at least RQ,Q8O, for
values of ty'e int), when we have to 'ic( a s'ecific value for $$true,'' we'll 'ic( 4.
1he relational operators such as ?, ?K, @, and @K are in fact o'erators, 0ust li(e L,
%, J, and ,. 1he relational o'erators ta(e two values, loo( at them, and $$return'' a
value of 4 or D de'ending on whether the tested relation was true or false. 1he
com'lete set of relational o'erators in C is:
? less than
?K less than or e)ual
@ greater than
@K greater than or e)ual
KK e)ual
3K not e)ual
5or e!am'le, 4 ? 7 is 4, 8 @ O is D, Q KK Q is 4, and R 3K R is D.
=e've now encountered 'erha's the most easy%to%stumble%on $$gotcha3'' in C:
the e)uality%testing o'erator is KK, not a single K, which is assignment. "f you
accidentally write
if(a K D)
(and you 'robably will at some 'oint everybody ma(es this mista(e), it will not
test whether a is 2ero, as you 'robably intended. "nstead, it will assign D to a, and
then 'erform the $$true'' branch of the if statement if a is non2ero. <ut a will have
0ust been assigned the value D, so the $$true'' branch will never be ta(en3 (1his
could drive you cra2y while debugging%%you wanted to do something if a was D,
and after the test, a is D, whether it was su''osed to be or not, but the $$true''
branch is nevertheless not ta(en.)
1he relational o'erators wor( with arbitrary numbers and generate true,false
values. .ou can also combine true,false values by using the "oolean operators,
which ta(e true,false values as o'erands and com'ute new true,false values.
1he three <oolean o'erators are:
&& and
XX or
3 not (ta(es one o'erand $$unary'')
1he && ($$and'') o'erator ta(es two true,false values and 'roduces a true (4)
result if both o'erands are true (that is, if the left%hand side is true and the right%
hand side is true). 1he XX ($$or'') o'erator ta(es two true,false values and 'roduces
a true (4) result if either o'erand is true. 1he 3 ($$not'') o'erator ta(es a single
true,false value and negates it, turning false to true and true to false (D to 4 and
non2ero to D).
5or e!am'le, to test whether the variable i lies between 4 and 4D, you might use
if(4 ? i && i ? 4D)
*ere we're e!'ressing the relation $$i is between 4 and 4D'' as $$4 is less than i
and i is less than 4D.''
"t's im'ortant to understand why the more obvious e!'ression
if(4 ? i ? 4D) ,J =R-HW J,
would not wor(. 1he e!'ression 4 ? i ? 4D is 'arsed by the com'iler analogously to
4 L i L 4D. 1he e!'ression 4 L i L 4D is 'arsed as (4 L i) L 4D and means $$add 4 to i,
and then add the result to 4D.'' &imilarly, the e!'ression 4 ? i ? 4D is 'arsed as (4 ?
i) ? 4D and means $$see if 4 is less than i, and then see if the result is less than
4D.'' <ut in this case, $$the result'' is 4 or D, de'ending on whether i is greater than
4. &ince both D and 4 are less than 4D, the e!'ression 4 ? i ? 4D would always be
true in C, regardless of the value of i3
Relational and <oolean e!'ressions are usually used in conte!ts such as an if
statement, where something is to be done or not done de'ending on some
condition. "n these cases what's actually chec(ed is whether the e!'ression
re'resenting the condition has a 2ero or non2ero value. #s long as the
e!'ression is a relational or <oolean e!'ression, the inter'retation is 0ust what
we want. 5or e!am'le, when we wrote
if(! @ ma!)
the @ o'erator 'roduced a 4 if ! was greater than ma!, and a D otherwise. 1he if
statement inter'rets D as false and 4 (or any non2ero value) as true.
<ut what if the e!'ression is not a relational or <oolean e!'ression+ #s far as C
is concerned, the controlling e!'ression (of conditional statements li(e if) can in
fact be any e!'ression: it doesn't have to $$loo( li(e'' a <oolean e!'ression it
doesn't have to contain relational or logical o'erators. #ll C loo(s at (when it's
evaluating an if statement, or anywhere else where it needs a true,false value) is
whether the e!'ression evaluates to D or non2ero. 5or e!am'le, if you have a
variable !, and you want to do something if ! is non2ero, it's 'ossible to write
and the statement will be e!ecuted if ! is non2ero (since non2ero means $$true'').
1his 'ossibility (that the controlling e!'ression of an if statement doesn't have to
$$loo( li(e'' a <oolean e!'ression) is both useful and 'otentially confusing. "t's
useful when you have a variable or a function that is $$conce'tually <oolean,''
that is, one that you consider to hold a true or false (actually non2ero or 2ero)
value. 5or e!am'le, if you have a variable verbose which contains a non2ero value
when your 'rogram should run in verbose mode and 2ero when it should be
)uiet, you can write things li(e
'rintf(B&tarting first 'assCnB)
and this code is both legal and readable, besides which it does what you want.
1he standard library contains a function isu''er() which tests whether a character
is an u''er%case letter, so if c is a character, you might write
<oth of these e!am'les (verbose and isu''er()) are useful and readable.
*owever, you will eventually come across code li(e
average K sum , n
where n is 0ust a number. *ere, the 'rogrammer wants to com'ute the average
only if n is non2ero (otherwise, of course, the code would divide by D), and the
code wor(s, because, in the conte!t of the if statement, the trivial e!'ression n is
(as always) inter'reted as $$true'' if it is non2ero, and $$false'' if it is 2ero.
$$Coding shortcuts'' li(e these can seem cry'tic, but they're also )uite common,
so you'll need to be able to recogni2e them even if you don't choose to write
them in your own code. =henever you see code li(e
where x or f$% do not have obvious $$<oolean'' names, you can read them as $$if
x is non2ero'' or $$if f$% returns non2ero.''
3.$ +hile -oops
:1his section corres'onds to half of K&R &ec. 8.Q;
Foo's generally consist of two 'arts: one or more control expressions which (not
sur'risingly) control the e!ecution of the loo', and the body, which is the
statement or set of statements which is e!ecuted over and over.
1he most basic loop in C is the while loo'. # while loo' has one control
e!'ression, and e!ecutes as long as that e!'ression is true. 1his e!am'le
re'eatedly doubles the number 7 (7, O, T, 4R, ...) and 'rints the resulting numbers
as long as they are less than 4DDD:
int ! K 7
while(! ? 4DDD)
'rintf(BMdCnB, !)
! K ! J 7
(-nce again, we've used braces AE to enclose the grou' of statements which are
to be e!ecuted together as the body of the loo'.)
1he general synta! of a while loo' is
while( expression )
# while loo' starts out li(e an if statement: if the condition e!'ressed by the
expression is true, the statement is e!ecuted. *owever, after e!ecuting the
statement, the condition is tested again, and if it's still true, the statement is
e!ecuted again. (9resumably, the condition de'ends on some value which is
changed in the body of the loo'.) #s long as the condition remains true, the body
of the loo' is e!ecuted over and over again. ("f the condition is false right at the
start, the body of the loo' is not e!ecuted at all.)
#s another e!am'le, if you wanted to 'rint a number of blan( lines, with the
variable n holding the number of blan( lines to be 'rinted, you might use code
li(e this:
while(n @ D)
n K n % 4
#fter the loo' finishes (when control $$falls out'' of it, due to the condition being
false), n will have the value D.
.ou use a "hile loo' when you have a statement or grou' of statements which
may have to be e!ecuted a number of times to com'lete their tas(. 1he
controlling e!'ression re'resents the condition $$the loo' is not done'' or $$there's
more wor( to do.'' #s long as the e!'ression is true, the body of the loo' is
e!ecuted 'resumably, it ma(es at least some 'rogress at its tas(. =hen the
e!'ression becomes false, the tas( is done, and the rest of the 'rogram (beyond
the loo') can 'roceed. =hen we thin( about a loo' in this way, we can seen an
additional im'ortant 'ro'erty: if the e!'ression evaluates to $$false'' before the
very first tri' through the loo', we ma(e zero tri's through the loo'. "n other
words, if the tas( is already done (if there's no wor( to do) the body of the loo' is
not e!ecuted at all. ("t's always a good idea to thin( about the $$boundary
conditions'' in a 'iece of code, and to ma(e sure that the code will wor( correctly
when there is no wor( to do, or when there is a trivial tas( to do, such as sorting
an array of one number. 6!'erience has shown that bugs at boundary conditions
are )uite common.)
3.( ,or -oops
:1his section corres'onds to the other half of K&R &ec. 8.Q;
-ur second loo', which we've seen at least one e!am'le of already, is the for
loo'. 1he first one we saw was:
for (i K D i ? 4D i K i L 4)
'rintf(Bi is MdCnB, i)
Gore generally, the synta! of a for loo' is
for( expr<sub>1</sub> expr<sub></sub> expr<sub>!</sub> )
(*ere we see that the for loo' has three control e!'ressions. #s always, the
statement can be a brace%enclosed bloc(.)
Gany loo's are set u' to cause some variable to ste' through a range of values,
or, more generally, to set u' an initial condition and then modify some value to
'erform each succeeding loo' as long as some condition is true. 1he three
e!'ressions in a for loo' enca'sulate these conditions: expr<sub>1</sub> sets u'
the initial condition, expr<sub></sub> tests whether another tri' through the loo'
should be ta(en, and expr<sub>!</sub> increments or u'dates things after each
tri' through the loo' and 'rior to the ne!t one. "n our first e!am'le, we had i K D
as expr<sub>1</sub>, i ? 4D as expr<sub></sub>, i K i L 4 as expr<sub>!</sub>, and
the call to 'rintf as statement, the body of the loo'. &o the loo' began by setting i
to D, 'roceeded as long as i was less than 4D, 'rinted out i's value during each
tri' through the loo', and added 4 to i between each tri' through the loo'.
=hen the com'iler sees a for loo', first, expr<sub>1</sub> is evaluated. 1hen,
expr<sub></sub> is evaluated, and if it is true, the body of the loo' (statement)
is e!ecuted. 1hen, expr<sub>!</sub> is evaluated to go to the ne!t ste', and
expr<sub></sub> is evaluated again, to see if there is a ne!t ste'. /uring the
e!ecution of a for loo', the se)uence is:
1he first thing e!ecuted is expr<sub>1</sub>. expr<sub>!</sub> is evaluated after
every tri' through the loo'. 1he last thing e!ecuted is always expr<sub></sub>,
because when expr<sub></sub> evaluates false, the loo' e!its.
#ll three e!'ressions of a for loo' are o'tional. "f you leave out expr<sub>1</sub>,
there sim'ly is no initiali2ation ste', and the variable(s) used with the loo' had
better have been initiali2ed already. "f you leave out expr<sub></sub>, there is no
test, and the default for the for loo' is that another tri' through the loo' should be
ta(en (such that unless you brea( out of it some other way, the loo' runs
forever). "f you leave out expr<sub>!</sub>, there is no increment ste'.
1he semicolons se'arate the three controlling e!'ressions of a for loo'. (1hese
semicolons, by the way, have nothing to do with statement terminators.) "f you
leave out one or more of the e!'ressions, the semicolons remain. 1herefore, one
way of writing a deliberately infinite loo' in C is
"t's useful to com'are C's for loo' to the e)uivalent loo's in other com'uter
languages you might (now. 1he C loo'
for(i K ! i ?K y i K i L 2)
is roughly e)uivalent to:
for " K Y to . ste' Z #"$%&'(
do 4D iK!,y,2 #)*+,+$-(
for i :K ! to y #.ascal(
"n C (unli(e 5-R1R#H), if the test condition is false before the first tri' through
the loo', the loo' won't be traversed at all. "n C (unli(e 9ascal), a loo' control
variable (in this case, i) is guaranteed to retain its final value after the loo'
com'letes, and it is also legal to modify the control variable within the loo', if you
really want to. (=hen the loo' terminates due to the test condition turning false,
the value of the control variable after the loo' will be the first value for which the
condition failed, not the last value for which it succeeded.)
"t's also worth noting that a for loo' can be used in more general ways than the
sim'le, iterative e!am'les we've seen so far. 1he $$control variable'' of a for loo'
does not have to be an integer, and it does not have to be incremented by an
additive increment. "t could be $$incremented'' by a multi'licative factor (4, 7, O, T,
...) if that was what you needed, or it could be a floating%'oint variable, or it could
be another ty'e of variable which we haven't met yet which would ste', not over
numeric values, but over the elements of an array or other data structure. &trictly
s'ea(ing, a for loo' doesn't have to have a $$control variable'' at all the three
e!'ressions can be anything, although the loo' will ma(e the most sense if they
are related and together form the e!'ected initiali2e, test, increment se)uence.
1he 'owers%of%two e!am'le of the 'revious section does fit this 'attern, so we
could rewrite it li(e this:
int !
for(! K 7 ! ? 4DDD ! K ! J 7)
'rintf(BMdCnB, !)
1here is no earth%sha(ing or fundamental difference between the while and for
loo's. "n fact, given the general for loo'
for(expr<sub>1</sub> expr<sub></sub> expr<sub>!</sub>)
you could usually rewrite it as a while loo', moving the initiali2e and increment
e!'ressions to statements before and within the loo':
&imilarly, given the general while loo'
you could rewrite it as a for loo':
for( expr )
#nother contrast between the for and while loo's is that although the test
e!'ression (expr<sub></sub>) is o'tional in a for loo', it is re)uired in a while loo'.
"f you leave out the controlling e!'ression of a while loo', the com'iler will
com'lain about a synta! error. (1o write a deliberately infinite while loo', you have
to su''ly an e!'ression which is always non2ero. 1he most obvious one would
sim'ly be while(4) .)
"f it's 'ossible to rewrite a for loo' as a "hile loo' and vice versa, why do they
both e!ist+ =hich one should you choose+ "n general, when you choose a for
loo', its three e!'ressions should all mani'ulate the same variable or data
structure, using the initiali2e, test, increment 'attern. "f they don't mani'ulate the
same variable or don't follow that 'attern, wedging them into a for loo' buys
nothing and a "hile loo' would 'robably be clearer. (1he reason that one loo' or
the other can be clearer is sim'ly that, when you see a for loo', you expect to
see an idiomatic initiali2e,test,increment of a single variable, and if the for loo'
you're loo(ing at doesn't end u' matching that 'attern, you've been momentarily
3.) &rea. and continue
:1his section corres'onds to K&R &ec. 8.S;
&ometimes, due to an e!ce'tional condition, you need to 0um' out of a loo'
early, that is, before the main controlling e!'ression of the loo' causes it to
terminate normally. -ther times, in an elaborate loo', you may want to 0um' bac(
to the to' of the loo' (to test the controlling e!'ression again, and 'erha's begin
a new tri' through the loo') without 'laying out all the ste's of the current loo'.
1he brea( and continue statements allow you to do these two things. (1hey are, in
fact, essentially restricted forms of goto.)
1o 'ut everything we've seen in this cha'ter together, as well as demonstrate the
use of the brea( statement, here is a 'rogram for 'rinting 'rime numbers between
4 and 4DD:
>include ?stdio.h@
>include ?math.h@
int i, 0
'rintf(BMdCnB, 7)
for(i K 8 i ?K 4DD i K i L 4)
for(0 K 7 0 ? i 0 K 0 L 4)
if(i M 0 KK D)
if(0 @ s)rt(i))
'rintf(BMdCnB, i)
return D
1he outer loo' ste's the variable i through the numbers from 8 to 4DD the code
tests to see if each number has any divisors other than 4 and itself. 1he trial
divisor 0 loo's from 7 u' to i. 0 is a divisor of i if the remainder of i divided by 0 is D,
so the code uses C's $$remainder'' or $$modulus'' o'erator M to ma(e this test.
(Remember that i M 0 gives the remainder when i is divided by 0.)
"f the 'rogram finds a divisor, it uses brea( to brea( out of the inner loo', without
'rinting anything. <ut if it notices that 0 has risen higher than the s)uare root of i,
without its having found any divisors, then i must not have any divisors, so i is
'rime, and its value is 'rinted. (-nce we've determined that i is 'rime by noticing
that 0 @ s)rt(i), there's no need to try the other trial divisors, so we use a second
brea( statement to brea( out of the loo' in that case, too.)
1he sim'le algorithm and im'lementation we used here (li(e many sim'le 'rime
number algorithms) does not wor( for 7, the only even 'rime number, so the
'rogram $$cheats'' and 'rints out 7 no matter what, before going on to test the
numbers from 8 to 4DD.
Gany im'rovements to this sim'le 'rogram are of course 'ossible you might
e!'eriment with it. (/id you notice that the $$test'' e!'ression of the inner loo'
for$ = !& < i& = + 1% is in a sense unnecessary, because the loo'
always terminates early due to one of the two brea# statements+)
$.1 Arra"s
&o far, we've been declaring sim'le variables: the declaration
int i
declares a single variable, named i, of ty'e int. "t is also 'ossible to declare an
array of several elements. 1he declaration
int a:4D;
declares an array, named a, consisting of ten elements, each of ty'e int. &im'ly
s'ea(ing, an array is a variable that can hold more than one value. .ou s'ecify
which of the several values you're referring to at any given time by using a
numeric subscript. (#rrays in 'rogramming are similar to vectors or matrices in
mathematics.) =e can re'resent the array a above with a 'icture li(e this:
"n C, arrays are zero/based: the ten elements of a 4D%element array are
numbered from D to N. 1he subscri't which s'ecifies a single element of an array
is sim'ly an integer e!'ression in s)uare brac(ets. 1he first element of the array
is a:D;, the second element is a:4;, etc. .ou can use these $$array subscri't
e!'ressions'' anywhere you can use the name of a sim'le variable, for e!am'le:
a:D; K 4D
a:4; K 7D
a:7; K a:D; L a:4;
Hotice that the subscri'ted array references (i.e. e!'ressions such as a:D; and
a:4;) can a''ear on either side of the assignment o'erator.
1he subscri't does not have to be a constant li(e D or 4 it can be any integral
e!'ression. 5or e!am'le, it's common to loo' over all elements of an array:
int i
for(i K D i ? 4D i K i L 4)
a:i; K D
1his loo' sets all ten elements of the array a to D.
#rrays are a real convenience for many 'roblems, but there is not a lot that C will
do with them for you automatically. "n 'articular, you can neither set all elements
of an array at once nor assign one array to another both of the assignments
a K D ,J =R-HW J,
int b:4D;
b K a ,J =R-HW J,
are illegal.
1o set all of the elements of an array to some value, you must do so one by one,
as in the loo' e!am'le above. 1o co'y the contents of one array to another, you
must again do so one by one:
int b:4D;
for(i K D i ? 4D i K i L 4)
b:i; K a:i;
Remember that for an array declared
int a:4D;
there is no element a:4D; the to'most element is a:N;. 1his is one reason that
2ero%based loo's are also common in C. Hote that the for loo'
for(i K D i ? 4D i K i L 4)
does 0ust what you want in this case: it starts at D, the number 4D suggests
(correctly) that it goes through 4D iterations, but the less%than com'arison means
that the last tri' through the loo' has i set to N. (1he com'arison i ?K N would also
wor(, but it would be less clear and therefore 'oorer style.)
"n the little e!am'les so far, we've always loo'ed over all 4D elements of the
sam'le array a. "t's common, however, to use an array that's bigger than
necessarily needed, and to use a second variable to (ee' trac( of how many
elements of the array are currently in use. 5or e!am'le, we might have an
integer variable
int na ,J number of elements of a:; in use J,
1hen, when we wanted to do something with a (such as 'rint it out), the loo'
would run from D to na, not 4D (or whatever a's si2e was):
for(i K D i ? na i K i L 4)
'rintf(BMdCnB, a:i;)
Haturally, we would have to ensure ensure that na's value was always less than
or e)ual to the number of elements actually declared in a.
#rrays are not limited to ty'e int you can have arrays of char or double or any
other ty'e.
*ere is a slightly larger e!am'le of the use of arrays. &u''ose we want to
investigate the behavior of rolling a 'air of dice. 1he total roll can be anywhere
from 7 to 47, and we want to count how often each roll comes u'. =e will use an
array to (ee' trac( of the counts: a:7; will count how many times we've rolled 7,
=e'll simulate the roll of a die by calling C's random number generation function,
rand(). 6ach time you call rand(), it returns a different, 'seudo%random integer. 1he
values that rand() returns ty'ically s'an a large range, so we'll use C's modulus
(or $$remainder'') o'erator M to 'roduce random numbers in the range we want.
1he e!'ression rand() M R 'roduces random numbers in the range D to Q, and
rand() M R L 4 'roduces random numbers in the range 4 to R.
*ere is the 'rogram:
>include ?stdio.h@
>include ?stdlib.h@
int i
int d4, d7
int a:48; ,J uses :7..47; J,
for(i K 7 i ?K 47 i K i L 4)
a:i; K D
for(i K D i ? 4DD i K i L 4)
d4 K rand() M R L 4
d7 K rand() M R L 4
a:d4 L d7; K a:d4 L d7; L 4
for(i K 7 i ?K 47 i K i L 4)
'rintf(BMd: MdCnB, i, a:i;)
return D
=e include the header ?stdlib.h@ because it contains the necessary declarations
for the rand() function. =e declare the array of si2e 48 so that its highest element
will be a:47;. (=e're wasting a:D; and a:4; this is no great loss.) 1he variables d4
and d7 contain the rolls of the two individual dice we add them together to decide
which cell of the array to increment, in the line
a:d4 L d7; K a:d4 L d7; L 4
#fter 4DD rolls, we 'rint the array out. 1y'ically (as cra's 'layers well (now), we'll
see mostly S's, and relatively few 7's and 47's.
(<y the way, it turns out that using the ' o'erator to reduce the range of the rand
function is not always a good idea. =e'll say more about this 'roblem in an
O.4.4 #rray "nitiali2ation
O.4.7 #rrays of #rrays ($$Gultidimensional'' #rrays)