Sie sind auf Seite 1von 14

C syntax

The syntax of the C programming language, the rules


governing writing of software in the language, is designed
to allow for programs that are extremely terse, have a
close relationship with the resulting object code, and yet
provide relatively high-level data abstraction. The development of this syntax was a major milestone in the history
of computer science as it was the rst widely successful
high-level language for operating-system development.

cases, there are multiple equivalent ways to designate the


type; for example, signed short int and short are synonymous.

C syntax makes use of the maximal munch principle.

The char type is distinct from both signed char and unsigned char, but is guaranteed to have the same representation as one of them. The _Bool and long long types
are standardized since 1999, and may not be supported
by older C compilers. Type _Bool is usually accessed
via the typedef name bool dened by the standard header
stdbool.h.

The representation of some types may include unused


padding bits, which occupy storage but are not included
in the width. The following table provides a complete list
of the standard integer types and their minimum allowed
widths (including any sign bit).

Data structures

Main article: C variable types and declarations

1.1

In general, the widths and representation scheme implemented for any given platform are chosen based on the
machine architecture, with some consideration given to
the ease of importing source code developed for other
platforms. The width of the int type varies especially
widely among C implementations; it often corresponds
to the most natural word size for the specic platform.
The standard header limits.h denes macros for the minimum and maximum representable values of the standard
integer types as implemented on any specic platform.

Primitive data types

The C language represents numbers in three forms: integral, real and complex. This distinction reects similar distinctions in the instruction set architecture of most
central processing units. Integral data types store numbers in the set of integers, while real and complex numbers represent numbers (or pair of numbers) in the set of
real numbers in oating point form.

In addition to the standard integer types, there may be


other extended integer types, which can be used for
typedefs in standard headers. For more precise specication of width, programmers can and should use typedefs
from the standard header stdint.h.

All C integer types have signed and unsigned variants.


If signed or unsigned is not specied explicitly, in most
circumstances signed is assumed. However, for historic
reasons plain char is a type distinct from both signed char
and unsigned char. It may be a signed type or an unsigned
type, depending on the compiler and the character set (C
guarantees that members of the C basic character set have
positive values). Also, bit eld types specied as plain int
may be signed or unsigned, depending on the compiler.
1.1.1

Integer constants may be specied in source code in several ways. Numeric values can be specied as decimal
(example: 1022), octal with zero (0) as a prex (01776),
or hexadecimal with 0x (zero x) as a prex (0x3FE). A
character in single quotes (example: 'R'), called a character constant, represents the value of that character in
the execution character set, with type int. Except for
character constants, the type of an integer constant is determined by the width required to represent the specied
value, but is always at least as wide as int. This can be
overridden by appending an explicit length and/or signedness modier; for example, 12lu has type unsigned long.
There are no negative integer constants, but the same effect can often be obtained by using a unary negation operator "-".

Integer types

Cs integer types come in dierent xed sizes, capable of


representing various ranges of numbers. The type char
occupies exactly one byte (the smallest addressable storage unit), which is typically 8 bits wide. (Although char
can represent any of Cs basic characters, a wider type
may be required for international character sets.) Most
integer types have both signed and unsigned varieties,
designated by the signed and unsigned keywords. Signed
integer types may use a twos complement, ones complement, or sign-and-magnitude representation. In many
1

2
1.1.2

1
Enumerated type

The enumerated type in C, specied with the enum keyword, and often just called an enum (usually pronounced ee'-num /i.nm/ or ee'-noom /i.num/), is a type
designed to represent values across a series of named constants. Each of the enumerated constants has type int.
Each enum type itself is compatible with char or a signed
or unsigned integer type, but each implementation denes
its own rules for choosing a type.
Some compilers warn if an object with enumerated type
is assigned a value that is not one of its constants. However, such an object can be assigned any values in the
range of their compatible type, and enum constants can
be used anywhere an integer is expected. For this reason,
enum values are often used in place of preprocessor #dene directives to create named constants. Such constants
are generally safer to use than macros, since they reside
within a specic identier namespace.
An enumerated type is declared with the enum specier
and an optional name (or tag) for the enum, followed by a
list of one or more constants contained within curly braces
and separated by commas, and an optional list of variable
names. Subsequent references to a specic enumerated
type use the enum keyword and the name of the enum.
By default, the rst constant in an enumeration is assigned
the value zero, and each subsequent value is incremented
by one over the previous constant. Specic values may
also be assigned to constants in the declaration, and any
subsequent constants without specic values will be given
incremented values from that point onward. For example,
consider the following declaration:
enum colors { RED, GREEN, BLUE = 5, YELLOW }
paint_color;
This declares the enum colors type; the int constants RED
(whose value is 0), GREEN (whose value is one greater
than RED, 1), BLUE (whose value is the given value, 5),
and YELLOW (whose value is one greater than BLUE,
6); and the enum colors variable paint_color. The constants may be used outside of the context of the enum
(where any integer value is allowed), and values other
than the constants may be assigned to paint_color, or any
other variable of type enum colors.

DATA STRUCTURES

Floating-point constants may be written in decimal notation, e.g. 1.23. Scientic notation may be used by
adding e or E followed by a decimal exponent, e.g. 1.23e2
(which has the value 123.0). Either a decimal point or
an exponent is required (otherwise, the number is parsed
as an integer constant). Hexadecimal oating-point constants follow similar rules, except that they must be prexed by 0x and use p or P to specify a binary exponent,
e.g. 0xAp-2 (which has the value 2.5, since 10 22 = 10
4). Both decimal and hexadecimal oating-point constants may be suxed by f or F to indicate a constant of
type oat, by l (letter l) or L to indicate type long double,
or left unsuxed for a double constant.
The standard header le oat.h denes the minimum and
maximum values of the implementations oating-point
types oat, double, and long double. It also denes other
limits that are relevant to the processing of oating-point
numbers.
1.1.4 Storage class speciers
Every object has a storage class. This species most basically the storage duration, which may be static (default
for global), automatic (default for local), or dynamic (allocated), together with other features (linkage and register hint).
1

Allocated and deallocated using the malloc()


and free() library functions.

Variables declared within a block by default have automatic storage, as do those explicitly declared with the
auto[2] or register storage class speciers. The auto and
register speciers may only be used within functions and
function argument declarations; as such, the auto specier is always redundant. Objects declared outside of all
blocks and those explicitly declared with the static storage
class specier have static storage duration. Static variables are initialized to zero by default by the compiler.

Objects with automatic storage are local to the block in


which they were declared and are discarded when the
block is exited. Additionally, objects declared with the
register storage class may be given higher priority by the
compiler for access to registers; although they may not actually be stored in registers, objects with this storage class
may not be used with the address-of (&) unary operator.
Objects with static storage persist for the programs entire
1.1.3 Floating point types
duration. In this way, the same object can be accessed by
The oating-point form is used to represent numbers with a function across multiple calls. Objects with allocated
a fractional component. They do not, however, represent storage duration are created and destroyed explicitly with
most rational numbers exactly; they are instead a close malloc, free, and related functions.
approximation. There are three types of real values, de- The extern storage class specier indicates that the stornoted by their speciers: single precision (oat), double age for an object has been dened elsewhere. When used
precision (double), and double extended precision (long inside a block, it indicates that the storage has been dedouble). Each of these may represent values in a dierent ned by a declaration outside of that block. When used
form, often one of the IEEE oating point formats.
outside of all blocks, it indicates that the storage has been

1.3

Pointers

dened outside of the compilation unit. The extern stor- *bert; };


age class specier is redundant when used on a function
declaration. It indicates that the declared function has Incomplete types are also used for data hiding; the inbeen dened outside of the compilation unit.
complete type is dened in a header le, and the body
Note that storage speciers apply only to functions and only within the relevant source le.
objects; other things such as type and enum declarations
are private to the compilation unit in which they appear.
1.3 Pointers
Types, on the other hand, have qualiers (see below).
In declarations the asterisk modier (*) species a pointer
type. For example, where the specier int would refer
1.1.5 Type qualiers
to the integer type, the specier int* refers to the type
pointer to integer. Pointer values associate two pieces
Main article: Type qualier
of information: a memory address and a data type. The
following line of code declares a pointer-to-integer variTypes can be qualied to indicate special properties of able called ptr:
their data. The type qualier const indicates that a value
does not change once it has been initialized. Attempting int *ptr;
to modify a const qualied value yields undened behavior, so some C compilers store them in rodata or (for embedded systems) in read-only memory (ROM). The type
qualier volatile indicates to an optimizing compiler that
it may not remove apparently redundant reads or writes,
as the value may change even if it was not modied by
any expression or statement, or multiple writes may be
necessary, such as for memory-mapped I/O.

1.2

Incomplete types

An incomplete type is a structure or union type whose


members have not yet been specied, an array type whose
dimension has not yet been specied, or the void type (the
void type cannot be completed). Such a type may not be
instantiated (its size is not known), nor may its members
be accessed (they, too, are unknown); however, the derived pointer type may be used (but not dereferenced).

1.3.1 Referencing
When a non-static pointer is declared, it has an unspecied value associated with it. The address associated with
such a pointer must be changed by assignment prior to using it. In the following example, ptr is set so that it points
to the data associated with the variable a:
int *ptr; int a; ptr = &a;
In order to accomplish this, the address-of operator
(unary &) is used. It produces the memory location of
the data object that follows.
1.3.2 Dereferencing

The pointed-to data can be accessed through a pointer


They are often used with pointers, either as forward or value. In the following example, the integer variable b is
external declarations. For instance, code could declare set to the value of integer variable a, which is 10:
an incomplete type like this:
int *p; int a, b; a = 10; p = &a; b = *p;
struct thing *pt;
This declares pt as a pointer to struct thing and the incomplete type struct thing. Pointers always have the same
byte-width regardless of what they point to, so this statement is valid by itself (as long as pt is not dereferenced).
The incomplete type can be completed later in the same
scope by redeclaring it:
struct thing{ int num; }; /* thing struct type is now
completed */

In order to accomplish that task, the unary dereference


operator, denoted by an asterisk (*), is used. It returns
the data to which its operandwhich must be of pointer
typepoints. Thus, the expression *p denotes the same
value as a. Dereferencing a null pointer is illegal.

1.4 Arrays
1.4.1 Array denition

Incomplete types are used to implement recursive struc- Arrays are used in C to represent structures of consecutures; the body of the type declaration may be deferred tive elements of the same type. The denition of a (xedsize) array has the following syntax:
to later in the translation unit:
typedef struct Bert Bert; typedef struct Wilma Wilma; int array[100];
struct Bert { Wilma *wilma; }; struct Wilma { Bert

which denes an array named array to hold 100 values


of the primitive type int. If declared within a function,
the array dimension may also be a non-constant expression, in which case memory for the specied number of
elements will be allocated. In most contexts in later use,
a mention of the variable array is converted to a pointer
to the rst item in the array. The sizeof operator is an
exception: sizeof array yields the size of the entire array
(that is, 100 times the size of an int, and sizeof(array) /
sizeof(int) will return 100). Another exception is the &
(address-of) operator, which yields a pointer to the entire
array, for example
int (*ptr_to_array)[100] = &array;

1.4.2

Accessing elements

DATA STRUCTURES

1.4.4 Dynamic arrays


Main article: C dynamic memory allocation
Arrays that can be resized dynamically can be produced
with the help of the C standard library. The malloc function provides a simple method for allocating memory. It
takes one parameter: the amount of memory to allocate
in bytes. Upon successful allocation, malloc returns a
generic (void) pointer value, pointing to the beginning of
the allocated space. The pointer value returned is converted to an appropriate type implicitly by assignment. If
the allocation could not be completed, malloc returns a
null pointer. The following segment is therefore similar
in function to the above desired declaration:
#include <stdlib.h> /* declares malloc */ ... int *a; a =
malloc(n * sizeof(int)); a[3] = 10;

The primary facility for accessing the values of the elements of an array is the array subscript operator. To
access the i-indexed element of array, the syntax would
be array[i], which refers to the value stored in that array
element.

The result is a pointer to int variable (a) that points to


the rst of n contiguous int objects; due to arraypointer
equivalence this can be used in place of an actual array
name, as shown in the last line. The advantage in using
this dynamic allocation is that the amount of memory that
Array subscript numbering begins at 0 (see Zero-based is allocated to it can be limited to what is actually needed
indexing). The largest allowed array subscript is therefore at run time, and this can be changed as needed (using the
equal to the number of elements in the array minus 1. standard library function realloc).
To illustrate this, consider an array a declared as having
10 elements; the rst element would be a[0] and the last When the dynamically-allocated memory is no longer
needed, it should be released back to the run-time syselement would be a[9].
tem. This is done with a call to the free function. It takes a
C provides no facility for automatic bounds checking for single parameter: a pointer to previously allocated memarray usage. Though logically the last subscript in an array ory. This is the value that was returned by a previous call
of 10 elements would be 9, subscripts 10, 11, and so forth to malloc. It is considered good practice to then set the
could accidentally be specied, with undened results.
pointer variable to NULL so that further attempts to acDue to arrays and pointers being interchangeable, the ad- cess the memory to which it points will fail. If this is not
dresses of each of the array elements can be expressed in done, the variable becomes a dangling pointer, and such
equivalent pointer arithmetic. The following table illus- errors in the code (or manipulations by an attacker) might
be very hard to detect and lead to obscure and potentially
trates both methods for the existing array:
dangerous malfunction caused by memory corruption.
Since the expression a[i] is semantically equivalent to
*(a+i), which in turn is equivalent to *(i+a), the expres- free(a); a = NULL;
sion can also be written as i[a], although this form is rarely
used.
1.4.5 Multidimensional arrays
1.4.3

Variable-length arrays

C99 standardised variable-length arrays (VLAs) within


block scope. Such array variables are allocated based on
the value of an integer value at runtime upon entry to a
block, and are deallocated at the end of the block.[3] As of
C11 this feature is no longer required to be implemented
by the compiler.
int n = ...; int a[n]; a[3] = 10;

In addition, C supports arrays of multiple dimensions,


which are stored in row-major order. Technically, C
multidimensional arrays are just one-dimensional arrays
whose elements are arrays. The syntax for declaring multidimensional arrays is as follows:
int array2d[ROWS][COLUMNS];

where ROWS and COLUMNS are constants. This denes


a two-dimensional array. Reading the subscripts from left
This syntax produces an array whose size is xed until the to right, array2d is an array of length ROWS, each element
end of the block.
of which is an array of COLUMNS integers.

1.5

Strings

To access an integer element in this multidimensional ar- into a string:


ray, one would use
The use of other backslash escapes is not dened by the
array2d[4][3]
C standard, although compiler vendors often provide additional escape codes as language extensions.
Again, reading from left to right, this accesses the 5th
row, and the 4th element in that row. The expression ar- 1.5.2 String literal concatenation
ray2d[4] is an array, which we are then subscripting with
[3] to access the fourth integer.
C has string literal concatenation, meaning that adjacent
Higher-dimensional arrays can be declared in a similar string literals are concatenated at compile time; this allows long strings to be split over multiple lines, and also
manner.
allows string literals resulting from C preprocessor denes
A multidimensional array should not be confused with an
and macros to be appended to strings at compile time:
array of references to arrays (also known as an Ilie vectors or sometimes an array of arrays). The former is al- printf(__FILE__ ": %d: Hello " world\n, __LINE__);
ways rectangular (all subarrays must be the same size),
and occupies a contiguous region of memory. The latter will expand to
is a one-dimensional array of pointers, each of which may
point to the rst element of a subarray in a dierent place printf(helloworld.c ": %d: Hello " world\n, 10);
in memory, and the sub-arrays do not have to be the same
size. The latter can be created by multiple uses of malloc. which is syntactically equivalent to
printf(helloworld.c: %d: Hello world\n, 10);

1.5

Strings

Main article: C string

1.5.3 Character constants

In C, string constants (literals) are surrounded by double


quotes ("), e.g. Hello world!" and are compiled to an
array of the specied char values with an additional null
terminating character (0-valued) code to mark the end of
the string.

Individual character constants are single-quoted, e.g. 'A',


and have type int (in C++, char). The dierence is that
A represents a null-terminated array of two characters,
'A' and '\0', whereas 'A' directly represents the character
value (65 if ASCII is used). The same backslash-escapes
String literals may not contain embedded newlines; this are supported as for strings, except that (of course) "
proscription somewhat simplies parsing of the language. can validly be used as a character without being escaped,
To include a newline in a string, the backslash escape \n whereas ' must now be escaped.
may be used, as below.
A character constant cannot be empty (i.e. '' is invalid
There are several standard library functions for operating syntax), although a string may be (it still has the null terwith string data (not necessarily constant) organized as minating character). Multi-character constants (e.g. 'xy')
array of char using this null-terminated format; see below. are valid, although rarely useful they let one store several characters in an integer (e.g. 4 ASCII characters can
Cs string-literal syntax has been very inuential, and has t in a 32-bit integer, 8 in a 64-bit one). Since the order
made its way into many other languages, such as C++, in which the characters are packed into an int is not speciObjective-C, Perl, Python, PHP, Java, Javascript, C#, ed, portable use of multi-character constants is dicult.
Ruby. Nowadays, almost all new languages adopt or build
upon C-style string syntax. Languages that lack this syntax tend to precede C.
1.5.4 Wide character strings
Since type char is 1 byte wide, a single char value typically can represent at most 255 distinct character codes,
not nearly enough for all the characters in use worldwide.
Main article: Escape sequences in C
To provide better support for international characters, the
rst C standard (C89) introduced wide characters (enIf you wish to include a double quote inside the string, that coded in type wchar_t) and wide character strings, which
can be done by escaping it with a backslash (\), for exam- are written as L"Hello world!"
ple, This string contains \"double quotes\".. To insert Wide characters are most commonly either 2 bytes (usa literal backslash, one must double it, e.g. A backslash ing a 2-byte encoding such as UTF-16) or 4 bytes
looks like this: \\".
(usually UTF-32), but Standard C does not specify the
1.5.1

Backslash escapes

Backslashes may be used to enter control characters, etc., width for wchar_t, leaving the choice to the implemen-

tor. Microsoft Windows generally uses UTF-16, thus


the above string would be 26 bytes long for a Microsoft
compiler; the Unix world prefers UTF-32, thus compilers such as GCC would generate a 52-byte string. A 2byte wide wchar_t suers the same limitation as char, in
that certain characters (those outside the BMP) cannot be
represented in a single wchar_t; but must be represented
using surrogate pair
The original C standard specied only minimal functions
for operating with wide character strings; in 1995 the
standard was modied to include much more extensive
support, comparable to that for char strings. The relevant functions are mostly named after their char equivalents, with the addition of a w or the replacement of
str with wcs"; they are specied in <wchar.h>, with
<wctype.h> containing wide-character classication and
mapping functions.

DATA STRUCTURES

secutive locations in memory, although the compiler is


allowed to insert padding between or after members (but
not before the rst member) for eciency or as padding
required for proper alignment by the target architecture.
The size of a structure is equal to the sum of the sizes of
its members, plus the size of the padding.
1.6.2 Unions

Unions in C are related to structures and are dened as objects that may hold (at dierent times) objects of dierent types and sizes. They are analogous to variant records
in other programming languages. Unlike structures, the
components of a union all refer to the same location in
memory. In this way, a union can be used at various times
to hold dierent types of objects, without the need to create a separate object for each new type. The size of a
The now generally recommended method[5] of supporting union is equal to the size of its largest component type.
international characters is through UTF-8, which is stored
in char arrays, and can be written directly in the source 1.6.3 Declaration
code if using a UTF-8 editor, because UTF-8 is a direct
ASCII extension.
Structures are declared with the struct keyword and
1.5.5

Variable width strings

A common alternative to wchar_t is to use a variablewidth encoding, whereby a logical character may extend over multiple positions of the string. Variable-width
strings may be encoded into literals verbatim, at the risk
of confusing the compiler, or using numerical backslash
escapes (e.g. "\xc3\xa9 for "" in UTF-8). The UTF8 encoding was specically designed (under Plan 9) for
compatibility with the standard library string functions;
supporting features of the encoding include a lack of embedded nulls, no valid interpretations for subsequences,
and trivial resynchronisation. Encodings lacking these
features are likely to prove incompatible with the standard library functions; encoding-aware string functions
are often used in such cases.

unions are declared with the union keyword. The specier keyword is followed by an optional identier name,
which is used to identify the form of the structure or
union. The identier is followed by the declaration of
the structure or unions body: a list of member declarations, contained within curly braces, with each declaration terminated by a semicolon. Finally, the declaration
concludes with an optional list of identier names, which
are declared as instances of the structure or union.
For example, the following statement declares a structure
named s that contains three members; it will also declare
an instance of the structure known as tee:
struct s { int x; oat y; char *z; } tee;
And the following statement will declare a similar union
named u and an instance of it named n:
union u { int x; oat y; char *z; } n;

1.5.6

Library functions

Members of structures and unions cannot have an incomStrings, both constant and variable, can be manipulated plete or function type. Thus members cannot be an inwithout using the standard library. However, the library stance of the structure or union being declared (because
contains many useful functions for working with null- it is incomplete at that point) but can be pointers to the
type being declared.
terminated strings.

1.6
1.6.1

Structures and unions


Structures

Once a structure or union body has been declared and


given a name, it can be considered a new data type using the specier struct or union, as appropriate, and the
name. For example, the following statement, given the
above structure declaration, declares a new instance of
the structure s named r:

Structures and unions in C are dened as data containers


consisting of a sequence of named members of various struct s r;
types. They are similar to records in other programming
languages. The members of a structure are stored in con- It is also common to use the typedef specier to eliminate

1.6

Structures and unions

the need for the struct or union keyword in later references to the structure. The rst identier after the body
of the structure is taken as the new name for the structure type (structure instances may not be declared in this
context). For example, the following statement will declare a new type known as s_type that will contain some
structure:

7
1.6.5 Assignment
Assigning values to individual members of structures and
unions is syntactically identical to assigning values to any
other object. The only dierence is that the lvalue of the
assignment is the name of the member, as accessed by the
syntax mentioned above.

typedef struct {} s_type;

A structure can also be assigned as a unit to another structure of the same type. Structures (and pointers to strucFuture statements can then use the specier s_type (in- tures) may also be used as function parameter and return
stead of the expanded struct specier) to refer to the types.
structure.
For example, the following statement assigns the value of
74 (the ASCII code point for the letter 't') to the member
named x in the structure tee, from above:
tee.x = 74;
1.6.4

Accessing members

And the same assignment, using ptr_to_tee in place of tee,


would look like:

ptr_to_tee->x = 74;
Members are accessed using the name of the instance of
a structure or union, a period (.), and the name of the
member. For example, given the declaration of tee from Assignment with members of unions is identical.
above, the member known as y (of type oat) can be accessed using the following syntax:
1.6.6 Other operations
tee.y
According to the C standard, the only legal operations
Structures are commonly accessed through pointers. that can be performed on a structure are copying it, asConsider the following example that denes a pointer to signing to it as a unit (or initializing it), taking its address
tee, known as ptr_to_tee:
with the address-of (&) unary operator, and accessing its
members. Unions have the same restrictions. One of the
struct s *ptr_to_tee = &tee;
operations implicitly forbidden is comparison: structures
and unions cannot be compared using Cs standard comMember y of tee can then be accessed by dereferencing parison facilities (==, >, <, etc.).
ptr_to_tee and using the result as the left operand:
(*ptr_to_tee).y
Which is identical to the simpler tee.y above as long
as ptr_to_tee points to tee. Due to operator precedence
(". being higher than "*"), the shorter *ptr_to_tee.y
is incorrect for this purpose, instead being parsed as
*(ptr_to_tee.y) and thus the parentheses are necessary. Because this operation is common, C provides an
abbreviated syntax for accessing a member directly from
a pointer. With this syntax, the name of the instance is
replaced with the name of the pointer and the period is
replaced with the character sequence ->. Thus, the following method of accessing y is identical to the previous
two:
ptr_to_tee->y
Members of unions are accessed in the same way.

1.6.7 Bit elds


C also provides a special type of structure member known
as a bit eld, which is an integer with an explicitly specied number of bits. A bit eld is declared as a structure
member of type int, signed int, unsigned int, or _Bool,
following the member name by a colon (:) and the number of bits it should occupy. The total number of bits in
a single bit eld must not exceed the total number of bits
in its declared type.
As a special exception to the usual C syntax rules, it is
implementation-dened whether a bit eld declared as
type int, without specifying signed or unsigned, is signed
or unsigned. Thus, it is recommended to explicitly specify signed or unsigned on all structure members for portability.

Unnamed elds consisting of just a colon followed by a


This can be chained; for example, in a linked list, one number of bits are also allowed; these indicate padding.
may refer to n->next->next for the second following node Specifying a width of zero for an unnamed eld is used
(assuming that n->next is not null).
to force alignment to a new word.[6]

OPERATORS

The members of bit elds do not have addresses, and as


such cannot be used with the address-of (&) unary oper- In C89, a union was initialized with a single value applied
ator. The sizeof operator may not be applied to bit elds. to its rst member. That is, the union u dened above
The following declaration declares a new structure type could only have its int x member initialized:
known as f and an instance of it known as g. Comments
union u value = { 3 };
provide a description of each of the members:
struct f { unsigned int ag : 1; /* a bit ag: can either be Using a designated initializer, the member to be initialon (1) or o (0) */ signed int num : 4; /* a signed 4-bit ized does not have to be the rst member:
eld; range 7...7 or 8...7 */ signed int : 3; /* 3 bits of
union u value = { .y = 3.1415 };
padding to round out to 8 bits */ } g;

1.7

Initialization

If an array has unknown size (i.e. the array was an


incomplete type), the number of initializers determines
the size of the array and its type becomes complete:

Default initialization depends on the storage class speci- int x[] = { 0, 1, 2 } ;


er, described above.
Compound designators can be used to provide explicit
initialization when unadorned initializer lists might be
misunderstood. In the example below, w is declared as an
array of structures, each structure consisting of a member a (an array of 3 int) and a member b (an int). The
int x = 12; int y = { 23 }; //Legal, no warning int z = { {
initializer sets the size of w to 2 and sets the values of the
34 } }; //Legal, expect a warning
rst element of each a:
Because of the languages grammar, a scalar initializer
may be enclosed in any number of curly brace pairs. Most
compilers issue a warning if there is more than one such
pair, though.

Structures, unions and arrays can be initialized in their


declarations using an initializer list. Unless designators
are used, the components of an initializer correspond with
the elements in the order they are dened and stored, thus
all preceding values must be provided before any particular elements value. Any unspecied elements are set to
zero (except for unions). Mentioning too many initialization values yields an error.
The following statement will initialize a new instance of
the structure s known as pi:

struct { int a[3], b; } w[] = { [0].a = {1}, [1].a[0] = 2 };


This is equivalent to:
struct { int a[3], b; } w[] = { { { 1, 0, 0 }, 0 }, { { 2, 0, 0
}, 0 } };
There is no way to specify repetition of an initializer in
standard C.

struct s { int x; oat y; char *z; }; struct s pi = { 3,


3.1415, Pi };
1.7.2 Compound literals

1.7.1

Designated initializers

It is possible to borrow the initialization methodology to


generate compound structure and array literals:

int* ptr; ptr = (int[]){ 10, 20, 30, 40 }; struct s pi; pi =


Designated initializers allow members to be initialized by (struct s){ 3, 3.1415, Pi };
name, in any order, and without explicitly providing the
preceding values. The following initialization is equivaCompound literals are often combined with designated
lent to the previous one:
initializers to make the declaration more readable:[3]
struct s pi = { .z = Pi, .x = 3, .y = 3.1415 };
pi = (struct s){ .z = Pi, .x = 3, .y = 3.1415 };
Using a designator in an initializer moves the initialization
cursor. In the example below, if MAX is greater than
10, there will be some zero-valued elements in the middle
of a; if it is less than 10, some of the values provided by
the rst ve initializers will be overridden by the second 2 Operators
ve (if MAX is less than 5, there will be a compilation
error):
Main article: Operators in C and C++
int a[MAX] = { 1, 3, 5, 7, 9, [MAX-5] = 8, 6, 4, 2, 0 };

3.3

Iteration statements

Control structures

one default label associated with a switch. If none of the


case labels are equal to the expression in the parentheses
following switch, control passes to the default label or, if
C is a free-form language.
there is no default label, execution resumes just beyond
Bracing style varies from programmer to programmer the entire construct.
and can be the subject of debate. See Indent style for
Switches may be nested; a case or default label is assomore details.
ciated with the innermost switch that contains it. Switch
statements can fall through, that is, when one case section has completed its execution, statements will continue
3.1 Compound statements
to be executed downward until a break; statement is enIn the items in this section, any <statement> can be re- countered. Fall-through is useful in some circumstances,
placed with a compound statement. Compound state- but is usually not desired. In the preceding example, if
<label2> is reached, the statements <statements 2> are
ments have the form:
executed and nothing more inside the braces. However
{ <optional-declaration-list> <optional-statement-list> }
if <label1> is reached, both <statements 1> and <statements 2> are executed since there is no break to separate
and are used as the body of a function or anywhere that a the two case statements.
single statement is expected. The declaration-list declares
It is possible, although unusual, to insert the switch labels
variables to be used in that scope, and the statement-list
into the sub-blocks of other control structures. Examples
are the actions to be performed. Brackets dene their own
of this include Dus device and Simon Tatham's implescope, and variables dened inside those brackets will be
mentation of coroutines in Putty.[7]
automatically deallocated at the closing bracket. Declarations and statements can be freely intermixed within a
compound statement (as in C++).

3.3 Iteration statements


3.2

Selection statements

C has three forms of iteration statement:

C has two types of selection statements: the if statement do <statement> while ( <expression> ) ; while ( <expression> ) <statement> for ( <expression> ; <expression> ;
and the switch statement.
<expression> ) <statement>
The if statement is in the form:
if (<expression>) <statement1> else <statement2>

In the while and do statements, the sub-statement is executed repeatedly so long as the value of the expression remains non-zero (equivalent to true). With while, the test,
including all side eects from <expression>, occurs before each iteration (execution of <statement>); with do,
the test occurs after each iteration. Thus, a do statement
always executes its sub-statement at least once, whereas
while may not execute the sub-statement at all.

In the if statement, if the <expression> in parentheses is


nonzero (true), control passes to <statement1>. If the
else clause is present and the <expression> is zero (false),
control will pass to <statement2>. The else <statement2>
part is optional and, if absent, a false <expression> will
simply result in skipping over the <statement1>. An else
always matches the nearest previous unmatched if; braces The statement:
may be used to override this when necessary, or for clarfor (e1; e2; e3) s;
ity.
The switch statement causes control to be transferred to
one of several statements depending on the value of an
expression, which must have integral type. The substatement controlled by a switch is typically compound. Any
statement within the substatement may be labeled with
one or more case labels, which consist of the keyword
case followed by a constant expression and then a colon
(:). The syntax is as follows:

is equivalent to:
e1; while (e2) { s; cont: e3; }
except for the behaviour of a continue; statement (which
in the for loop jumps to e3 instead of e2). If e2 is blank,
it would have to be replaced with a 1.

switch (<expression>) { case <label1> : <statements Any of the three expressions in the for loop may be omit1> case <label2> : <statements 2> break; default : ted. A missing second expression makes the while test
always non-zero, creating a potentially innite loop.
<statements 3> }
Since C99, the rst expression may take the form of a
No two of the case constants associated with the same declaration, typically including an initializer, such as:
switch may have the same value. There may be at most for (int i = 0; i < limit; ++i) { // ... }

10

4 FUNCTIONS

4.1 Syntax
The declarations scope is limited to the extent of the for
A C function denition consists of a return type (void if
loop.
no value is returned), a unique name, a list of parameters
in parentheses, and various statements:

3.4

Jump statements

<return-type> functionName( <parameter-list> ) {


<statements> return <expression of type return-type>; }

Jump statements transfer control unconditionally. There


are four types of jump statements in C: goto, continue, A function with non-void return type should include at
least one return statement. The parameters are given by
break, and return.
the <parameter-list>, a comma-separated list of parameThe goto statement looks like this:
ter declarations, each item in the list being a data type followed by an identier: <data-type> <variable-identier>,
goto <identier> ;
<data-type> <variable-identier>, ....
The identier must be a label (followed by a colon) lo- If there are no parameters, the <parameter-list> may be
cated in the current function. Control transfers to the la- left empty or optionally be specied with the single word
void.
beled statement.
A continue statement may appear only within an iteration
statement and causes control to pass to the loopcontinuation portion of the innermost enclosing iteration
statement. That is, within each of the statements

It is possible to dene a function as taking a variable number of parameters by providing the ... keyword as the last
parameter instead of a data type and variable identier.
A commonly used function that does this is the standard
library function printf, which has the declaration:

while (expression) { /* ... */ cont: ; } do { /* ... */ cont:


; } while (expression); for (expr1; expr2; expr3) { /* ... int printf (const char*, ...);
*/ cont: ; }

Manipulation of these parameters can be done by using


a continue not contained within a nested iteration state- the routines in the standard library header <stdarg.h>.
ment is the same as goto cont.
The break statement is used to end a for loop, while loop, 4.1.1 Function Pointers
do loop, or switch statement. Control passes to the statement following the terminated statement.
A pointer to a function can be declared as follows:
A function returns to its caller by the return statement.
When return is followed by an expression, the value is returned to the caller as the value of the function. Encountering the end of the function is equivalent to a return with
no expression. In that case, if the function is declared as
returning a value and the caller tries to use the returned
value, the result is undened.

3.4.1

Storing the address of a label

<return-type> (*<function-name>)(<parameter-list>);
The following program shows use of a function pointer
for selecting between addition and subtraction:
#include <stdio.h> int (*operation)(int x, int y); int
add(int x, int y) { return x + y; } int subtract(int x, int y)
{ return x - y; } int main(int argc, char* args[]) { int foo
= 1, bar = 1; operation = add; printf("%d + %d = %d\n,
foo, bar, operation(foo, bar)); operation = subtract;
printf("%d - %d = %d\n, foo, bar, operation(foo, bar));
return 0; }

GCC extends the C language with a unary && operator


that returns the address of a label. This address can be
stored in a void* variable type and may be used later in a
goto instruction. For example, the following prints hi " 4.2
in an innite loop:
void *ptr = &&J1; J1: printf(hi "); goto *ptr;
This feature can be used to implement a jump table.

Functions

Global structure

After preprocessing, at the highest level a C program consists of a sequence of declarations at le scope. These
may be partitioned into several separate source les,
which may be compiled separately; the resulting object modules are then linked along with implementationprovided run-time support modules to produce an executable image.
The declarations introduce functions, variables and types.

4.3

Argument passing

11

C functions are akin to the subroutines of Fortran or the A function may return a value to caller (usually another
procedures of Pascal.
C function, or the hosting environment for the function
A denition is a special type of declaration. A variable main). The printf function mentioned above returns how
denition sets aside storage and possibly initializes it, a many characters were printed, but this value is often ignored.
function denition provides its body.
An implementation of C providing all of the standard library functions is called a hosted implementation. Pro- 4.3 Argument passing
grams written for hosted implementations are required to
dene a special function called main, which is the rst In C, arguments are passed to functions by value while
function called when execution of the program begins.
other languages may pass variables by reference. This
Hosted implementations start program execution by in- means that the receiving function gets copies of the valvoking the main function, which must be dened follow- ues and has no direct way of altering the original variables. For a function to alter a variable passed from aning one of these prototypes:
other function, the caller must pass its address (a pointer
int main() {...} int main(void) {...} int main(int argc,
to it), which can then be dereferenced in the receiving
char *argv[]) {...} int main(int argc, char **argv) {...}
function. See Pointers for more information.
The rst two denitions are equivalent (and both are compatible with C++). It is probably up to individual preference which one is used (the current C standard contains
two examples of main() and two of main(void), but the
draft C++ standard uses main()). The return value of
main (which should be int) serves as termination status
returned to the host environment.
The C standard denes return values 0
and
EXIT_SUCCESS
as
indicating
success
and
EXIT_FAILURE
as
indicating
failure.
(EXIT_SUCCESS and EXIT_FAILURE are dened in
<stdlib.h>). Other return values have implementationdened meanings; for example, under Linux a program
killed by a signal yields a return code of the numerical
value of the signal plus 128.

void incInt(int *y) { (*y)++; // Increase the value of


'x', in 'main' below, by one } int main(void) { int x = 0;
incInt(&x); // pass a reference to the var 'x' return 0; }
The function scanf works the same way:
int x; scanf("%d, &x);
In order to pass an editable pointer to a function (such
as for the purpose of returning an allocated array to the
calling code) you have to pass a pointer to that pointer:
its address.

#include <stdio.h> #include <stdlib.h> void allocate_array(int ** const a_p, const int A) { /* allocate
array of A ints assigning to *a_p alters the 'a' in main() */
*a_p = malloc(sizeof(int) * A); } int main(void) { int *
A minimal correct C program consists of an empty main a; /* create a pointer to one or more ints, this will be the
routine, taking no arguments and doing nothing:
array */ /* pass the address of 'a' */ allocate_array(&a,
42); /* 'a' is now an array of length 42 and can be
int main(void){}
manipulated and freed here */ free(a); return 0; }
Because no return statement is present, main returns 0 on
exit.[3] (This is a special-case feature introduced in C99 The parameter int **a_p is a pointer to a pointer to an
int, which is the address of the pointer p dened in the
that applies only to main.)
main function in this case.
The main function will usually call other functions to help
it perform its job.
Some implementations are not hosted, usually because
they are not intended to be used with an operating system. Such implementations are called free-standing in
the C standard. A free-standing implementation is free
to specify how it handles program startup; in particular it
need not require a program to dene a main function.
Functions may be written by the programmer or provided by existing libraries. Interfaces for the latter are
usually declared by including header leswith the #include preprocessing directiveand the library objects
are linked into the nal executable image. Certain library
functions, such as printf, are dened by the C standard;
these are referred to as the standard library functions.

4.3.1 Array parameters


Function parameters of array type may at rst glance appear to be an exception to Cs pass-by-value rule. The
following program will print 2, not 1:
#include <stdio.h> void setArray(int array[], int index,
int value) { array[index] = value; } int main(void) { int
a[1] = {1}; setArray(a, 0, 2); printf (a[0]=%d\n, a[0]);
return 0; }
However, there is a dierent reason for this behavior. In
fact, a function parameter declared with an array type is
treated like one declared to be a pointer. That is, the pre-

12

5 MISCELLANEOUS

ceding declaration of setArray is equivalent to the follow- // this line will be ignored by the compiler /* these lines
ing:
will be ignored by the compiler */ x = *p/*q; /* this
comment starts after the 'p' */
void setArray(int *array, int index, int value)
At the same time, C rules for the use of arrays in expressions cause the value of a in the call to setArray to
be converted to a pointer to the rst element of array a.
Thus, in fact this is still an example of pass-by-value, with
the caveat that it is the address of the rst element of the
array being passed by value, not the contents of the array.

5
5.1

Miscellaneous
Reserved keywords

The following words are reserved, and may not be used


as identiers:

5.4 Command-line arguments


The parameters given on a command line are passed to
a C program with two predened variables - the count of
the command-line arguments in argc and the individual
arguments as character strings in the pointer array argv.
So the command:
myFilt p1 p2 p3
results in something like:
While individual strings are arrays of contiguous characters, there is no guarantee that the strings are stored as a
contiguous group.

Implementations may reserve other keywords, such as


asm, although implementations typically provide nonstandard keywords that begin with one or two underscores.

The name of the program, argv[0], may be useful when


printing diagnostic messages or for making one binary
serve multiple purposes. The individual values of the
parameters may be accessed with argv[1], argv[2], and
argv[3], as shown in the following program:

5.2

#include <stdio.h> int main(int argc, char *argv[]) { int


i; printf (argc\t= %d\n, argc); for (i = 0; i < argc; i++)
printf (argv[%i]\t= %s\n, i, argv[i]); return 0; }

Case sensitivity

C identiers are case sensitive (e.g., foo, FOO, and Foo


are the names of dierent objects). Some linkers may
map external identiers to a single case, although this is
uncommon in most modern linkers.

5.3

Comments

Text starting with /* is treated as a comment and ignored.


The comment ends at the next */; it can occur within expressions, and can span multiple lines. Accidental omission of the comment terminator is problematic in that the
next comments properly constructed comment terminator will be used to terminate the initial comment, and all
code in between the comments will be considered as a
comment. C-style comments do not nest; that is, accidentally placing a comment within a comment has unintended results:
1 /* 2 This line will be ignored. 3 /* 4 A compiler
warning may be produced here. These lines will also be
ignored. 5 The comment opening token above did not
start a new comment, 6 and the comment closing token
below will close the comment begun on line 1. 7 */ 8
This line and the line below it will not be ignored. Both
will likely produce compile errors. 9 */
C++ style line comments start with // and extend to the
end of the line. This style of comment originated in
BCPL and became valid C syntax in C99; it is not available in the original K&R C nor in ANSI C:

5.5 Evaluation order


In any reasonably complex expression, there arises a
choice as to the order in which to evaluate the parts of
the expression: (1+1)+(3+3) may be evaluated in the order (1+1)+(3+3), (2)+(3+3), (2)+(6), (8), or in the order
(1+1)+(3+3), (1+1)+(6), (2)+(6), (8). Formally, a conforming C compiler may evaluate expressions in any order between sequence points (this allows the compiler to
do some optimization). Sequence points are dened by:
Statement ends at semicolons.
The sequencing operator: a comma. However, commas that delimit function arguments are not sequence points.
The short-circuit operators: logical and (&&, which
can be read and then) and logical or (||, which can
be read or else).
The ternary operator (?:): This operator evaluates its
rst sub-expression rst, and then its second or third
(never both of them) based on the value of the rst.
Entry to and exit from a function call (but not between evaluations of the arguments).

13
Expressions before a sequence point are always evaluated
before those after a sequence point. In the case of shortcircuit evaluation, the second expression may not be evaluated depending on the result of the rst expression. For
example, in the expression (a() || b()), if the rst argument evaluates to nonzero (true), the result of the entire
expression cannot be anything else than true, so b() is not
evaluated. Similarly, in the expression (a() && b()), if
the rst argument evaluates to zero (false), the result of
the entire expression cannot be anything else than false,
so b() is not evaluated.
The arguments to a function call may be evaluated in any
order, as long as they are all evaluated by the time the
function is entered. The following expression, for example, has undened behavior:
printf("%s %s\n, argv[i = 0], argv[++i]);

5.6

Undened behavior

Main article: Undened behavior


An aspect of the C standard (not unique to C) is that the
behavior of certain code is said to be undened. In
practice, this means that the program produced from this
code can do anything, from working as the programmer
intended, to crashing every time it is run.
For example, the following code produces undened behavior, because the variable b is modied more than once
with no intervening sequence point:

7 References
[1] The long long modier was introduced in the C99 standard.
[2] The meaning of auto is a type specier rather than a storage class specier in C++0x
[3] Klemens, Ben (2012). 21st Century C. O'Reilly Media.
ISBN 1449327141.
[4] Balagurusamy, E. Programming in ANSI C. Tata McGraw
Hill. p. 366.
[5] see UTF-8 rst section for references
[6] Kernighan & Richie
[7] Tatham, Simon (2000), Coroutines in C, http://www.
chiark.greenend.org.uk/~{}sgtatham/coroutines.html
Missing or empty |title= (help)

General
Kernighan, Brian W.; Ritchie, Dennis M. (1988).
The C Programming Language (2nd Edition ed.).
Upper Saddle River, New Jersey: Prentice Hall
PTR. ISBN 0-13-110370-9.
American National Standard for Information Systems - Programming Language - C - ANSI X3.1591989

8 External links

#include <stdio.h> int main(void) { int a, b = 1; a = b++


+ b++; printf("%d\n, a); return 0; }

The syntax of C in Backus-Naur form

Because there is no sequence point between the modications of b in "b++ + b++", it is possible to perform
the evaluation steps in more than one order, resulting in
an ambiguous statement. This can be xed by rewriting
the code to insert a sequence point in order to enforce an
unambiguous behavior, for example:

The comp.lang.c Frequently Asked Questions Page

a = b++; a += b++;

See also
Blocks (C language extension)
C programming language
C variable types and declarations
Operators in C and C++

Programming in C

Storage Classes

14

9 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Text and image sources, contributors, and licenses

9.1

Text

C syntax Source: https://en.wikipedia.org/wiki/C_syntax?oldid=669435856 Contributors: The Anome, Arvindn, Edward, TakuyaMurata, MichaelJanich, Mxn, Charles Matthews, Dcoetzee, Bevo, Jni, Robbot, RedWolf, Tobycat, Cutler, Tobias Bergemann, Kevin Sa,
Ds13, JF Bastien, Darrien, Neilc, Vadmium, Urhixidur, Joyous!, Andreas Kaufmann, Abdull, Adashiel, Corti, KneeLess, Qutezuce,
Spoon!, Cmdrjameson, Jonsafari, Mick8882003, B k, ABCD, Rotring, SteinbDJ, Kbolino, BillC, MattGiuca, Hdante, Toussaint, Btyner,
Tbird20d, TheIncredibleEdibleOompaLoompa, StuartBrady, RobertG, Akihabara, Crazycomputers, Fresheneesz, YurikBot, Pip2andahalf,
Chungyan5, Cleared as led, Nick, Mikeblas, LeoNerd, Daghall, Flipjargendy, Ms2ger, TomJF, Alex Ruddick, Dkasak, Myrdred, BiT,
Marc Kupper, Papa November, Nbarth, Abaddon314159, RedHillian, BIL, Ritchie333, Cybercobra, Matt.forestpath, Acdx, Gennaro
Prota, Derek farn, HeroTsai, CPMcE, Loadmaster, Hvn0413, EdC~enwiki, JoeBot, SkyWalker, Ahy1, AndrewHowse, Cydebot, Oerjan,
Trevyn, DmitTrix, WinBot, Widefox, Seaphoto, Billyoneal, MER-C, IanOsgood, DervishD, Tedickey, DAGwyn, ANONYMOUS COWARD0xC0DE, Gwern, R'n'B, Nono64, Pfaben, Dhrubnarayan, Flyer22, Fratrep, ClueBot, Scrumplesplunge, Iandiver, WestwoodMatt,
Frederico1234, Johnuniq, MystBot, Addbot, Btx40, Jarble, Luckas-bot, Yobot, AnomieBOT, 1exec1, RandomAct, Kristjan.Jonasson,
Sae1962, Miklcct, LittleWink, Trappist the monk, Vrenator, Der schiefe Turm, Keegscee, EmausBot, John of Reading, Ida Shaw, Sbmeirow, Ptsneves, Mobywfm, Gustedt, 28bot, ClueBot NG, ScottSteiner, Visakhmr, Widr, Helpful Pixie Bot, GKFX, AshishDandekar7,
Kurrahee, Skunk44, Jimw338, Epicgenius, Xiaokaoy and Anonymous: 137

9.2

Images

File:Folder_Hexagonal_Icon.svg Source: https://upload.wikimedia.org/wikipedia/en/4/48/Folder_Hexagonal_Icon.svg License: Cc-bysa-3.0 Contributors: ? Original artist: ?


File:Question_book-new.svg Source: https://upload.wikimedia.org/wikipedia/en/9/99/Question_book-new.svg License: Cc-by-sa-3.0
Contributors:
Created from scratch in Adobe Illustrator. Based on Image:Question book.png created by User:Equazcion Original artist:
Tkgd2007
File:Text_document_with_red_question_mark.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/a4/Text_document_
with_red_question_mark.svg License: Public domain Contributors: Created by bdesham with Inkscape; based upon Text-x-generic.svg
from the Tango project. Original artist: Benjamin D. Esham (bdesham)

9.3

Content license

Creative Commons Attribution-Share Alike 3.0

Das könnte Ihnen auch gefallen