Sie sind auf Seite 1von 21

WRITING OPTIMIZED C CODE FOR MICROCONTROLLER APPLICATIONS

By Wilson Chan
Toshiba America Electronics Components, Inc.
Email: wilson.chan@taec.toshiba.com

INTRODUCTION
If you have a microcontroller project that requires a small program, or the
application has very limited memory resource, you may prefer to use Assembly language
for programming. Nowadays, as the performance of microcontrollers has been
improving, application systems have become larger and more complicated. As a result,
programs can no longer be coded in Assembly language easily. To improve development
efficiency, many microcontroller based products are programmed in C. Generally, when
programs are written in C and compiled by a C compiler, the code efficiency decreases
compared to an Assembly language program. In order to improve code efficiency, most
C compilers make use of optimization techniques. Often, the output code is optimized
for size, or execution speed, or both. Besides relying on the C compiler to generate
efficient code, the programmer can lend a helping hand to the C compiler by adopting
certain programming styles. This paper provides an overview of common optimizing
techniques used by C compilers and recommend C programming guidelines that will
result in optimized code for microcontroller applications.
PROGRAMMING MODEL
Some microcontrollers do not have hardware support for a C stack. If you plan to
develop your embedded applications in C, you should select a microcontroller with a
stack-based architecture. If the microcontroller has dedicated address-specifying/index
registers, they will also help the C compiler to generate more efficient code.
In this paper, well use a C compiler for a microcontroller which has a
programming model as shown in Figure 1 to illustrate the effect of various optimization
methods on the quality of the generated machine code. The W, A, B, C, D, E, H, L
registers are 8-bit general purpose registers. They can be used in pairs as four 16-bit
general purpose registers: WA, BC, DE and HL. The IX and IY registers are specialpurpose 16-bit registers used as address-specifying registers under register indirect
addressing mode and as index registers under index addressing mode. The SP register is
a 16-bit stack pointer. The PC register is a 16-bit program counter. The PSW is a 16-bit
program status word register. JF is the jump status flag, ZF is the zero flag, CF is the
carry flag, HF is the half carry flag, SF is the sign flag, and VF is the overflow flag.

W
B
D
H

A
C
E
L

IX
IY
SP
PC
PSW

General-Purpose :

Special-Purpose :

8bit X 8

16bit (IX, IY, SP, PC)


PSW =

JF, ZF, CF,


HF, SF, VF

Figure 1 Programming Model

COMMON OPTIMIZATIONS IN C COMPILERS


This section describes common optimization techniques often found in optimizing
C compilers.
Convolution and Propagation of Constants
This optimization method propagates constants in place of variables whenever
possible and computes any constant expression at compile time rather than at run time.
Consider the C program example in Figure 2. For the C statement in line 3, the C
compiler computes the addition of two constants at compile time rather than at execution
time. Similarly, for C statements from lines 4 to 6, the constant in line 4 is propagated to
lines 5 and 6. This optimization technique reduces program size and increases execution
speed.
C Language Program
1 int i, a, b, c;
2 test(){
3
i = 1 + 2;
4
a = 1;
5
b = a + 2;
6
c = a + b;
7 }

Without Optimization
1 _test:
2
ld
3
ld
4
add
5
ld
6
ld
7
ld
8
ld
9
inc
10
inc
11
ld
12
ld
13
ld
14
add
15
ld
16
ret

WA,0x1
BC,0x2
WA,BC
(_i),WA
WA,0x1
IY,_a
(IY),WA
WA
WA
IX,_b
(IX),WA
WA,(IY)
WA,(IX)
(_c),WA

With Optimization
1 _test:
2
ld
3
ld
4
ld
5
ld
6
ld
7
ld
8
ld
9
ld
10
ret

Figure 2 Convolution and Propagation of Constants

WA,0x3
(_i),WA
WA,0x1
(_a),WA
WA,0x3
(_b),WA
WA,0x4
(_c),WA

Dead-Code Elimination
This optimization method deletes unused variables at compile time. Consider the
C program example in Figure 3. With dead-code elimination optimization, the C
compiler eliminates the C statement in line 3.
C Language Program
1 int test(){
2
int a;
3
a = 1;
4
return 0;
5 }

Without Optimization
1 _test:
2
ld
3
xor
4
ret

WA,0x1
WA,WA

Figure 3 Dead-Code Elimination

With Optimization
1 _test:
2
xor
3
ret

WA,WA

Strength Reduction
This optimization method replaces expensive operations with less expensive ones.
Consider the C program example in Figure 4. The most efficient code is a left-shift
instead of an integer multiplication. Without optimization, the generated code makes a
call to a multiplication function supplied by the C run-time library to compute the
multiplication which takes much longer than a left-shift operation.
C Language Program
1 int i;
2 test() {
3
i *= 2;
4 }

Without Optimization
1 _test:
2
ld
3
ld
4
cal
5
ld
6
ret

BC,0x2
WA,(_i)
C87C_muli
(_i),WA

Figure 4 Strength Reduction

With Optimization
1 _test:
2
ld
3
ld
4
shlca
5
ld
6
ret

IY,_i
WA,(IY)
WA
(IY),WA

Common Sub-Expression Elimination


This optimization method reduces the number of operations by using the first
operation result in subsequent statements that contain the same operation. Consider the C
program example in Figure 5. The calculation of the sub-expression, i + 1, is reduced
from two times to once with optimization.

C Language Program
1 int test(int *a, int *b, int i)
2 {
3
return(a[i+1] + b[i+1]);
4 }

Without Optimization
1 _test:
2
ld
3
inc
4
shlca
5
ld
6
add
7
ld
8
inc
9
shlca
10
ld
11
add
12
ld
13
add
14
ret

WA,(SP+0x7)
WA
WA
IX,WA
IX,(SP+0x3)
WA,(SP+0x7)
WA
WA
DE,WA
DE,(SP+0x5)
WA,(DE)
WA,(IX)

With Optimization
1 _test:
2
ld
3
inc
4
shlca
5
ld
6
add
7
ld
8
add
9
ld
10
add
11
ret

Figure 5 Common Sub-Expression Elimination

WA,(SP+0x7)
WA
WA
IX,WA
IX,(SP+0x3)
DE,WA
DE,(SP+0x5)
WA,(DE)
WA,(IX)

Code Motion
This optimization method is often used to optimize loops. Generally speaking,
most of the program execution time is spent in loops. Therefore, it is important for C
compilers to provide optimization for loops. Consider the C program example in Figure
6. First, the invariant operation, b + c, is moved outside of the loop. Second, array
address calculations that use an induced variable (updated at each iteration) are reduced
to incrementing an accumulator. The optimized code will not only be smaller in size (26
bytes versus 39 bytes), but will also execute much faster.
C Language Program
1 int a[10], b, c;
2 test(){
3
int i;
4
for (i = 0; i < 10; i++)
5
a[i] = b + c;
6 }

Without Optimization
1 _test:
2
ld
3
cmp
4
j
5 L2:
6
ld
7
shlca
8
ld
9
add
10
ld
11
add
12
ld
13
inc
14
cmp
15
j
16 L1:
17
ret

BC,0x0
BC,0xa
sge,L1
WA,BC
WA
DE,WA
DE,_a
WA,(_b)
WA,(_c)
(DE),WA
BC
BC,0xa
slt,L2

Figure 6 Code Motion

With Optimization
1 _test:
2
ld
3
add
4
ld
5
ld
6
add
7 L2:
8
ld
9
inc
10
inc
11
cmp
12
j
13
ret

BC,(_b)
BC,(_c)
WA,_a
DE,WA
WA,0x14
(DE),BC
DE
DE
DE,WA
lt,L2

Caching Of Memory Access


This optimization method reduces the number of memory access, thus speeding
up program execution. Consider the C program example in Figure 7. With optimization,
the number of memory access is reduced from three to one. Also, the resultant code
occupies less memory (12 bytes versus 20 bytes).
C Language Program
1 int a;
2 int test() {
3
if (a != 1)
4
return a;
5
else
6
return a - 1;
7 }

Without Optimization
1 _test:
2
ld
3
cmp
4
j
5
ld
6
ret
7 L1:
8
ld
9
dec
10
ret

WA,(_a)
WA,0x1
t,L1
WA,(_a)

WA,(_a)
WA

With Optimization
1 _test:
2
ld
3
cmp
4
j
5
ret
6 L1:
7
dec
8
ret

WA,(_a)
WA,0x1
t,L1

WA

Figure 7 Caching of Memory Access


Switch Table Optimization
Switch statements are very common in C programs. This optimization calls for
the C compiler to analyze the nature of the case values in the switch statement, then
decide on the optimum way to implement the switch statement. Consider the C program
examples in Figure 8. The C compiler implements the switch statement in the C program
1 as a series of compare and branch instructions. For the C program 2, the C Compiler
uses a different coding style to improve code efficiency.

C Language Program 1

C Language Program 2

1 unsigned char j;
2 test1(unsigned char i) {
3
switch(i) {
4
case 1:
5
j = 1;
6
break;
7
case 2:
8
j = 2;
9
break;
10
case 3:
11
j = 3;
12
break;
13
default:
14
break;
15
}
16 }

1 unsigned char j;
2 test2(unsigned char i) {
3
switch(i) {
4
case 1:
5
j = 1;
6
break;
7
case 2:
8
j = 2;
9
break;
10
case 3:
11
j = 3;
12
break;
13
case 4:
14
j = 4;
15
break;
16
default:
17
break;
18
}
19 }

With Optimization
1 _test1:
2
ld
3
ld
4
cmp
5
j
6
cmp
7
j
8
cmp
9
j
10
ld
11
ret
12 L9:
13
ld
14
ret
15 L10:
16
ld
17 L6:
18
ret

A,(SP+0x3)
W,0x0
WA,0x3
t,L10
WA,0x2
t,L9
WA,0x1
f,L6
(_j),0x1

(_j),0x2

(_j),0x3

Figure 8 Switch Table Optimization

With Optimization
1 S50000:
2
db
3
db
4
db
5
db
6 _test2:
7
ld
8
ld
9
dec
10
cmp
11
j
12
ld
13
add
14
ld
15
ld
16 L14:
17
ret

1
2
3
4
A,(SP+0x3)
W,0x0
WA
WA,0x3
gt,L14
IX,S50000
IX,WA
A,(IX)
(_j),A

Microcontroller Specific Optimization


Every microcontroller has a specific instruction set. The types of instructions
vary according to the particular microcontroller. The C compiler can generate more
efficient code by using instructions specific to a microcontroller. Consider the C program
example in Figure 9. The C compiler takes advantage of the bit manipulation instructions
provided by the microcontroller to generate efficient code.
C Language Program
1 unsigned char a;
2 test( ) {
3
a &= ~0x1;
4
a |= 0x4;
5 }

With Optimization
1 _test:
2
ld
3
clr
4
set
5
ret

IY,_a
(IY).0
(IY).2

Figure 9 Microcontroller Specific Optimization

PROGRAMMING GUIDELINES FOR EFFICIENT C CODE


It is important to choose a good optimizing C compiler for your microcontroller
project. However, besides relying on the C compiler, you can also lend the C compiler a
helping hand in generating efficient code by observing certain programming guidelines.
This section describes these programming guidelines.
Use Microcontroller Specific C Language Extensions
In order to facilitate the portability of C programs, a general rule of thumb is to
use ANSI C constructs as much as possible, especially function prototyping. For cases
that require tighter code, use C compilers language extensions to access hardware,
locate variables in memory, specify interrupt service routines, etc. The disadvantage is
that using language extensions renders the C program non-portable.
/* Function Prototype Examples */
void void_ptr_func();
/* function returning nothing */
char *char_ptr_func();
/* function returning a pointer to char
*/
int ifunc( char B, int *DEW ); /* function returning an int,
1st parameter is a char,
2nd parameter is a pointer to int */
/* Prototype declaration for interrupt processing functions.*/
extern void int5(), inttc1(), intsio2(), int3(),
inttc4(),intsio1(), inttc3(),int2(),
inttbt(), int1(),inttc1(), int0(), intwdt(), intsw(), reset();
/* Definition of vector table. */
#pragma section const interrupt near 0xffe0
/* const or code can be used for section type */
void (*intvector[])() = { int5, inttc1, intsio2,int3,
inttc4, intsio1, inttc3, int2,
inttbt, int1, inttc1, int0,
intwdt, intsw, reset };
/* Define pointer variable (or array) in vector table. */
/* Initialize to set the function address (function name). */
Figure 10 Function Prototype Examples

static void DecSafeTicks(int nSafeTicks) {


__DI;
/* special function - disable interrupts */
nTicks -= nSafeTicks;
__asm(" ei"); /* inline assembly - enable interrupts */
}
/* An example of I/O variable definition using directive #pragma */
#pragma io port0 0x00
unsigned char port0;
/* An example of I/O variable definition using _io */
unsigned char __io(0x00) port0;
Figure 11 C Language Extension Examples

Use C Compilers Optimization Options


The C compiler may provide multiple levels of optimization. For examples,
optimization levels 0, 1, 2 and 3. Usually, the higher the level, the more optimization
methods will be used by the C compiler to generate machine code. However, depending
on the coding style, it is possible that the size of generated code of level 3 is larger than
that of level 0.

Level Function
0
Minimum optimization (default)
Stack release absorption. Branch instruction optimization.
Deletion of unnecessary instructions
1
Basic block optimization
Propagation of copying restricted ranges.
Gathering of common partial expressions in restricted ranges.
2
Optimization of more than basic blocks
Propagation of copying whole functions.
Gathering of common partial expressions of whole functions
3
Maximum optimization
Loop optimization and other miscellaneous optimization
Figure 12 C Compiler's Optimization Options Example

The C compiler may have an option that minimizes program size. Use it if it is available.
A possible side effect of this option is that in certain situations, the resultant code may be
smaller in terms of number of bytes, but it may execute slower than the code generated
without specifying the option.
Format
-XS
Function
Specifies the output of minimum object code size.
Description
When this option is specified, part of optimization is skipped. The
default, when this option is not specified, is the output of code with execution
speed priority.
Figure 13 C Compiler Code Size Optimization Example

Optimize Usage of Memory Spaces


Many microcontrollers have more than one memory space. For example, a
memory space may be accessible with an 8-bit offset, another memory space requires a
16-bit offset, still some memory space requires an address space modifier. You can
decrease program size by explicitly locating the frequently used variables into the
memory space that requires the minimum number of bytes for addressing.

Example:
int a0 = 0, a1; /* default
memory area */
int __tiny at0 = 0, __tiny at1;
/* to tiny area */
void fcn(void) {
a1 = a0;
at1 = at0;
}

Opcode
_fcn:
; 16-bit address offset
E1000048 R
ld WA,(_a0)
F1000068 R ld (_a1),WA
; 8-bit address offset
E00048
R
ld WA,(_at0)
F00068
R
ld (_at1),WA
FA
ret

Figure 14 Optimize Usage of Memory Space Example

Use of the const Keyword


Using a constant is more efficient than a const variable in terms of execution
speed and program size. The C compiler can easily access a constant with immediate
addressing whereas it needs to use index addressing to access a const variable which is
usually placed in ROM. However, if a function argument is a pointer to a read-only
string (a const data object), using the const keyword to declare the pointer argument may
help the C compiler, in certain cases, to generate more efficient code.
Example:
#pragma section const
const int
ix = 3000;
char y[] = { 'A','B' };
static char z = 4;

A const declaration can only be used for variables defined in const


or code sections, or external variables. Variables in const sections
automatically take the const attribute and no const declaration
therefore needs to be coded, if a const declaration is made for
variables in a const section. The const declaration also affects
external variables
Figure 15 Use of const Keyword Example

Use Of Auto Variables


For temporary variables, do not declare them as global variables. Rather, declare
them as auto variables.
When global variables are passed as arguments in a function, use the arguments,
not the global variables, in expressions within the function.
For global variables that are accessed frequently in a function, make a copy of the
global variables as auto variables and use the auto variables within the function.
Consider the C program examples in Figure 16. The C program 1 produces 30 bytes of
machine code whereas the C program 2 generates 18 bytes of machine code.

C Language Program 1

C Language Program 2

1 unsigned char *a, j;


2 test( ) {
3
for (j=0; j<100; j++)
4
*a++ = 0;
5 }

1 unsigned char *a, j;


2 test( ) {
3
unsigned char *c, i;
4
c = a;
5
for (i=0; i<100; i++)
6
*c++ = 0;
7 }

With Optimization
_test:

With Optimization
_test:

ld

(_j),0x0

ld
ld
ld
inc
ld
ld
inc
cmp
j
ret

IY,_a
DE,(IY)
WA,(IY)
WA
(IY),WA
(DE),0x0
(_j)
(_j),0x64
lt,L2

L2:

Figure 16 Auto Variable Example

ld
xor

BC,(_a)
A,A

ld
inc
ld
inc
cmp
j
ret

DE,BC
BC
(DE),0x0
A
A,0x64
lt,L6

L6:

Use Unsigned Data Types


Use unsigned data types with the data size that matches the natural width of the
microcontrollers registers. Also, use the smallest data type that can get the job done. For
example, if you write a C program for an 8-bit microcontroller, use the unsigned char
data type in loop control operations, as subscript of arrays and as bit-field members. If
the C compiler enforces ANSI Cs integer promotion rule by default, specify an option to
disable it. Otherwise, this ANSI Cs rule will enlarge the program size.

Example:
struct field {
unsigned char a:1;
unsigned char b:3;
unsigned char c:3;
unsigned char d:1;
};
struct field array[10];

_fcn:
ld
ld
ld
add

IY,_array
IX,IY
BC,IY
IY,0xa

ld
ld
and
or
ld
or
inc
inc
cmp
j
ret

DE,BC
A,(DE)
A,0x8f
A,0x50
(DE),A
(IX),0xe
BC
IX
IX,IY
lt,L4

L4:

void fcn( ) {
unsigned char i;
for (i=0; i < 10; i++) {
array[i].b = 5;
array[i].c = 7;
}
}

Figure 17 Unsigned Data Type Example

Pass Function Arguments with Registers


If the C compiler supports register variables as auto variables and arguments of a
function, use the facility. The register variables are placed in microcontrollers registers,
which will result in smaller and faster program.

Example:
char a_src[41] = {"Hello"};
char a_des[41];
void testcpy(void) {
register char *p_src = a_src;
register char *p_des = a_des;
while (*p_src)
*p_des++ = *p_src++;
*p_des = '\0';
}

Figure 18 Register Type Example

_testcpy:
push
ld
ld
j
L2:
ld
inc
ld
inc
ld
ld
L3:
cmp
j
ld
pop
ret

HL
IY,_a_src
IX,_a_des
L3
DE,IX
IX
HL,IY
IY
A,(HL)
(DE),A
(IY),0x0
f,L2
(IX),0x0
HL

Function type __adec1 is used to interface with assembly language functions. Up


to three arguments are passed via the registers. Arguments not passed via the register are
passed via the stack like __cdecl type. Arguments are sequentially evaluated from the
left unlike __cdecl type. The merit of using __adecl type is that, in many cases, passing
arguments via the register simplifies the coding of argument processing in assembly
language.

Example:
int g1, g2, g3, g4, sum;
int __adecl RegParm( int p1,
int p2, int p3, int p4) {
g1 = p1;
g2 = p2;
g3 = p3;
g4 = p4;
return(p1+p2);
}
void test( ) {
sum = RegParm(2, -2, 88, 88);
}

.RegParm:
ld
ld
ld
ld
ld
add
pop
pop
j
_test:
ld
push
ld
ld
ld
cal
ld
ret

Figure 19 Function Type __adecl Example

(_g1),WA
(_g2),BC
(_g3),DE
DE,(SP+0x3)
(_g4),DE
WA,BC
DE
BC
DE
WA,0xffa8
WA
DE,0x58
WA,0x2
BC,0xfffe
.RegParm
(_sum),WA

Avoid Using Signed Variables, Long Types and Floating Points


Expressions involving these data types often result in calls to run-time library
functions in the generated code. For the example in Figure 20, the C compiler generates
a function call to _fld_ff in the run-time library to support floating point type.

Example:
float array4[10], vf;
fcn1( ) {
unsigned char i;
for (i=0; i < 10; i++)
array4[i] = vf;
}

_fcn1:
push
push
ld
L2:
ld
ld
shlca
shlca
ld
ld
add
ld
ld
cal
inc
cmp
j
pop
pop
ret

DE
HL
(SP+0x4),0x0
A,(SP+0x4)
W,0x0
WA
WA
BC,_array4
HL,WA
HL,BC
BC,_vf
WA,HL
._fld_ff
(SP+0x4)
(SP+0x4),0xa
lt,L2
HL
DE

Figure 20 Support Function in Run-Time Library Example

Avoid Operations Involving Different Data Types


The C compiler follows the ANSI C data type promotion rules to process
expression that involve different data types, resulting in extra object code and execution
time. In expressions that contain char and int, char gets promoted to int. In expressions
that contain both signed and unsigned integers, signed integers are promoted to unsigned
integer. In expressions that contain floating point types, float gets promoted to double.
In expressions that contain only char, if the C compiler enforces ANSI Cs integer
promotion rule by default, specify an option to disable it. Otherwise, this ANSI Cs rule
will enlarge the program size.

Example:
char c1;
int i1, i2, i3;

_fcn1:

fcn1( ) {
i1 = c1 + i3;
}
fcn2( ) {
i1 = i2 + i3;
}

ld
test
subb
add
ld
ret

A,(_c1)
A.7
W,W
WA,(_i3)
(_i1),WA

ld
add
ld
ret

WA,(_i2)
WA,(_i3)
(_i1),WA

_fcn2:

Figure 21 Operation Involving Different Data Types Example

Avoid Operations that Overloads the Stack


Avoid using recursive functions with many arguments and auto variables, and
functions with variable length argument lists. If a structure or an array is used as
argument in a function, pass a pointer to the data instead of passing the data on the stack.

Example:
struct s1 {
char *text;
int count;
};
extern struct s1 ays1[5];
int sum;

int SumCount( struct s1 *p1,


int p2) {
int i, j;
for (i=0, j=0; i < p2; i++,
p1++)
j += p1->count;
return(j);
}
void test3( ) {
sum = SumCount(ays1,
(sizeof(ays1)/sizeof(struct
s1)));
}

_SumCount:
ld
xor
ld
ld
cmp
j
L2:
ld
add
inc
add
cmp
j
L1:
ret
_test3:
ld
push
ld
push
cal
ld
ld
ret

IX,(SP+0x3)
IY,IY
WA,IY
DE,(SP+0x5)
DE,0x0
sle,L1
BC,(IX+0x2)
WA,BC
IY
IX,0x4
IY,DE
slt,L2

WA,0x5
WA
WA,_ays1
WA
_SumCount
SP,SP+0x4
(_sum),WA

Figure 22 Passing Pointer in Function Argument Example

CONCLUSION
We have described some of the commonly used optimizing methods of C
compilers. Not all C compilers are created equal. Therefore, it is important to choose a
C compiler that incorporates good optimizing methods in generating code for your
particular microcontroller. We have also discussed a set of C programming guidelines
that will help improve code efficiency of your microcontroller applications.

Das könnte Ihnen auch gefallen