Sie sind auf Seite 1von 18

Type Checking

Type Checking • The general topic of type checking includes two parts – Type synthesis –

• The general topic of type checking includes two parts

– Type synthesis – assigning a type to each expression in the language

– Type checking – making sure that these types are used in contexts where they are legal, catching type-related errors

• Strongly typed languages are ones in which every expression can be assigned an unambiguous type

– Weakly typed languages could have run-time errors due to type incompatibility

• Statically typed languages vs. dynamically type languages

– Capable of being checked at compile time vs. not

• Static type checking vs. dynamic type checking

– Actually check types at compile time vs. run time

• Run time type checking is less efficient because type information must be stored

1

Base Types

Base Types • Numbers – integer • C specifies length in relative terms, short and long

• Numbers

– integer

• C specifies length in relative terms, short and long

• Java specifies specific lengths: byte, short, int and long have 8, 16, 32 and 64 bits, respectively

– floating point numbers

• Many languages have two sizes, 32 and 64 bit

– Can use IEEE representation standards

• Characters

– Single letter, used to be 8 bit ASCII standard, now can also be a 16 bit Unicode

• Booleans

– Two values: true and false

– C uses the 0 and 1 subrange of integers

• Strings

Compound Types: Arrays and Strings

Compound Types: Arrays and Strings • Arrays – aggregate of values of the same type –

• Arrays – aggregate of values of the same type

– Arrays have a base type for elements, may have an indexing range for each dimension int a [100][25], in C

– If the indexing range is known, then the compiler can compute space allocation, otherwise indexing is relative and space is allocated by a run-time allocator

– The main operation is indexing; some languages allow whole array operations

• Strings – sequence of characters

– Also can have bit strings

– C treats strings as arrays of characters

– Strings can have comparison operators that use lexicographic order “fee” < “fie”

More Compound Types

More Compound Types • Records or Structures – components may have different types and may be

• Records or Structures – components may have different types and may be indexed by names

struct { double r; int i;

}

– Representation as ordered product of elements

• Variant records or Unions – a component may be one of a choice of types union ( double r; int i; }

– Representation can save space for the largest element and may have a tag to indicate which type the value is

– Take care not to have run-time (value) errors

Other Types

Other Types • Enumerated types – the programmer can create a type name for a specific

• Enumerated types – the programmer can create a type name for a specific set of constant values enum WeekDay {Sunday, Monday, … Saturday}

– Representation as for a small set

• Pointers – abstraction of addresses

– Can create a reference to, or derefence an object

– Distinguish between “pointer to integer” and “pointer to boolean”, etc.

– C allows arithmetic on pointers

• Void

• Classes – may or may not create new types

– Classes can be represented by an extended type of record for data and methods

– But are not just types since they also have features such as inheritance

Function Types, New Type Names

Function Types, New Type Names • Procedure and function types are sometimes called signatures – Give

• Procedure and function types are sometimes called signatures

– Give number and types of parameters

• May include parameter-passing information such as by value or by reference

– Give type of result (or indicate no result) strlength : String -> unsigned int

• Type declarations or type definitions allow the programmer to assign new type names to a type expression

– These names should also be stored in the symbol table

– Will have scope and other attributes

– Definitions may be recursive in some languages

• If so, size of object will be unknown

Representing Types

Representing Types • Types can be represented as expressions, or quite often as trees: array(9) type

• Types can be represented as expressions, or quite often as trees:

array(9) type
array(9)
type
function arglist result type
function
arglist
result
type
argn type
argn
type
as expressions, or quite often as trees: array(9) type function arglist result type argn type arg1

arg1

type

Type Equivalence

Type Equivalence • Name equivalence – in this kind of equivalence, two types must have the

• Name equivalence – in this kind of equivalence, two types must have the same name

– Assumes that the programmer will introduce new names exactly when they want types to be different if you say t1 = int and t2 = int, t1 and t2 are different

• Structural equivalence – two objects are interchangeable if their types have the same fields with equivalent types. int x[10][10] and int y[10][10] x and y have equivalent types, the 10 by 10 arrays

– More complex situations arise in structural equivalence of other compound types, e.g. may have mutually recursive type definitions

• Type checking rules extend this to a notion of type compatibility

Type Synthesis and Type Checking

Type Synthesis and Type Checking • Assigning types to language expressions (type synthesis) can be done

• Assigning types to language expressions (type synthesis) can be done by a traversal of the abstract syntax tree

• At each node, a type checking rule will say which types are allowed

• Description here is for languages with type declarations

– Constants are assigned a type

• If there is not a known type, then there will be a set of possibles

– Variables are looked up in the symbol table

• Note that we are assuming an L-attributed grammar so that declarations are processed first

– Assignment

• The type of the assignable entity on the left must be the typeEqual to the type of the value on the right

Types for expressions

Types for expressions • Arithmetic and other operators have result types defined in terms of the

• Arithmetic and other operators have result types defined in terms of the types of the subnodes in the tree

• Statements have substructures that need to be checked for type correctness

– Condition of if and while statements must have type boolean

• Array reference

– Suppose we have exp1 -> exp2[exp3], then an adhoc SDT:

if (isArrayType(exp2.type)) and typeEqual(exp3.type, integer) then exp1.type = exp2.type.basetype // get the basetype child else type-error(exp1)

• Function calls have similar rules to check the signature of the function name the parameters

Issues for typeEqual

Issues for typeEqual • This is sometimes called type compatibility • Overloading – Arithmetic operators 2

• This is sometimes called type compatibility

• Overloading

– Arithmetic operators 2 + 3 means integer addition 2.0 + 3.0 means floating pt addition

– Language may have an arithmetic operator table, telling which type of operator will be used based on the expressions

Type of: a

b

a+b

 

int

int

int

int

float

float

int

double

double

float

float

float

float

double

double

double

double

double

11

Overloading Functions

Overloading Functions • Can declare the same function (or method) name with different numbers and types

• Can declare the same function (or method) name with different numbers and types of parameters int max(int x,y) double max(double x,y)

– Java and C++ allow such overloaded declarations

• Need to augment the symbol table functionality to allow for the name to have multiple signatures

– The lookup procedure is always give a name to look up – we can add a typelist argument and the lookup can tell us if there is a function declared with that signature

– Or the lookup procedure is given the name to look up – and in the case of a method, it can return sets of allowable signatures

Conversion and Coercion

Conversion and Coercion • The typeEqual comparison is commonly extend to allow arithmetic expressions of mixed

• The typeEqual comparison is commonly extend to allow arithmetic expressions of mixed type and for other cases of types which are compatible, but not equal

– If mixed types are allowed in an arithmetic expression, then a conversion should be inserted (into the AST or the code)

2 * 3.14 t1 = float ( 2) t2 = t1 * 3.14

becomes code like

• Conversion from one type to another is said to be implicit it if is done automatically by the compiler, and is also called coercion

• Conversion is said to be explicit if the programmer must write the conversion

– Called casts in C and Java languages

Widening and Narrowing

Widening and Narrowing • The rules for Java type conversion distinguishes between – Widening conversions which

• The rules for Java type conversion distinguishes between

– Widening conversions which preserve information

– Narrowing conversion which can lose information

• Conversions between primitive types in Java:

(there is also widening and narrowing for references, i.e. objects)

Widening

Narrowing (usually with a cast)

double

double

Widening Narrowing (usually with a cast) double double float float long long int int

float

float

Widening Narrowing (usually with a cast) double double float float long long int int

long

long

Widening Narrowing (usually with a cast) double double float float long long int int

int

int

     
short
short

char

byte

short char byte

short

char

 

byte

 

Generating Type Conversions

Generating Type Conversions • For an arithmetic expressions e1 e2 op e3, the algorithm can generally

• For an arithmetic expressions e1 e2 op e3, the algorithm can generally use widening, to possibly a third type that is greater than both the types of e2 and e3 in the widening tree:

Let the new type of e1 be the max of e2.type and e3.type Generate a widening conversion of e1 if necessary Generate a widening conversion of e2 if necessary Set the type (and later the code) of e1

• Type conversions and coercions also apply to assignment

if r is a double and i is an int:

allow r = i;

– C also allows i = r, with the corresponding loss of information

– For classes, we have the subtype principle, objects of subclasses can be assigned to objects of superclasses (with the corresponding loss of information), but not vice versa

Continuing type synthesis and checking rules

Continuing type synthesis and checking rules • Expressions: the two expressions involved with boolean operators, such

• Expressions: the two expressions involved with boolean operators, such as &&, must both be boolean

• Functions: the type of each actual parameter must be typeEqual to its formal parameter

• Classes:

– if specified, the parent of the class must be a properly declared class

– If a class says that it implements an interface, then all methods of the interface must be implemented

Polymorphic Typing

Polymorphic Typing • A language is polymorphic is language constructs can have more than one type

• A language is polymorphic is language constructs can have more than one type procedure swap(anytype x,y) where anytype is considered to be a type variable

• Polymorphic functions have type patterns or type schemes, instead of actual type expressions

• The type checker must check that the types of the actual parameters fit the pattern

– Technically, the type checker must find a substitution of actual types for type variables that satisfies the type equivalence between the formal type pattern and the actual type expression

– In complex cases with recursion, may need to do unification to solve the substitution problem

• Most notably in the language ML

• Java and C# allow polymorphic type constructors, called generics

References

References • Keith Cooper and Linda Torczon, Engineering a Compiler , Elsevier, 2004 • Kenneth C.

• Keith Cooper and Linda Torczon, Engineering a Compiler, Elsevier, 2004

• Kenneth C. Louden, Compiler Construction: Principles and Practices, PWS Publishing, 1997.

• Aho, Lam, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools. Addison-Wesley, 2006. (The purple dragon book)

• Charles Fischer and Richard LeBlanc, Jr., Crafting a Compiler with C, Benjamin Cummings, 1991.