Sie sind auf Seite 1von 14

Symbol Resolution and Relocation

Symbol Resolution and Relocation


The essentials of MCLinker
Symbol Resolution

Relocation
Performing section merge
Resolving all resolvable relocation
Replacing symbolic references with
actual addresses (binding)

Resolving references across


symbols
Merging multiple symbol tables into
one

MCLinker
MCLinker
Bitcode
Bitcode

Object
Object file
file

Archive
Archive

Bitcode
Bitcode
reader
reader
Object
Object
reader
reader
Archive
Archive
reader
reader

Symbol
Symboltable
table

MCLDFile
MCLDFile
MCLDFile
MCLDFile
MCLDFile
MCLDFile
Symbol
table

Symbol
Symboltable
table

Symbol
table

Symbol
Symboltable
table

Section a
Section b

Section
aaaaa
Section
Section
Section
Section
Section
a

Section

Section c

Section
bbbbb
Section
Section
Section
Section
Section
b

Section

Section
ccccc
Section
Section
Section
Section
Section
c

Section

Relocation

Relocation
Relocation

Relocation

Relocation

Relocation
Relocation

class

Relocation
Relocation

Data
http://code.google.com/p/mclinker

Symbol
Resolution

Output
Output

11/11/18

MCLinker

MCLDFile

Instances of MCLDFile are the inputs and outputs of MCLinker


MCLDFile provides a consistent abstraction of object files in a variety of targets
and formats, it has:

symbol tables

sections

relocation entries
Linking operations on MCLDFile are efficient

Looking up symbols is fast with limited memory

By memory map I/O, usage of physical memory keeps as few as possible

By loading on demand, usage of virtual memory keeps as few as possible

http://code.google.com/p/mclinker

11/11/18

MCLinker

Symbol Table In MCLinker (1/2)

MCLinker avoids copying symbols between symbol tables


Symbol tables record only references of symbols, not instances.

A common symbol pool stores all symbol instances of different symbol


tables
MCLinker keeps the number of walks over symbol tables as few as possible

MCLinker prevents merging symbol tables from symbol resolution


MCLinker resolves symbols simultaneously when reading symbol
tables of inputs
MCLinker only visits symbols which it needs
Dynamic and common symbols are grouped into different sets

http://code.google.com/p/mclinker

11/11/18

MCLinker

Symbol Table In MCLinker (2/2)


Store only references
of symbols

Group symbols into


different categories

Symbol Table 1
Dynamic

Symbol Table 2

Common

Non-Dynamic

Reference symbols

Common

Define symbols

Common Symbol Pool


Resolve symbols when
add a symbol into the
common symbol pool
http://code.google.com/p/mclinker

Common symbol pool


records the real instances
of symbols
11/11/18

MCLinker

Symbol in MCLinker

MCLinker defines a format-independent abstraction of


symbols, aka LDSymbol
Supports Mach-O, ELF, and COFF
Supports both 32- and 64-bit
MCLinker transforms symbols of different formats into
LDSymbol as the following figure:
LDSymbol
LDSymbol
name
name
is_dyn
is_dyn :: 11
type
type :: 22
bind
bind :: 22
in_section
in_section
value
value :: 64
64
size
size :: 64
64
other
other :: 88

ELF
ELF Symbol
Symbol
st_name
st_name
st_info
st_info
st_shndx
st_shndx
st_value
st_value
st_size
st_size
st_other
st_other

http://code.google.com/p/mclinker

11/11/18

MachO
MachO Nlist
Nlist
n_un
n_un
n_type
n_type
n_desc
n_desc
n_sect
n_sect
n_value
n_value
COFF
COFF Symbol
Symbol
Name
Name
Type
Type
StorageClass
StorageClass
SectionNum
SectionNum
Value
Value
NumAux
NumAux

MCLinker

Symbol Resolution

Steps

Get a input symbol from an input file


If no symbols in output symbol table have the same name ,
then add input symbol to output symbol table
Otherwise, compare input symbol with the existing output symbol
according to Table 1.
Discard the input symbol or override the output symbol by the result of
comparison
Table 1. - The priorities of attribute values in symbol comparison
Attributes

Priority of attribute values

is_dyn

not a dynamic object > is a dynamic object

type

defined > common > reference

bind

global > weak

http://code.google.com/p/mclinker

11/11/18

MCLinker

Sections in MCLinker

MCLDFile reuses the definitions of sections in LLVM machine code (MC) layer

MCSection has the attributes (name, type, ) of a section

MCSectionData records the size and offset of a section

MCFragment is the storage of data

Readers in MCLinker transform only the sections holding the information


defined by the program into MCSection

In general, readers transforms only text and data sections


In ELF, readers transforms only sections with SHT_PROGBITS and
SHF_ALLOC attributes

http://code.google.com/p/mclinker

11/11/18

MCLinker

Relocation Entries in MCLinker

MCLinker defines a format-independent relocation called


LDRelocation

Support Mach-O, ELF, and COFF


As BFD, MCLinker uses a target-independent relocation algorithm for
all targets

LDRelocation has a target-independent data structure howto to


describe how to apply relocation
Relief the porting efforts from implementing various relocation functions
for all targets
howto
howto

TargetBackend additionally
provides target-dependent
relocation functions to improve
performance as needed

http://code.google.com/p/mclinker

11/11/18

LDRelocation
LDRelocation
symbol
symbol
offset
offset
addend
addend
howto
howto

type
type
right_shift
right_shift
size
size
bit_size
bit_size
pcrel
pcrel
bit_position
bit_position
overflow
overflow
target_callback
target_callback
src_mask
src_mask
dst_mask
dst_mask
pcrel_offset
pcrel_offset

MCLinker

Applying Relocations by howto


Steps

1. Compute the relocation value

Relocation = S + A P
S : the value of the symbol
A : the value of addend
P : the value derived from offset

2. Shift the relocation value by shiftright (>>=) and bitpos (<<=)


3. Apply relocation value (As the following figure)
4. Write back the final result into the address of the symbol
SUB R0, R1, #1024

Rn
Rd
Rotate
Immed 8
11100010010000010000111100111111

and

and
result high

Rn
Rd Rotate
Immed 8
11100010010000010000111111111111
~dst_mask
11111111111111111111111100000000
111000100100000100001111

final result

http://code.google.com/p/mclinker

sum

src_mask
11111111
final value of relocation ( offset + addend + symbol address )
11111111111111111111110000000100

and

dst_mask
11111111

result low

01000011

11100010010000010000111101000011

11/11/18

10

MCLinker

Memory Allocation Policy in MCLinker

MCLinker has its own memory allocator, as called as MemoryArea

Unfortunately, We do not directly use LLVM MemoryBuffer


Linkers' demands of memory allocation policy is different from
compilers'

The average size of object files is different from source files

Linkers have more file operations than compilers. Linkers'


performance is more sensitive to the usage of memory mapped I/O
LLVM MemoryBuffer is designed for compilers, not linkers

Average size of all members in libc.a is less than and closed to one page
However, LLVM MemoryBuffer uses memory mapped I/O only when the
request is larger than four pages

Policy

Advantage

Disadvantage

Memory Mapped I/O


mmap()

~x2 faster file copy

1. Start address must be on the


page boundaries
2. Memory size must be a
multiple of the page size

Dynamic Memory
No constraints on either the start Slow file copy
http://code.google.com/p/mclinker
11/11/18
11
malloc()
+ read()
address
or the requested
size
MCLinker

Components of MemoryArea (1/2)


Three layers of MemoryArea

MemoryArea

MemorySpace

Clients request MemoryArea for virtual memory space


MemoryArea creates MemorySpaces and MemoryRegions to satisfy
clients' requests
MemoryArea decides whether to use dynamic memory or memory
mapped I/O
MemorySpace is a container of a non-overlapped and continuous range of
virtual addresses
Virtual memory is allocated by either malloc or memory mapped I/O
Clients do not directly access memory by MemorySpace. Instead, they access
memory by MemoryRegions

MemoryRegion

MemoryRegion marks a range of virtual memory space in a MemorySpace


Clients access memory through MemoryRegions
Several MemoryRegions can map to the identical MemorySpace

http://code.google.com/p/mclinker

11/11/18

12

MCLinker

Components of MemoryArea (2/2)


Every MemoryRegion maps
to a MemorySpace

LDObjectReader
LDObjectReader

Reads or Writes memory

MemoryRegion
MemoryRegion
MemoryRegion

MemoryArea

MemorySpace
MemorySpace

MemoryRegion
MemoryRegion

MemorySpace
MemorySpace

MemoryRegion
MemoryRegion

MemorySpace
MemorySpace

The MemorySpace mapped


by MemoryRegion may be
overlapped

A file in secondary storage


Copy parts of a file to a MemoySpace
by memory mapped I/O or dynamic
memory

http://code.google.com/p/mclinker

11/11/18

mmap
mmap
dynamic
dynamicmemory
memory

13

MCLinker

How MemoryArea allocates memory?


1. Requests memory space
with the specified size

LDObjectReader
LDObjectReader

MemoryRegion

4. Reader reads and writes memory


only through MemoryRegion

3. Map a MemorySpace to
a MemoryRegion

MemoryArea MemorySpace

2. Decides memory allocation policy


by size and threshold as Table 2.
22ndnd storage
storage
Table 2.
Request
RequestSize
Size>=
>=Threshold
Threshold

Request
RequestSize
Size<<Threshold
Threshold

Memory
MemoryPolicy
Policy

Using
Usingmemory
memorymapped
mappedI/O
I/O

Allocating
Allocatingdynamic
dynamicmemory
memory

Allocated
AllocatedMemorySpace
MemorySpaceSize
Size

Page
Pagealignment
alignment

As
Asrequested
requestedsize
size

Feature
Feature

Fast
Fastmemory
memoryread
readand
andwrite
write

Reducing
Reducingmemory
memoryfragments
fragments

http://code.google.com/p/mclinker

11/11/18

14

MCLinker

Das könnte Ihnen auch gefallen