Beruflich Dokumente
Kultur Dokumente
Brainf*ck
Tools
The BF Compiler
Stack Machines
LL(regex) parsers
Brainf*ck
Tools
LL(regex) parsers
n So this is a presentation about compiler design featuring a
URLs and References
Brainf*ck Compiler.
Code Generators
n Components of a compiler, overview
Tools
Stack Machines
n Designing and implementing parsers
The SPL Project
LL(regex) parsers
The BF Compiler
n Design and implementation of the Brainf*ck Compiler
Stack Machines
LL(regex) parsers
n Implementation of and code generation for stack machines
URLs and References
Brainf*ck
n .. should have a rough idea of how compilers are working.
Lexer and Parser
The BF Compiler
n .. should be able to implement code-generators for stack
Stack Machines
machines.
The SPL Project
LL(regex) parsers
Brainf*ck
● Overview
● Instructions
● Implementing "while"
● Implementing "x=y"
● Implementing "if"
● Functions Brainf*ck
Lexer and Parser
Code Generators
Tools
The BF Compiler
Stack Machines
LL(regex) parsers
Complex Code Generators n All other characters in the input are ignored.
The BF Compiler
Stack Machines
n A Brainfuck program has an implicit byte pointer which is free
The SPL Project
to move around within an array of 30000 bytes, initially all set
LL(regex) parsers
to zero. The pointer itself is initialized to point to the
URLs and References
beginning of this array.
Brainf*ck
> Increment the pointer. ++p;
● Overview
● Instructions < Decrement the pointer. --p;
● Implementing "while"
● Implementing "x=y"
● Implementing "if" + Increment the byte at the pointer. ++*p;
● Functions
The BF Compiler
, Input a byte and store it in the byte at the pointer.
Stack Machines *p = getchar();
The SPL Project
[ Jump forward past the matching ] if the byte at the pointer is zero.
LL(regex) parsers
while (*p) {
URLs and References
] Jump backward to the matching [ unless the byte at the pointer is zero.
}
Code Generators
<move pointer to a>
Tools
[
Complex Code Generators
<foobar>
The BF Compiler
<move pointer to a>
Stack Machines
]
The SPL Project
LL(regex) parsers
The BF Compiler
[ -
Stack Machines
<move pointer to x> +
The SPL Project
<move pointer to t> ]
LL(regex) parsers
Code Generators n The generated code may become huge if macros are used
Tools intensively.
Complex Code Generators
The BF Compiler
n So recursions must be implemented using explicit stacks.
Stack Machines
LL(regex) parsers
Brainf*ck
Code Generators
Tools
The BF Compiler
Stack Machines
LL(regex) parsers
Code Generators
Tools n Tokens may have additional attributes. E.g. the textual input
Complex Code Generators "123" may be transformed to the token TOKEN NUMBER with
The BF Compiler the integer value 123 attached to it.
Stack Machines
Code Generators
The BF Compiler
n Instead the parse-tree just defines the order in which
Stack Machines
so-called reduction functions are called.
The SPL Project
LL(regex) parsers
expression: sum;
Stack Machines
sum: product
The SPL Project
| sum ’+’ product { $$ = $1 + $3; }
LL(regex) parsers
| sum ’-’ product { $$ = $1 + $3; };
URLs and References
Tools
n Most hand written parsers are LL(1) parsers.
Complex Code Generators
The BF Compiler
Stack Machines
n Most parser generators create LALR(1) parsers.
The SPL Project
The BF Compiler
n Reduce-reduce conflicts should be avoided when writing the
Stack Machines
BNF.
The SPL Project
LL(regex) parsers
Brainf*ck
Code Generators
● Overview
● Simple Code Generators Code Generators
Tools
The BF Compiler
Stack Machines
LL(regex) parsers
Code Generators
● Overview
● Simple Code Generators
n Usually the code-generation is split up in different stages,
such as:
Tools
u Creating an Abstract-Syntax tree
Complex Code Generators
u Creating an intermediate code
The BF Compiler
u Creating the output code
Stack Machines
LL(regex) parsers
n A code-generator which creates assembler code is usually
URLs and References much easier to write than a code-generator creating binaries.
Code Generators
● Overview
● Simple Code Generators
n This is possible if no anonymous variables exist (BFC) or the
target machine is a stack-machine (SPL).
Tools
The BF Compiler
Example:
Stack Machines
if_stmt:
The SPL Project
TK_IF TK_ARGS_BEGIN TK_STRING TK_ARGS_END stmt
LL(regex) parsers
{
URLs and References
$$ = xprintf(0, 0, "%s{", debug_info());
$$ = xprintf($$, $5, "(#tmp_if)<#tmp_if>[-]"
"<%s>[-<#tmp_if>+]"
"<#tmp_if>[[-<%s>+]\n", $3, $3
$$ = xprintf($$, 0, "]}");
}
Brainf*ck
Code Generators
Tools
● Overview
Tools
● Flex / Lex
● Yacc / Bison
● Burg / iBurg
● PCCTS
The BF Compiler
Stack Machines
LL(regex) parsers
Code Generators
n Most of these tools cover the lexer/parser step only.
Tools
● Overview
● Flex / Lex n Most of these tools generate c-code from a declarative
language.
● Yacc / Bison
● Burg / iBurg
● PCCTS
The BF Compiler
Stack Machines n Use those tools but understand what they are doing!
The SPL Project
LL(regex) parsers
Code Generators
n The lex input file (*.l) is a list or regular expressions and
Tools
actions.
● Overview
● Flex / Lex
The BF Compiler
Stack Machines
n Most actions simply return the token to the parser.
The SPL Project
Code Generators
Tools
n Bison is a parser generator.
● Overview
● Flex / Lex
The BF Compiler
n The generated parser is a LALR(1) parser.
Stack Machines
Code Generators
n iBurg is a “Code Generator Generator”.
Tools
● Overview
● Flex / Lex n The code generator generated by iBurg implements the
“dynamic programming” algorithm.
● Yacc / Bison
● Burg / iBurg
● PCCTS
The BF Compiler
n It is a bit like a parser for an abstract syntax tree with an
Stack Machines extremely ambiguous BNF.
The SPL Project
LL(regex) parsers n The reductions have cost values applied and an iBurg code
URLs and References
generator chooses the cheapest fit.
Code Generators
n PCCTS is a parser generator for LL(k) parsers in C++.
Tools
● Overview
● Flex / Lex n The PCCTS toolkit was written by Terence J. Parr of the
MageLang Institute.
● Yacc / Bison
● Burg / iBurg
● PCCTS
The BF Compiler
n His current project is antlr 2 - a complete redesign of pccts,
Stack Machines written in Java, that generates Java or C++.
The SPL Project
Brainf*ck
Code Generators
The BF Compiler
Stack Machines
LL(regex) parsers
Code Generators
Tools
n However, I will try to give a rough overview of the topic and
explain the most important terms.
Complex Code Generators
● Overview
● Abstract syntax trees
● Intermediate representations
● Basic block analysis
● Backpatching
● Dynamic programming
● Optimizations
The BF Compiler
Stack Machines
LL(regex) parsers
Code Generators
n In compilers for such languages, an abstract syntax tree is
Tools
created from the parser.
Complex Code Generators
● Overview
● Abstract syntax trees
Stack Machines
LL(regex) parsers
Code Generators
Tools
n Usually the intermediate code is some kind of three-address
code assembler language.
Complex Code Generators
● Overview
● Abstract syntax trees
The BF Compiler
Stack Machines
n Intermediate representations which are easily converted to
The SPL Project
trees (such as functional approaches) are better for dynamic
LL(regex) parsers
programming, but are usually not optimal for ad-hoc code
URLs and References
generators.
Code Generators
Tools
n Optimizations in basic blocks are an entirely different class of
optimization than those which can be applied to a larger
Complex Code Generators
● Overview code block.
● Abstract syntax trees
● Intermediate representations
● Basic block analysis
The BF Compiler
programming.
Stack Machines
LL(regex) parsers
Code Generators
Tools
n This problem is solved by outputting a dummy target address
and fixing it later.
Complex Code Generators
● Overview
● Abstract syntax trees
The BF Compiler
n The Brainf*ck compiler doesn’t need backpatching because
Stack Machines
Brainf*ck doesn’t have jump instructions and addresses.
The SPL Project
LL(regex) parsers n However, the Brainf*ck runtime bundled with the compiler is
URLs and References
using backpatching to optimize the runtime speed.
Code Generators
Tools
n Code generators such as Burg and iBurg are implementing
the dynamic programming algorithm.
Complex Code Generators
● Overview
● Abstract syntax trees
The BF Compiler
n In the first phase, the tree is labeled to find the cheapest
Stack Machines
matches in the rule set (bottom-up).
The SPL Project
LL(regex) parsers n In the 2nd phase, the code for the cheapest solution is
URLs and References
generated (top-down).
Code Generators
Tools
n So most compilers don’t have a separate “the optimizer”
code path.
Complex Code Generators
● Overview
● Abstract syntax trees
u Peephole optimizations
Stack Machines
LL(regex) parsers
Brainf*ck
Code Generators
The BF Compiler
● Overview
● Assembler
● Compiler
● Running
● Implementation
Stack Machines
LL(regex) parsers
Code Generators
n The assembler handles variable names and manages the
pointer position.
Tools
The BF Compiler n The compiler reads BFC input files and creates assembler
code.
● Overview
● Assembler
● Compiler
● Running
● Implementation
Stack Machines
n The assembler has an ad-hoc lexer and parser.
The SPL Project
LL(regex) parsers n The compiler has a flex generated lexer and a bison
URLs and References
generated parser.
Lexer and Parser n The ] operator sets the pointer back to the position where it
Code Generators was at [.
Tools
Code Generators
n C-like expressions for =, +=, -=, if and while are available.
Tools
Complex Code Generators n Macros can be defined with macro x() { ... }.
The BF Compiler
● Overview
● Assembler
● Compiler
n All variables are passed using call-by-reference.
● Running
● Implementation
LL(regex) parsers
Code Generators
n So compilation is done by:
$ ./bfc < hanoi.bfc | ./bfa > hanoi.bf
Tools
Code: 53884 bytes, Data: 275 bytes.
Complex Code Generators
The BF Compiler
Stack Machines
LL(regex) parsers
Code Generators
Tools
The BF Compiler
● Overview
● Assembler
● Compiler
● Running
● Implementation
Stack Machines
Brainf*ck
Code Generators
The BF Compiler
Stack Machines
● Overview
● Example
LL(regex) parsers
Code Generators
Tools
n Every instruction pops it’s arguments from the stack and
pushes the result back on the stack.
Complex Code Generators
The BF Compiler
Brainf*ck
Code Generators
Tools
x = 5 * ( 3 + y );
Complex Code Generators
The BF Compiler
Stack Machines
PUSHC "5"
● Overview
● Example PUSHC "3"
The SPL Project
PUSH "y"
LL(regex) parsers
Brainf*ck
Code Generators
The BF Compiler
Stack Machines
LL(regex) parsers
Lexer and Parser n It has support for arrays, hashes, objects, perl regular
Code Generators expressions, etc. pp.
Tools
Complex Code Generators n The entire state of the virtual machine can be dumped at any
The BF Compiler time and execution of the program resumed later.
Stack Machines
LL(regex) parsers
n It’s possible to run pre-compiled binaries, program directly in
URLs and References
the VM assembly, use multi threading, step-debug programs,
etc. pp.
Code Generators
n It creates a state over the stateless HTTP protocol using the
Tools
dump/restore features of SPL.
Complex Code Generators
The BF Compiler n I.e. it is possible to print out an updated HTML page and
Stack Machines then call a function which “waits” for the user to do anything
The SPL Project and returns then.
● Overview
● WebSPL
● Example
LL(regex) parsers
n WebSPL is still missing some bindings for various SQL
URLs and References
implementations, XML and XSLT bindings, the WSF
(WebSPL Forms) library and some other stuff..
Brainf*ck
Code Generators
The BF Compiler
Stack Machines
LL(regex) parsers
● Overview
● Left recursions
● Example
Code Generators
Tools
n Usually parsers read lexemes (tokens) from a lexer.
Complex Code Generators
LL(regex) parsers
● Overview
n Usually LL(N) parsers are LL(1) parsers.
● Left recursions
● Example
URLs and References n LL(regex) parsers are LL parsers with no lexer but a regex
engine.
Brainf*ck
Code Generators
Tools
Stack Machines
Brainf*ck
Code Generators
The BF Compiler
Stack Machines
LL(regex) parsers
LL(regex) parsers
n “The Dragonbook”
URLs and References Compilers: Principles, Techniques and Tools
● URLs and References
by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman
Addison-Wesley 1986; ISBN 0-201-10088-6
http://www.clifford.at/papers/2004/compiler/