Sie sind auf Seite 1von 42

Software Engineering Essentials

Essentials of Software Engineering


With a Game Programming Focus

2 Preface
Version 1.0

Table of Contents
Preface

To The Reader
Prerequisites

3
4

Memory and Hardware


Simplified CPU Architecture and Cache Awareness
Prefetching
Additional Topics of Interesting

4
6
9
9

Data Structures
Additional References

10
10

Algorithms
Sorting
Sorting: Integers
Sorting: Strings
Sorting: Auxiliary Data

10
10
11
11
12

Searching
Recursion
BFS and DFS
Hashing
Root Finding
Non-linear Min/Maximization
No Derivative

14
14
14
15
16
17
19

Linear Algebra
Linear Transformations
Affine Transformations
Math Library
Dot Product
Planes and Lines
Lines
Interpolation
Bilinear Interpolation

20
21
22
24
25
27
28
29
30

Multi-Threading

30

Language Design
Lexer String Matching
Parsing
Code Generation
Type Reflection

31
31
32
33
34

Software Design and Architecture


File Dependencies and Messaging
API Design

35
36
37

Software Engineering Essentials

Example Layer Problem


Game Architecture
Compilation
Iteration
Memory Storage
Run-time Object Model
Specific Solutions

38
39
39
39
40
41
41

Preface
Learning software engineering takes time, patience, and never has to end. To become a solid
software engineer (at least to the authors personal standards) involves understanding ones
own limitations as a developer followed by a continuous grind to press those professional
boundaries to the next level. This implies hard work as another ingredient to the time and
patience recipe.
The goal of this PDF is to provide readers with a trustworthy compilation of information
essential to becoming a solid software engineer. Many topics referenced in this PDF are well
presented by other authors; for brevitys sake external works are often cited instead of directly
presenting such topics with new materials.
This PDF takes a focus on game programming, which means the language of choice is C/C++.
This does not mean readers interested in other applications of computer science will not
benefit from the read. On the contraire the author views problems in the game programming
domain rich, plentiful, and ideal for continuous professional challenge. Anecdotally: Elon Musk
has been seen recruiting engineers from the game industry for his SpaceX endeavors due to the
challenges game programmers are used to facing (circa 2014).

To The Reader
As a writer the author prefers to start each section with a high level overview. The overview
serves to connect various concepts together in order to allow readers to understand the
relationships of various topics. Readers interested only in direct application or technical
knowledge may feel free to skip the overview of each section.
This PDF was created upon commission. Special thanks to the commissioner, who wishes to
remain anonymous, for their generous support
This PDF will contain many opinionated statements without any empirical evidence. Readers
will just have to deal with it. The value of this PDF is in the authors professional merit; read:
YMMV. At the time this documented was created the author had no academic degree of any
kind. Readers are urged to take this document with a grain of salt and healthy dose of
skepticism.

4 Memory and Hardware

Prerequisites
Most of the material in this document aims to provide practical and intuitive knowledge. This
can be contrasted to the references to external materials this document makes; fundamentals
are largely assumed to be already familiar to the reader, such as:
1.
2.
3.
4.
5.
6.
7.

Structs, functions, arrays, pointers, basic C++ template usage


Some familiarity with the C++ standard library
Compiling, linking
Classes, member functions, inheritance, virtual dispatch
C stack and heap
Processes, threads, some familiarity with operating system concepts
Familiarity with trigonometry and basic differential calculus

Littered throughout the document are references to external materials and sources. If ever the
theory presented here becomes too abstract or doesnt make sense, perhaps a visit to another
source will be presented in a manner more suited to the reader.
Here are some recommend resources for the above topics:
C Programming, a Modern Approach 2nd Ed. By K. N. King
Any C++ book, preferably multiple of them since they are all terrible (in my opinion)
See 1.
See 2.
See 1.
Wikipedia searches, google searches. These topics are fairly simple. Perhaps try to find
an online course freely viewable to the public and read through lecture notes +
homeworks.
7. See 6.
1.
2.
3.
4.
5.
6.

Memory and Hardware


Code runs on hardware and hardware contains physical limitations. The limitations imposed by
hardware give rise to software engineering creativity. There are two major forms of software
engineering creativity: abstraction and algorithms.
Abstraction
Suppression of details by defining an alternative level of system interaction. In
this case the system is meant as whatever problem solving tools the engineer
has at his disposal. As an example assembly languages are abstracted by the C
programming language. The purpose of C is to generate assembly. Since
assembly languages are often specific to the hardware of which they operate

Software Engineering Essentials

upon an engineer may be required to learn many various forms of assembly to


write code on various pieces of hardware.
If a hardware vendor implements a C compiler, suddenly engineers writing C code can use the
vendors compiler to generate hardware-specific assembly. In this way C abstracts the specific
details of the assembly instructions behind a different layer of language constructs.
Algorithms
For the purposes of this PDF algorithms are the steps, code, or rules that an
engineer defines in order to solve problems. For example, C compilers often use
algorithms for reading source code files, turning individual keywords into tokens.
Algorithms are then used to turn tokens into data structures (like trees). These
structures are then converted, via algorithms, into assembly instructions. In
general algorithms are about reading data, transforming it, and outputting some
new data.
Many software engineers heavily focus on abstractions and as such spend little time actually
writing useful code. Abstraction should be treated as a dirty word. When a new programmer
first learns of abstractions an illusion is pulled over their nave view of power. The illusion is
that if a little more time is spent making code more generic, with another layer of indirection,
or more reusable it becomes 10x more powerful. Abstraction is an illusion of power.
The power that abstraction seems to provide comes from neglect of the details an
abstraction hides. If these details are not taken into consideration when they are important
then the abstraction has failed. In practice failure rate of abstraction is immensely high and as
such all code should be viewed through the lens of skepticism.

6 Memory and Hardware

Skepticism of code disallows any code to exist that cannot justify its own existence through
necessity. When applied heavily skepticism of code can prevent much over-engineering and
harmful abstractions. Theres an old saying that dumb code is smarter, meaning the simplest
code that gets the job done is the smartest. The engineer brain constantly attempts to solve
very difficult problems, so difficult that an engineers focus must be zeroed in on the algorithms
used to solve problems.
Distractions are costly, and abstractions muddy many (often important) details of the code. The
more time an engineer must sit and think what is this code really doing, the more likely it is
that abstractions are completely getting in the way of productivity. In short, abstraction makes
difficult problems impossible.
If an algorithm respects the hardware it runs upon a curious phenomenon occurs: often times
the code becomes simpler. How can this be? Respecting modern hardware often comes down
respecting the CPU cache. This translates to using contiguous arrays, which leads to straightforward for loops. If memory is stored in arrays engineers can reason about memory
consumption and worst-case memory footprints. This can lead to pre-allocation of memory for
later-to-run algorithms, resulting in overall simpler code (as opposed to a myriad of dynamically
allocated and inter-coupled objects).
The reality of hardware is that hardware likes arrays. Modern CPUs will prefetch memory
automatically when code runs over arrays linearly. The CPU likes small amounts of data that can
fit into its tiny cache.

Simplified CPU Architecture and Cache Awareness


A CPU has cores. Each core can run a thread (and some cores on certain CPUs can run multiple
threads simultaneously). See the Multi-Threading section for more details about threads.
Assembly instructions are fed to the CPU to be physically executed. Instructions are executed
by means of registers. A CPU register holds a piece of memory, commonly 32 or 64 bits of data.
Modern SIMD registers store 128 bits of data. Each register can be used by the CPU to execute
an instruction.
There is a limited amount of registers within a CPU, as they are very expensive to construct. If
possible CPUs would contain unlimited registers for infinitely fast execution. To store more data
than what exists within the registers the main memory (RAM) is used. RAM stands for random
access memory, which means the RAM can be accessed more or less like a giant array. When
the CPU puts data into a register, that data came from the RAM.
Pulling data in from memory into a register is a very slow operation, often hundreds or
thousands of times slower than executing simple instructions. Modern CPUs employ memory
caches. Conceptually a CPU memory cache acts like a closer to the CPU, but smaller, piece of

Software Engineering Essentials

RAM. Fetching memory from the cache is often an order of magnitude or so faster than
fetching from main memory.
The CPU loads memory into the cache via cache lines. A cache line is a small chunk of memory,
often 64 bytes in size. The entire cache consists of cache lines. Whenever a program requires
access to any data in main memory an entire cache line will be pulled into the cache. Cache
lines are aligned to memory boundaries according to their size. This means that if the program
executing requires a single byte of data from main memory, and this data does not already exist
in the cache, the CPU will pull in the cache line on the 64-byte boundary the single byte lays
upon.
When cache lines are pulled into cache, old cache lines are evicted from memory. Which line is
evicted depends on the implementation of the CPU, but will likely be the least recently used
cache line (or perhaps decided by some heuristic). The reader is referred to Naughty Dogs
Dogged Determination presentation for more CPU information.

The implication here is that whenever a cache line is read it is up to the programmer to try to
use as much of that cache line as possible. Even though we might have 8 gigabytes of RAM, if
we arent running in the cache the CPU will be sitting there waiting. Even if a single byte is read
from main memory and entire cache line will be fetched. Reading a single byte from a random

8 Memory and Hardware


location in main memory is about the worst possible way to use memory. Try to use every
single byte of every single cache line pulled in from memory.
This tends toward the idea of using very compact and concise data structures. If a data
structure is packed together in memory it can be operated upon by the CPU very quickly once it
arrives to the CPUs cache.
The cache isnt very big. Heres a nice slide by Scott Meyers on the topic:

32KB of L1 data cache is tiny. You dont even get to use all of it as the operating system does
need to do stuff too!
Here are some additional references:

CPU Cache and Why You care (video) Scott Meyers


What Every Programmer Should Know About Memory (pdf) Ulrich Drepper et. al.
Managing Data Relationship (blog post) Noel Llopis

Software Engineering Essentials

Prefetching
Prefetching exists to try to hide the latency of fetching memory and placing it into the CPUs
cache. A prefetch is when a cache line is preemptively fetched and placed into the cache, such
that when the memory is actually requested a cache hit occurs.
Hardware can detect patterns in real-memory accesses, but it can only detect pretty simple
patterns like array traversals. Hardware is made in such a way that it can detect iterating over
arrays forwards, backwards and with variable (but constant) element step size. It can also do
this for all hardware threads simultaneously. However, if youre not looping over an array you
cant count on any intelligent prefetching. It will take two or more cache misses in a
recognizable pattern to start automatic prefetching.
Usually compilers provide a specific keyword to hint to the run-time to grab a specific cache line
from somewhere in memory. This can be used by programmers to ease out a final bit of
performance, given a proper implementation to prefetch for.

Additional Topics of Interesting

Branch Prediction
Register Pressure
Custom allocators
False sharing
Pointer aliasing
CPU pipelining
Superscalar and SIMD architecture

10 Data Structures

Data Structures
Data structures are incredibly important. Learn them all, implement them all. Ideally be able to
implement each one on-demand (not necessarily super efficiently, but good enough to solve an
interview problem). Having a working knowledge of most data structures allows an engineer to
understand problems with more intellectual vocabulary.
There are a plethora of freely available resources on each of these topics. I encourage readers
to go forth and google search until their souls hunger no longer each of these topics:
Array
Linked List (link 1, link 2)
Trees
BST
Trie
Stack
Queue
Heap
Hash Table
Graph

Additional References

C Programming A Modern Approach 2nd Ed. By K. N. King


Wikipedia
Bitsquid Foundation library
Lua the programming language (advanced hash table implementation)

Algorithms
Sorting
In real life there are usually only a couple ways sorting goes down: std::sort is used or some
other library function is used. In very rare cases a custom sorting routine can be employed,
though honestly this probably will not be necessary 95% of the time.
The downside to sorting is that for some reason many big tech firms have asked me specifically
a bunch of annoying sorting algorithms. A solid software engineer would be out of his mind to
go off implementing complicated sorting algorithms from scratch when battle-tested, bug free,
efficient library implementations are readily available. Not only is writing from-scratch code
risky in terms of bugs and efficiency, its also an expensive time sink. Despite these arguments I
am consistently asked about sorting algorithms from companies like Microsoft, SpaceX, Google,
etc.

Software Engineering Essentials

11

Im not really sure what the fascination is here, especially when sorting algorithm questions are
bundled with big O notation questions, but the fact is since these questions are asked by
interviewers the ability to answer them is valuable.
Sorting: Integers
Integer sorting is actually an extremely common task. Integers can be used to index into arrays
or represent important information, and as such integers are very commonly sorted. Luckily
sorting integers is one of the simplest (if not the simplest) type of memory that can be sorted.
If in production code go ahead and include std::sort and sort your array of integers. For
interview questions I would recommend being able to produce, on the spot:

2-3 N^2 sorting algorithms (like bubble sort, selection sort, etc.)
Quick Sort or Merge Sort
Counting Sort (and Radix Sort)

These sorting algorithms are not too difficult to understand and subsequently memorize,
however it can be a huge annoying pain to do so. Stick with it, grind through the slow process of
practicing these algorithms and your interviews will likely pay off.
Sorting: Strings
String sorting is difficult to do in an efficient manner. Strings are defined as arrays of integers,
and depending on the format each individual character can be various lengths (like in UTF-8
encoding). Each individual string can also be variable length. The problem becomes sorting an
array of arrays of integers. If the strings are stored in nul-terminator format then each strings
length may be unknown until computed.
Unfortunately the C++ std library is rather ill-equipped with a good mechanism for high purpose
string sorting. std::string is capable of performing a silly amount of heap allocations and
deallocations during std::sorts swap loops. Pairing std::string and std::sort may be a good
general-purpose sorting routine, however when a more high-performance solution is required
alternative solutions must be used.

Make sure to research any companies you are applying to before-hand. Smaller companies are especially likely to
ask highly a highly specific set of questions. For example I interviewed at multiple small game studios (10-30
employees) where not a single sorting algorithm question was asked. Instead these smaller companies focused on
asking questions relevant to their studio culture, or founders style. All in all smaller companies seem to respond
keenly when I portrayed myself as a product-focused candidate, interesting in creating a great product.

12 Algorithms
Often times the best solution is to stop using strings and rethink the problem. Is sorting strings
really a necessity? Can the problem be transformed into a similar format requiring the sorting
of integers instead? If the answers are no some potential solutions may be:

Insert strings into a trie. A depth-first search will yield a lexicographically correct sorting
of the strings.
Implement a string radix/bucket sort.

However it must be stressed that most problems likely do not require highly efficient string
sorting algorithms. Engineers often just have a certain itch to implement complicated solutions
when much easier ones suffice, resulting in wasted time of implementation, high risk of bugs,
and in the end poor efficiency will likely arise either way. This is just the nature of complex
algorithms! Often times a simpler and specific approach, tailored to the exact problem to solve,
will end up highly efficient.
Sorting: Auxiliary Data
Often times data is sorted in some type of { key, value } format. When keys must be sorted
values may need to be sorted as well. Sorting key value tuples can be fairly tricky in C and C++
since C code must explicitly state, at time compile time, what kind of data will flow through the
code.
Some sort of code-generation mechanism can allow an engineer to use a sorting routine which
can be generated on the spot for an instance of a type of data. C++ templates are the obvious
choice and work quite well. Since std::sort is a template and can take an optional predicate
efficient sorts can be generated as the predicate will likely be in-lined during code generation
phase of compilation.

Software Engineering Essentials

13

Using std::sort in this manner will swap around both the keys and values in memory together,
assuming each are stored in a contiguous struct. Heres an example:
struct Pair
{
Key key;
Value value;
};
bool SortPredicate( const Pair& A, const Pair& B )
{
return Key_A_LessThan_B( A.key, B.key );
}
Pair pairs[ N ];
std::sort( pairs, pairs + N, SortPredicate );

An alternative method can be to construct an array of indices to the array of { key, value }
tuples. The predicate can use each index to lookup a { key, value } pair, like so:
bool SortPredicate( const& int iA, const& int iB )
{
Pair& A = Pair::instances[ iA ];
Pair& B = Pair::instances[ iB ];
return PairCompare( A, B );
}

It should be understood that when using C++ templates often C++-styled solutions are
necessary. This sorting predicate is a great example. In order to get std::sort to sort the array of
indices based on the Pair structs they represent a static member variable was employed to
allow the predicate to access the array of Pair instances. This solution reeks of C++-ism not
that this is necessarily a bad thing! It just ought to be understood.
If a different code generation technique were used perhaps code can be generated that looks
much more like so:
void QuickSort( int* indices, Pair* pairs, int pairCount )
{
// ...
if ( SortPredicate( A, B ) )
Swap( indices[ 0 ], indices[ 1 ] );
// ...
}

In the above example the predicate is inlined into the sorting routine, and the routine has an
understanding of the relationship between the indices array and the pairs array. Since this
solution intimately understands the specific data it is sorting a simple solution arises, one
without the need for static member variables, references, or any other C++-isms. Implementing

14 Searching
a code generator, or finding a suitable one, can be a time-consuming task. In the end the style
of code-gen used will be best chosen based on the project at hand, along with the problems the
project demand be solved.

Searching
Most of the time searching involves the traversal of data structures. Depending on the project
searching mathematic functions can also come up. If an engineer has solid knowledge of data
structures, the problem at hand, and the relationships between data, coming up with efficientenough search algorithms will likely be only a matter of time.

Recursion
Recursive searching operations are usually good for on-the-fly debugging routines. For example
say a tree structure is used to hierarchically organize some data. In games programming
common use cases are for graphics scene organization, and physics scene organization. Another
example could be to represent file modifications as a tree of file-sections. This would allow for
colliding files together to search for merge conflicts for source control software.
The reason recursion is stated as useful for debugging is: recursive search techniques are often
fairly quick and easy to get up and running in a robust manner. Each recursive function call uses
the stack to allocate space to push old registers, space for local variables, etc. This alleviates the
need for the user to manage any memory manually.
However, moving data around on the stack can consume quite a bit of memory during recursive
searching, and in many cases can lead to a stability risk as stack space can be exhausted. The
alternative is to use manual memory handling techniques, such as pre-allocated memory and
stack-based data structures.
Other times recursion is useful is for creating small parsing tools, or doing other kinds of datapreprocessing. Preprocessing steps typically do not occur when clients use shipped software
and can be run off-line, and as such dont usually need to be too performant.
Very often interviewers ask recursion-related programming problems, so knowing how to
search through a tree, or graph, are highly valuable skills. Common approaches to recursive
searching are through breadth-first searches, and depth-first searches.

BFS and DFS


BFS stands for breadth-first search, while DFS stands for depth-first search. Either can be
implemented recursively or iteratively. There are a million resources on these topics, most of
which explain these concepts exhaustively. A quick google search will yield all the information
anyone could ever hope for on these topics.

Software Engineering Essentials

15

My favorite of the searches mentioned here would be the iterative DFS. More often than not
the DFS will behave in a more cache friendly way than compared to the BFS due to the orders of
traversal and layouts of the traversed data structures. Heres some pseudo code (which depicts
code I have written in production-ready projects):
#define Push( node ) stack[ stack_pointer++ ] = node
#define Pop( ) stack[ --stack_pointer ]
Node* DFS( Node* graph )
{
const int N = 256;
int stack_pointer = 0;
Node* stack[ N ];
Push( graph );
while ( stack_pointer )
{
Node* seed = Pop( );
if ( Done( seed ) )
{
ClearState( graph );
return seed;
}
seed->visited = 1;
Edge* edge = seed->edge_list;
while ( edge )
{
Node* B = edge->B;
if ( !B->visited )
Push( B );
}
}
}

Hashing
From Wikipedia:
A hash function is any function that can be used to map data of arbitrary size to
data of fixed size.
The way I interpret this is: transform any data to an integer. This is useful since integers can
index into arrays. Hashing strings is a very common operation, and so Id like to recommend
two of my favorite string hashing algorithms: dbj2 and FNV-1a. These two hashing algorithms

16 Searching
are straightforward, dont require much code to implement, and can easily be inserted into preexisting projects easily. They are also good enough for most applications.
However most large code bases Ive seen employ a CRC-32 hashing algorithm of some kind, for
example the Unreal Engine can be seen to contain an implementation within crc.h where the
MemCrc32 function seems to be used quite often throughout Unreal.

Root Finding
Sometimes projects require simulations of some kind that involve finding the roots of two
dimensional functions of the form f( x ) = y. These roots are typically defined as the points of
the function that are zero valued (i.e. intersect the y axis along y = 0). Some examples of
situations where this can be useful are:

Vehicle dynamics simulation


Pre-calculating shader parameters
Continuous collision detection
Statistics

More often than not functions in-question are linear, or quadratic, or simple enough that the
Newton-Raphson method will work well. Specifically if assumptions made in the Proof of
Quadratic Convergence are met then Newtons method will converge.
Newtons method works by using a tangent line approximation of a curve, intersecting the
tangent with the y-axis at 0, and using this point as the starting point of the next iteration.
Wikipedia has a wonderful animation showing the algorithm converge (shown on the right of
the previous link).
If the derivative is not known or hard to calculate the secant method can be used instead. A
secant is a line that intersects two points of a curve. If these two points are close enough
together they can themselves approximate the derivative of the function in question.

Software Engineering Essentials

17

Non-linear Min/Maximization
If one is lucky enough to be dealing with a non-linear function that has an easy to compute
derivative finding minima/maxima can make use of searching for when the derivative function
is equal to zero.
In this case a root finding algorithm can be used to search for a local minima/maxima be
inspecting the derivative and treating df/dx( x ) = 0 as the root. However, in order to do so the
minima/maxima must first be bracketed.

Imagine our non-linear function looks like the curve above. To bracket the minimum means to
find points 1 and 4 (known as sample points) such that it is known that at least one minimum is
between each sample point.
Bracketing algorithms are conceptually simple though care must be taken during
implementation! Imagine we start with sample point 1, and move along the derivative to
sample point 2. We now know our derivative increased, but is still negative; we have traveled
downhill. Onward to sample point 3, derivative increased, but still negative; downhill again.
Finally we arrive at sample point 4, of which the derivative is positive. It is now known that at
least one minimum exists somewhere between sample points 3 and 4, and in most cases
theres likely only a single minimum.
Once bracketed the sample points 3 and 4 can be handed off to a root-finding algorithm to
iteratively refine the derivative towards 0. Once close enough, as defined by some error
tolerance (typically a hard-coded value) the minimum is found.

18 Searching
The same kind of technique can be applied to finding maxima. However, how can it be known
that the current minimum or maximum is the global minimum, or the global maximum? Well,
the short answer is: its not possible to know without searching the entire non-linear space
(unless special knowledge is known a-priori about the function).

Say our function looks like the above cross-section. This function is quite non-linear and
contains various minima. Typically the previous techniques can be used to find any particular
minimum given a starting point, however it can be difficult to determine if a minimum is the
global minimum. One technique for searching for the global minimum is called simulated
annealing.
Annealing is a term from metallurgy. The idea is that if a metal is molten and allowed to cool
down very quickly all of the molecules or atoms will be aligned in random orientations resulting
in a brittle metal. If the metal cools slower the atoms have time to align themselves in a more
orderly fashion, thus relieving internal stresses and resulting in less brittle metal.
We can take the concept of annealing and apply it to sample points. Imagine finding a minimum
like the previously mentioned bracket + search technique. This is analogue to tossing a ball
randomly into the cross-section. The ball will land somewhere, probably in a local minimum (as
opposed to the global minimum). Annealing is like violently shaking the cross-section so the ball
bounces out of the first minimum, and lands somewhere else. As this continues we shake less
and less violently, until the ball (likely) settles into the global minimum.
Simulated annealing can make use of a virtual temperature, which would represent a radius
of which sample points can be taken from. The temperature can then be dialed down over
multiple searches, while keeping track of the best minimum. Once cooled completely the
annealing process can be repeated as long as necessary to feel comfortable with the best found
minimum, at which is then asserted as the global minimum (on good faith).

Software Engineering Essentials

19

For more information readers are pointed to Timothy Masters Practical Neural Network
Recipes for more information on non-linear optimization, and annealing.
No Derivative
If one is unlucky enough to not have the luxury of a derivative function at hand secants can be
used to approximate the derivative. Care must be taken during implementation to bracket a
minimum without making any false assumptions! Often times this kind of
minimization/maximization is only useful if a lot of information about the function at hand is
available.
For example let us inspect Pacejkas magic formula, which is useful for car tire simulation. The
function takes in the slip ratio and output an expected force exerted onto the ground by the
tire. Here is a graph of a typical (simplified) Pacejka function:

https://en.wikipedia.org/wiki/Hans_B._Pacejka

Pacejkas magic formula, for professional car simulation purposes, can take very many coefficient parameters.
For many game-ish simulations the simplified model can be used which takes 4 tunable coefficients parameters
(as shown in the Wikipedia image above).

20 Linear Algebra

This function is defined as:


F( x ) = D * sin( C * atan( B * x E * ( B * x atan( B * x ) ) ) )
If we assign b = B * x we arrive at:
F( b ) = D * sin( C * atan( b E * ( b atan( b ) ) ) )
It can easily be seen the minimum and maximum both reside at y = 1 and y = -1 respectively.
This allows one to calculate an error from the minimum or maximum! If error can be known
then the maximal points can be searched for. It is also easy to see there are only one maximum
and one minimum, which makes for the initial bracketing algorithm to make various
assumptions!
Once bracketed the bracket interval can be iteratively refined via methods similar to bisection
until convergence with respect to a tolerance is achieved.

Linear Algebra
Games are made of 2D or 3D spaces. These spaces are defined by the mathematics of linear
algebra. Linear algebra can provide an engineer with an intellectual vocabulary to think and
reason about the space the game occupies. This ability is invaluable for defining how things
move, interact with one another, and in general operate within a space.
Readers are heavily urged to get the book Essential Mathematics by Van Verth. This book
covers all critical mathematics required for game development and is presented spectacularly!
My writings here will not do the readers justice on their own, and should be thought of as
supplements to a more well-rounded educational foundation.

Software Engineering Essentials

21

Linear Transformations
A linear transformation is one that preserves arrangements of lines. For example, if line A and
line B do not intersect and undergo a linear transformation, the results A and B will not
intersect.
Linear transformations in 3D can be represented as a 3x3 matrix. Often times these matrices
can be used to transform points or vectors like so:

] { } = { }

If we define the matrix itself A and the vector as v we can rewrite the above equation as:
=
If we have two matrices A and B, both of which are linear transformations, we can compose the
matrix C such that:
=

( ) = =
In words: C will transform v the same was as (A * B). Common linear transformation operations
are (but not limited to): rotation, scaling, shear, flipping (or mirroring), etc.
These properties can be used to compose trees of transformations in order to form a
hierarchical animation model. In the case of animations matrices themselves can be thought of
as bones in skeletal animation.

However linear transformations alone arent quite enough to implement full skeletal animation,
or implement other interesting transformations.

22 Linear Algebra

Affine Transformations
Affine transformations can be used to linearly transform, followed by an affine transform.
Affine transformations can change the positions of points through translation, whereas linear
transformations preserve linear arrangements.
An affine transformation in 3D has the form:
+ =
In the above equation A * v linearly transforms the vector (or point) v, and then adds a vector b
to the result, thus resulting in a new v value (which is a point). If readers are confused by the
difference between vectors and points a quick google search should remedy the confusion! In
short: vectors are directions and points are locations. Knowing this one can, in a hand-wavy
way, say: linear transformations modify vectors while affine transformations modify points.
Most modern graphics software, and many older education graphics materials, use 4x4
matrices, or 4x4 matrix notation. We can view a 4x4 matrix as an affine transformation. Let A
be a 3x3 matrix and b be a vector, we then have:
[

{0 0 0}

]=
1

M is a 4x4 matrix, and can be used to transform 4-component vectors. In this 4-dimensional
linear algebra notation 3D vectors are represented as:

={ }

0
While a point would have the number 1 in place of the 0. What is the reason for this strange
notation? The answer is: a fundamental assertion of linear algebra is that a non-linear
transformation in dimensions N is a linear transformation in dimensions N+1.

Software Engineering Essentials

23

This is useful when trying to describe view matrices (such as a projection matrix), which are
non-linear in 3D. Once lifted to the fourth dimension the problem can be treated in a linear
manner, and once completed, can be then projected back down to the third dimension. This
can be thought of as a qualitative description of homogenous coordinates.
In regards to skeletal animation from the previous section:

24 Linear Algebra

Math Library
Do not use templates in your math libraries. Templates will generate tons of code in every
single translation unit that the compiler operates upon. Once all this code-gen is finished the
linker has to painstakingly churn through the turmoil of duplicate symbols a bajillion times!
Just write out your specific math functions as needed. Solve specific problems that the project
actually demands.
I recommend taking Erin Cattos lead in his Box2D library and write a math library to model
affine transformations like so (very similar code can be seen in other very popular physics
engines, both professional and open-source):
struct Rotation
{
// Rows of matrix, aka orthonormal basis
Vec3 x;
Vec3 y;
Vec3 z;
};
struct Transform
{
// linear component (3x3 matrix)
// can optionally contain scaling
Rotation r;
// affine component
Vec3 p;
};
// performs and affine transformation
// A * v + b = v'
// where A = t, b = t.p
Vec3 Mul( Transform t, Vec3 v )
{
return Mul( t.r, v ) + t.p;
}
// "transposed", or inverse affine transformation
// v' = A^T * (v - b)
Vec3 MulT( Transform t, Vec3 v )
{
return MulT( t.r, v - b );
}

Here is good resource by Richard Mitton on constructing solid math libraries, which also takes
into account SIMD (more on SIMD later) instructions.

Software Engineering Essentials

25

Dot Product
The dot product comes from the law of cosines. Heres the formula:
2 = 2 + 2 2

(1)

This is just an equation that relates the cosine of an angle within a triangle to its various side
lengths a, b and c. The Wikipedia page (link above) does a nice job of explaining this.
Equation (1) can be rewritten as:
2 2 2 = 2

(2)

The right hand side equation (2) is interesting! Lets say that instead of writing the equation with
side lengths a, b and c, it is written with two vectors: u and v. The third side can be represented
as u v. Re-writing equation (2) in vector notation yields:
| |2 ||2 ||2 = 2||||

(3)

Which can be expressed in scalar form as:


2

( )2 + ( ) + ( )2 (2 + 2 + 2 )
(2 + 2 + 2 ) = 2||||

(4)

Crossing out some redundant terms, and getting rid of the -2 on each side of the equation, this
ugly equation can be turned into a much more approachable version:
+ + = ||||

(5)

Equation (5) is the equation for the dot product. If both u and v are unit vectors then the
equation will simplify to:
( , ) =

(6)

If u and v are not unit vectors equation (5) says that the dot product between both vectors is
equal to cos( ) that has been scaled by the lengths of u and v. This is a nice thing to know! For
example: the squared length of a vector is just itself dotted with itself.

26 Linear Algebra
If u is a unit vector and v is not, then dot( u, v ) will return the distance in which v travels in
the u direction. Heres a slide from a slideshow I created some time ago about this property:

This is useful for understanding the plane equation in three dimensions (or any other
dimension):
+ + = 0

(7)

The normal of a plane would be the vector: { a, b, c }. If this normal is a unit vector, then d
represents the distance to the plane from the origin. If the normal is not a unit vector then d is
scaled by the length of the normal.
To compute the distance of a point to this plane any point can be substituted into the plane
equation, assuming the normal of the plane equation is of unit length. This operation is
computing the distance along the normal a given point travels. The subtraction by d can be
viewed as translating the plane to the origin in order to convert the distance along the
normal, to a distance to the plane.

More on planes and lines in the next section.

Software Engineering Essentials

27

Planes and Lines


The equation of a plane in 2D looks like (which can be thought of as 2D lines):
+ = 0
In 3D we add on the z component and rename the c from 2D to d in 3D:
+ + = 0
Hopefully something has jumped out at the readers If not, heres another slide I made some
time ago:

Planes are a dot product! Along with this d term. { a, b, c } is called the plane normal. { x, y, z
} is any point on the plane. If we put any numbers into { x, y, z }, and then subtract d from the
dot product with the normal, we will end up with a point on the plane. In this way d can be
thought of as representing distance of the plane to the origin.
Since d = ax + by + dc, d is equal to the normal dotted with the inputs. By definition of the dot
product, this means d is scaled by the length of the normal. If the normal is unit length d
represents distance to the origin in normalized units.
These facts allow us to piece together an algorithm to take any point and project it onto the
surface of any plane (heres another one of my old slides):

(NSFW) https://www.youtube.com/watch?v=OoAlf0-U7EA

28 Linear Algebra

The above slide is describing how to compute a vector (shown in red) that can be used to take P
straight to the plane by a translation. If we subtract this vector from P, P is translated to the
plane (first equation in the slide).
If the reader has been diligently following along the reader ought to be able to construct an
algorithm to rotate and translate a plane! Translating a plane can be done by adjusting d, and
rotating a plane can be done by adjusting the normal. In this way the normal can be thought of
as the linear component, and d can be thought of as the affine component.
Finding the intersection point of two 2D planes involves taking two 2D plane equations, setting
them equal to one another, and solving for the point of intersection. Its a straight-forward two
equations two unknown scenario, and can be solved in-code directly via algebraic substitution.
Lines
In 3D lines commonly take on the vector form. Vector form is nice since a vector can be used to
describe the lines direction. A point in 3D space on the line (any point) can be used to fix the
lines direction to a location. In this way the lines direction vector becomes bound.
Let the aforementioned point be named P and the direction vector by name D. We then have:
= +
Where L is the line and t is a scalar parameter (or in other words, just a floating point number).
If we multiply D with t we end up stretching, shrinking, or flipping D. If we add D * t with P, we
slide P along the D direction. In this way the equation can be used to describe any point on the
line L.

Software Engineering Essentials

29

Interpolation
Interpolation is documented all over the place! Some good terms to google search for are:
LERP, bilinear-interpolation, SLERP, Bezier curves, splines, etc.
Heres an old post of mine describing splines, lerp, and cubic Bezier curves. Unfortunately I
couldnt find a good resource on linear interpolation (lerp) from a quick google search, so here
goes a quick explanation:
Recall from the earlier section Planes and Lines under the Lines section, vector form of a 3D line
looks like:
( ) = +
Where L is the line, D is a direction vector, P is a point fixed on L, and t is the scalar parameter.
Now lets say we have two points A and B instead of just one single point, and lets restrict t to
the (open) interval [0, 1]. We then have:
= + (1 )
This equation can be rearranged to form:
= ( ) +
In the above equation we see the expression B A, which results in a vector. This vector points
from A to the direction of B (when we fix it to the point A via + operator), and is scaled by t. The
t value determines how far along the interval of A to B the point P resides upon.
The last two equations are called lerp, which stands for linear interpolation. Lerp lets an
engineer define points between a range of two other points. Points are not the only type of
data that can be lerpd, scalars, integers, and pretty much whatever else the heart desires can
be lerpd. Lerp returns values between a range, which is a form of interpolation. Whereas
extrapolation can return a value outside of a range. If we use lerp with a value of t outside the
interval [0, 1] we would be performing an extrapolation. When I mentioned the concept of
tangent line approximation in the root finding section, this approximation is a form of
extrapolation.

30 Multi-Threading
Bilinear Interpolation
Bilinear interpolation done by:

Lerp from A to B, call the result P


Lerp from C to D, call the result Q
Lerp from P to Q, this is the result of bilinear interpolation

Bilinear interpolation is used in modern graphics cards to pull colors out of a rectangular
texture. The idea is to lerp along the x axis of the texture, then the y axis. Then lerp between
these two values to produce an intermediary color. Usually the lerp operations are performed
on a 2x2 square of pixels. This is called bilinear filtering. Modern GPUs achieve blazing fast
filtering speeds by doing each of these bilinear interpolations all in parallel to one another, of
which are likely implemented in hardware directly.
By understanding this one can implement a quick and dirty bilinear filtering scheme for pixel-art
games in 2D. Say a game is setup to have a perfect 1:1 texel to pixel ratio. In a shader it can be
quite easy to offset each pixel by { 0.5f, 0.5f } to trigger bilinear filtering across the entire
screen, which will happen for free since the GPU will be doing bilinear filtering in the
hardware itself.
{ 0.5f, 0 } and { 0. 0.5f } can be used to trigger filtering along a specific axis.

Multi-Threading
In my highly opinionated opinion there are no good resources available anywhere on good
ways to go about multi-threading. There just arent. Why? Who knows! Perhaps the dudes who
are good at multi-threading are so dang tired after work every day they have no energy to write
documents like this one. Or perhaps they get so tired and rich (rich because they are paid
oodles of money for being good at something so ridiculously hard) they just retire and leave the
rest of us in the dust.
Jokes aside I firmly believe theres only one good way to multi-thread code, and thats through
a job system. Forget locks, semaphores, turnstiles, condition variables, critical sections,
atomics, or anything else related to multi-threading. The job pool (aka thread queue, or job
system) is by far my preferred method (at least for now until someone comes up with
something better).
Rather than poor into the details, yet again, I will leave the readers with a link to a presentation
on this topic that I gave in the past: link here. Implementation probably requires a semaphore,
so heres the best resource on semaphores.

Software Engineering Essentials

31

Language Design
Ive found that language design has shaped my viewpoint on what good code is. I would define
language design as the concepts/algorithms revolved around compiling code into machine
language, as well as the design the languages grammar.
Von Neumann architecture is about taking machine instructions, feeding them to a processor
which executes them and produces an output. In this way code (instructions) is just a type of
data, freely modifiable. As long as the correct sequence of bits is fed to the CPU, the program
will execute accordingly.
Understanding this can allow an engineer to increase productivity by thinking about the code
they are writing, and how to leverage its patterns. For example, an engineer could write a code
parser that interprets code. Once achieved, the parser transform the code into another format
(perhaps an abstract syntax tree), one of which shows useful or interesting information not
readily available from the raw code format.
In turn the parser output can then be transformed back into new code. This type of code
generation can be used to fuel very implementation-heavy tasks that might normally take a lot
of tedious and monotonous work on part of the engineer.

Lexer String Matching


A lexer is a program that performs lexical analysis, and may also be referred to as a tokenizer.
The job of the lexer is to produce tokens. A token is a meaningful representation of a lexeme. A
lexeme is just a piece of raw code, like a keyword or function name. Tokens are usually
implemented as an integer, or enumeration. Therefore a good way to implement a lexer is to
scan source code and output integral tokens that represent each chunk of meaningful codedata.
Writing a lexer can range from very simple to very complicated depending on the features
required for a given language. For C-style languages Sean T. Barret has implemented an
absolutely wonderful lexer, perfect for use as a reference when constructing a productionready lexer. Here is the source to Seans lexer.
There exist tools to generate lexers, such as the tool lex (which is commonly used with the
parser generator yacc. However writing a custom lexer can be quite straight-forward and not
too time consuming, if the language of choice is a C-style language.
For Seans implementation style the concept is to store a small lex_state struct that keeps track
of an offset into the source file, along with a couple other small data members. A single
function, perhaps called Lex_Next( lex_state* state ) is used to grab the next token from the
source file.

32 Language Design

Parsing
Parsing is a topic people typically tend to get very passionate about, and so I wont say too
much here there are infinite resources available at the tip of a google search. I will say my
preferred method if parsing is with a recursive descent parser.
In short a recursive descent parser will look at a single token at a time, along with the next
token coming directly after the current token. The language being parsed should a-priori be
designed such that at any given moment the meaning of the two tokens is enough to deduce
exactly where in the language grammar the parser resides.
A typical hand-written recursive descent parser will be very efficient (likely much more so than
a generated parser), and can be extremely fun (and educational) to implement.
I recommend staying away from backtracking under the argument that it just isnt necessary,
along with any other kind of parsing scheme. Again, these are just my own personal conclusions
as I dont see more complicated or exotic grammars any better or necessary compared to a
straight-forward language design that can be parsed with an LL(1) recursive descent parser.
One of my favorite resources of what is, in my opinion, a pretty good parser to study is the one
lua has equipped itself with. Here is a paper describing the implementation of Lua 5.0. Luas
source code is freely viewable online.
One exceptionally practical implementation of lexing and parsing
is within the Handmade Hero video series. Search for the videos
about meta-programming.
If anyone wants more reading on implementing compilers I would
suggest Compilers by Aho et. al. I highly recommend getting the
older edition from 1988 that looks like this image on the right. The
older edition contains C code (unlike the newer ones with verbose
Java), and I find it overall much easier to read and more practical.
Ive read maybe 6 or 7 other compiler books and honestly none of
the others even come close!
One word of practicality, if implementing a custom parser for a C-style language Ive found that
parsing expressions to be the most difficult thing about it. Ive written my experiences on this
topic here.

Software Engineering Essentials

33

Code Generation
Once card is parsed the real juicy information can be known. Information about types, the form
of the code, and pretty much anything else of interest is readily available. The matter of how to
store this information is not a very easy matter to decide, however. Should the parser output
new code directly? Should an abstract syntax tree be formed? Should some kind of symbol
table be present? What data structures should be used to represent the code? How should
these data structures be transformed? These are all open questions that I myself do not have
the answers to.
However I can share my own practical experiences in hopes someone else can figure out
something more on this topic than myself.

34 Language Design
Type Reflection
Without a doubt type reflection is a must-have for any large project. The ability to for code to
intimately understand the data it operates upon allows for automation of very tedious and
repetitive tasks, the big one being serialization.
Unfortunately type reflection isnt really a thing in C or C++ and we are all filled with hatred and
lament at this fact. Ideally it would be invaluable to be able to write this in C:
struct Player
{
float x;
float y;
};
member_t* m = Player.members;
for ( int i = 0; i < countof( m ); ++i ) { /* ... */ }

The reason being is that writing a general purpose serializer (albeit a simple one without many
features) could be as easy as:
void Serialize( void* memory, type_t* t, FILE* fp )
{
if ( t->members )
{
member_t* m = t->members;
for ( int i = 0; i < countof( m ); ++i )
{
member_t* m0 = m + i;
Serialize( offset( memory, m0 ), m0->type, fp );
}
}
else
{
switch ( t->typeid )
{
case _INT_T:
fprintf( fp, "%d", memory );
/* other cases ... */
}
}
}

Software Engineering Essentials

35

Where the member_t and type_t could look something like this:
struct type_t
{
char* name;
int size;
member_t* members;
};
struct member_t
{
type_t* type;
char* name;
int offset;
};

Equipped with a lexer and parser an engineer could quite easily generate the above code
definitions and instantiate various type_t and member_t instances for all the data types present
within the code. New types of data can even be defined and described at run-time!
Converting between different versions of data can also make sense. An algorithm to do so
would loop over the member_ts of version A (older) and mutate them into an instance of
version Bs members.
Ive seen these tactics implemented at various game studios, often with custom technology,
ranging from fairly simple to way over-engineered. Here is the best introduction to this topic I
know of.

Software Design and Architecture


File dependencies are often the bane of large projects. Each individual file will likely be
compiled as a separate translation unit. This sucks as each file probably pulls in a ton of
common code shared across all translation units by means of headers. stdio.h is a good
example.
All these headers do is setup symbols for the compiler in order to perform program linkage (and
to verify program correctness). This means that a precompiled header can be used as a hacky
solution. Common headers can be packaged up into a special translation unit and compiled
once. All other translation units can just copy-paste the precompiled header unit at the top of
their own unit, thus massively reducing compile times for all translation units.
I call this a hack because its just a band-aid solution. The real problem is that compiling code
in tons of different translation units is silly! Sure, having multiple files is great, but actually
separating code into different translation units should probably not happen nearly as often as it
does in the modern world.

36 Software Design and Architecture


The argument in favor of many translation units is: we have cool build tools that let us
compile only the translation units that were modified! Tools like make, or msbuild. Well to that
I say Guphaw! Have fun with your fancy build tools when you require a dedicated salaried
engineer to maintain them. Ill happily stick with my own unity build for as long as possible.
I can say Ive seen this type of thing happen at one studio Ive been to, and I sure loved it. Ive
also done it on my own personal projects which gave me great personal happiness. As for a
much larger codebase, unfortunately I just dont have the experience to back myself up here.

File Dependencies and Messaging


Messaging is another one of those topics people seem to get very passionate about.
Unfortunately whenever a bunch of programmers get passionate about something they often
obfuscate and over-engineer unwieldy solutions, and messaging is one such case.
Messaging is just about sending data from part of the program to another. This can be across
various, across function calls, across time, or across a network (or across anything else you can
think of). Thats it. The simplest form of messaging might be considered as a plain old C-style
function call, which incurs some kind of jump assembly instruction, along with some stack
allocation.
In larger projects with many lines of code often times its best to make sure lots of data can be
sent over a more opaque connection compared to that of a plain function call. This allows
various bits of code to not require explicit data type declarations (like those seen in a header),
and instead can focus more on the algorithm of sending the data, often as a stream of bits.
Here is an article I wrote a while ago with a decent introduction to the concept of messaging.
Much more discussion about how to implement messaging is a bit beyond the scope of this
document as the author does not really take an interesting in things much more complicated
than a function call.

Software Engineering Essentials

37

API Design
API design is wicked difficult. Im not too great at it myself! Please, readers, equip yourself with
some of Casey Muratoris API design knowledge! Casey may be one of the best people on the
planet at designing effective APIs. Casey implemented the Granny SDK for game development,
which is wildly successful and broadly regarded as a great API.
In general it seems there are a few different kinds of APIs that Casey outlined in his video (link
above), take a look at one of his slides:

The layer portion would describe code that sits on top of a rock-solid (never changing)
service, like a piece of hardware, perhaps a GPU (i.e. OpenGL or DirectX).
An engine would be what most people are developing: software that solves a consumers
problems, and often makes use of a bunch of pre-existing tools and paradigms. Examples would
be Unity, Unreal, etc. Another example is a new engineer coming into a company with a big preexisting codebase. That engineer will be writing new code in his small little box, whereas the
huge reused box will be the companys code.
Finally we have component which is defining an input and output to the user. Examples could
be physics middleware, or the Granny SDK itself. In all instances it can be extremely difficult to
decide what goes where, and to degree of control to expose things to the user. Please, watch
Caseys video! He will do much more justice than I, as I am a noob compared to him.
If possible I recommend readers getting a hold of tried and tested APIs like Granny SDK, or
Havok physics SDK. Unreal is open source and also a great way to learn about the Unreal API.

38 Software Design and Architecture


Study the sources of these products (or other similar products) and try to form opinions.
Another good example is the Maya SDK.
Example Layer Problem
In an attempt to provide readers with some food for thought below is an example scenario of
implementing a layer API.
Say we are implementing an OS layer that abstracts underlying hardware, and exposes sys calls
to the user. Perhaps we have written the file abstraction and implemented posix sys calls such
as open, close, flush, and write. Our job is to write some documentation on these sys calls after
implementing them. Here is some pseudo documentation:

int open( const char *pathname, int flags )

o returns a file descriptor given a path

void close( int filedes )

o closes a file given a file descriptor


And so on and so forth for the other functions. One day a user submits a bug report saying they
cannot figure out WTF is up with this code. It crashes on the close syscall:
int fd = open( path, flags );
write( fd, memory, count );
// ...
close( fd ); // assert( fp != 0 ); !!!

What might the problem be? The user clearly opened, wrote and closed a file. All seems well,
right? close is crashing with a fairly cryptic assert message of fp != 0. Can you imagine what the
problem may be?
In this case the user forgot to call flush to submit the changes to the file, and as such the OS file
is asserting due to their internal pointer residing at 0. The user couldnt quite figure out what
the problem was because the internal code had a fairly cryptic assert, and the user didnt know
how close was implemented.
However, the implementer of close (us) was completely happy with the use the assert. It
immediately caught the user bug, and was also valuable during development to assert
implementation correctness.
Should an exception have been thrown by close, so the user could catch the exception and read
the error themselves? If thats the case then exceptions need to be enabled by all users wishing
to use close. Should a more verbose assert have existed? Perhaps. Should an error message be
returned by close? If so these error messages need to be documented, and this requires the

Software Engineering Essentials

39

user to read the documentation and then correctly implementing retrieving + decrypting the
error message.
All in all there is no best solution as they all have tradeoffs. This is API design. Its difficult.

Game Architecture
In this section I will attempt to take readers on a small thought-experiment of designing the
architecture for a pretend platformer game.
Personally I am not a fan of game engine architectures. Currently my opinion is that
architecture gets in the way of making a game. To that end I dont even really believe in game
engines. I believe in writing a custom codebase (while reusing some common pieces) tailored to
solve specific problems of the game.
This preference only works well if the core design of the game is known from project start, and
rapid iteration is achieved (somehow) in the early stages of the project. The rapid iteration
needs to have a smooth transition into shipping mode as the project matures as at some
point iteration and shipping will become mutually exclusive.
Given these constraints I will attempt to walk readers through a brief design process of
mentally constructing a valid game engine for a single-player platformer style game.

Compilation
Right off the bat lets setup the game engine to have fast compilation time, and make it easy for
people to add more files to the project at-will. Lets use the unity build style (described earlier
in this document), where all files are bundled together to form a single translation unit for the
compiler to churn through at lightning speed.
Next up lets use a hand-made script to compile the project. A batch file, python script, bash
script, or anything else will suffice; anything that can execute the compiler and pass along the
main-translation unit (along with any compiler flags) will work just fine. Compiling is as simple
as executing the script, and should take less than a second throughout project development.

Iteration
Ensuring fast iterations can be done through three main methods: reading data off disk during
run-time (data files) that determine program behavior, scripting language integration, run-time
compilation of C++.
The first option is known as data driving the program, and can be implemented in a manner as
simple as reading values from a text file upon program startup. The benefits here are tunable

40 Game Architecture
parameters can be added as necessary, however the bulk of the work is in the C++ code that
interprets the data files.
This can be thought of as level editors! A level editor is likely creating these data files so the
game can read them, and adjust execution, as necessary. A level editor is a good example of
building a tool to generate the data that is driving the program. The downsides to this style is
that it can take a very long time to create these kinds of asset creation tools.
The second option would be to incorporate a scripting language, such as Lua, into the game
project. The benefits here are lua files can be reloaded at run-time in order to adjust program
logic at run-time. This takes the whole data driven idea a step farther! Not only are game
assets and tunable parameters up for change, but so is the games own logic. The downsides
here are scripting languages are not as efficient as native C code.
The final option is similar to incorporating a scripting language, but is not quite as flexible. This
tradeoff in flexibility comes with the benefit of native C code for blazing fast speed! Lets
choose this option.
One way to implement run-time compiled C++ is to place all game logic within a dynamic
library, such as a DLL or (or .a file on linux). The OS can execute the compiler while the program
is running, and write out a dynamic library. The program can then notice a new library is
available, unload the one it currently has, reload the new one, and continue running with the
new logic.
Losses in flexibility with this style are: function address can change when a library is reloaded,
thus destroying many functor implementations; virtual table addresses will certainly change if
they are linked into the dynamic library; if the C run-time was statically linked to the DLL all
heap memory allocated by the DLL will be lost upon reload; all static memory (bss, data, code
section, etc.) of the process space for the DLL will be overwritten and lost upon reload; perhaps
more limitations I have forgotten to list here.
Casey Muratori of Handmade Hero has shown a great way to implement this style of dynamic
code compilation in his various videos.
Memory Storage
Clearly defining where memory is stored and where it comes from becomes integral to the
success of the dynamic library reloading. There are a couple options:

The main executable owns all memory. This memory is handed to the dynamic library to
operate upon, but the dynamic library does not own this memory. When the library is
reloaded no program memory is lost. This requires the memory to handed to the library
over an interface, and the library must store a pointer to this memory. The main
executable and the library can both statically link all project code definitions thus

Software Engineering Essentials

41

simplifying data definitions. Or, the main executable need not know about the dynamic
libraries data definitions at all, hand opaque memory to the dynamic library.
The dynamic library owns whatever memory it needs. Upon reloading the dynamic
library serializes all run-time objects into temporary memory held by the main
executable. Once reloaded all objects are serialized back into the dynamic libraries runtime memory.

I highly recommend the first option due to ease of implementation. However, the second
option may be preferred! For example, if the user wishes to use virtual dispatch the second
method can gracefully handle patching of virtual tables upon serialization back into the runtime.
In order to support polymorphism with the first style the user may need to implement their
virtual table, either through actual function points and an actual table, or through C switches
(or if-else combos). These options do not break down like C++ virtual tables since all of these
methods reside in the process space of the dynamic library itself, and when reloaded will be set
to new and appropriate values.
Once a scheme is chosen and implemented developers will likely want to modify code while the
game is running via an in-game hotkey. In this case it would be wise to implement a mechanism
to do so within the game itself.

Run-time Object Model


Jason Gregorys book Game Engine Architecture has a great section on all the various choices of
the run-time object model. I highly suggest reading his book! Whatever model is chosen can be
incorporated into our pretend design.

Specific Solutions
The rest of the platformer game can be tackled on a per-feature basis. Since I stated we were
making a platformer some questions immediately arise:

Can the player jump


Are there typical Mario-style enemies
How does the player interact with obstacles, or enemies
Are there tiles in the game
Advanced physics, such as bridges, joints?
What kind of animations do we want, skeletal, 2D framed-based, etc.?
What do levels look like
How are the level transitions defined

42 Game Architecture
And the list can go on and on. These questions help to define the features the game requires.
Each feature can be solved with a specific solution! Development in this pretend project would
consist of picking features to implement, while trying to write only the code necessary to solve
the given feature. In this way the architecture of the game is all the implementation details
involved in achieving each feature. From here on lies the realm of game design

Das könnte Ihnen auch gefallen