Semantic Compression

Semantic Compression We all know how to program in C++, dont we?
I mean, weve all

read a selection of wonderful books by the gaggle of bearded fellows who defined
the language in the first place, so weve all learned the best ways to write C++
code that solves real-world problems. First, you look at the real world problem
say, a payroll system and you see that it has some plural nouns in it: employees, m
anagers, etc. So the first thing you need to do is make classes for each of these
nouns. There should be an employee class and a manager class, at least. But rea
lly, both of those are just people. So we probably need a base class called perso
n, so that things in our program that dont care whether youre an employee or a mana
ger can just treat you as a person. This is very humanizing, and makes the other
classes feel less like cogs in a corporate machine! Theres a bit of a problem, t
hough. Isnt a manager also an employee? So manager should probably inherit from e
mployee, and then employee can inherit from person. Now were really getting somew
here! We havent actually thought about how to write any code, sure, but were model
ing the objects that are involved, and once we have those solid, the code is jus
t going to write itself. Wait, shoot you know what? I just realized, what if we
have contractors? We definitely need a contractor class, because they are not em
ployees. The contractor class could inherit from the person class, because all c
ontractors are people (arent they?). That would be totally sweet. But then what d
oes the manager class inherit from? If it inherits from the employee class, then
we cant have managers who work on contract. If it inherits from the contractor c
lass, then we cant have full-time managers. This is turning out to be a really ha
rd programming problem, like the Simplex algorithm or something! OK, we could ha
ve manager inherit from both classes, and then just not use one of them. But tha
ts not type-safe enough. This isnt some sloppy JavaScript program! But you know wh
at? BAM! Ive got the solution right here: we templatize the manager class. We tem
platize the manager class on its base class, and then everything that works with
manager classes is templatized on that as well! This is going to be the best pa
yroll system ever! As soon as I get all these classes and templates specd out, Im
going to fire up my editor and get to work on the UML diagrams.Programmers Posti
ng Programming Posts Itd be great if everything I just wrote had been farcical, b
ut sadly, theres actually a lot of programmers in the world who think like this.
Im not talking about Bob the Intern Im talking about all kinds of programmers, inclu
ding famous programmers who give lectures and write books. I am also sad to say
that there was a time in my life when I thought this way, too. I was introduced
to object oriented programming when I was 18, and it took me until I was about 24
to realize it was all a load of horseshit (and the realization was thanks in no
small part to my taking a job with RAD Game Tools, which thankfully never bought
into the whole OOP nightmare). But despite the fact that many programmers out t
here have gone through bad phases like this and eventually come to smart conclus
ions about how to actually write good code efficiently, it seems that the landsc
ape of educational materials out there still overwhelming falls into the objectiv
ely bad category. I suspect this has something to do with the fact that good prog
ramming seems very straightforward once you know how to do it, unlike, say, a fa
ncy math technique that retains its sexiness and makes you want to spend the tim
e to post about it. So, although I dont have any data to back this up, I strongly
suspect that experienced programmers rarely spend time posting about how they p
rogram because they just dont think its anything special. But they should! It may
not be special, but its necessary, and if good programmers dont start posting abou
t how to do good programming, well never get out of this nasty place where everyo
ne has to go through six years of writing horrible object-oriented programs befo
re they realize theyre wasting their time. So what Id like to do with this next se
t of Witness articles is spend some serious word count talking about the purely
mechanical process of putting code into a computer, and it is my sincere hope th
at other experienced programmers out there will take some time to do the same. P
ersonally, Id love to read more about the techniques actual good programmers out
there use when they sit down to code. To start things off, I am going to detail
a straightforward set of code transformations that I did on The Witnesss editor c
ode. In the coming weeks, Ill move from that into some larger examples where I wr
ote more pieces from scratch, but the entire time Ill be focusing solely on code
and how its structured. Nothing that Im going to cover has any fancy algorithms or
math or anything, its all just pure plumbing.Jon Starts Things Off Right In the
built-in editor for The Witness, there is a piece of UI called the Movement Panel.
It is a floating window with some buttons on it that are used to perform operat
ions on entities like rotate 90 degrees. Originally it was quite small and had onl
y a few buttons, but when I started working on the editor, I added a bunch of fe
atures that needed to go in the movement panel. This was going to expand its con
tents considerably, and it meant I had to learn how to add elements to the UI, w
hich Id never done before. I examined the existing code, which looked like this:
int num_categories = 4;
int category_height = ypad + 1.2 * body_font->character_height;
float x0 = x;
float y0 = y;
float title_height = draw_title(x0, y0, title);
float height = title_height + num_categories * category_height + ypad;
my_height = height;
y0 -= title_height;
{
y0 -= category_height;
char *string = "Auto Snap";
bool pressed = draw_big_text_button(x0, y0, my_width, category_height, strin
g);
if (pressed) do_auto_snap(this);
}
{
char *string = "Reset Orientation";
bool pressed = draw_big_text_button(x0, y0, my_width, category_height, strin
g);
if (pressed) {
...
}
}
...
The first thing I noticed here was that Jon, the original programmer, did a real
ly nice job setting me up for success with what I was about to do. A lot of time
s, you open up some code for something simple like this, and you find that it is
just a massive tangle of unnecessary structure and indirection. Here, instead,
we find an extremely straightforward series of things happening, that read exact
ly like how you would instruct a person to draw a UI panel: First, figure out whe
re the title bar should go. Then, draw the title bar. Now, below that, draw the
Auto Snap button. If its pressed, do auto snapping. . . This is exactly how progra
mming should go. I suspect that most anyone could read this code and know what i
t was doing, and probably intuit how to add more buttons without having to read
anything beyond just this excerpt. However, nice as the code was, it was obvious
ly not set up for doing large amounts of UI, because all the layout work was sti
ll being done by hand, in-line. This is mildly inconvenient in the snippet above
, but gets more onerous once you consider more complex layouts, like this piece
of the UI that has four separate buttons that occur on the same row:
{
float
float
float
float
w = my_width / 4.0f;
x1 = x0 + w;
x2 = x1 + w;
x3 = x2 + w;
unsigned long button_color;
unsigned long button_color_bright;

unsigned long text_color;
get_button_properties(this, motion_mask_x,
&button_color, &button_color_bright, &text_color);
bool x_pressed = draw_big_text_button(x0, y0, w,
category_height, "X", button_color,
button_color_bright, text_color);
get_button_properties(this, motion_mask_y,
bool y_pressed = draw_big_text_button(x1, y0, w,
category_height, "Y", button_color,
get_button_properties(this, motion_mask_z,
bool z_pressed = draw_big_text_button(x2, y0, w,
category_height, "Z", button_color,
get_button_properties(this, motion_local,
bool local_pressed = draw_big_text_button(x3, y0, w,
category_height, "Local", button_color,
if
if
if
if
(x_pressed) motion_mask_x = !motion_mask_x;

(y_pressed) motion_mask_y = !motion_mask_y;
(z_pressed) motion_mask_z = !motion_mask_z;
(local_pressed) motion_local = !motion_local;
}
So, before I started adding lots of new buttons, I already felt like I should sp
end a little time working on the underlying code to make it simpler to add new t
hings. Why did I feel that way, and how did I know what simpler means in this case
?Semantic Compression I look at programming as having essentially two parts: fig
uring out what the processor actually needs to do to get something done, and the
n figuring out the most efficient way to express that in the language Im using. I
ncreasingly, it is the latter that accounts for what programmers actually spend
their time on: wrangling all those algorithms and all that math into a coherent
whole that doesnt collapse under its own weight. So any experienced programmer wh
os any good has had to come up with some way if even just by intuition of thinkin
g about what it means to program efficiently. By efficiently, this doesnt just mean
that the code is optimized. Rather, it means that the development of the code i
s optimized that the code is structured in such a way so as to minimize the amou
nt of human effort necessary to type it, get it working, modify it, and debug it
enough for it to be shippable. I like to think of efficiency as holistically as
possible. If you look at the development process for a piece of code as a whole
, you wont overlook any hidden costs. Given a certain level of performance and qu
ality required by the places the code gets used, beginning at its inception and
ending with the last time the code is ever used by anyone for any reason, the go
al is to minimize the amount of human effort it cost. This includes the time to
type it in. It includes the time to debug it. It includes the time to modify it.
It includes the time to adapt it for other uses. It includes any work done to o
ther code to get it to work with this code that perhaps wouldnt have been necessa
ry if the code were written differently. All work on the code for its entire usa
ble lifetime is included. When considered in this way, my experience has led me
to conclude that the most efficient way to program is to approach your code as i
f you were a dictionary compressor. Like, literally, pretend you were a really g
reat version of PKZip, running continuously on your code, looking for ways to ma
ke it (semantically) smaller. And just to be clear, I mean semantically smaller,

as in less duplicated or similar code, not physically smaller, as in less text,
although the two often go hand-in-hand. This is a very bottom-up programming me
thodology, a pseudo-variant of which has recently gained the monicker refactoring,
even though that is a ridiculous term for a number of reasons that are not wort
h belaboring at the moment. I also think that the formal refactoring stuff missed
the main point, but thats also not worth belaboring. Point being, they are sort-o
f related, and hopefully you will understand the similarities and differences mo
re over the course of this article series. So what does compression-oriented pro
gramming look like, and why is it efficient? Like a good compressor, I dont reuse
anything until I have at least two instances of it occurring. Many programmers
dont understand how important this is, and try to write reusable code right off the
bat, but that is probably one of the biggest mistakes you can make. My mantra i
s, make your code usable before you try to make it reusable. I always begin by jus
t typing out exactly what I want to happen in each specific case, without any re
gard to correctness or abstraction or any other buzzword, and I get that working. Th
en, when I find myself doing the same thing a second time somewhere else, that i
s when I pull out the reusable portion and share it, effectively compressing the c
ode. I like compress better as an analogy, because it means something useful, as o
pposed to the often-used abstracting, which doesnt really imply anything useful. Wh
o cares if code is abstract? Waiting until there are (at least) two examples of
a piece of code means I not only save time thinking about how to reuse it until
I know I really need to, but it also means I always have at least two different
real examples of what the code has to do before I try to make it reusable. This
is crucial for efficiency, because if you only have one example, or worse, no ex
amples (in the case of code written preemptively), then you are very likely to m
ake mistakes in the way you write it and end up with code that isnt conveniently
reusable. This leads to even more wasted time once you go to use it, because eit
her it will be cumbersome, or you will have to redo it to make it work the way y
ou need it to. So I try very hard to never make code prematurely reusable, to evok
e Knuth. Similarly, like a magical globally optimizing compressor (which sadly P
KZip isnt), when you are presented with new places where a previously reused piec
e of code could be reused again, you make a decision: if the reusable code is al
ready suitable, you just use it, but if its not, you decide whether or not you sh
ould modify how it works, or whether you should introduce a new layer on top of
or underneath it. Multiresolution entry points are a big part of making code res
uable, but Ill save discussion of that for a later article, since its a topic unto
itself. Finally, the underlying assumption in all of this is, if you compress y
our code to a nice compact form, it is easy to read, because theres a minimal amo
unt of it, and the semantics tend to mirror the real language of the problem, beca
use like a real language, those things that are expressed most often are given t
heir own names and are used consistently. Well-compressed code is also easy to m
aintain, because all the places in the code that are doing identical things all
go through the same paths, but code that is unique is not needlessly complicated
or separated from its use. Finally, well-compressed code is easy to extend, bec
ause producing more code that does similar operations is simple, as all the nece
ssary code is there in a nicely recomposable way. These are all things that most
programming methodologies claim to do in an abstract fashion (build UML diagram
s, make class hierarchies, make systems of objects, etc.), but always fail to ac
hieve, because the hard part of code is getting the details right. Starting from
a place where the details dont exist inevitably means you will forget or overloo
k something that will cause your plans to fail or lead to suboptimal results. St
arting with the details and repeatedly compressing to arrive at the eventual arc
hitecture avoids all the pitfalls of trying to conceive the architecture ahead o
f time. With all that in mind, lets take a look at how all this can be applied to
the simple Witness UI code.Shared Stack Frames The first bit of code compressio
n I did on the UI code happens to be one of my very favorites, since its trivial
to do and yet is extremely satisfying. Basically, in C++, functions are very sel
fish. They keep all their local variables to themselves, and you cant really do a
nything about that (although as the cancerous C++ specification continues to met
astasize, its starting to add more options for this, but that is a separate issue
). So when I see code like the Witness UI code thats doing stuff like this:
int category_height = ypad + 1.2 * body_font->character_height;
float y0 = y;
...
...
...
...
I think its time for me to make a shared stack frame. What I mean by this is, any
where theres going to be a panel UI in the Witness, this sort of thing is going t
o happen. I looked at the other panels in the editor, of which there were severa
l, and they all had substantively the exact same code as I showed in the origina
l snippet same startup, same button calculations, etc. So its clear that I want t
o compress all this so that each thing only happens in one place, then just gets
used by everyone else. But its not really feasible to wrap whats going on purely
in a function, because theres systems of variables that interact, and they intera
ct in multiple places that need to connect with each other. So the first thing I
did to this code was to pull those variables out into a structure that can serv
e as a sort of shared stack frame for all these operations if I want them to be
separate functions:
struct Panel_Layout
{
float width; // renamed from "my_width"
float row_height; // rename from "category_height"
float at_x; // renamed from "x0"
float at_y; // renamed from "y0"
};
Simple, right? You just grab the variables that you see that are being used in a
repetitive way, and you put them in a struct. Typically, I use InterCaps for va
riable names and lowercase_with_underscores for types, but since I am in the Wit
ness codebase, I try to adhere to its general conventions where possible, and it
uses Uppercase_With_Underscores for types and lowercase_with_underscores for va
riables. After I substituted the structure in for the local variables, the code
looked like this:
Panel_Layout layout;
layout.row_height = ypad + 1.2 * body_font->character_height;
layout.at_x = x;
layout.at_y = y;
float height = title_height + num_categories * layout.row_height + ypad;
my_height = height;
layout.at_y -= title_height;
{
layout.at_y -= layout.row_height;
bool pressed = draw_big_text_button(layout.at_x, layout.at_y,
my_width, layout.row_height, string);
}
{
layout.at_y -= category_height;
my_width, layout.row_height, string);
if (pressed) {
...
}
}
...
Not an improvement yet, but it was a necessary first step. Next I pulled the red
undant code out into functions: one at startup, and one for each time theres a ne
w row of UI. Normally, I would probably not make these member functions, but sin
ce The Witness is a more C++-ish codebase than my own, I thought it was more con
sistent with the style (and I dont have a strong preference either way):
Panel_Layout::Panel_Layout(Panel *panel, float left_x, float top_y, float width)
{
row_height = panel->ypad + 1.2 * panel->body_font->character_height;
at_y = top_y;
at_x = left_x;
}
void Panel_Layout::row()
{
at_y -= row_height;
}
Once I had the structure, it was also trivial to take these two lines
y0 -= title_height;
from the original and wrap them up:
void Panel_Layout::window_title(char *title)
{
float title_height = draw_title(at_x, at_y, title);
at_y -= title_height;
}
So then the code looked like this:
Panel_Layout layout(this, x, y, my_width);
layout.window_title(title);
float height = title_height + num_categories * layout.row_height + ypad;
my_height = height;
{
layout.row();
layout.my_width, layout.row_height, string);
}
{
layout.row();
if (pressed) {
...
}
}
...
Although that wouldnt be necessary if this was the only panel (since the code onl
y happens once), all the Witness UI panels did the same thing, so pulling it out
meant I could go compress all that code too (which I did, but which I wont be co
vering here). Things were looking better, but I also wanted to get rid of the we
ird num_categories bit and the height calculation. Looking at that code further, I
determined that all it was really doing was pre-counting how high the panel wou
ld be after all the rows were used. Since there was no actual reason why this ha
d to be set up front, I figured hey, why not do it after all the rows have been
made, so I can just count how many actually got added rather than forcing the pr
ogram to pre-declare that? That makes it less error prone, because the two canno
t get out of sync. So I added a complete function that gets run at the end of a pa
nel layout:
void Panel_Layout::complete(Panel *panel)
{
panel->my_height = top_y - at_y;
}
I went back to the constructor and made sure I saved top_y as the starting y, so a
ll I had to do was just subtract the two. Poof! No more need for the precalculat
ion:
{
layout.row();
}
{
layout.row();
if (pressed) {
...
}
}
...
layout.complete(this);
The code was getting a lot more concise, but it was also clear from the often-re
peated draw_big_text_button calls that there was plenty of compressibility left.
So I took those out next:
bool Panel_Layout::push_button(char *text)
{
bool result = panel->draw_big_text_button(
at_x, at_y, width, row_height, text);
return(result);
}
which left the code looking rather nice and compact:
{
layout.row();
bool pressed = layout.push_button(string);
}
{
layout.row();
bool pressed = layout.push_button(string);
if (pressed) {
...
}
}
...
and I decided to pretty it up a bit by reducing some of the unnecessary verbosit
y:
layout.row();
if(layout.push_button("Auto Snap")) {do_auto_snap(this);}
layout.row();
if(layout.push_button("Reset Orientation"))
{
...
}
...
Ah! Its like a breath of fresh air compared to the original, isnt it? Look at how
nice that looks! Its getting close to the minimum amount of information necessary
to actually define the unique UI of the movement panel, which is how we know wer
e doing a good job of compressing. And adding new buttons is getting very simple
no more in-line math, just one call to make a row and another to make a button.
Now, I want to point out something really important. Did all that seem pretty s
traightforward? Im guessing that there wasnt anything in there where you were like
, oh my god, how did he DO that?? Im hoping that every step was really obvious, and
everyone could have easily done a similar set of steps if charged with just pul
ling out the common pieces of code into functions. So, given that, what I want t
o point out is this: this is the correct way to give birth to objects. We made a r
eal, usable bundle of code and data: the Panel_Layout structure and its member f
unctions. It does exactly what we want, it fits perfectly, its really easy to use
, it was trivial to design. Contrast this with the absolute absurdity that you s
ee in object-oriented methodologies that tell you to start writing things on index
cards (like the class responsibility collaborators methodology), or breaking out
Visio to show how things interact using boxes and lines that connect them. You can
spend hours with these methodologies and end up more confused about the problem
than when you started. But if you just forget all that, and write simple code,
you can always create your objects after the fact and you will find that they ar
e exactly what you wanted. If youre not used to programming like this, you may th
ink Im exaggerating, but youll just have to trust me, its true. I spend exactly zer
o time thinking about objects or what goes where. The fallacy of object-oriented pr
ogramming is exactly that: that code is at all object-oriented. It isnt. Code is pro
cedurally oriented, and the objects are simply constructs that arise that allow pr
ocedures to be reused. So if you just let that happen instead of trying to force
everything to work backwards, programming becomes immensely more pleasant. More
Compression, Then Expansion Because I needed to spend some time introducing the
concept of compression-oriented programming, and also because I enjoy trashing
object-oriented programming, this article is already very long despite only show
ing a small fraction of the code transformations I did to the Witness UI code. S
o I will save the next round for next week, where Ill talk about handling that mu
lti-button code I showed, and then how I started using the newly compressed UI s
emantics to start extending what the UI itself could do.

Casey Muratori
Seattle, WA

Semantic Compression

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Semantic Compression

Hochgeladen von

Copyright:

Verfügbare Formate

Semantic Compression We all know how to program in C++, dont we?

I mean, weve all

unsigned long button_color;

unsigned long button_color_bright;

(x_pressed) motion_mask_x = !motion_mask_x;

ke it (semantically) smaller. And just to be clear, I mean semantically smaller,

emantics to start extending what the UI itself could do.

Das könnte Ihnen auch gefallen