Caching Strategies

DIPLOMARBEIT
Caching Strategies for Load Reduction on High

Traffic Web Applications
ausgeführt am
Institut für Computersprachen
der Technischen Universität Wien
unter Anleitung von

Ao.Univ.Prof. Dipl.-Ing. Dr. Franz Puntigam
durch
Alexander Kirk
Stolberggasse 12/12, 1050 Wien
May 9, 2005
Datum Unterschrift
2
3
Abstract
In this thesis we discuss the problem of web applications that have to work
under heavy load of a high number of visitors. We evaluate the application
Bandnews.org as an example and tune it using various caching strategies.
They include caching by a proxy server, a compiler cache, database caching
using a query cache and application based caching using Smarty.
This work shows that gain in speed is possible if methods are applied care-
fully. We compare and combine caching strategies to come to a stage where
every page is generated in reasonable time even under high load.
Kurzfassung
In dieser Diplomarbeit wird das Problem von Web Applikationen behandelt,

die unter hoher Last und einer großen Zahl von Benutzern arbeiten müssen.
Die Applikation Bandnews.org wird als Beispiel untersucht und mittels ver-
schiedener Caching Strategien beschleunigt. Dies beinhält das Cachen mit-
tels einem Proxy Server, einem Compiler Cache, Datenbank Caching mittels
Query Cache und applikationsbasiertes Caching mittels Smarty.
Diese Arbeit zeigt, dass Geschwindigkeitssteigerungen möglich sind, wenn

die Methoden umsichtig eingesetzt werden. Die Caching Strategien werden
miteinander verglichen und kombiniert, um eine Stufe zu erreichen in der
jede Seite in vertretbarer Zeit geladen wird, sogar unter hoher Last.
4
Contents
Contents 5
1 Introduction 9
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Expected Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Terms 13
2.1 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Invalidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Using the uptime command . . . . . . . . . . . . . . . . . . . 15
2.2.2 Using the top command . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Load Averages . . . . . . . . . . . . . . . . . . . . . . . . . . 16
I Environment 17
3 Application 19
3.1 Bandnews.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.1 Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.2 Page Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.3 myBandnews . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.4 BandnewsCMS . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Tools 23
4.1 Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5
6 CONTENTS
4.1.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.2 Language Basics and Structure . . . . . . . . . . . . . . . . . 26
4.2.3 Integration with the web server . . . . . . . . . . . . . . . . . 28
4.2.4 Additional Libraries . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.5 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.1 PEAR::DB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.2 Query Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Smarty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4.1 Template Basics . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4.2 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.1 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.2 HTTP Acceleration . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Advanced PHP Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.6.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.6.2 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.7 Advanced PHP Debugger . . . . . . . . . . . . . . . . . . . . . . . . 43
4.7.1 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.7.2 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.7.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.8 ApacheBench ab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.8.1 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
II Tuning the Application 47
5 Evaluation 49
5.1 Goal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Processing a Request . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 Possible Hooking Points . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3.1 Client Request . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3.2 PHP Module . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3.3 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 Bandnews.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
CONTENTS 7
5.4.1 Skeleton page . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.4.2 Index page index.php . . . . . . . . . . . . . . . . . . . . . . 57
5.4.3 Search page search.php . . . . . . . . . . . . . . . . . . . . . 58
5.4.4 Links page links.php . . . . . . . . . . . . . . . . . . . . . . 58
5.5 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5.1 Preparations . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.5.2 Testing environment . . . . . . . . . . . . . . . . . . . . . . . 61
6 Squid 63
6.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1.1 Caching of whole pages . . . . . . . . . . . . . . . . . . . . . 64
6.1.2 Programmer’s view . . . . . . . . . . . . . . . . . . . . . . . . 66
6.1.3 Expected Results . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2.1 Configuring Apache . . . . . . . . . . . . . . . . . . . . . . . 69
6.2.2 Configuring Squid . . . . . . . . . . . . . . . . . . . . . . . . 69
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3.1 skeleton-t.php . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3.2 pres-skel-t.php . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3.3 index.php . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.4 Conclusions for Squid . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7 APC 75
7.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.1.1 Compiler Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.1.2 Code Optimization . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1.3 Outputting Data . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.1.4 Programmer’s View . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.2.1 Output Buffering . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.1 Results for output testing . . . . . . . . . . . . . . . . . . . . 82
7.4 Conclusions for APC . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8 MySQL 85
8.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.1.1 MySQL Query Cache . . . . . . . . . . . . . . . . . . . . . . 85
8.1.2 Persistent Connections . . . . . . . . . . . . . . . . . . . . . . 86
8.1.3 Query Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8 CONTENTS
8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3.1 Query Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3.2 Persistent Connection . . . . . . . . . . . . . . . . . . . . . . 92
8.4 Conclusions for MySQL . . . . . . . . . . . . . . . . . . . . . . . . . 94
9 Smarty Caching 95
9.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.1 Caching Page Parts . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.2 Database Usage . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9.4 Conclusions for Smarty Caching . . . . . . . . . . . . . . . . . . . . . 103
10 Conclusions 105
10.1 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
A File Sources 109

A.1 Benchmark Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.2 Patch Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
B List of Figures 119
C List of Tables 121
D List of Listings 123
References 125
Chapter 1
Introduction
1.1 Motivation
As the Internet resp. the World Wide Web (WWW) is gaining more and
more popularity, servers have to handle more requests accordingly. The
more people (or simply clients) request resources (in this case files) from
web servers, the faster servers have to accept and process the requests. To
cope with these requirements programmers as well as system administrators
must take countermeasures.
From the very beginning of the WWW the requirements for servers have
not only changed from the view of traffic, but also from the type of content
they deliver to the client. Initially static pages had to be served, today – in
2005 – content is usually taken from a database, and dynamically generated
pages are to be transferred.
This development takes the main source of load away from the operating
system responsible for reading the files from the hard disk or another type of
memory and shifts it to the program that dynamically generates the page.
Also computer hardware has evolved. This makes it possible to have web
pages generated the way they are today. Generally speaking, servers are
capable of serving most pages in quite a reasonable amount of time. This is
true as long as only a small number of visitors request pages to be generated.
The larger the number of clients, the more pages have to be generated
simultaneously. Multi-tasking enables servers to do so, but CPU capacity is
9
10 CHAPTER 1. INTRODUCTION
limited.
If it was only for system administrators, they would add more hardware
power (for instance clustering servers, load balancing). Often this can be
done only to a certain extent, mainly due to financial but also for logistical
reasons. From a programmer’s view, however, algorithms can be optimized
(consider an algorithm in O(n2 ) on a fast computer which can easily be
overtaken by a slower one running an O(n)) but also by caching techniques.
The basis for this diploma thesis will be the analysis of caching strategies
for this scenario. They will be used to speed up an existing application. The
combination of various methods will be tested and benchmarked to reach
a stage at which the application runs at reasonable speed even under high
load.
1.2 Method
We will explore the topic of this thesis using an existing web site (Band-
news.org) as an example to which the caching strategies are applied.
The site consists of an underlying structure which is common to each page.

Therefore the examination is not solely restricted to standard pages but also
a skeleton page is taken into account. To compare the pages we measure
the time for delivery on a single system (i.e. on an Intel PC, see 5.5.2).
Due to the nature of different computer systems these results are only valid
in a relative way. This method still produces significant results because the
differences between versions are at a similar level on faster or slower systems.
We simulate high load on the page using a load generator which effectively
makes the server deliver pages simultaneously.
We examine single pages using a profiler – a tool that measures not only
the overall performance of page generation, but also the time consumed by
single function calls.
1.3. EXPECTED RESULTS 11
1.3 Expected Results
As a result of this work we expect a web application, that delivers pages mul-
tiple times faster than an uncached version of the site (considering repetitive
calls to have the caching taken into account).
As methods for revealing bottlenecks within the application also faster de-
livery is expected for the first call of a web page. This is only considered as
a side effect. The thesis will concentrate on caching pages or parts of pages.
1.4 Outline of the Thesis
The paper is organized as follows:
In the first part we will present the application as well as the used tools. As
application the web site Bandnews.org (see Section 3.1) was chosen. Tools
used are the Apache HTTP Server (Section 4.1), PHP (Section 4.2), MySQL
(Section 4.3), Smarty (Section 4.4), Squid (Section 4.5), APC (Section 4.6),
APD (Section 4.7), and ab (Section 4.8).
The second part, the central part of this diploma thesis, describes and eval-
uates the caching strategies to be applied.
In Section 5 we test the original site and chose pages for later evaluation.
The following sections deal with each technique in detail and provide bench-
marking results which are analyzed and discussed. These sections include
Squid (Section 6), APC (Section 7), MySQL (Section 8), and Smarty Caching
(Section 9).
In the conclusion (Section 10) we review the results as a whole. Section 10.1
gives an outlook of how future work can further improve the performance.
The appendix includes source listings and lists of figures, tables and list-
ings.
12 CHAPTER 1. INTRODUCTION
Chapter 2
Terms
In this section we explain important terms used throughout the thesis.
2.1 Caching
Caching (noun: cache, from the French word cacher – to hide) is the tem-
porary storage of data for later retrieval. Necessary for this approach is a
certain persistance of the data to be stored. The motivation for caches is
the gain of speed whilst trying not to deliver outdated content. The gain of
speed manifests in three points (compare with [Wes01]):
• Reduced system load: The retrieval or generation of content is avoided,

instead a copy from fast memory is delivered.
• Reduced latency: The decision whether to take a copy from cache

or have the data delivered from the orignial source can often be made
very quickly. In combination with fast media, a very short response
time can be established.
• Less bandwidth consumption: The following goes primarily for

hardware and web client proxies: the data does not need to be retrieved
over a slow connection but uses a faster one (either because of physical
distance or because of higher capacity). Storing the cached copy using
compression can also lead to less bandwidth consumption.
13
14 CHAPTER 2. TERMS
Quick retrieval requires fast memory. That is the reason in hardware often
small and expensive memory – but fast one – is used.
Generally speaking, a cache should never be visible to the user, it should be

a transparent means for speeding up the delivery of data.
2.1.1 Invalidation
A cache needs to provide means for invalidating its contents or parts of it in

order to avoid the delivery of outdated data. There are two main concepts
for invalidation:
• Invalidation by command: a copy remains in the cache until the cache

is explicitly told to dismiss or – which is more likely – to replace it
with fresh data.
• Deleting a file in cache by the appliance of a rule, such as an expiry

date or a certain number of retrievals.
Both strategies have their advantages. Chosing the right one depends on the
circumstances. Command based invalidation proxies need very little logic,
but are also very susceptible for delivering invalid data, for example, if an
invalidation command gets lost for some reason.
The second approach needs more intelligence for the proxy, but allows mini-
mization of delivering old data. As it is difficult to define a lifetime for a cer-
tain cached object, though, it is quite easy to implement a last-modification
check which compares the version available in the cache to the “real” one.
This can be done every few times the resource is requested or – if the check
is inexpensive – upon each retrieval.
2.1.2 Privacy
A dangerous field for caching is the privacy of data. Often the contents to be
stored can include sensitive data which has to remain uncached. Although
this issue should be avoided by using encryption, it is – especially in the
WWW – quite common to transmit user specific data through an insecure,
plain text channel.
2.2. LOAD 15
Caches should therefore either be aware of a kind of “private flag” or support

the tagging of content – storing extra information for data, marking it as a
belonging to a certain user. Often this can be implemented in combination
with an authenticated (e.g. by user passwords) proxying system.
2.2 Load
The topic of this diploma thesis includes the technical term “load”. Al-
though most IT professionals know what load is they would not be able to
define it clearly.
A common answer when asking for a definition of “load” is “the degree of

occupation of the CPU”. In the Windows operating system the system load
is indeed displayed (e.g. in the “Task Manager”) using a percentage between
0% and 100%.
2.2.1 Using the uptime command
In Unix the load is usually recognized by three values that can be displayed
e.g. by using the uptime command.
Listing 2.1: Output of the uptime command

alex@notebook :~ $ uptime
12:32:52 up 2:23 , 3 users , load average : 1.16 , 1.13 , 1.09
According to the man page (1) of uptime these are “the system load averages
for the past 1, 5, and 15 minutes.” So these values do not actually represent
the load of the system but average values, so this is a mean value for three
periods of time.
2.2.2 Using the top command
The current load of the system can be shown by using the top command (see
listing 2.2). It also includes the load averages but additionally a percentage
for CPU load is being displayed.
Listing 2.2: A part of the output of the top command

alex@notebook :~ $ top
16 CHAPTER 2. TERMS
top - 12:32:52 up 2:23 , 3 users , load average : 1.16 , 1.13 , 1.09

Tasks : 84 total , 1 running , 83 sleeping , 0 stopped , 0 zombie
Cpu ( s ) : 5.0% us , 0.3% sy , 0.0% ni , 94.7% id , 0.0% wa ,
0.0% hi , 0.0% si
The current occupation of the CPU is split into 7 parts to give a more precise
overview. The abbreviations mean the following:
us is short for user and describes the amount of time the CPU spends in
user mode (a kind of safe mode for user programs, see [Arc03]).
sy is short for system – the amount of time the CPU spends in kernel
mode (e.g. for operating with hardware).
id is short for idle – on desktop computers CPUs spend most of their

time doing nothing.
ni is short for nice, specifying the time being used for processes with
lower priority (e.g. started using the command nice).
wa is short for I/O wait – the amount of time the system is waiting for
an I/O device such as a hard disk.
si and hi are short for soft and hardware interrupt: the time the processor
spends with dealing with such signals.
2.2.3 Load Averages
Although the top command gives a quite intuitive overview over the current
system load, load averages are highly important for diagnosing the “work-
ing” load for a system. The current state is not very informative when
the system does not respond due to a highly CPU intense process – (even
afterwards) the load average can still be helpful.
Interestingly there does not seem to be a single, valid definition on how the
load average is calculated and how it shall be interpreted. According to
[Gun03] the load averages are constructed using the CPU run queue and
the number of jobs currently running on the CPU. The article describes
the calculation in more detail and even provides a second part going even
further into detail.
Part I
Environment
17
Chapter 3
Application
In this section we describe the application chosen for this diploma thesis. It
includes a listing of important criteria which lead to the decision of choosing
this application. These do not solely apply to this web site, so many of the
methods described will also have similar effects on other pages.
3.1 Bandnews.org
The application chosen for this diploma thesis is Bandnews.org (http://

www.bandnews.org/). It is a portal and search engine for the latest band
related news. News items are taken directly from the official source, i.e.
the bands’ websites. Pieces of information are extracted automatically and
repeatedly throughout the day to present the most recent news first. The
news are collected by a spider1 , then easily made available on one page.
This web site is very suitable for the thesis, as the following points apply:
• It is continuously affected by changes: regularly band sites are

checked for news. This results in about 3 new items per hour. It is
even planned to aggregate news every five minutes. Furthermore, each
news item can appear on several pages: band page, genre pages, search
pages.
1
A program that automatically “crawls” web sites, i.e. downloads a page and then
moves on to the next one by following the links included in the first page.
19
20 CHAPTER 3. APPLICATION
Figure 3.1: Screenshot of Bandnews.org
• it retrieves its data from a database: at the time of writing – April

2005 – about 34,000 news items are stored in the database. This does
not slow down retrieving data in general, but joining against this table
becomes expensive. Also full text queries are required for searching.
• it delivers customized pages for each registered user: This makes

caching extremely tricky as a page delivered to a user can’t be reused.
These points make the site require “non-standard” caching techniques. A

single page is rarely delivered twice which makes a simple caching solution
nearly useless.
3.1.1 Technology
The application was built in a so called LAMP environment, an abbreviation

for Linux, Apache (see Section 4.1), MySQL (see Section 4.3), and PHP (see
Section 4.2).
News aggregation is done via a robot program that fetches news by down-
loading the news pages (specified as a “feed”) from the bands’ websites
3.1. BANDNEWS.ORG 21
(which is done with the tool curl). Because the topic of this diploma thesis
concentrates on caching strategies for the main site, the process of retrieving
is not described in detail.
Bandnews.org is a project by Nader Cserny and the author, Alexander Kirk.

The design, public relations, texts, etc. is done by Nader, the author is
responsible for programming. The page was built from scratch starting in
September 2004, no other software was integrated for the main site.
3.1.2 Page Structure
Figure 3.1 shows a screenshot of the main page (http://www.bandnews.

org/). The page consists of single news items (the arbitrary number of 6
news items per page was chosen), this is the main content of the site. At
the top of the page there is a language selector (German and English are
provided) and a meta-navigation which stays the same on each page.
To the left there is the primary navigation: the search form, the band selec-
tor (a two-step drop down for selecting a band), a listing of genres (which
highlights the current genre and shows sub-genres if there are any). A
news language selector (which controls the language of news items to be
displayed), a list of recently added bands and statistics can also be found
on the left.
The right bar (internally called “side bar”) is part of the myBandnews nav-
igation. Site news (e.g. artist of the week, new features) are also displayed
there as well as a Top Ten list which shows the bands clicked most often.
3.1.3 myBandnews
myBandnews (http://my.bandnews.org/) adds to the complexity of the

site as it leads to different pages for each user. When logged in to myBand-
news the user can select his/her favorite bands (see Figure 3.2, page 22)
his/her personal news page will consist of.
The news are also available as an RSS (Rich Site Summary, nowadays often
called Really Simple Syndication) feed which can be used to receive news
alerts quickly – there are many 3rd party tools (for the client) to check RSS
feeds periodically for news and alert the user in such a case.
22 CHAPTER 3. APPLICATION
Figure 3.2: Screenshot of myBandnews while selecting personal bands
3.1.4 BandnewsCMS
Another interesting feature is the integrated CMS (Content Management

System) which provides bands with an easy tool for creating, modifying
and deleting their news. These news items are stored in the Bandnews.org
database and are integrated with the bands’ homepages by using an iframe
and their own style sheet. This is especially useful for bands who only have
little budget for their web page and cannot afford their own CMS to make
all members have the possibility to post news on their page.
One of the bands using this feature is “When the Music’s over” (http:
//www.whenthemusicsover.com/ resp. http://bandnews.org/homepage/
When+The+Music’s+Over/).
Chapter 4
Tools
In this section we describe the tools which will be used in this thesis. The
sequence does not reflect their later use, it was chosen for reasons of bet-
ter understanding. If not stated otherwise, the tools are Open Source and
underlie the GNU General Public License (GPL).
4.1 Apache
In this thesis as a web server the Apache HTTP Server is being used. Ac-
cording to [Net05] it is the web server software used on most hosts today.
4.1.1 History
The development of Apache started in April 1995 as an evolution of the pub-

lic domain HTTP daemon developed by Rob McCool at the National Center
for Supercomputing Applications, University of Illinois, Urbana-Champaign.
Originally Apache was a group of patches for the NCSA daemon, with not
too many of those patches developed by the newly founded Apache Group.
Soon it was evident, though, that the basis lacked extensibility, and so the
server was developed with a new design from scratch. Apache 1.0 was re-
leased on December 1, 1995.
Already in April 1996 the Apache web server moved to first place in web
server popularity.
23
24 CHAPTER 4. TOOLS
The versions of Apache used today are 1.3 and 2.0. While 1.3 was an evolu-
tion from the first version, extended with various modules, version 2.0 was
once again a new design that intends to match the requirements to web
servers in the World Wide Web today. Amongst those features is a better
integration of the POSIX thread system and native IPv6 suppport.
Apache runs on several platforms, including Unix based systems and Win-
dows NT. Still when referring to an Apache web server, one commonly refers
to a Unix or even more often to a Linux system. The term LAMP (Linux
Apache MySQL PHP – the environment used in this thesis) was coined
representing a very common configuration.
The Apache Group has meanwhile approached several other projects that
are very important in the field of Open Source software. Amongst others
the most important projects are the Jakarta Tomcat web server (for Java
based applications), Ant (build system (not only) for Java projects), and
Struts (a framework for Java apps).
There is also an Apache License (which also applies to the HTTP server)
that is primarily based on the BSD license. See [Hub04].
Today the market share of the Apache web server is very close to 70%.
4.1.2 Features
Basically the Apache web server is designed to serve data via the HTTP
protocol (versions 1.0 and 1.1). Its functionality can be enhanced by a great
variety of modules.
Apache HTTP server 1.3 implements a so-called pre-forking model. The

term forking describes the generation of child processes for a father process
controlling those sub-processes. With the web server a certain number of
child processes is generated without immanent need for them. When several
requests arrive, though, no time needs to be spent for forking a new pro-
cess but the existing processes can be used. If there are not enough child
processes, even more can be generated.
This preforking model can be seen as a replacement for threading. Version

2.0 of the Apache web server implements threads which can increase speed in
many scenarios. This is because current operating systems heavily support
4.1. APACHE 25
threads as an alternative to forking; threads can be split to different CPUs

in multiprocessor (SMP) systems.
Also third party modules are supported which lead to a large variety of new
capabilities for the web server. For example, PHP (see 4.2) is commonly
integrated as a module, allowing higher performance than CGI.
Other important modules are all sorts of authentication modules (via LDAP,
MySQL, DBM, etc.), (highly configurable) logging and rewriting modules
(modify the request URI before processing).
4.1.3 Alternatives
For a Linux system there are very few alternatives. The greatest rival ac-
cording to [Net05] is Microsoft’s IIS which is only available for the desktop
monopoly operating system Windows (NT).
The only real alternative to Apache under Linux is the Zeus Web Server
(developed by Zeus Technology Ltd., receiving some coverage in [Mid02]),
claiming to be the fastest web server available. As it is not available on
a free basis (and is far from Open Source) it was not taken into greater
consideration.
26 CHAPTER 4. TOOLS
4.2 PHP
The web application is implemented in the programming language PHP

which stands for “PHP: Hypertext Preprocessor”. It is a nowadays quite
commonly used language for creating dynamic web pages.
4.2.1 History
Development of PHP was started by Rasmus Lerdorf in 1995, at that time

it was called “Personal Home Page Tools”, a set of Perl programs that did
some tracking of accesses to his homepage. Later (1997) he re-coded it in C
to provide some more features (e.g. easy access to databases) and called it
PHP/FI (“Personal Home Page / Forms Interpreter”).
PHP 3 (released in June 1998) was the first version similar to PHP most web
sites use today. It was highly extensible and provided a solid infrastructure
for many databases and protocols. In the end of 1998 PHP hundreds of
thousands of web servers reported to have PHP installed which was approx-
imately 10% of the WWW’s servers.
The break through for PHP came with version 4 (released 1999, using the
“Zend Engine”, named after the Zeev Suraski and Andi Gutmans). Many
more web servers were supported as well as new features for programmers
such as HTTP session support and output buffering as well as security en-
hanced methods for receiving user data (“magic quotes”). Several millions of
sites report today that they use PHP 4 (about 20% of the WWW’s servers).
One of the biggest drawbacks of PHP was fixed with version 5, released only
in 2004. It provides full-featured OOP as earlier versions only had rudimen-
tary support for it, e.g. inheritance was supported but no encapsulation.
4.2.2 Language Basics and Structure
PHP is a language specialized on delivering web pages. PHP code is therfore

simply integrated with (existing) HTML pages. A simple “Hello World”
script would look like the one shown in listing 4.1.
4.2. PHP 27
Listing 4.1: Hello World in PHP – helloworld.php

1 < html > < head > < title > Hello World Example </ title > </ head >
2 < body > <? php echo " Hello World !"; ? > </ body > </ html >
The code is declared as PHP by using special tags (<?php and ?>; similar to
ASP’s <% and %>). The text between opening and end tag is compiled and
interpreted, the HTML code is sent untouched to the client.
PHP is a scripting language. This means that there is no need for the pro-
grammer to compile the script before it can be executed. This enables quick
prototyping which matches the requirements of web applications: modifica-
tions have to be integrated quickly.
In PHP, variables (specified by a dollar sign, e.g. $variable) do not have to

be declared before they can be used, they are from the programmer’s view
type-free1 , so you can use the same variable for calculations (e.g. as integer)
and outputing (as a string) without having to take further care.
Reflecting the web specialized character, there are a few variables that make
processing of web pages much easier. The $ GET and $ POST variables au-
tomatically contain (in form of an array) the values received from an URL
or HTML form, depending on the HTTP method the data was sent to the
server.
Given the URL http://localhost/test.php?hello=world the $ GET vari-

able would be a one-element array consisting of the key/value pair [hello]
=> "world".
The $ COOKIE variable contains the values of HTTP cookies (small portions
of data to be stored on the client side). $ SERVER contains server-set data
such as the file system path of the script being executed, or the IP address
of the remote client. The $ SESSION variable is suitable for storing data
which is persistent throughout multiple requests of the same client. With
this variable in use, PHP takes care of generating a so-called session id (for
identifying the user) and setting a cookie containing this id. If clients do
not support cookies the session id is appended to each link (URL rewriting)
and a hidden field containing this id is added to each form on the delivered
page, too.
1
This cannot be applied to completely different data types. For example, using an
array as a string results in the text “Array”.
28 CHAPTER 4. TOOLS
In PHP, arrays play an important role. Every variable mentioned until now
is by definition an array, i.e. a data structure that maintains key/value
pairs, which is internally established by using a hash table. PHP provides
many functions for arrays (e.g. foreach will cycle through each element)
as in everyday use you mostly have to do with structured data. A database
query will commonly return an array representing the data in a natural
and intuitive way. Multi-dimensional arrays can be created at will (just by
specifying an array as value), making it easy to juggle with data.
What makes the language quite special is the great number of functions
provided. Compared to other languages there are few internal functions.
These are mainly used for variable manipulation (for example for cropping
strings). The majority of functions is provided by third party libraries which
are integrated with PHP and provide enormous functionality which can be
accessed easily because of the initial integration into PHP.
The scope of functions starts at database wrappers for very many database
types (most important ones are MySQL, PostgreSQL, Oracle, and Berkeley
DB), a library for image manipulation (GD), compression (such as gzip or
bzip2) and encryption libraries (mcrypt), ending with libraries for accessing
remote services such as up-/downloading, SOAP calls, or XML-RPC. So a
large task of PHP is being a framework to third-party libraries.
4.2.3 Integration with the web server
As a external product, PHP needs to be integrated with a web server. Usu-

ally this is Apache. Two scenarios are possible:
• Using a module (so-called mod php), the PHP compiler is included

with the web server, so if a file is requested that requires parsing, the
module is loaded and the output of the executed script is returned.
• With CGI (see section 4.2.5) the PHP compiler is executed as an

external program.
The CGI variant is available for all web servers that support a cgi-bin
directory, such as IIS on the Windows platform. Generally speaking, this
approach should be dismissed when integrating PHP, as a web server module
is available which provides an enormous gain of speed.
4.2. PHP 29
4.2.4 Additional Libraries
The popularity of PHP causes the rise of a large number of third party tools
which provide even more functionality.
On the one hand there is a “semi-official” database of tools which is called

PEAR (PHP Extension and Application Repository). It covers various top-
ics such as protocol implementations, abstraction layers for databases, and
reference implementations of algorithms (e.g. cryptographic algorithms).
The number of maintainers is limited, though. As most of them are ex-
perienced programmers, this ensures a high quality of the included com-
ponents. Additionally a QA (Quality Assurance) team checks for a high
standard. Documentation is provided for each project, often voluminous
tutorials, too.
On the other hand many source code repositories exist (e.g. Hotscripts.com)
which are usually user-contributed. This has both its good and its bad
sides: these repositories contain thousands of pieces of source code, so there
are not too many “common problems” which have not been solved yet.
As everybody (may the programmer be experienced or not) can contribute
anything the quality of code becomes (naturally) highly diverse. What is
more critical: documentation is commonly bad.
4.2.5 Alternatives
There are many projects on the market that have either developed their
own programming language or modified an existing language for use with
the WWW. The alternatives can be classified in two categories: scripted
and compiled languages.
• ASP.NET is a solution by Microsoft which is based on the .NET

framework and therefore solely supported on Microsoft Windows sys-
tems2 . In contrast to other projects it is not restricted to a single
language, commonly only C# and VB.NET are used, though. For
2
The Mono project supports ASP.NET on various platforms by reimplementing the
proprietary libraries. Their scope will never be the same as on the Windows platform,
but usually little adjustment is needed for getting ASP.NET programs running by using
the Mono compiler.
30 CHAPTER 4. TOOLS
supported languages an API is provided which includes common func-

tions for web use. In ASP.NET only code compiled to an intermediate
language (MSIL) can be executed.
• Java can also be used for web projects, either in form of JSP (Java
Server Pages) which is similar to PHP in its design, or as Java Servlets
which are “normal” Java programs that implement a certain interface.
Java also has to be compiled (the result is an intermediate language,
though) and requires a special web server, e.g. Apache Tomcat.
• Perl has been used for generation of dynamic pages for a long time
already. Commonly it is integrated with the web server Apache by
using the module mod perl which integrates the script interpreter.
Perl was not primarily designed for web use, but there are modules
for Perl (e.g. CGI.Pm) that provide useful functions for processing web
data. Additionally projects such as Mason provide a whole framework
that implements templating and has many other useful features for web
development.
• (Fast) CGI (Common Gateway Interface) is not a language by itself,

but an interface that enables any program that uses stdin, stdout,
and stderr for I/O to provide its services via a web server. Usually
a directory cgi-bin contains the program or script. When a client
requests a file in that directory it is not delivered diretly, but instead
the operating system executes the file as an own process, and the web
server delivers the output of the program. Usually only the output
from stdout is returned, error messages which are usually written to
stderr are saved to the web server’s error log. Data provided by the
client is sent to the program which allows e.g. form processing. This
is either done via command line argument or – for POST requests –
via stdin.
Using the CGI interface, fast languages such as C can be used for
solving CPU intensive problems. On the other hand, there are no
supportive functions, so issues trivial to solve e.g. in PHP require a lot
of code when using CGI.
Fast CGI keeps the program in memory which dismisses the time for
loading the program resulting in a gain of speed.
4.3. MYSQL 31
4.3 MySQL
As a DBMS (Database Management System, actually RDBMS with R mean-

ing Relational) MySQL was chosen. Throughout the thesis it also will be
referred to as “database” which is a quite common mis-naming.
MySQL was originally designed to achieve high performance and believed

to be one of the fastest DBMSs currently on the market. MySQL uses the
GPL as license which makes it open source and therefore freely available.
An API for PHP is provided which is commonly integrated with PHP and
adds to its function pool. Actually this integration is the way MySQL is
most commonly used today, the rise of PHP also helped MySQL to emerge.
4.3.1 PEAR::DB
PEAR::DB is not a part of the MySQL distribution and is not solely de-
pendant on MySQL either. It is rather an abstraction layer from the PEAR
repository (see section 4.2.4) that provides a DBMS independant layer for
retrieving data.
Apart from SQL which has to be understood by the DBMS used (many
systems use their own flavour of SQL, so does MySQL) switching the DBMS
can be easily done by just switching the DSN string (Data Service Name).
Also methods for retrieving data (as associative hash, as “normal” array,
etc.) do not differ.
4.3.2 Query Cache
Recent versions of MySQL (since 4.0.1) provide a so called query cache –

also referred to as Query Folding [Qia96]. SELECT statements are stored
together with their results which allows very fast responses when the same
query (the exact same string has to be used for querying) is executed the
second time.
It is quite common when using a database in connection with a web server

that tables do not change very frequently and the same queries are executed
over and over. So a large increase of speed can be expected when activating
the query cache.
32 CHAPTER 4. TOOLS
4.3.3 Alternatives
There are some alternatives to MySQL that provide additional features

which may make them more favourable for certain uses.
• Oracle is a well known, fast and commercial RDBMS. Due to its

performance and scalability it has a wide distribution. With the inte-
grated programming language PL/SQL many problems can be solved
at database level resulting in a fast alternative to their solution at the
top level programming language.
• PostgreSQL is often referred to as free Oracle. Indeed it provides a

similar set of SQL commands but lacks PL/SQL. This DBMS should
be used when DBMS speed is not a crucial point.
• In the latest versions of PHP SQLite is featured as an alternative for

MySQL. SQLite is a C library that implements an embeddable SQL
database engine. When integrated with PHP, database files are written
directly – it does not act as wrapper for an external engine. Indeed
for smaller projects SQLite is sufficient, but it should be avoided when
dealing with large amounts of data.
MySQL was chosen for its common use in Open Source projects and its
speed. Even though license problems arose in 2004 for using it with PHP, it
can now be recommended as a special license for this case of appliance has
been published.
4.4. SMARTY 33
4.4 Smarty
Another tool used in this diploma thesis is the Smarty Template Engine. It
is a tool – written in PHP, created by Monte Ohrt and Andrei Zmievski in
2001 – to separate program logic, i.e. the PHP code, from design, stored in
so-called template files.
Figure 4.1: The MVC design pattern
Model
PHP Script
View Controller
Smarty PHP
Figure 4.1 shows the MVC (Model-View-Controller [KP88]) design pattern

which is often tried to be applied on web applications. Using Smarty this
pattern can be implemented with separation into these components:
• The Model specifies the part of the application that handles the busi-
ness logic, i.e. the actual problem is solved here. In this scenario this
part is taken by the PHP scripts written by the programmer.
• Smarty is used for the View component which is responsible for han-
dling the output and its formatting.
• The processing of the input is done by the Controller, in this case

PHP. It handles, for example, the transistion from a parameter in the
HTTP URI to a variable.
In a scenario without Smarty, View and Model are mixed. This would
not only dismiss the design pattern but also reduce reusability of source
code [Par04]. The use of Smarty contrasts the design goal PHP originally
implements. In fact Smarty only acts as a layer within a PHP script – this
is quite obvious as it is coded in PHP itself.
34 CHAPTER 4. TOOLS
Figure 4.2: Three-tier architecture
This also represents the common three-tier architecture (see Figure 4.2). It is
quite desirable (also in other parts of information engineering) to split apart
the data (first tier), the business logic (second tier) and the presentation
(third) tier. The MVC model is a corresponding design pattern. More
benefits from the three-tier architecture are discussed in [Swe01].
In a company the roles of programmer and layout designer are separate. This
is supported and even pushed by Smarty because designer and programmer
can concurrently work on the same page with the designer changing the
appropriate .tpl file while the programmer makes changes to the PHP code.
Therefore, the use of Smarty is highly recommended.
4.4.1 Template Basics
Template files are quite similar to “normal” PHP files, they embed their logic
into HTML. A Hello World example using Smarty in combination with PHP
would look like this:
Listing 4.2: Hello World in Smarty – hello.tpl

1 < html > < head >
2 < title > Hello World Example with Smarty </ title >
3 </ head > < body >{ $hello } </ body > </ html >
Listing 4.3: Hello World in Smarty – hello.php

1 <? php
2 include (" Smarty . class . php ") ;
3 $smarty = new Smarty () ;
4 $hello = " Hello World !";
5 $smarty - > assign (" hello " , $hello ) ;
6 $smarty - > display (" hello . tpl ") ;
7 ?>
4.4. SMARTY 35
In this example, the variable $hello is displayed within the template file,
just by putting it into curly brackets. This is the default setting for inte-
grating logic and variables in .tpl files3 . The variable does not go together
with those from PHP. They have to be explicitly assigned to Smarty (line 5
of hello.php) to have it accessible in hello.tpl4 . After that the Smarty
command for displaying the template file is called.
Listing 4.4: Highlighting alternating lines – alternate.tpl

1 < html > < head > < title > Alternate Backgrounds </ title > </ head >
2 < body > < table >
3 { section name = d loop = $data }
4 <tr > < td
5 { if $smarty . section . d . first }
6 bgcolor ="# CC0000 "
7 { elseif $smarty . section . d . index is even }
8 bgcolor ="# CCCCCC "
9 { else }
10 bgcolor ="# DDDDDD "
11 {/ if }
12 >{ $data [ d ]} </ td > </ tr >
13 {/ section }
14 </ table > </ body > </ html >
For outputting arrays assigned from PHP in Smarty the helper functions
foreach and section are available. In “sections” arrays are traversed
with keys from 0 to n. foreach acts the same way as in PHP, provid-
ing access to key and value for each entry of the array. While looping, the
$smarty.section variable (resp. $smarty.foreach) is filled with values to
be used for design functionality. As an example, listings 4.4 and 4.5 show
how a table with alternating background colors is generated (see Figure 4.3
for a screenshot of a web browser displaying the page).
Listing 4.5: Highlighting alternating lines – alternate.php

1 <? php
2 include (" Smarty . class . php ") ;
3 $smarty = new Smarty () ;
3
The delimiters can also be changed to e.g. <smarty: and > to establish a certain
degree of XML (parsing) confirmity
4
The name of the assignment and the variable name do not have to be equivalent, this
is only the case in this example.
36 CHAPTER 4. TOOLS
4 $data = array () ;
5 for ( $i = 0; $i < 10; $i ++) {
6 $data [] = " value " . $i ;
7 }
8 $smarty - > assign (" data " , $data ) ;
9 $smarty - > display (" alternate . tpl ") ;
10 ?>
Figure 4.3: Screenshot of the output of the alternating backgrounds example
In this short example several more aspects of Smarty and PHP are shown.
The program logic of if/elseif/else is available to Smarty for doing sim-
ple tasks (intended for design-oriented conditionals, something just like in
the example above). The $smarty.section array provides common states,
for instance the current index or whether it is the first or last iteration of the
loop. Array values are accessed in a PHP like form (index within squared
brackets) when using sections.
Finally it is important to state that it should be avoided to integrate logic

that does not solely affect (visual) design.
4.4.2 Alternatives
The idea of templating PHP is quite common and various such projects
exist.
• HTML Template IT is part of the PEAR repository, developed by

Ulf Wendel. Contrary to Smarty no programming logic (such as if)
4.4. SMARTY 37
is provided. Repeated sections (for array output) are declared by

specifying a beginning and ending position. The overall performance
is said to be good, though, no caching is supported.
• patTemplate – created by Stephan Schmidt – heavily relies upon

XML notation: Templates are defined using a certain tag, for alter-
nating table rows as in the example above there is an even/odd clas-
sification. Variables are inserted as XML tags.
Smarty was chosen for its features and its steady improvement.
38 CHAPTER 4. TOOLS
4.5 Squid
A proxy server is a program that acts in favour of a client by means of

requesting data and returning it to the client. In computer security this
would be referred to as a kind of man-in-the-middle. There exist proxies for
various application. In this diploma thesis Squid is used as “a full-featured
Web proxy cache”, i.e. it is capable of proxying requests of the protocols
HTTP and FTP.
4.5.1 Use cases
Squid is a proxy server for use on Unix/Linux systems (Windows NT is only

supported via cygwin). Usually it is installed on a server that acts as a
gateway for a (local area) network. Several configurations are possible:
• Proxying HTTP or FTP requests for clients with non-routable ad-

dresses (such as 192.168.x.x) as an alternative to NAT (Network
Address Translation).
• Reducing traffic for frequently visited sites by acting as a caching

proxy.
• Controlling the accessiblity of web pages: white or black lists, restric-

tion on a time basis, access via user id and password.
4.5.2 HTTP Acceleration
Neither of these configurations really seems to match the topic of this

diploma thesis and the idea of caching web applications itself. However
there is another configuration called “HTTP acceleration” which forwards
requests to a web server which resides on the same machine. This is often
also known as reverse proxying.
The idea of using a proxy on the same server as a web server has to do with
design goals of the two programs.
A web server has to provide several features for processing files to be served
(in the case of PHP, for example, the interpreter is commonly integrated
with the Apache web server via module). For each request a copy of the
4.5. SQUID 39
executable must be held in memory. Therefore, the larger the executable is,
the higher the memory consumption will be for a number of requests.
Proxy servers are designed to be very light-weight programs that primarily

serve the goal of collecting requests and – this is a crucial point – then do
their proxying: contact the web server.
Considering several clients accessing the web server at the same time, for
each request an executable has to be loaded and held in memory until the
request is completed. With HTTP acceleration requests are collected by the
small proxy program (which consumes particularily little memory) and can
therefore take many more requests than the web server itself. Only after
the request has been transmitted completly the web server is contacted to
collect the pages.
4.5.3 Alternatives
• pound is a proxy that is specially designed for the use of reverse

proxying as well as load balancing. It has been developed by Robert
Segall since 2003.
• The mod proxy Apache module can also be used for proxying requests.
This is only useful, though, when the proxy resides on its own machine.
Squid was chosen for its being commonly used in production environments
and its availability through standard shipping of most Linux distributions.
40 CHAPTER 4. TOOLS
4.6 Advanced PHP Cache
APC is a tool that speeds up execution of PHP scripts by caching the

compiled script in the immediate language which is eventually executed. It
was written in 2000 by George Schlossnagle, Daniel Cowgill and Rasmus
Lerdorf.
Figure 4.4: PHP script execution
compile
main script
compile
execute included script
main script
execute
included script
complete
The idea of reducing execution time is based on the mechanism how PHP
executes a script (see Figure 4.4). This is basically done in two steps:
1. The source file is read, parsed and converted to intermediate language

(“compiled”).
2. PHP, i.e. the Zend Engine virtual machine, executes the intermediate
code.
These two steps have to be done every time a script is requested – the
compiled result is dismissed after execution. The same goes for each file
that is included during execution.
4.6. ADVANCED PHP CACHE 41
4.6.1 Concept
While this procedure is by design and fits the requirement of a scripting

language to have the ability to make changes to a file without further ado,
the amount of changes commonly exceeds the number of executions by far.
What is more: for many scripts – especially those with many “includes” –
it often takes PHP longer to convert the script into intermediate language
than to execute it.
In fact step 1 stays the same for most requests (except when a modification
was made). APC implements the idea of caching the compilation results
until a modification was made to (one of) the PHP source file(s).
APC works as a loadable module for PHP which is simply integrated by

specifying it in php.ini:
extension = /usr/lib/php4/apc.so
Figure 4.5: Script execution with compiler cache
load compiled
load compiled script
main script
yes is script
cached?
load compiled
no
execute included script
has script yes
main script no been modified?
execute
included script
load insert compile
compiled script script
script from to cache
cache
complete
It instantly starts working5 when PHP resp. the web server is restarted. The
defaults reserve a total storage space of 30 mega bytes for caching compiled
scripts. Figure 4.5 shows how the cache is being used. Grey boxes show
where the cache repository is accessed.
5
This is done by subclassing the file loading routines of PHP.
42 CHAPTER 4. TOOLS
4.6.2 Alternatives
There are quite a few compiler caches around also worth a try.
• Zend Accelerator is a commercial compiler with closed source. It

was developed by the people who mainly designed the Zend Engine
which makes this one quite fast. The drawback is that it is not for
free.
• Turck MMCache is an open source compiler cache developed by

the company Turck Software St. Petersburg. It is one of the fastest
compilers but development stopped in November 2003.
• ionCube Accelerator was developed by Nick Lindridge and is dis-

tributed by his company, ionCube, for free but with closed source.
The authors choice was APC for its ongoing development and the PHP open
source license.
4.7. ADVANCED PHP DEBUGGER 43
4.7 Advanced PHP Debugger
The tool called APD is primarily a debugger that can be integrated with
PHP. Mainly it provides functions and tools for debugging and profiling.
APD acts as a PHP module and is activated and controlled using PHP
functions which are provided by the module.
4.7.1 Debugging
The debugging functions of this tool provide the “standard” range of com-
monly used debuggers. This includes the setting of break points, debugging
output, printing of stacks and currently used variables, and overriding or
renaming of functions.
For this diploma thesis debugging will not be thoroughly used as a working
and approved application is being tested, supposing that no bugs affect the
caching procedure (and if, only in a relative measure).
4.7.2 Profiling
Profiling is an important tool for reaching the goal of this diploma thesis. It
can be used to spot inefficiencies in source code by measuring the amount of
time the processor spent in each function. While the script is being executed,
a trace file is generated including compiled information about the on-goings
of the current execution.
Afterwards this trace file can be processed with the included tool pprof to
gather the information recorded. The output of the tool can be customized
to the needs of analysis through several options. For example, a call tree can
be printed showing the functions called including their dependencies. The
tool is also capable of listing totals for functions such as time and memory
consumed or times of calls.
4.7.3 Alternatives
• Xdebug, created by Derick Rethans, is currently at the verge to ver-

sion 2 which will enhance its functionality by great a deal. Also the
44 CHAPTER 4. TOOLS
older version 1 has its bonus points, for example the profiling output
can be directly appended to the generated page.
• DBG is a program developed by Dmitri Dmitrienko. Quite a few

IDE use it for their debugging capabilities. Under Linux a GDB (The
GNU debugger) like interface is provided with a reduced command set.
Therefore the visualisation tool DDD can be used as a GUI. Support
for this application is still quite limited, though this might change in
future versions. Contrary to APD, no modifications to source code
need to be made.
Choosing the right tool for this thesis was hard, as all three of the intro-
duced tools have good and distinct features. If it was for debugging only, a
combination of all tools for different cases would have been the best choice.
As mainly profiling is done, APD provides the best functions, especially the
tool for processing trace files sets it apart from the other tools.
4.8. APACHEBENCH AB 45
4.8 ApacheBench ab
To retrieve measurable results a load generation tool is used for enumerating

the amount of requests a server is able to process in a given time. ab is a tool
that is capable of doing so. It ships together with the Apache web server.
Adam Twiss of Zeus Technology Ltd started its development in 1996 which
was continued in 1998 by the Apache Software Foundation.
ApacheBench is a tool that just does its task of load generation, not much
more. The most important settings used are the number of requests (speci-
fied by command line option -n) and the number of concurrent connections
(-c).
4.8.1 Alternatives
• httperf was developed by David Mosberger and Tai Jin at Hewlett-

Packard Research Labs in 1998. It allows a broader range of functions
than ab. What sets it apart from ab is the ability to request multiple
pages in form of a “user session”.
• Siege, created by Jeffrey Fulmer in 2000, is another load generator

that has some modes for how multiple sites are used for stressing the
web server (incrementally or randomly).
46 CHAPTER 4. TOOLS
Part II
Tuning the Application
47
Chapter 5
Evaluation
5.1 Goal definition
Before the analysis of the web application can be started, the goal of the
tuning has to be defined. This is done in order to find the points on where
to install caching mechanisms.
Primarily the web application should be optimized regarding load of the web
server. That means the generation of a single web page should not be very
CPU intensive. There are mainly three spots where the CPU is involved
heavily:
• The web server (program) takes processor time for reading direc-
tories and files, forking to other processes and some other configured
extensions or modules (such as mod rewrite for manipulating the URI
before the request is processed further).
• The RDBMS needs the CPU for reading files, building and process-
ing quite complex data structures (e.g. b+ -trees), searching through
indices and data manipulation.
• The script interpreter uses time for lexing and interpreting files,
execution of the specified script and the handling of data structures
the language provides.
49
50 CHAPTER 5. EVALUATION
Reducing the CPU load mainly works through following two schemes:
• Improving algorithms: Often there exist other algorithms that ful-

fil the same requirements but differ regarding to speed and memory
consumption. A better algorithm can decrease the load because it
solves the problem in a more elegant way.
• Caching: Considering the daily usage of a web server, requested pages

or scripts, return the same or very similar data on each request. It
appears to be wasteful to re-execute the same code over and over (con-
suming CPU time) and eventually receiving the same output anyway.
So a good method for reducing load is to not even start to produce
the load, but to deliver a cached copy.
For web servers the caching strategy seems to be the best and biggest chance
to not only reduce the need for CPU time, but also for speeding up the
service. Still, this is not that simple and evident because there are many
points where caching can be applied.
What is more important: often it is not trivial to implement a caching

module because of the complexity of the problem or application.
5.2. PROCESSING A REQUEST 51
5.2 Processing a Request
The sources of load as well as the recurring processes can best be understood
by having a closer look at a script is requested by a client and returned by
the server. This is shown in Figure 5.1.
Figure 5.1: Processing a Request
Client
8
Server
2
Web Server
3
4 7
5 6
PHP Script
RDBMS
PHP Module
7
1. The client establishes a connection to the web server via TCP, typically
to port 801 , where the web server is listening for connections.
2. The request for a document is sent to the server using the HTTP
protocol. This would look typically like this
GET /index.php HTTP/1.1

Host: www.bandnews.org
Accept-Language: en
Content-encoding: utf-8
(The first two lines are a minimum request for using NameVirtualHosts
– multiple web sites residing on one IP address – with HTTP/1.1)
3. As soon as the request is fully transmitted (this is only a short GET

request, but especially POST requests can become very long, e.g. for
1
This is a so called well-known port, defined by IANA.
transmitting files) it is processed by the web server: the correct vir-

tual host is selected, preprocessing modules (e.g. URL rewriting) are
executed, the existance of the requested file is checked, and eventually
the action for the request is chosen and applied.
4. If the file is of a MIME type for which there exists a responsible mod-
ule, it is loaded and processes the file. Otherwise the web server simply
delivers the file (skip to step 8). Here an example for a MIME type
definition in /etc/apache2/mods-avaliable/php4.conf (default lo-
cation for debian-based systems):
<IfModule mod_php4.c>
AddType application/x-httpd-php .php .phtml .php3
AddType application/x-httpd-php-source .phps
</IfModule>
5. The web server module (a script interpreter in this case) reads, parses
and executes the file.
6. A data base connection can be invoked – either by establishing a new

connection or reusing a persistent one. Also other third party tools
(libraries, etc.) required by the script are invoked at this point.
7. Once the script has been executed, its output is delivered as if it was
the content of a file.
8. The file (or output of the script) is returned to the client, reusing the
TCP connection established by the client.
9. HTTP 1.1 [FGM+ 99] allows the client to reuse this TCP connection
for further requests (this is called “keep alive” [Mog95], go back to
step 2) if the server is configured to allow this behaviour.
5.3. POSSIBLE HOOKING POINTS 53
5.3 Possible Hooking Points
The description of the request-process reveals the following points where

caching might have a good effect (to be evaluated). These considerations
are kept fairly general first and will later in Section 5.4 be focussed on the
example application, Bandnews.org.
5.3.1 Client Request
A client most commonly will request the home page (or index page) first.
Usually this is the page named / or /index.php. This sounds like a good
opportunity to cache those pages and not even have the script touched, and
deliver a previously stored copy instead. Unforunately this is not easily
possible with a script generated page. Is it randomly displays parts on the
page or other highly dynamic contents; most commonly the page cannot be
cached as a whole (but in parts, see Section 9).
The requests of clients are very similar, but the most important thing they
have in common is that they have a slow connection to the server. This
does not necessarily mean they add to the server load, but they have the
web server program reside longer in memory than necessary: The program
has to wait until the TCP connection is closed which will take longer if it is
a slow one (the speed of the data to be transmitted can usually only be as
fast as the slowest part of the connection).
A solution would be a cache to handle the transmission of the data. Rather

this should be described as a one-time cache or buffer, as the data is dis-
missed after transmission. It still fits in the field of caching.
A proxy program consuming very little memory will act in favour of the
web server, i.e. listens on port 80, receives the connection, and waits for the
client to submit its request (how ever long this takes). Then it transmits the
request to the web server with full speed by using the loop-back interface
when residing on the same machine, or over a fast LAN connection. The
result of the request is transferred back again at high speed to the proxy.
Now it handles the transmission of the data while the web server executable
can be removed from memory or handle the next request.
5.3.2 PHP Module
Each time a script is requested by the client, PHP needs to go through

this sequence: it reads the script, parses it, converts it to an executable
intermediate code and executes it. If the script requires additional files
(“included” files), this procedure has to be repeated for each file. Commonly
there are many includes; especially when connecting to databases the login
data resides in an included file, a wrapper library is loaded, and so on.
In the procedure necessary for executing a PHP script, the first steps are
heavily recurring. The ration of the real need to re-“compile” – this is when
the script has been modified – and when it is really done – every time the
script is executed – is very unfortunate. For smaller scripts the time for
compilation might be longer than the time for execution. For example, a
script only outputting a few lines runs considerably faster if the compilation
steps are skipped.
The repeated process of compiling can be left out quite easily when its result
– the runnable intermediate code – is stored in a cache. To maintain the
aspects of a scripting language the script file only has to be checked for
modifications when its run – this is by far less expensive (in terms of time)
than a re-compilation. If the script is modified, though, a little delay will
be added, as the script is both checked for modification and then compiled.
5.3.3 Database
When using databases in combination with web servers there are also op-
portunities for caching. Similar queries are executed over and over as most
of the content of web sites are not personalized which means most of the
queries usually do not differ. As the database changes comparably seldom,
caching the database output has a good gain in speed in evidence.
The caching of results would commonly be assigned to the application, be-

cause the database knows little about the way data is retrieved from the
application. Moving the cache from application to database, allows more
efficient cache invalidation, because table modifications can be caught by
internal triggers. The cache does not need to be very “intelligent” as queries
are re-executed by a computer program. Therefore repeated queries match
5.3. POSSIBLE HOOKING POINTS 55
byte-wise and are consequently easy to detect.
Another hooking point is the establishing of a connection to the database.

This can be quite expensive and has to be done every time a script needs to
connect to the database. The use of persistent connections can help in this
scenario: The connection between application and database is not cut when
the script ends but instead lives on merely forever. The drawbacks of these
connection types must be kept in mind: bugs in the application can block
a persistent connection forever causing the pool of available connections –
which is limited by definition – to shrink.
5.3.4 Application
For an application general improvements can hardly be suggested. It heavily

depends on the application and the requirements whether and what methods
of improvement can be used. Here, the knowledge and experience of the
application programmer is enforced. There are some “standard” approaches
leading to the result of gaining speed.
An important point with caching is the recognition of recurring patterns.

The more often a pattern appears, the better are the chances for efficient
caching. Good points for caching are application arranged data combina-
tions, for example a result set consisting of combined database queries. Also,
intermediate results of algorithms are often worth caching: for instance, data
structures used by an algorithm have to be built up first – frequently a pricey
thing.
5.4 Bandnews.org
Evaluating opportunities for application level improvement requires a spe-

cific look at the application itself. As stated in the introduction (see Section
1.2), a skeleton of a page as well as a few key pages will be examined.
When analyzing tuning potentials we take a two-stage approach: First the

page is investigated regarding its structure and elements; for this overview
caching opportunities are identified. The second stage involves a profiler
which identifies other bottlenecks that suggest a search for better solutions
for the critical regions.
5.4.1 Skeleton page
A skeleton page is a page common to every other page of the application.

For web sites a separation between two different types of skeletons can be
made: An “ultimate” skeleton and a “normal” skeleton that depends on the
ultimate one. This can also be shown using a layer model, see Section 5.2.
Figure 5.2: Script layers of an application script
Script
Presentation Skeleton
Application Skeleton
• The Application Skeleton is common to every single page of the ap-

plication. It is usually used to include db connection routines, authen-
tication mechanisms, loading of modules, such as Smarty, and other
common functions. For the application Bandnews.org these function-
ality is integrated with the file inc/setup.inc.php. A simple skeleton
page is therefore a PHP script that only includes this file.
When benchmarking, this file will be called skeleton.php (sometimes
variated with an appended -t or -n).
5.4. BANDNEWS.ORG 57
• The Presentation Skeleton builds on the Application Skeleton, so

it provides all of its functionality. This is simply established by in-
cluding inc/setup.inc.php. Calling the presentation skeleton in a
webbrowser results in a page that includes all navigational and recur-
ring features of a page. So the added functionality is restricted to
presentational code, commonly including a meta navigation, primary
navigation such as search box and/or a menu, as well as header and
footer.
When looking at Listing 5.1, a separation into 4 documents is clearly visible.
Listing 5.1: A Presentation Skeleton – pres-skel.php

1 <? php
2 require ( ’ inc / setup . inc . php ’) ;
3
4 include ( ’ inc / header . inc . php ’) ;

5 include ( ’ inc / menu . inc . php ’) ;
6 include ( ’ inc / sidebar . inc . php ’) ;
7 include ( ’ inc / footer . inc . php ’) ;
8 ?>
If only the first file was included, the script would represent the application
skeleton.
The additional files load the corresponding page parts, including database
calls when needed.
5.4.2 Index page index.php
The index page belongs to a group of three page types (described below)
and is a good representative for a common page of the site. In addition to
the presentation skeleton, news items are displayed which underlie constant
modification. The selection of the news items (equals the SELECT statement
and its WHERE clause for an SQL database query) is based on the page type:
• The Index Page shows the most recent news from all bands and
genres, selected by date in a descending order.
• The Genre Page displays the news for a certain genre and its sub
genres, with a news item belonging to the genre of the band. As a band
can be classified into more genres, a weak entity is used for establishing
this relation. For displaying this page a joining of tables is necessary.
• A Band Page is a filtered view of the Index Page with restriction to

a specific band.
The index page which usually receives the most hits on a server and therefore
is worth close consideration.
5.4.3 Search page search.php
A search page can be separated into two parts:
• The band search: The search query is used for finding bands which
match the expression, not only taking bands into consideration that
provide news, but also those that could not be integrated within Band-
news.org.
• In the news search based on the query, matching news items are
displayed: A selection of news items, just as described at section 5.4.2.
Both points can be included in the testing of the index page. This is due to
the MySQL query cache which will be included in testing.
5.4.4 Links page links.php
The links page is somewhat unlike the other page types, but or rather be-
cause of that it is worth to take a closer look at it: For each letter of the
alphabet, matching bands are displayed on one page, including all relevant
information:
• Band name
• Country
• Genres
• Homepage address
5.5. TESTING 59
The interesting thing about the page is that for database design reasons the
information is split across several tables: Band name and Homepage address
are stored in the bn band table, the country – for translation reasons – in
bn country and the genres in bn band genre respectively in bn genre as
a band can have more than one genre assigned (an m:n-relation, modeled
using bn band genre as weak entity).
There are many (expensive) queries needed to build this page. It is therefore
important to know, how the various caching strategies can optimize the
speed of this page.
5.5 Testing
When testing the capacity of a web server, there are several things to be
considered [BD99]. Aspects like the latency of a WAN connection are not
taken into consideration in this thesis, all tests are done via the loopback
interface.
In our scenario, we are not testing a web server program delivering text files,
but the output of a script instead. This significantly decreases the speed.
In a basic test on the author’s system, the Apache web server is able to
serve about 3,000 pages of a 2 kilobyte document in a second, while a script
generating 2 kb of random data only produces a through put of about 190
requests per second (lacking any tuning).
For each test a certain sequence of steps is maintained to produce comparable

results. This is done by using a script developed for this purpose. More
information and source code can be found in the appendix.
1. The configuration files are modified to reflect the changes to be tested.
2. The web server and database server are restarted to provide a fresh
environment.
3. The script to be tested is requested – without measuring – one or more

times (depending on the test case). This is used to load caches. This
step can be skipped if the generation of the cache is to be included in
the test.
4. A certain amount of time, for instance 10 seconds, the process is paused

to ensure no trailing requests block anything.
5. The test run is started. The document is requested some 10 to some

1000 times, also parallel requests (concurrent requests, CCR) are pos-
sible.
6. The log file of the test is parsed and converted to a format used for
generating a chart.
5.5.1 Preparations
The author’s script for benchmarking can automate many tests as it auto-
matically generates the possible testcases with each tool switched on or off.
There are a few steps to take until a tool can be used with the benchmark.
Usually changes to one or more text files have to be made to configure a

tool to be used and restart the appropriate server. Unfortunately it is not
possible to exchange the whole configuration file as two tools might need a
change in the same file. The configuration files are therefore patched using
a file representing the changes to be made.
When the benchmark script is started it expects all configuration files to

be disabled. This can be established by prepending a restoring script. The
tools need to be turned off anyway for the process of generating a file that
can be used to integrate the tool with the benchmark. See Listing 5.2 for
how such a patch file is generated.
Listing 5.2: Creating a patch file for the MySQL query cache
1 cd / tmp
2 cp / etc / mysql / my . cnf .
3 vi my . cnf # do the necessary editing
4 diff -c / etc / mysql / my . cnf my . cnf > mqc
The result is a file containing only the modified lines (plus some contextual
lines, so that the file can even be patched if the line numbers do not match).
What such a file exactly looks like can be found in the appendix, Listing
A.4.
5.5. TESTING 61
Figure 5.3: Typical output while benchmarking
When all necessary patch files are generated, the testing can be started. The
benchmark script is invoked with the patch files as command line arguments.
A single option can be tested just by specifying one argument. All in all 2n
(with n being the number of arguments) test cases will be run through. The
scripts to be tested (k) are hardcoded and can be overridden (see appendix
A.1). As a grand total k ∗ 2n benchmarks are run.
5.5.2 Testing environment
As testing environment a Pentium 4 2.8 GHz system with 512 MB DDR

RAM is used. As operating system Ubuntu Linux Hoary 5.04 has been
installed. Versions of the used programs are shown in Lsting 5.3.
Listing 5.3: Program versions

1 $ uname -a
2 Linux main 2.6.10 -5 -686 - smp #1 SMP Tue Apr 5 12:41:40 UTC
2005 i686 GNU / Linux
3
4 $ apache2 -v
5 Server version : Apache /2.0.53
6 Server built : Apr 1 2005 18:17:53
7
8 $ php -v
9 PHP 4.3.10 -10 ubuntu4 ( cli ) ( built : Apr 1 2005 14:16:27)
10 Copyright ( c ) 1997 -2004 The PHP Group
11 Zend Engine v1 .3.0 , Copyright ( c ) 1998 -2004 Zend

Technologies
12
13 $ mysqld -V
14 mysqld Ver 4.1.10 a - Debian_2 - log for pc - linux - gnu on i386
( Source distribution )
15
16 $ grep @version libs / Smarty . class . php

17 * @version 2.6.7
18
19 $ squid -v | head -1
20 Squid Cache : Version 2.5. STABLE8
21
22 $ pear info APC | grep Version

23 Version 2.0.4
24
25 $ pear info APD | grep Version

26 Version 0.9.2
27
28 $ ab -V
29 This is ApacheBench , Version 2.0.41 - dev < $Revision : 1.141
$ > apache -2.0
30 Copyright ( c ) 1996 Adam Twiss , Zeus Technology Ltd , http
:// www . zeustech . net /
31 Copyright ( c ) 1998 -2002 The Apache Software Foundation ,
http :// www . apache . org /
The SMP kernel was installed because of the HyperThreading feature of

Pentium 4.
Chapter 6
Squid
The Squid Proxy server used here has been already introduced in Section
4.5.
6.1 Considerations
The primary idea which lead the author to integrate Squid and its Server
Acceleration Mode1 was to speed up large requests by taking away the part
of transfering the data from the client to the server and back from the web
server. According to [Wes04] there are many more benefits from this mode:
• Whole pages can be deliverd from Squid’s cache if they have been
requested before.
• Squid acts as a kind of dedicated firewall: no direct access to the

web server is possible; if it is attacked or compromised, no stored data
is lost.
• Load balancing can be quite easily established by using Squid as a

reverse proxy.
The two latter points are not taken into greater consideration in this thesis.
The first point, though, fits the topic and requires a closer look.
1
RFC 3040 [CMT01] calls the proxy server a surrogate in this mode. The term “reverse
proxy” is also quite common.
63
64 CHAPTER 6. SQUID
6.1.1 Caching of whole pages
As already mentioned earlier (see Section 5.3.1), it is often difficult to cache

whole pages generated by a script. A web application usually also consists
of static pages that can be cached. Typically (in scripted environments)
these pages are wrapped through a script in order to provide the standard
navigational environments (see presentation skeleton, Section 5.4.1).
To understand, how to cache scripted pages as a whole the mechanism of a

caching proxy has to be examined in greater detail.
As stated earlier, one of the top priorities of a proxy server is to be trans-

parent to the user. So it must be ensured that no stale (=old) copy of a
document is delivered to the client. To achieve this a caching proxy has
to rely on the HTTP header fields sent by the servers (so called response
headers).
The most important fields (also from the view of the web developer) are
(compare to [Wes01]):
• Date
• Last-Modified
• Expires
• Cache-Control
• Content-Length
The Date field can be used to detect clock skews that might interfer with
other date based headers. The server sends its current time which can be
compared to the time on the proxy server. In our case this is somehow
useless as web and proxy server reside on the same machine. If the clock is
out of sync, the system might have a schizophrenic problem.
Even if the clocks of the systems are not synchronized, the Date field can
still be used to convert other fields specifying absolute timestamps to relative
time spans. The issue of wrong timestamps is also discussed in [Mog99].
Of greater importance is the Last-Modified field. The time of the last

modification (on the server) of the document is stated here. In combination
6.1. CONSIDERATIONS 65
with the Date field, on the one hand, the age of the document can be de-
termined. On the other hand, the proxy can use this timestamp to decide
whether its stored copy is stale and has to be refreshed or not.
In HTTP/1.0 the request field If-Modified-Since can be used by the proxy

to ask the web server about a newer version of a document [BLFF96]. If
it had been changed recently, the server simply would answer with a 200
OK response and sends the new version – it acts as if there had not been a
If-Modified-Since field in the request. Otherwise the 304 Not Modified
message would tell the proxy that its copy is still valid.
HTTP/1.1 provides a field called Entity which can be used to establish a

more complex treatment of different versions of documents (e.g. in different
languages) on the server. This is discussed in great detail in [Wes01] and
[FGM+ 99].
The Expires field provides the proxy with information about the lifetime
of the document. Until this timestamp is reached, the proxy can deliver
the page without revalidating it. By providing a timestamp in the past, the
proxy can be told to check the file upon the next request for sure.
By using the Cache-Control header, the proxy server can be given specific
information about the caching options the document has. Most important
values are public vs. private which specifies whether the document con-
tains user-specific data. In the latter case the document can only be stored
in the client’s personal cache. The no-cache value advises the proxy not to
store the document under any circumstances.
For a quick stale check the field Content-length can be used. If the docu-
ment size stored in the cache does not match the specified one, the document
is likely to have been changed in the mean time. Decisions upon this field
are rather risky, though. For example content encodings (e.g. UTF-8 vs.
iso8859-1) can lead to different lengths of documents of the same content.
Because of this, proxy servers usually only use this value to verify that they
do not store a document that has not been transferred successfully. A client
must never receive incomplete data from the proxy.
66 CHAPTER 6. SQUID
6.1.2 Programmer’s view
Knowing the headers the proxy servers rely on, the programmer has to
ensure that the web application provides the proxy with the correct headers.
As mentioned earlier there are two types of documents: static and dynamic
ones. While the web server commonly takes care of the correct header fields
for static data – it obtains, for instance, the last modification date from
the operating system – the programmer is fairly left alone with dynamic
documents.
Quite an easy case is the wrapping of a static document (mixture of a static

and a dynamic document – semi-static), e.g. when an HTML document is
embedded with navigational elements of the site. Listing 6.1 shows how the
appropriate fields can be set. This piece of code is adapted and enhanced
from sample code from [Sch04].
Listing 6.1: Last modification check

1 <? php
2 function last _modi fied_ head e r s ( $mod_time ) {
3 $gmt_mtime = gmdate ( ’D , d M Y H : i :s ’ , $mod_time ) . ’
GMT ’;
4
5 if ( $_SERVER [ ’ IF_MODIFIED_SINCE ’] == $gmt_mtime ) {

6 header ( ’ HTTP /1.1 304 Not Modified ’) ;
7 exit ;
8 } else {
9 s es sion_ cache _limi ter (" must - revalidate ") ;
10 // header ( ’ Cache - Control : must - revalidate ’) ;
11 header ( ’ Last - Modified : ’ . $gmt_mtime ) ;
12 }
13 }
14 $document = ’ static . html ’;
15 l a s t _ mod ified _head ers ( filemtime ( $document ) ) ;
16 include ( ’ header . inc . php ’) ;
17 include ( $document ) ;
18 include ( ’ footer . inc . php ’) ;
19 ?>
Note line 9: When using PHP sessions (using the command session start)
the header field Cache-Control is rewritten by PHP. When specifying a re-
placement value with session cache limiter, this behaviour can be con-
trolled. One has to be careful with this option because sessions can affect
the users privacy (see Section 2.1.2).
A problem arises when either header or footer are changed. This script will
still return an old header. Either the last modification date of all three
documents need to be taken into consideration (see Listing 6.2), or, simply
the modification time of the static page is changed by using the command
touch static.html. It depends on the application (e.g. when data from a
database are retrieved in either header or footer) whether or not to choose
the former method.
Listing 6.2: Last modification check adapted – pres-skel-t.php

1 <? php
2 function last _modi f i e d _ h e a d e r s ( $mod_time ) { /* code
remains unchanged */ }
3
4 $documents = array ( ’ header . inc . php ’ , ’ static . html ’ , ’

footer . inc . php ’) ;
5 $last_modification = -1;
6 foreach ( $documents as $document ) {
7 if ( $last_modifica ti on < filemtime ( $document ) ) {
8 $last_modificat io n = filemtime ( $document ) ) ;
9 }
10 }
11 las t_mod ified _head e r s ( $l as t_m od if ic ati on ) ;
12 foreach ( $documents as $document ) {
13 include ( $document ) ;
14 }
15 ?>
With dynamic pages it heavily depends on the content whether a time of last
modification can be specified at all. Generally speaking the last modification
date is the date of the “youngest” part of the page. If for a single part of
the page (e.g. included content of a database) this time is unavailable, the
last modification for the whole page is unavailable. This case is not unlikely.
That is the reason why caching of whole pages is problematic for dynamically
generated pages. Therefore, the solution using a proxy does not suffice for
web applications.
68 CHAPTER 6. SQUID
6.1.3 Expected Results
For static documents (or “statified” dynamic documents) we can expect a

high gain in speed when using the caching capabilities of the proxy.
It is difficult to estimate what effect turning off a cache has. This case exactly
matches the case when testing a purely dynamic script that is incapable of
delivering a last modification date. A gain of speed can be expected with
many concurrent requests or with slow clients.
6.2. PREPARATION 69
6.2 Preparation
To prepare Squid for the server acceleration mode, the configuration file has
to be modified. Squid is by default configured to be a client-side caching
proxy.
This section will only give an overview of the most important options. All
options that have been modified for the benchmark tests can be found in
the patch file in the appendix, Listing A.2.
6.2.1 Configuring Apache
The first consideration is to have the proxy listen on the port of the web
server. This is usually the well-known port 80. Before that, the web server
must be told to listen on another port because only one program can occupy
one port at the same time. The new port of the web server is arbitrary, the
proxy server needs to be configured to forward requests to this port anyway.
The necessary change needs to be made to the file /etc/apache2/ports.

conf. The line Listen 81 will make Apache listen on port 81, instead of
the default port 80.
6.2.2 Configuring Squid
For the Squid proxy server there are some more changes needed. As we have
configured the web server to listen on another port, Squid should listen on
port 80 instead. This is done via the command http port 80 (see Listing
6.3).
Now we need to tell the proxy where the web server resides. This can
be done using the httpd accel host and httpd accel port option. The
values 127.0.0.1 (=localhost via loopback interface) and 81 do the correct
thing.
As with the default configuration of Squid the developers have taken care of
security, there is still an option to be changed. We need to allow everyone
(this is the usual purpose of a web server, opposed to the audience of a proxy)
to access the proxy-accelerator. The option http access allow all does
exactly that.
70 CHAPTER 6. SQUID
These options suffice for turning on server acceleration mode. The modifica-
tion of both an Apache configuration file and the Squid file requires a restart
of both programs (typically done via the commands2 sudo /etc/init.d/
apache2 restart and sudo /etc/init.d/squid restart).
Still there are two more options worth considering:
The acceleration switches automatically turn off the caching-proxy function.

It is be advisable to turn the function on again. This can be done via the
extra option httpd accel with proxy on.
When using the web server in domain virtual host mode (when more than
one (sub)domain point to one IP address) the HTTP 1.1 request field Host
needs to be transferred via the proxy, too. This is turned off due to se-
curity reasons once again. The appropriate option is httpd accel uses
host header on.
Listing 6.3: A reduced Squid configuration file – /etc/squid/squid.conf

53 http_port 80
1847 http_access allow all
2185 httpd_accel_host 127.0.0.1
2186 httpd_accel_port 81
2215 h t t p d _a cc el _wi th _pr ox y on
2235 h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on
2
The command sudo allows a standard user to execute a command with the rights
of the super user (typically root). The scripts are executed with user privileges but for
certain commands more rights are required.
6.3. RESULTS 71
6.3 Results
Let us now have a look at the first results.
The testings started with optimized versions of skeleton.php and pres-

skel.php (see listings 6.1 and 6.2), providing last modification headers),
but without activated Squid.
For the tests there were 10,000 documents requested by the load generator,
for the comparison the number of requests per seconds is taken as a measure
(the total time of the test run can easily be calculated). Because static
documents receive a extra ordinarily high gain of speed (compare Table 6.1
with Table 6.2), that large a number was chosen to receive representative
results.
Table 6.1: Benchmarking results (Requests per second): Without Squid

Concurrent requests
File (.php) 1 5 10 25 50 100 1000
skeleton-t 78.56 91.43 84.69 77.24 69.82 62.08 65.91
pres-skel-t 4.42 5.64 5.70 5.99 5.62 5.99 3.44*
* Aborted after 172 requests (because of a time limit of 60s per request)
6.3.1 skeleton-t.php
The (optimized) ultimate skeleton skeleton-t.php is the only page that is

completely independent of a database. The results for this page are therefore
quite good (i.e. fast, see Figure 6.1). This demonstrates mainly the “full
power” of the server, so this is quite the upper border: 3868.6 requests per
seconds with Squid turned on and 25 concurrent requests.
Table 6.2: Benchmarking results (Requests/s): With Squid

Concurrent requests
File (.php) 1 5 10 25
skeleton-t 2,500.3 3,590.4 3,868.6 3,788.3
pres-skel-t 1,699.9 2,349.8 2,523.4 2,470.1
72 CHAPTER 6. SQUID
6.3.2 pres-skel-t.php
This skeleton is a more realistic test candidate (see Figure 6.2). Static
pages (such as the About page or the contact form of the application Band-
news.org) wrapped through a script behave very similar to pres-skel-t.php.
This is only the case if the presentation skeleton stays the same upon each
request – a desirable state.
Table 6.3: Benchmarking results (Requests/s): With Squid (cont.)

Concurrent requests
File (.php) 50 100 1000
skeleton-t 3,681.1 3,385.7 2,709.5
pres-skel-t 2,005.9 1,940.1 1,912.9
As the difference in speed is so extra-ordinarily high (and, therefore, the

positive effect cannot be overlooked) we are concentrating on another aspect
of high load: Concurrent requests. In the other tests we will only take a
look at a maximum of 2 different concurrency rates.
Figure 6.1: Squid benchmark: skeleton-t.php

skeleton−t.php
10,000
squid_disabled
squid_enabled
3590.44 3868.60 3788.34 3681.12 3385.67
2500.32 2709.50
Requests per second
1,000
100 91.43 84.69

78.56 77.24 69.82 62.08 65.91
10
1 5 10 25 50 100 1,000
Concurrent Requests
6.3. RESULTS 73
Figure 6.2: Squid benchmark: pres-skel-t.php

pres−skel−t.php
10,000
Squid_disabled
Squid_enabled
2349.80 2523.40 2470.10 2005.90 1940.10 1912.90
1699.90
1,000
Requests per second
100
10
4.42 5.64 5.70 5.99 5.62 5.99
3.44
1
1 5 10 25 50 100 1,000
Concurrent Requests
Figure 6.3: Squid benchmark: index.php

index.php
3.00
Squid_disabled
Squid_enabled
2.50
Requests per second
2.00 2.64 2.64 2.25 2.20
1.50
1.00
0.50
0.00
10 100
Concurrent Requests
74 CHAPTER 6. SQUID
Table 6.4: Benchmarking results (Requests per second): index.php

Concurr. req.
Squid 10 100
off 2.64 2.25
on 2.64 2.20
6.3.3 index.php
Figure 6.3 (Table 6.4) shows benchmark results of index.php with Squid
turned on and off. There is no significant difference. In the case of 100
concurrent requests, turning Squid on adds so much overhead that it shows
up in the benchmark.
6.4 Conclusions for Squid
Squid can accelerate static and partially static pages massively (up to the
factor 420) when using the caching functionality of the proxy.
For dynamic pages Squid is no solution. It can even decrease speed due to
the added overhead. This is because Squid can only cache whole pages while
often no date of the last modification can be specified.
Considering even higher traffic web applications (when referring to static

pages) the (disk) I/O will become a great bottleneck. [Wes04] deals with
this topic and possible solutions in great detail.
For dynamic pages a proxy server does not suffice for acceleration. There-
fore, more testing is necessary in the following sections.
Chapter 7
APC
In this section the Advanced PHP Cache introduced in section 4.6 will be
used.
7.1 Considerations
The idea behind APC has been discussed in detail already (see sections 4.6
and 5.3.2). Nevertheless here is a short overview of what APC does:
7.1.1 Compiler Cache
For PHP being a scripting language, the script code has to be compiled to a
runnable intermediate format each time the script is executed. This matches
the idea of quick prototyping as a change to the script file is immediately
applied when the file is saved.
The ratio between necessary compilation and useless recompilation is very

bad, though. Especially when the application is finished the recompilations
without a code change exceed necessary ones by far.
That is where compiler caches hook in. They store the result and the compi-
lation and reuse this intermediate code for the next request. Before that, a
quick check is made whether the script has been modified or not, of course.
In that case the cached code is invalid and a recompilation is initiated.
The magnitude of this tuning even increases when you consider that the
75
76 CHAPTER 7. APC
compilation process has to be initiated for every file that is included by the
first script.
7.1.2 Code Optimization
A topic which has not yet been discussed but adds to a speed increase of the
compiled PHP scripts is the optimization of code. It is worth spending some
effort (and therefore time) on optimizing the PHP code before storing it in
the cache. The cost for doing this is minimal considering that the optimized
code will be reused a few thousand times at least.
Although there is no documentation for APC on the topic of code optimiza-

tion there are still quite a few resources. On the one hand the author of
another PHP cache, Nick Lindridge, has written an article covering code
optimization [Lin02]. According to [Sch04] the optimizations done are quite
similar. On the other hand the source code (long live open source!) is
available and documented well enough to give a brief overview on what it
does.
These code optimizations should not be compared to what “real compilers”

like GCC1 do [Jon05]. They concentrate instead on common cases that gain
much from small adjustments. Longer and very comprehensive analysis
would eventually not be worth the effort.
Here is just a short overview of these so-called “Peephole optimizations:”
• Removing unnecessary NOOPs. APC simply strips out any NOOP

codes it finds. Even that is not trivial as the jumps have to be modified
to match the shifted code positions.
• Glue sequences of ADD STRING together. This is quite a drawback

of PHP (due to the inline replacement of variables, actually) that it
splits a string into parts and executes an ADD STRING command for
each word.
• Converting $c++ to ++$c where possible. This is the case when

the result is not instantly used (so-called void context). With $c++ an
additional temporary variable is needed.
1
The Gnu Compiler Collection. A set of compilers that is almost always used to compile
open source software.
• Strip multiple jumps. If an if clause does not contain an else

branch, a jump command points to the next instruction which is the
next one to be executed anyway.
7.1.3 Outputting Data
Especially the second point in the listing above shows the need for some
more testing. Outputting is an important point for a script that is used to
return data to the browser. Therefore the quickest method for transmitting
data has to be determined.
In PHP there is a concept called “Output Buffering.”
Initially, output buffering was integrated with PHP because of the necessity
to send HTTP headers before writing any output (compare to [Sur00]). This
is because PHP instantly sends the output of the script to the browser (it
“flushes” its buffer), but this can only be done after all headers have been
sent to the client. If you wanted to set a cookie (which is done in the HTTP
header with the Set-Cookie field) after you printed some text already, PHP
returns an error message or the cookie is simply dismissed.
Now output buffering not only enables the programmer to set headers at
any stage, but also performance benefits from the concept.
Instead of sending all data instantly to the browser, the data is stored in an
internal buffer. Therefore, all headers can be modified until output buffering
is terminated or the script is finished. At this point the headers and all data
stored in the buffer are sent to the browser.
A fine thing is the optional callback function. It allows the programmer to

modify the contents of the buffer before it is sent. This can be used to com-
press the output, e.g. with gzip (see below) or to modify it for compatibility
with character encodings [Kir05].
7.1.4 Programmer’s View
The boring thing about this section is that there is nothing to do for the
programmer.
78 CHAPTER 7. APC
7.2 Preparation
APC is activated easily. In the php.ini file just a line
extension = /usr/lib/php4/apc.so
has to be added (see the patch file in the appendix, Listing A.3).
7.2.1 Output Buffering
For testing the output behaviour, three additional scripts have been created.
Common to all of them is the generation of random output of a double
quoted string (which is examined for contained variables by PHP) as shown
in listing 7.1.
Listing 7.1: Generate lengthy random output

1 <? php
2 for ( $i = 0; $i < 5000; $i ++) {
3 echo str_repeat (" This is a test string " , rand (1 , 4) ) ;
4 }
5 ?>
These lines were not integrated (although it would be obvious) with another
file to be included in order to have these lines included with the compiler
cache separately for each file.
no.ob-start(.php) uses no output buffering, i.e. the PHP function ob

start() is never called.
ob-start.nogz(.php) just calls ob start() at the beginning of the script.

This enables output buffering, but does not set a post-processing callback
function.
ob-start.gz(.php) adds an internal callback function to output buffering:

ob gzhandler. Before sending the data, it is compressed, commonly by
using gzip. If the browser reports (using the header field Accept-Encoding)
that it has no support for gzip, another supported compression method is
chosen. For this test only gzip compression is used.
7.3. RESULTS 79
7.3 Results
The effects of switching on APC are not as extraordinary as those for Squid.
They apply to all of the tested scripts.
Table 7.1: APC Benchmarking results (Requests/s)

APC off APC on
Concurrent requests
File 10 100 10 100
skeleton-t.php 112.17 103.75 409.11 506.44
pres-skel-t.php 7.27 7.26 9.01 8.91
index.php 2.83 2.47 3.25 2.76
links.php 4.31 3.69 4.90 4.19
skeleton-t.php (Figure 7.1) receives the highest benefit from APC. This
is due to two points: There is no need to connect to the database and only
little output is made. The speed gained from dismissing the database is
quite obvious, but an interesting point is the cost of outputting data. We
will take a closer look at this point in Section 7.3.1.
Figure 7.1: APC benchmark: skeleton-t.php

skeleton−t.php
600
apc_disabled
apc_enabled
500 506.44
Requests per second
400 409.11
300
200
100 112.17 103.75
0
10 100
Concurrent Requests
80 CHAPTER 7. APC
Figure 7.2: APC benchmark: pres-skel-t.php

pres−skel−t.php
10
apc_disabled
9 apc_enabled 9.01
8 8.91
Requests per second
7 7.27 7.26
6
5
4
3
2
1
0
10 100
Concurrent Requests
Figure 7.3: APC benchmark: index.php

index.php
3.50
apc_disabled
apc_enabled
3.00 3.25
2.50
Requests per second
2.00 2.83 2.47 2.76
1.50
1.00
0.50
0.00
10 100
Concurrent Requests
7.3. RESULTS 81
Figure 7.4: APC benchmark: links.php

links.php
5.00
apc_disabled
4.50 apc_enabled
4.00 4.31 4.90 4.19
Requests per second
3.50
3.00 3.69
2.50
2.00
1.50
1.00
0.50
0.00
10 100
Concurrent Requests
Also scripts that need to build a connection to the database receive a gain in
speed. The difference in speed lies between 10 and 25 percent, which is quite
a value for literally uncommenting a line. In the rather uncommon case of
no database connection (we will see later how to make scripts independent
of a database) the gain hits a 500%.
The results for index.php (Figure 7.3) and links.php (Figure 7.4) show
that with APC the same performance (when speaking of requests per second)
for 100 concurrent requests2 can be achieved as for 10 CCR without APC.
When comparing requests per second for different concurrency rates it still
has to be considered that the speed from the view of a client still varies. If
benchmark A and B have the same rate of requests/s, the total time is also
the same regardless of the concurrency rate.
The response time “felt” by the client is therefore longer for more concurrent
requests.
Considering two benchmarks with the same speed of 25 requests per second,
25 users requesting the page at the same time will have to wait one second
each. If the request rate stays the same and 250 users want to access the
2
Throughout the thesis the abbreviation CCR will be used for concurrent requests.
82 CHAPTER 7. APC
page at the same time, every user has to wait 10 seconds. The request rate
is still 25 requests per second (and therefore a “good” result).
7.3.1 Results for output testing
In the previous section a cause for the achieved results has been claimed
to be the output performance of PHP. Therefore, some extra testing was
included to check how different settings for outputting data perform.
As stated before and can be seen in Figure 7.5, output buffering increases
speed. When looking at the dark grey bars representing the results without
APC we see that for 10 concurrent requests output buffering increases speed
by only 2%. With APC the gap increases to about 5% (see Table 7.2).
For 100 CCR there is not a gain but a loss. This can be explained by the
higher memory usage that output buffering takes. With many concurrent
requests, much data has to be stored in a buffer: a mean size of 262,500
bytes sums up to 25 mega bytes only used for buffered data (there is an
additional overhead, of course). With the additional overhead of Apache
binaries this can quickly fill memory.
Table 7.2: Output benchmarking results (Requests/s)
Files (.php)
CCR APC no.ob-start ob-start.nogz ob-start.gz skel-t
10 off 38.77 40.29 28.72 112.17
10 on 48.68 51.45 35.69 409.11
100 off 37.59 35.32 26.86 103.75
100 on 46.02 43.01 33.29 506.44
The slowest output comes from gzip compressed. This is due to the ex-
pensive algorithm for compressing data. Still, using this mode can be rec-
ommended as due to less data (text data is very suitable for compression
[BK93]) that needs to be transferred, the bandwidth becomes less a factor
for performance. It is also a cost factor: usually you have an account-based
traffic limit; using compression you can serve more visitors at the same price.
The results for 100 CCR do not show a better performance for output buffer-
7.3. RESULTS 83
Figure 7.5: APC benchmark: Skeletons with(out) ob start() (10 CCR)

10 Concurrent requests
1,000
apc_disabled
apc_enabled
409.11
Requests per second
100 112.17
48.68 51.45
38.77 40.29
35.69
28.72
10
no.ob−start ob−start.nogz ob−start.gz skeleton−t
File
Figure 7.6: APC benchmark: Skeletons with(out) ob start() (100 CCR)

100 Concurrent requests
1,000
apc_disabled
apc_enabled
506.44
Requests per second
100 103.75
46.02 43.01
37.59 35.32 33.29
26.86
10
no.ob−start ob−start.nogz ob−start.gz skeleton−t
File
84 CHAPTER 7. APC
ing than with 10 CCR. Instead, there is only a (very small) gain with APC
on and gzip compression turned off.
For documents that do not return any contents (skeleton-t.php) the rate
is evidently even higher since no output has to be stored or sent. Only the
headers are sent in this case. This reduces the used network bandwidth even
more. The headers are not compressed under any circumstances.
Although the gain in speed is not enormous, the use of output buffering
(with enabled compression) is heavily recommended. For large projects
also bandwith plays an important role and even a small loss of speed in
combination with a reduction of traffic can save a lot of money.
7.4 Conclusions for APC
APC cache proves to speed up each PHP script requested more than once.
This shows how much time is consumed for compilation when executing a
PHP script.
Turning off APC or not installing a compiler cache is simply a waste of CPU
time and it should always be considered to install and activate APC.
Chapter 8
MySQL
8.1 Considerations
An important aspect in most web application is the database. As the content

is dynamic, it is commonly stored in a database. Therefore, the speed of
the database is nearly as important as the speed of the script. Or, to say it
differently, a tuned database will also speed up the scripts.
There are two concepts that will be tested in this thesis: the MySQL Query
Cache and Persistent Connections. The third concept of Query Tuning
should also be taken into consideration but is a too extensive topic.
8.1.1 MySQL Query Cache
Version 4.0.1 of MySQL – the database used in this thesis, see Section 4.3
– supports a caching mechanism that allows quicker retrieval of common
queries.
MySQL is commonly used as a database for a web application. Charactaris-

tic for this scenario are little changes for the stored data and many identical
queries. The MySQL query cache reserves a given amount of storage for sav-
ing queries plus their results. A script typically requests exactly the same
query (the SQL command has to be the same – byte-by-byte) several times
and MySQL can use the cache to instantly return the results.
This concept works as long as nothing is changed in the database that affects
85
86 CHAPTER 8. MYSQL
the query. The most common, “dangerous” commands are INSERT/REPLACE,

UPDATE, and DELETE. When tables are modified, any relevant entries in the
query cache are flushed; no stale data is ever returned.
When using the MySQL query cache, we want to consider certain queries
not to be cached at all. There is a special command for this case that will
be discussed in Section 8.2.
8.1.2 Persistent Connections
A script that wants to access the database has to connect to the database
daemon first. This is called “establishing a link”. Usually this is done via a
TCP connection and an authentication mechanism.
The cost for connection to a database and establishing a link depends heavily
on the environment. Most important factors for the speed for connecting
are the speed (and/or latency) of the network interface (can also be the
very fast loop back interface if the DBMS resides on the same machine) and
load on the database machine. Dependent on the configuration a certain
overhead for connecting will slow the script down.
The concept of persistent connections is somewhat similar to preforking of

Apache (see Section 4.1): a set of connections is ready to be reused without
having to go through the whole connection phase. The higher the overhead
for a connection is, the higher the gain from persistent connections will be.
The drawbacks of persistent connections are caused by their persistency.

If for some reason a link is ruined (e.g. by a connection loss or a faulty
script) it cannot be reused any more. There exist concepts to detect bro-
ken connections and re-establish them, but they require extra overhead. A
greater problem are table lockings that have been turned on by mistake. A
programmer can avoid this by using a so called “shutdown function” that
clears all locks when the script finishes.
8.1.3 Query Tuning
Apart from caching and speeding up the environment, one also has to make
the queries behave well: take care that they are not (too) wasteful.
8.2. PREPARATION 87
Often indices and good database layouts can improve the speed even more
than caching techniques. In combination these techniques result in the high-
est speed, of course.
Tuning queries is very application dependent, but [ZB04] gives a good intro-
duction and leads to good starting points for optimizing the queries. You
can commonly start with the slowest queries of our application. They can
be automatically logged by MySQL if we specify a threshold of x seconds.
8.2 Preparation
The changes needed for this testing can both be made in the configura-
tion files of MySQL (for the query cache) and PHP (for generally enabling
the persistent connection feature), but it has also to be ensured that the
connection script code takes advantage of this feature.
To turn on the MySQL Query Cache (abbreviated MQC), in the MySQL

configuration file /etc/mysql/my.cnf simply the size of the cache has to be
set to a non-zero value (see also appendix, Listing A.4):
Listing 8.1: Activate MQC – /etc/mysql/my.cnf

59 query_cache_size = 26214400
The size is specified in bytes. Here a cache of 25 mega bytes is created.
Persistent connections are activated, by enabling them in /etc/php4/

apache2/php.ini. The number of possible links and persistent connections
are usually set to unlimited (i.e. -1).
Listing 8.2: Activate persistent connections – /etc/php4/apache2/php.ini

598 mysql . allow_persistent = On
601 mysql . max_persistent = -1
604 mysql . max_links = -1
Additionally we have to asure that the scripts really use the persistent con-
nections. When using plain PHP we need to use the function mysql pconnect
instead of mysql connect to connect to the database.
88 CHAPTER 8. MYSQL
Wrapper APIs (like PEAR::DB, see Section 4.3.1) need individual care. In
PEAR::DB either the construct
$db = DB::connect($dsn, true);
works. Listing 8.3 shows a more verbose and extensible solution.
Listing 8.3: Use persistent connections with PEAR::DB – db/db.php

3 require_once ( ’ DB . php ’) ;
4
5 $dsn = array (
6 ’ phptype ’ = > ’ mysql ’ ,
7 ’ username ’ = > ’ bandnews_org ’ ,
8 ’ password ’ = > ’xyz ’ ,
9 ’ hostspec ’ = > ’ localhost ’ ,
10 ’ database ’ = > ’ bandnews_org ’ ,
11 );
12
13 $options = array (
14 ’ persistent ’ = > true ,
15 );
16
17 $db = DB :: connect ( $dsn , $options ) ;
When using other Wrapper APIs the appropriate steps (usually well docu-
mented) need to be taken too, of course.
8.3. RESULTS 89
8.3 Results
For these tests, APC was turned on. This allows the MySQL query cache to
show its full potential and makes the following results to be the most useful
ones so far.
8.3.1 Query Cache
As can be seen in figures 8.2–8.4 (p. 90–91) and Table 8.1, the important
scripts now really gain speed and move to interesting regions regarding the
possible requests per second. index.php moves up by 471% for 10 CCR
and even by 784% for 100 concurrent requests. Also links.php gets faster
by 3-digit percentage numbers, an increase between 104% and 178%. The
underlying pres-skel-t.php receives similar gain.
Table 8.1: MQC Benchmarking results (Requests per second)

MQC off MQC on
Concurrent requests
File 10 100 10 100
skeleton-t.php 429.61 338.48 404.08 308.76
pres-skel-t.php 9.38 9.27 56.31 53.12
index.php 3.29 2.07 18.80 18.30
links.php 5.05 3.66 10.33 10.19
The only exception is skeleton-t.php (Figure 8.1, page 90). There is no

gain, but loss instead, although the numbers can be taken as equal due to
measurement inaccuracies: The number of tested requests was only as low as
1,000, and the original request rate already started around 400 requests/s.
If a test took for any reason (e.g. an I/O event) 0.1 seconds longer than the
original one (2.5s vs. 2.6s) the measured request rate would already drop
from 400 to 385 requests per second. Another point is that due to the query
cache of 25 mega bytes there is less memory available.
Table 8.2 (page 92) shows for the requests per second (rps) the correspond-
ing (mean) number of seconds that it takes to generate a page (gt). This
1
follows the simple formula rps = gt.
90 CHAPTER 8. MYSQL
Figure 8.1: MQC benchmark: skeleton-t.php

skeleton−t.php
500 mqc_disabled
mqc_enabled
429.61
400 404.08
Requests per second
338.48
300 308.76
200
100
0
10 100
Concurrent Requests
Figure 8.2: MQC benchmark: pres-skel-t.php

pres−skel−t.php
60
mqc_disabled 56.31
mqc_enabled 53.12
50
Requests per second
40
30
20
10 9.38 9.27
0
10 100
Concurrent Requests
8.3. RESULTS 91
Figure 8.3: MQC benchmark: index.php

index.php
20
mqc_disabled
18 mqc_enabled 18.80 18.30
16
14
Requests per second
12
10
8
6
4
3.29
2 2.07
0
10 100
Concurrent Requests
Figure 8.4: MQC benchmark: links.php

links.php
12
mqc_disabled
mqc_enabled
10 10.33 10.19
Requests per second
6
5.05
4
3.66
2
0
10 100
Concurrent Requests
92 CHAPTER 8. MYSQL
Table 8.2: Comparison: Requests per second – Generation time

File Rps G. time Rps G. time
skeleton-t.php 429.61 0.0023s 404.61 0.0025s
pres-skel-t.php 9.38 0.107s 56.31 0.0176s
index.php 3.29 0.304s 18.80 0.053s
links.php 5.05 0.198s 10.33 0.097s
8.3.2 Persistent Connection
This test was done with an activated MySQL query cache. skeleton-t.php
was not tested since it does not use the database.
The results (Table 8.3) for this test do not show any significant change in
speed. Only index.php (Figure 8.6) profits a little. This result is quite
evident, though. None of the pro-persistent connection arguments really fits
our scenario. The database server resides on the same machine, there is no
network latency, the system load is low. Nevertheless, as the use of persistent
connection does not slow down anything significantly, it is arbitrary to use
them or not. The author feels more comfortable with reusing stuff that ain’t
broke1 .
Table 8.3: Persistent connection benchmarking results (R/s)

Temp. conn Persist. conn
Concurrent requests
File 10 100 10 100
pres-skel-t.php 54.81 51.46 53.80 51.19
index.php 18.68 14.97 19.41 18.17
links.php 10.32 10.08 10.21 10.00
1
Referring to the common saying (not only) amoung computer scientists: If it ain’t
broke, don’t fix it.
8.3. RESULTS 93
Figure 8.5: Persistent connection benchmark: pres-skel-t.php

pres−skel−t.php
persist_disabled
60 persist_enabled
54.81 53.80
50 51.46 51.19
Requests per second
40
30
20
10
0
10 100
Concurrent Requests
Figure 8.6: Persistent connection benchmark: index.php

index.php
persist_disabled
persist_enabled
20
19.41
18.68 18.17
Requests per second
15
14.97
10
0
10 100
Concurrent Requests
94 CHAPTER 8. MYSQL
Figure 8.7: Persistent connection benchmark: links.php

links.php
12
persist_disabled
persist_enabled
10 10.32 10.21 10.08 10.00
Requests per second
0
10 100
Concurrent Requests
8.4 Conclusions for MySQL
Web applications such as Bandnews.org profit enormously from the MySQL

query cache. Especially for sequential testing the results are amazing. Ac-
cording to [AB04] the overhead is minimal even for frequently changed ta-
bles. It should always be considered to turn on this feature.
The power of persistent connections did not quite show in the benchmarks.
This is primarily due to the fact that script and database run on the same
machine so that expensive factors for link establishing such as network la-
tency do not come into play. Still the use of persistent connections can be
recommended. Because of detection techniques for damaged connections
and no real limits in connection numbers the minimal advantages of spon-
taneous connections do not weigh much.
Chapter 9
Smarty Caching
In this section the PHP scripts will be tuned using the Smarty caching
feature (introduced in Section 4.4).
9.1 Considerations
This testing method is different from those described before. Based on the
knowledge of the available tools used until now we will modify the scripts
to receive the best results.
The tool Smarty also provides compiling and caching functionality. These
abilities will be used here.
9.1.1 Caching Page Parts
As already discussed in Section 6.1.2, it is difficult for a dynamic script to

report its last modification date since the data is received from a database.
Therefore, no caching with Squid is possible.
As we defined earlier the last modification date is the date of the “youngest”
part of the page. Furthermore, if for one part of the page no such date can
be determined, the last modification date for the whole page is unavailable.
This takes proxy servers, such as Squid, out of play.
If we descend a level and move the caching part to the script (for non-static
pages), then we cannot do anything useful about the part for which the last
95
96 CHAPTER 9. SMARTY CACHING
modification date is unavailable. We can cache all other parts that provide
such a date and, therefore, the caching process can be applied to those parts.
Moving the caching part to the PHP script makes it more vulnerable to bugs
regarding the delivery of outdated (stale) information. Therefore, there has
to be taken special care of the implementation.
When speaking about caching in this field we usually mean a combination

of compilation and caching. This is also the approach that Smarty takes.
The template files that follow their own syntax are transformed into a PHP
script on the first loading. From that point on only the compiled template
file is accessed as long as the template file is not modified. This can already
be considered as a kind of caching.
When enabling the caching feature of Smarty also loops and sections are
eliminated and the output of the script (corresponding to the template file)
is stored and delivered. This causes another speed increase. To allow to have
the programming constructs removed, further care has to be taken which is
discussed in the next section.
9.1.2 Database Usage
As demonstrated in Section 7.3, scripts that do not even touch the database
(such as skeleton-t.php) run considerably faster. The idea of caching only
the parts of the page that allow the detection of a last modification date
does not go far enough for that.
The concept of reducing (or even eliminating) database access can be com-
pared to the MySQL query cache. The idea behind the concept is as follows:
If the database did not change, the whole application (at least the parts that
rely on database selects, commonly unpersonalized pages) can be stored on
disk. A database change usually only affects certain parts of the application.
If the application is notified about the modification of a certain part of the

page, it can clear the corresponding caches and have the other parts remain
in cache.
When done carefully many dynamic pages can be made semi-static. Espe-
cially the main page (index.php) is worth the effort as it is typically the
most frequently accessed page in a web application.
The parallelity to the MySQL query cache is evident (the data from SQL
queries could be equally stored and received from the query cache). If it is
turned on, the additional effort (modifying existing scripts) seems useless.
It depends on the complexity of post-processing data whether this is true.

At the index page of Bandnews.org, for example, a news item is put to-
gether from many tables (combining band name, genres, and news data)
and requires additional processing (beautifying URIs, search keyword high-
lighting). If there was no post-processing, the MySQL query cache suffices
indeed.
From its first generation the news item stays about the same1 . If stored at
that point, the database is not needed at all to display it and therefore is
not even invoked.
For notifying the application there exist several concepts. For example:
• A file can be stored on the hard disk and carry the modification infor-
mation in its file modification time.
• More favourable is the direct deletion of corresponding cache parts. If

the caching functionality of Smarty is used carefully, the cache for a
single file can be spread to several directories, for example, dependent
on the section where it is used or on the given parameters.
It is important to mention that this method needs additional effort and

consideration for the administrative part of the site. All parts where we
expect changes to an application need to be aware of the caching aspect and
need to act accordingly. This can be done by centralizing the modification
code (duplicated code should be eliminated anyway), e.g. by glueing together
database call and cache clearance in one function.
1
The feature of dynamic time display (x hours and y minutes ago) needs some PHP
processing, still. This small function can be inserted to the cached document.
9.2 Preparation
To easily install the use of the Smarty cache the author proposes to use a
function for including files. The function is called load and introduces the
following convention (with $file as a file to be included; without extension):
• The file inc/$file.inc.php will be loaded (and must therefore exist).
• The template file templates/$file.tpl is displayed.
We can determine stale documents either by using the last modification

date of the included file or by choosing the second possibility (see previous
section): if the cache file does not exist (i.e. if it has been deleted), the
cached copy is built.
Listing 9.1: The load function in inc/setup.inc.php

218 function load ( $page , $cacheid = ’’, $load_php = true ) {
219 global $smarty ;
220
221 if ( $load_php && ! $smarty - > is_cached ( $page . ’. tpl ’ ,

$cacheid , $_SESSION [ ’ language ’]) ) {
222 include ( ROOT_PATH . ’ inc / ’ . $page . ’. inc . php ’) ;
223 }
224
225 if ( $page == ’ newsitem ’) {

226 include_once ( ROOT_PATH . ’ inc / genfunc . inc . php ’) ;
227 $source = $smarty - > fetch ( $page . ’. tpl ’ , $cacheid ,
$_SESSION [ ’ language ’]) ;
228 $source = fixtime ( $source ) ;
229 echo $source ;
230 } else {
231 $smarty - > display ( $page . ’. tpl ’ , $cacheid , $_SESSION
[ ’ language ’]) ;
232 }
233 }
The load function (see Listing 9.1) is quite a universal function but still
adapted for Bandnews.org. For example, there is a special branch for the
news items that dynamically replaces absolute time (e.g. “March 3, 2005
3:20 p.m.”) with relative time (e.g. “1 hour 3 minutes ago”).
9.2. PREPARATION 99
It also provides support for a caching ID: a template is often displayed with
varying parameters in different contexts. This can be handled with caching
IDs. Internally such an ID represents a directory in the cache. Even sub-
directories can be specified using the pipe character (|) as separator. If
$cacheid is specified carefully, (only) related cached files are placed in the
same directory and can easily be deleted to invalidate the cache.
An important point is that the associated include file is only loaded if there
is no cached copy available. This behaviour can eliminate database calls
or expensive execution of other PHP code. If the load function was used
(or the corresponding part, using the $smarty->is cached method), the
PHP code is still executed and Smarty does not even touch the generated
contents.
Listing 9.2 shows a modified version of the presentation skeleton. The last
modification code (see page 67, listing 6.2) from pres-skel-t.php was not
reused as it is taken for sure that no last modification date could be deter-
mined anyway.
Listing 9.2: The adapted presentation skeleton – pres-skel-n.php

1 <? php
2 require ( ’ inc / setup . inc . php ’) ;
3
4 load ( ’ header ’) ;
5 load ( ’ menu ’) ;
6 load ( ’ sidebar ’) ;
7
8 load ( ’ footer ’) ;
9 ?>
9.3 Results
The benchmarks show another large improvement in speed and let the pages
be generated really fast. MySQL query cache and APC were also activated
for this test.
The skeleton (skeleton-t.php, see Figure 9.1) is once again the only script
to lose speed. Since the request rates are very high this cannot really be
felt, but it is still important to our analysis: the activation of the Smarty
Cache seems to add some overhead, and there is the previously observed
speed losses due tto memory occupied by the MySQL query cache.
All other scripts show high gains (compare with Table 9.1, page 103):
• pres-skel-n.php (Figure 9.2) receives gains between 151% and 178%:

header, menu, sidebar and footer are no longer dependent on the
database. Instead just the file is loaded and instantly returned to
the browser.
• For index.php (Figure 9.3, page 102) each of the previous points ap-
plies. There remain dynamic fields such as relative dates for news
items. They are not very expensive, though. Additionally 6 news items
and band headers for each item are displayed. Those news items are
stored in cache separately in order to share them between the common
pages (band page, index page, and search page).
Usually there is no connection to the database established, except if it
is explicitly needed. This is either the case when we consider a search
page or when a user is logged in. In this case each band header receives
a plus or minus sign, for adding or deleting it from the user’s band
list. Also there is a custom area displayed in the sidebare that uses
the database. Overall, the majority of users are not logged in and,
therefore, do not use the database at all. This enables the highest
speed boost for them.
Gains lie between 207% and 236% for 100 resp. 10 concurrent requests.
• links.php profits most from the no longer used database queries

(eventhough they have been sped up by the MySQL query cache al-
ready).
9.3. RESULTS 101
Figure 9.1: Smarty Caching benchmark: skeleton-t.php

skeleton−t.php
400
386.24
smarty_disabled
350 smarty_enabled
328.72
300
Requests per second
263.13
250
228.06
200
150
100
50
0
10 100
Concurrent Requests
Figure 9.2: Smarty Caching benchmark: pres-skel-n.php

pres−skel−n.php
180
smarty_disabled
160 smarty_enabled 158.88
140 137.22
Requests per second
120
100
80
60 57.20 54.76
40
20
0
10 100
Concurrent Requests
Figure 9.3: Smarty Caching benchmark: index.php

index.php
70
smarty_disabled
smarty_enabled 61.80
60
57.65
50
Requests per second
40
30
20 20.16
17.15
10
0
10 100
Concurrent Requests
Figure 9.4: Smarty Caching benchmark: links.php

links.php
140
smarty_disabled 133.48
smarty_enabled
120 117.44
100
Requests per second
80
60
40
20
10.06 10.04
0
10 100
Concurrent Requests
9.4. CONCLUSIONS FOR SMARTY CACHING 103
When looking at Figure 9.4 the enormous gains of 1227% resp. 1070%
(for 100 CCRs) can be enjoyed.
The results only profit mainly from APC. The MySQL query cache is only
used for filling the cache and dynamic fields if the script needs any at all.
Table 9.1: Smarty Caching Benchmarking results (Requests per second)

SC off SC on
Concurrent requests
File 10 100 10 100
skeleton-t.php 386.24 263.13 328.72 228.06
pres-skel-n.php 57.20 54.76 158.88 137.22
index.php 20.16 17.15 61.80 57.65
links.php 10.06 10.04 133.48 117.44
9.4 Conclusions for Smarty Caching
This section shows the power a well written PHP script has. When using the
cache carefully and managing the correct clearance of the cache whenever a
manipulation happens, there is another enourmous speed-up possible. This
is especially true for the pages which could not really be tuned by external
means, index.php and links.php.
An important point of this form of caching is the fact that only a part of the
previously used caching methods is active. For example, the MySQL query
cache is only used for generating and filling the cache (this applies to the
tested pages).
Chapter 10
Conclusions
In this thesis we tested several caching techniques concentrating on different

aspects of the generation and distribution of dynamically generated pages
belonging to a web application.
The results show that a combination of the demonstrated techniques leads

to a useful result. All in all speed gains of up to 2,240% (index.php without
caching vs. with Smarty Caching, see Table 10.1 and Figure 10.1, page 107)
are possible if the application is tuned carefully. This applies to pages that
have already been written with some care for speed, so the original page
already rendered in 0.5 seconds, but now does the same in less than 0.1s.
While the external caching methods (Squid, APC, and MySQL query cache)
do not need very much effort by the programmer, Smarty Caching involves
an application design with this option in mind or requires a redesign.
Table 10.1: Overall Benchmarking results (Requests/s)

File (.php) No Tuning Squid APC MySQL Smarty
skeleton 84.69 3868.60 409.11 404.08 –
pres-skel 5.70 2523.40 9.01 56.31 158.88
index 2.64 2.64 3.25 18.80 61.80
105
106 CHAPTER 10. CONCLUSIONS
10.1 Further Work
The work done for this thesis was carried out only on a single desktop
PC. Its results already show what potential caching has. It is not very
favourable (but very common) to have the main three programs (proxy,
web, and database server) reside on the same machine.
Further speed up can be established by moving each service to a single

machine. The proxy server provides (as already mentioned) means for load
balancing. The web server can therefore be designed redundantly. The
database server can be split into several servers that can be clustered – or
split up into master and slave databases at least with the slaves doing search
operations.
A network of computers with the needed software requires quite a budget

already. Opposed to that, everything used in the work on this thesis is
software based and built on freely available open source software.
10.1. FURTHER WORK 107
Figure 10.1: Overall Benchmarking results

108 CHAPTER 10. CONCLUSIONS
Appendix A
File Sources
A.1 Benchmark Script
Listing A.1 shows the source code of the benchmarking script which was
developed for this thesis. The script takes patch files (see A.2, page 112) as
parameters.
Listing A.1: Benchmark – benchmark.sh

1 #!/ bin / sh
2
3 # configuration
4
5 if [ -z " $EXTERNAL_CONFIG " ]; then

6 PRETEST_NUMCONNS =1
7 PRETEST_SLEEP =5
8 NUMCONNS =1000
9 CONCURR =100
10 FILES =" skeleton . php pres - skel . php index . php links . php "
11 fi
12 COUNT =0
13
14 # check for correct parameter count

15 check_params () {
16 if [ -z " $1 " ]; then
17 echo " please specify at least one patch script "
18 exit 1
19 fi
109
110 APPENDIX A. FILE SOURCES
20
21 until [ -z " $1 " ]; do

22 if [ ! -e $1 ]; then
23 echo " patch file $1 could not be found ."
24 exit 1
25 fi
26 shift
27 done
28 }
29
30 # all config files are prepared , run the test

31 run_benchmark () {
32 LOGFILE = $1 . log
33 CHARTFILE =../ $1 . chart
34 # empty files
35 rm -f $LOGFILE $CHARTFILE
36
37 echo ""
38 echo $1
39
40 # restart programs to have fair results ( need to shut

down all of them first )
41 sudo / etc / init . d / apache2 stop
42 sudo / etc / init . d / mysql stop
43 sudo / etc / init . d / squid stop
44 sudo / etc / init . d / apache2 start
45 sudo / etc / init . d / mysql start
46 sudo / etc / init . d / squid start
47 sudo rm - rf / var / www / bandnews / cache /*
48
49 echo starting tests ..

50
51 # create a chart file as input for gnuplot

52 echo $1 > $CHARTFILE
53 echo File Requests_per_sec o n d >> $CHARTFILE
54
55 # test each of these files

56 for f in $FILES ; do
57 echo -n testing $f ...
58 if [ $PRETEST_NUMCONNS - gt 0 ]; then
A.1. BENCHMARK SCRIPT 111
59 ab -n $PRETEST_NU MCONNS -c $CONCURR http ://

localhost / $f > / dev / null
60 sleep $PRETEST_SLEEP
61 fi
62
63 ab -t 60 -n $NUMCONNS -c $CONCURR -H ’ Accept - Encoding

: gzip ’ http :// localhost / $f >> $LOGFILE . $f
64 echo finished .
65 REQSEQ = ‘ grep " Requests per " $LOGFILE . $f | awk ’{
print $4 } ’ ‘
66
67 cat $LOGFILE . $f >> $LOGFILE

68 rm -f $LOGFILE . $f
69
70 echo $f $REQSEQ >> $CHARTFILE

71
72 done
73 }
74
75 run () {
76 let COUNT = $COUNT +1
77 RUN =1
78 if [ " $RUN_ONLY " ] && [ $COUNT - ne $RUN_ONLY ]; then
79 RUN =0
80 fi
81 if [ $RUN - eq 1 ]; then
82 run_benchmark $1
83 fi
84 }
85
86 # the config files are prepared here and start the

benchmark when done
87 test_run () {
88 # the identifier is used to
89 local identifier = $1
90 shift
91 local cur = $1
92 shift
93
94 if [ -n " $1 " ]; then

95 test_run $ { identifier } _$ { cur }0 " $@ "
96 else
97 run $ { identifier } _$ { cur }0
98 fi
99
100 sudo patch - p0 -i $cur

101 if [ -n " $1 " ]; then
102 test_run $ { identifier } _$ { cur }1 " $@ "
103 else
104 run $ { identifier } _$ { cur }1
105 fi
106
107 # undo the patch

108 sudo patch -R - p0 -i $cur
109 }
110
111 check_params " $@ "

112 # include the test parameters in the file name
113 test_run " $ { PRETEST_NUMCONNS } - $ { PRETEST_SLEEP } - $ { NUMCONNS
} - $ { CONCURR }" " $@ "
A.2 Patch Files
Listing A.2: Squid patch file – squid

1 *** / etc / squid / squid . conf 2005 -04 -12 09 :1 3: 43. 96 57 91 64 8
+0200
2 --- squid . conf 2005 -04 -12 09 :1 4: 07. 77 51 72 07 2 +0200
3 ***************
4 *** 50 ,56 ****
5 # visible on the internal address .
6 #
7 # Default :
8 ! # http_port 3128
9
10 # TAG : https_port
11 # Note : This option is only available if Squid is
rebuilt with the
12 --- 50 ,56 ----
13 # visible on the internal address .
14 #
A.2. PATCH FILES 113
15 # Default :
16 ! http_port 80
17
18 # TAG : https_port
19 # Note : This option is only available if Squid is
rebuilt with the
20 ***************
21 *** 1844 ,1850 ****
22 # of your access lists to avoid potential confusion .
23 #
24 # Default :
25 ! # http_access deny all
26 #
27 # Recommended minimum configuration :
28 #
29 --- 1844 ,1850 ----
30 # of your access lists to avoid potential confusion .
31 #
32 # Default :
33 ! http_access allow all
34 #
35 # Recommended minimum configuration :
36 #
37 ***************
38 *** 2182 ,2188 ****
39 # the ’ httpd_accel_with_proxy ’ option .
40 #
41 # Default :
42 ! # httpd_accel_port 80
43
44 # TAG : h tt pd _ ac c e l _ s i n g l e _ h o s t on | off
45 # If you are running Squid as an accelerator and have a
single backend
46 --- 2182 ,2189 ----
47 # the ’ httpd_accel_with_proxy ’ option .
48 #
49 # Default :
50 ! httpd_accel_host 127.0.0.1
51 ! httpd_accel_port 81
52
53 # TAG : h tt pd _ ac c e l _ s i n g l e _ h o s t on | off
54 # If you are running Squid as an accelerator and have a

single backend
55 ***************
56 *** 2211 ,2217 ****
57 # setting )
58 #
59 # Default :
60 ! # h t tp d_a cc el_ wi th_ pr ox y off
61
62 # TAG : h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on | off
63 # HTTP /1.1 requests include a Host : header which is
basically the
64 --- 2212 ,2218 ----
65 # setting )
66 #
67 # Default :
68 ! h t t p d_a cc el_ wi th _pr ox y on
69
70 # TAG : h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on | off
71 # HTTP /1.1 requests include a Host : header which is
basically the
72 ***************
73 *** 2231 ,2237 ****
74 # require the Host : header will not be properly cached .
75 #
76 # Default :
77 ! # h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r off
78
79 # TAG : h t tp d _ a cc e l _n o _ p m t u _ d i s c on | off
80 # In many setups of transparently intercepting proxies
Path - MTU
81 --- 2232 ,2238 ----
82 # require the Host : header will not be properly cached .
83 #
84 # Default :
85 ! h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on
86
87 # TAG : h t tp d _ a cc e l _n o _ p m t u _ d i s c on | off
88 # In many setups of transparently intercepting proxies
Path - MTU
89 *** / etc / apache2 / ports . conf 2005 -04 -12 09 :1 3:4 3. 96 37 91 952
+0200
90 --- ports . conf 2005 -04 -12 09 :1 3: 24. 62 47 31 936 +0200
91 ***************
92 *** 1 ****
93 ! Listen 80
94 --- 1 ----
95 ! Listen 81
Listing A.3: APC patch file – apc

1 *** / etc / php4 / apache2 / php . ini 2005 -04 -07
16:49:07.454799 95 2 +0200
2 --- php . ini 2005 -04 -07 16 :5 0: 32 .64 58 48 94 4 +0200
3 ***************
4 *** 1077 ,1080 ****
5 ; End :
6 extension = curl . so
7 extension = mysql . so
8 ! ; extension = apc . so
9 --- 1077 ,1080 ----
10 ; End :
11 extension = curl . so
12 extension = mysql . so
13 ! extension = apc . so
Listing A.4: MySQL Query Cache patch file – mqc

1 *** / etc / mysql / my . cnf 2005 -04 -07 17 :1 6: 04 .55 59 63 20 8 +0200
2 --- my . cnf 2005 -04 -07 17 :2 7: 01 .43 01 03 16 0 +0200
3 ***************
4 *** 56 ,62 ****
5 # Query Cache Configuration
6 #
7 query_cache_limit = 1048576
8 ! query_cache_size = 0
9 query_cache_type = 1
10 #
11 # Here you can see queries with especially long
duration
12 --- 56 ,62 ----
13 # Query Cache Configuration
14 #
15 query_cache_limit = 1048576
16 ! query_cache_size = 26214400
17 query_cache_type = 1
18 #
19 # Here you can see queries with especially long
duration
Listing A.5: Persistent connection patch file – persist

1 *** / etc / php4 / apache2 / php . ini 2005 -04 -24
13 :15:06.384164796 +0200
2 --- php . ini 2005 -04 -24 13:1 4: 15 .49 97 17 76 2 +0200
3 ***************
4 *** 595 ,601 ****
5
6 [ MySQL ]
7 ; Allow or prevent persistent links .
8 ! mysql . allow_persistent = Off
9
10 ; Maximum number of persistent links . -1 means no

limit .
12 --- 595 ,601 ----
13
14 [ MySQL ]
15 ; Allow or prevent persistent links .
16 ! mysql . allow_persistent = On
17
18 ; Maximum number of persistent links . -1 means no

limit .
20 *** / var / www / bandnews / db / db . php 2005 -04 -26
09 :48:44.140531952 +0200
21 --- db . php 2005 -04 -26 09:4 8: 52 .17 03 11 24 0 +0200
22 ***************
23 *** 12 ,18 ****
24
26 ’ debug ’ = > 0 ,
27 ! ’ persistent ’ = > false ,
28 );
29

31 --- 12 ,18 ----
32
34 ’ debug ’ = > 0 ,
35 ! ’ persistent ’ = > true ,
36 );
37
Listing A.6: Smarty Caching patch file – persist

1 *** / var / www / bandnews / inc / smarty . inc . php 2005 -04 -26
12:54:03.921068 36 0 +0200
2 --- smarty . inc . php 2005 -04 -26 12 :5 3:4 7. 95 84 950 40 +0200
3 ***************
4 *** 10 ,16 ****
5 $this - > config_dir = ROOT_PATH . "/ config /";
6 $this - > register_block ( ’ dynamic ’ , ’
smarty_block_dynamic ’ , false ) ;
7 $this - > register_mo difier ( ’ convert_to_class ’ , ’
smarty_modifier_convert_to_class ’ , false ) ;
8 ! $this - > caching = false ;
9 $this - > cache_lifetime = -1;
10 $this - > use_sub_dirs = true ;
11 $this - > security = false ;
12 --- 10 ,16 ----
13 $this - > config_dir = ROOT_PATH . "/ config /";
14 $this - > register_block ( ’ dynamic ’ , ’
smarty_block_dynamic ’ , false ) ;
15 $this - > register_mo difier ( ’ convert_to_class ’ , ’
smarty_modifier_convert_to_class ’ , false ) ;
16 ! $this - > caching = true ;
17 $this - > cache_lifetime = -1;
18 $this - > use_sub_dirs = true ;
19 $this - > security = false ;
B List of Figures
3.1 Screenshot of Bandnews.org . . . . . . . . . . . . . . . . . . . 20
3.2 Screenshot of myBandnews while selecting personal bands . . 22
4.1 The MVC design pattern . . . . . . . . . . . . . . . . . . . . 33
4.2 Three-tier architecture . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Screenshot of the output of the alternating backgrounds ex-

ample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4 PHP script execution . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 Script execution with compiler cache . . . . . . . . . . . . . . 41
5.1 Processing a Request . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 Script layers of an application script . . . . . . . . . . . . . . 56
5.3 Typical output while benchmarking . . . . . . . . . . . . . . . 61
6.1 Squid benchmark: skeleton-t.php . . . . . . . . . . . . . . . . 72
6.2 Squid benchmark: pres-skel-t.php . . . . . . . . . . . . . . . . 73
6.3 Squid benchmark: index.php . . . . . . . . . . . . . . . . . . 73
7.1 APC benchmark: skeleton-t.php . . . . . . . . . . . . . . . . 79
7.2 APC benchmark: pres-skel-t.php . . . . . . . . . . . . . . . . 80
7.3 APC benchmark: index.php . . . . . . . . . . . . . . . . . . . 80
7.4 APC benchmark: links.php . . . . . . . . . . . . . . . . . . . 81
7.5 APC benchmark: Skeletons with(out) ob start() (10 CCR) 83
119
120 APPENDIX B. LIST OF FIGURES
7.6 APC benchmark: Skeletons with(out) ob start() (100 CCR) 83
8.1 MQC benchmark: skeleton-t.php . . . . . . . . . . . . . . . . 90
8.2 MQC benchmark: pres-skel-t.php . . . . . . . . . . . . . . . . 90
8.3 MQC benchmark: index.php . . . . . . . . . . . . . . . . . . 91
8.4 MQC benchmark: links.php . . . . . . . . . . . . . . . . . . . 91
8.5 Persistent connection benchmark: pres-skel-t.php . . . . . . . 93
8.6 Persistent connection benchmark: index.php . . . . . . . . . . 93
8.7 Persistent connection benchmark: links.php . . . . . . . . . . 94
9.1 Smarty Caching benchmark: skeleton-t.php . . . . . . . . . . 101
9.2 Smarty Caching benchmark: pres-skel-n.php . . . . . . . . . . 101
9.3 Smarty Caching benchmark: index.php . . . . . . . . . . . . 102
9.4 Smarty Caching benchmark: links.php . . . . . . . . . . . . . 102
10.1 Overall Benchmarking results . . . . . . . . . . . . . . . . . . 107

C List of Tables
6.1 Benchmarking results (Requests per second): Without Squid 71
6.2 Benchmarking results (Requests/s): With Squid . . . . . . . 71
6.3 Benchmarking results (Requests/s): With Squid (cont.) . . . 72
6.4 Benchmarking results (Requests per second): index.php . . . 74
7.1 APC Benchmarking results (Requests/s) . . . . . . . . . . . . 79
7.2 Output benchmarking results (Requests/s) . . . . . . . . . . 82
8.1 MQC Benchmarking results (Requests per second) . . . . . . 89
8.2 Comparison: Requests per second – Generation time . . . . . 92
8.3 Persistent connection benchmarking results (R/s) . . . . . . . 92
9.1 Smarty Caching Benchmarking results (Requests per second) 103
10.1 Overall Benchmarking results (Requests/s) . . . . . . . . . . 105
121
122 APPENDIX C. LIST OF TABLES
D List of Listings
2.1 Output of the uptime command . . . . . . . . . . . . . . . . 15
2.2 A part of the output of the top command . . . . . . . . . . . 15
4.1 Hello World in PHP – helloworld.php . . . . . . . . . . . . 27
4.2 Hello World in Smarty – hello.tpl . . . . . . . . . . . . . . 34
4.3 Hello World in Smarty – hello.php . . . . . . . . . . . . . . 34
4.4 Highlighting alternating lines – alternate.tpl . . . . . . . . 35
4.5 Highlighting alternating lines – alternate.php . . . . . . . . 35
5.1 A Presentation Skeleton – pres-skel.php . . . . . . . . . . . 57
5.2 Creating a patch file for the MySQL query cache . . . . . . . 60
5.3 Program versions . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.1 Last modification check . . . . . . . . . . . . . . . . . . . . . 66
6.2 Last modification check adapted – pres-skel-t.php . . . . . 67
6.3 A reduced Squid configuration file – /etc/squid/squid.conf 70
7.1 Generate lengthy random output . . . . . . . . . . . . . . . . 78
8.1 Activate MQC – /etc/mysql/my.cnf . . . . . . . . . . . . . 87
8.2 Activate persistent connections – /etc/php4/apache2/php.ini 87
8.3 Use persistent connections with PEAR::DB – db/db.php . . . 88
9.1 The load function in inc/setup.inc.php . . . . . . . . . . . 98
9.2 The adapted presentation skeleton – pres-skel-n.php . . . . 99
A.1 Benchmark – benchmark.sh . . . . . . . . . . . . . . . . . . . 109
123
124 APPENDIX D. LIST OF LISTINGS
A.2 Squid patch file – squid . . . . . . . . . . . . . . . . . . . . . 112
A.3 APC patch file – apc . . . . . . . . . . . . . . . . . . . . . . . 115
A.4 MySQL Query Cache patch file – mqc . . . . . . . . . . . . . 115
A.5 Persistent connection patch file – persist . . . . . . . . . . . 116
A.6 Smarty Caching patch file – persist . . . . . . . . . . . . . . 117

References
[AB04] MySQL AB. MySQL Language Reference: The Official Guide

to the MySQL Language and APIs. MySQL Press, 2004.
[Arc03] R. Arcomano. Kernel Analysis HOWTO. Linux Docu-

mentation Project, Mar 2003, http://www.tldp.org/HOWTO/
KernelAnalysis-HOWTO.html.
[BD99] G. Banga and P. Druschel. Measuring the capacity of a Web

server under realistic loads. World Wide Web, 2(1-2):69–83,
1999, http://www.cs.rice.edu/∼druschel/wwwjsi99.ps.gz.
[BK93] Timothy C. Bell and David Kulp. Longest-match

String Searching for Ziv-Lempel Compression. Soft-
ware - Practice and Experience, 23(7):757–771, 1993,
http://www.cs.ubc.ca/local/reading/proceedings/
spe91-95/spe/vol23/issue7/spe837.pdf.
[BLFF96] T. Berners-Lee, R. Fielding, and H. Frystyk. Hypertext Transfer

Protocol – HTTP/1.0. RFC 1945, May 1996, http://www.ietf.
org/rfc/rfc1945.txt.
[CMT01] I. Cooper, I. Melve, and G. Tomlinson. Internet Web Replication

and Caching Taxonomy. RFC 3040, Jan 2001, http://www.
ietf.org/rfc/rfc3040.txt.
[FGM+ 99] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter,

P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol –
HTTP/1.1. RFC 2616, Jun 1999, http://www.ietf.org/rfc/
rfc2616.txt.
125
126 REFERENCES
[Gun03] N. Gunther. UNIX Load Average Part 1: How It Works.

TeamQuest White Papers, Dec 2003, http://www.teamquest.
com/resources/gunther/ldavg1.shtml.
[Hub04] Jordan Hubbard. Open Source to the Core. Queue,

2(3):24–31, 2004, http://www.acmqueue.org/modules.php?
name=Content&pa=printer friendly&pid=151.
[Jon05] M. Tim Jones. Optimization in GCC. Linux Journal,

2005(131):11, 2005.
[Kir05] Alexander Kirk. PHP and Multibyte. Apr 2005, http://alex.

bandnews.org/?p=11.
[KP88] Glenn E. Krasner and Stephen T. Pope. A cookbook for using

the model-view controller user interface paradigm in smalltalk-
80. J. Object Oriented Program., 1(3):26–49, 1988.
[Lin02] Nick Lindridge. The PHP Accelerator 1.2. PHP e.V. Mag-
azine, Apr 2002, http://www.phpaccelerator.co.uk/PHPA
Article.pdf.
[Mid02] Julian Midgley. Benchmarking Web Servers on Linux.

Mar 2002, http://support.zeus.com/doc/tech/linux http
benchmarking.pdf.
[Mog95] Jeffrey C. Mogul. The case for persistent-connection HTTP. In

SIGCOMM ’95: Proceedings of the conference on Applications,
technologies, architectures, and protocols for computer communi-
cation, pages 299–313, New York, NY, USA, 1995. ACM Press.
[Mog99] Jeffrey C. Mogul. Errors in timestamp-based HTTP

header values. Technical Report, 99(2), Nov 1999,
ftp://gatekeeper.research.compaq.com/pub/DEC/WRL/
research-reports/WRL-TR-99.3.pdf.
[Net05] Ltd Netcraft. Web server Survey. Apr 2005, http://news.

netcraft.com/archives/web server survey.html.
[Par04] Terence John Parr. Enforcing strict model-view separation in

template engines. In WWW ’04: Proceedings of the 13th inter-
REFERENCES 127
national conference on World Wide Web, pages 224–233, New

York, NY, USA, 2004. ACM Press.
[Qia96] Xiaolei Qian. Query Folding. In ICDE ’96: Proceedings of the

Twelfth International Conference on Data Engineering, pages
48–55, Washington, DC, USA, 1996. IEEE Computer Society.
[Sch04] George Schlossnagle. Advanced PHP Programming. Sams, 2004.
[Sur00] Zeev Suraski. Output buffering, and how it can change your
life. Zend Article, Dec 2000, http://www.zend.com/zend/art/
buffering.php.
[Swe01] Jason E. Sweat. Using PHP to Develop Three-Tier Architecture

Applications. Zend Article, Dec 2001, http://www.zend.com/
zend/tut/tutsweatpart1.php.
[Wes01] Duane Wessels. Web Caching. O’Reilly & Associates, Inc., Se-
bastopol, CA, USA, 2001.
[Wes04] Duane Wessels. Squid: The Definitive Guide. O’Reilly & Asso-
ciates, Inc., Sebastopol, CA, USA, 2004.
[ZB04] Jeremy D. Zawodny and Derek J. Balling. High Performance

MySQL. O’Reilly & Associates, Inc., Apr 2004.

Caching Strategies

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Caching Strategies

Hochgeladen von

Copyright:

Verfügbare Formate

DIPLOMARBEIT

Caching Strategies for Load Reduction on High

unter Anleitung von

In dieser Diplomarbeit wird das Problem von Web Applikationen behandelt,

Diese Arbeit zeigt, dass Geschwindigkeitssteigerungen möglich sind, wenn

II Tuning the Application 47

5.4.1 Skeleton page . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

A File Sources 109

B List of Figures 119

C List of Tables 121

D List of Listings 123

The site consists of an underlying structure which is common to each page.

1.3 Expected Results

1.4 Outline of the Thesis

The paper is organized as follows:

In this section we explain important terms used throughout the thesis.

• Reduced system load: The retrieval or generation of content is avoided,

• Reduced latency: The decision whether to take a copy from cache

• Less bandwidth consumption: The following goes primarily for

Generally speaking, a cache should never be visible to the user, it should be

A cache needs to provide means for invalidating its contents or parts of it in

• Invalidation by command: a copy remains in the cache until the cache

• Deleting a file in cache by the appliance of a rule, such as an expiry

Caches should therefore either be aware of a kind of “private flag” or support

A common answer when asking for a definition of “load” is “the degree of

2.2.1 Using the uptime command

Listing 2.1: Output of the uptime command

2.2.2 Using the top command

Listing 2.2: A part of the output of the top command

top - 12:32:52 up 2:23 , 3 users , load average : 1.16 , 1.13 , 1.09

id is short for idle – on desktop computers CPUs spend most of their

2.2.3 Load Averages

The application chosen for this diploma thesis is Bandnews.org (http://

• It is continuously affected by changes: regularly band sites are

Figure 3.1: Screenshot of Bandnews.org

• it retrieves its data from a database: at the time of writing – April

• it delivers customized pages for each registered user: This makes

These points make the site require “non-standard” caching techniques. A

The application was built in a so called LAMP environment, an abbreviation

Bandnews.org is a project by Nader Cserny and the author, Alexander Kirk.

3.1.2 Page Structure

Figure 3.1 shows a screenshot of the main page (http://www.bandnews.

myBandnews (http://my.bandnews.org/) adds to the complexity of the

Figure 3.2: Screenshot of myBandnews while selecting personal bands

Another interesting feature is the integrated CMS (Content Management

The development of Apache started in April 1995 as an evolution of the pub-

Apache HTTP server 1.3 implements a so-called pre-forking model. The

This preforking model can be seen as a replacement for threading. Version

threads as an alternative to forking; threads can be split to different CPUs

The web application is implemented in the programming language PHP

Development of PHP was started by Rasmus Lerdorf in 1995, at that time

4.2.2 Language Basics and Structure

PHP is a language specialized on delivering web pages. PHP code is therfore

Listing 4.1: Hello World in PHP – helloworld.php

In PHP, variables (specified by a dollar sign, e.g. $variable) do not have to

Given the URL http://localhost/test.php?hello=world the $ GET vari-

4.2.3 Integration with the web server

As a external product, PHP needs to be integrated with a web server. Usu-

• Using a module (so-called mod php), the PHP compiler is included

• With CGI (see section 4.2.5) the PHP compiler is executed as an

4.2.4 Additional Libraries

On the one hand there is a “semi-official” database of tools which is called

• ASP.NET is a solution by Microsoft which is based on the .NET

supported languages an API is provided which includes common func-