Beruflich Dokumente
Kultur Dokumente
ausgeführt am
Institut für Computersprachen
der Technischen Universität Wien
durch
Alexander Kirk
Stolberggasse 12/12, 1050 Wien
May 9, 2005
Datum Unterschrift
2
3
Abstract
In this thesis we discuss the problem of web applications that have to work
under heavy load of a high number of visitors. We evaluate the application
Bandnews.org as an example and tune it using various caching strategies.
They include caching by a proxy server, a compiler cache, database caching
using a query cache and application based caching using Smarty.
This work shows that gain in speed is possible if methods are applied care-
fully. We compare and combine caching strategies to come to a stage where
every page is generated in reasonable time even under high load.
Kurzfassung
Contents 5
1 Introduction 9
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Expected Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Terms 13
2.1 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Invalidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Using the uptime command . . . . . . . . . . . . . . . . . . . 15
2.2.2 Using the top command . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Load Averages . . . . . . . . . . . . . . . . . . . . . . . . . . 16
I Environment 17
3 Application 19
3.1 Bandnews.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.1 Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.2 Page Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.3 myBandnews . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.4 BandnewsCMS . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Tools 23
4.1 Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5
6 CONTENTS
4.1.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.2 Language Basics and Structure . . . . . . . . . . . . . . . . . 26
4.2.3 Integration with the web server . . . . . . . . . . . . . . . . . 28
4.2.4 Additional Libraries . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.5 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.1 PEAR::DB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.2 Query Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Smarty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4.1 Template Basics . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4.2 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.1 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.2 HTTP Acceleration . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Advanced PHP Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.6.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.6.2 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.7 Advanced PHP Debugger . . . . . . . . . . . . . . . . . . . . . . . . 43
4.7.1 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.7.2 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.7.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.8 ApacheBench ab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.8.1 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5 Evaluation 49
5.1 Goal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Processing a Request . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 Possible Hooking Points . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3.1 Client Request . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3.2 PHP Module . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3.3 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 Bandnews.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
CONTENTS 7
6 Squid 63
6.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1.1 Caching of whole pages . . . . . . . . . . . . . . . . . . . . . 64
6.1.2 Programmer’s view . . . . . . . . . . . . . . . . . . . . . . . . 66
6.1.3 Expected Results . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2.1 Configuring Apache . . . . . . . . . . . . . . . . . . . . . . . 69
6.2.2 Configuring Squid . . . . . . . . . . . . . . . . . . . . . . . . 69
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3.1 skeleton-t.php . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3.2 pres-skel-t.php . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3.3 index.php . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.4 Conclusions for Squid . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7 APC 75
7.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.1.1 Compiler Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.1.2 Code Optimization . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1.3 Outputting Data . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.1.4 Programmer’s View . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.2.1 Output Buffering . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.1 Results for output testing . . . . . . . . . . . . . . . . . . . . 82
7.4 Conclusions for APC . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8 MySQL 85
8.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.1.1 MySQL Query Cache . . . . . . . . . . . . . . . . . . . . . . 85
8.1.2 Persistent Connections . . . . . . . . . . . . . . . . . . . . . . 86
8.1.3 Query Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8 CONTENTS
8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3.1 Query Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3.2 Persistent Connection . . . . . . . . . . . . . . . . . . . . . . 92
8.4 Conclusions for MySQL . . . . . . . . . . . . . . . . . . . . . . . . . 94
9 Smarty Caching 95
9.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.1 Caching Page Parts . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.2 Database Usage . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9.4 Conclusions for Smarty Caching . . . . . . . . . . . . . . . . . . . . . 103
10 Conclusions 105
10.1 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
References 125
Chapter 1
Introduction
1.1 Motivation
As the Internet resp. the World Wide Web (WWW) is gaining more and
more popularity, servers have to handle more requests accordingly. The
more people (or simply clients) request resources (in this case files) from
web servers, the faster servers have to accept and process the requests. To
cope with these requirements programmers as well as system administrators
must take countermeasures.
From the very beginning of the WWW the requirements for servers have
not only changed from the view of traffic, but also from the type of content
they deliver to the client. Initially static pages had to be served, today – in
2005 – content is usually taken from a database, and dynamically generated
pages are to be transferred.
This development takes the main source of load away from the operating
system responsible for reading the files from the hard disk or another type of
memory and shifts it to the program that dynamically generates the page.
Also computer hardware has evolved. This makes it possible to have web
pages generated the way they are today. Generally speaking, servers are
capable of serving most pages in quite a reasonable amount of time. This is
true as long as only a small number of visitors request pages to be generated.
The larger the number of clients, the more pages have to be generated
simultaneously. Multi-tasking enables servers to do so, but CPU capacity is
9
10 CHAPTER 1. INTRODUCTION
limited.
If it was only for system administrators, they would add more hardware
power (for instance clustering servers, load balancing). Often this can be
done only to a certain extent, mainly due to financial but also for logistical
reasons. From a programmer’s view, however, algorithms can be optimized
(consider an algorithm in O(n2 ) on a fast computer which can easily be
overtaken by a slower one running an O(n)) but also by caching techniques.
The basis for this diploma thesis will be the analysis of caching strategies
for this scenario. They will be used to speed up an existing application. The
combination of various methods will be tested and benchmarked to reach
a stage at which the application runs at reasonable speed even under high
load.
1.2 Method
We will explore the topic of this thesis using an existing web site (Band-
news.org) as an example to which the caching strategies are applied.
We simulate high load on the page using a load generator which effectively
makes the server deliver pages simultaneously.
We examine single pages using a profiler – a tool that measures not only
the overall performance of page generation, but also the time consumed by
single function calls.
1.3. EXPECTED RESULTS 11
As a result of this work we expect a web application, that delivers pages mul-
tiple times faster than an uncached version of the site (considering repetitive
calls to have the caching taken into account).
As methods for revealing bottlenecks within the application also faster de-
livery is expected for the first call of a web page. This is only considered as
a side effect. The thesis will concentrate on caching pages or parts of pages.
In the first part we will present the application as well as the used tools. As
application the web site Bandnews.org (see Section 3.1) was chosen. Tools
used are the Apache HTTP Server (Section 4.1), PHP (Section 4.2), MySQL
(Section 4.3), Smarty (Section 4.4), Squid (Section 4.5), APC (Section 4.6),
APD (Section 4.7), and ab (Section 4.8).
The second part, the central part of this diploma thesis, describes and eval-
uates the caching strategies to be applied.
In Section 5 we test the original site and chose pages for later evaluation.
The following sections deal with each technique in detail and provide bench-
marking results which are analyzed and discussed. These sections include
Squid (Section 6), APC (Section 7), MySQL (Section 8), and Smarty Caching
(Section 9).
In the conclusion (Section 10) we review the results as a whole. Section 10.1
gives an outlook of how future work can further improve the performance.
The appendix includes source listings and lists of figures, tables and list-
ings.
12 CHAPTER 1. INTRODUCTION
Chapter 2
Terms
2.1 Caching
Caching (noun: cache, from the French word cacher – to hide) is the tem-
porary storage of data for later retrieval. Necessary for this approach is a
certain persistance of the data to be stored. The motivation for caches is
the gain of speed whilst trying not to deliver outdated content. The gain of
speed manifests in three points (compare with [Wes01]):
13
14 CHAPTER 2. TERMS
Quick retrieval requires fast memory. That is the reason in hardware often
small and expensive memory – but fast one – is used.
2.1.1 Invalidation
Both strategies have their advantages. Chosing the right one depends on the
circumstances. Command based invalidation proxies need very little logic,
but are also very susceptible for delivering invalid data, for example, if an
invalidation command gets lost for some reason.
The second approach needs more intelligence for the proxy, but allows mini-
mization of delivering old data. As it is difficult to define a lifetime for a cer-
tain cached object, though, it is quite easy to implement a last-modification
check which compares the version available in the cache to the “real” one.
This can be done every few times the resource is requested or – if the check
is inexpensive – upon each retrieval.
2.1.2 Privacy
A dangerous field for caching is the privacy of data. Often the contents to be
stored can include sensitive data which has to remain uncached. Although
this issue should be avoided by using encryption, it is – especially in the
WWW – quite common to transmit user specific data through an insecure,
plain text channel.
2.2. LOAD 15
2.2 Load
The topic of this diploma thesis includes the technical term “load”. Al-
though most IT professionals know what load is they would not be able to
define it clearly.
In Unix the load is usually recognized by three values that can be displayed
e.g. by using the uptime command.
According to the man page (1) of uptime these are “the system load averages
for the past 1, 5, and 15 minutes.” So these values do not actually represent
the load of the system but average values, so this is a mean value for three
periods of time.
The current load of the system can be shown by using the top command (see
listing 2.2). It also includes the load averages but additionally a percentage
for CPU load is being displayed.
The current occupation of the CPU is split into 7 parts to give a more precise
overview. The abbreviations mean the following:
us is short for user and describes the amount of time the CPU spends in
user mode (a kind of safe mode for user programs, see [Arc03]).
sy is short for system – the amount of time the CPU spends in kernel
mode (e.g. for operating with hardware).
ni is short for nice, specifying the time being used for processes with
lower priority (e.g. started using the command nice).
wa is short for I/O wait – the amount of time the system is waiting for
an I/O device such as a hard disk.
si and hi are short for soft and hardware interrupt: the time the processor
spends with dealing with such signals.
Although the top command gives a quite intuitive overview over the current
system load, load averages are highly important for diagnosing the “work-
ing” load for a system. The current state is not very informative when
the system does not respond due to a highly CPU intense process – (even
afterwards) the load average can still be helpful.
Interestingly there does not seem to be a single, valid definition on how the
load average is calculated and how it shall be interpreted. According to
[Gun03] the load averages are constructed using the CPU run queue and
the number of jobs currently running on the CPU. The article describes
the calculation in more detail and even provides a second part going even
further into detail.
Part I
Environment
17
Chapter 3
Application
In this section we describe the application chosen for this diploma thesis. It
includes a listing of important criteria which lead to the decision of choosing
this application. These do not solely apply to this web site, so many of the
methods described will also have similar effects on other pages.
3.1 Bandnews.org
This web site is very suitable for the thesis, as the following points apply:
19
20 CHAPTER 3. APPLICATION
3.1.1 Technology
News aggregation is done via a robot program that fetches news by down-
loading the news pages (specified as a “feed”) from the bands’ websites
3.1. BANDNEWS.ORG 21
(which is done with the tool curl). Because the topic of this diploma thesis
concentrates on caching strategies for the main site, the process of retrieving
is not described in detail.
To the left there is the primary navigation: the search form, the band selec-
tor (a two-step drop down for selecting a band), a listing of genres (which
highlights the current genre and shows sub-genres if there are any). A
news language selector (which controls the language of news items to be
displayed), a list of recently added bands and statistics can also be found
on the left.
The right bar (internally called “side bar”) is part of the myBandnews nav-
igation. Site news (e.g. artist of the week, new features) are also displayed
there as well as a Top Ten list which shows the bands clicked most often.
3.1.3 myBandnews
The news are also available as an RSS (Rich Site Summary, nowadays often
called Really Simple Syndication) feed which can be used to receive news
alerts quickly – there are many 3rd party tools (for the client) to check RSS
feeds periodically for news and alert the user in such a case.
22 CHAPTER 3. APPLICATION
3.1.4 BandnewsCMS
One of the bands using this feature is “When the Music’s over” (http:
//www.whenthemusicsover.com/ resp. http://bandnews.org/homepage/
When+The+Music’s+Over/).
Chapter 4
Tools
In this section we describe the tools which will be used in this thesis. The
sequence does not reflect their later use, it was chosen for reasons of bet-
ter understanding. If not stated otherwise, the tools are Open Source and
underlie the GNU General Public License (GPL).
4.1 Apache
In this thesis as a web server the Apache HTTP Server is being used. Ac-
cording to [Net05] it is the web server software used on most hosts today.
4.1.1 History
Originally Apache was a group of patches for the NCSA daemon, with not
too many of those patches developed by the newly founded Apache Group.
Soon it was evident, though, that the basis lacked extensibility, and so the
server was developed with a new design from scratch. Apache 1.0 was re-
leased on December 1, 1995.
Already in April 1996 the Apache web server moved to first place in web
server popularity.
23
24 CHAPTER 4. TOOLS
The versions of Apache used today are 1.3 and 2.0. While 1.3 was an evolu-
tion from the first version, extended with various modules, version 2.0 was
once again a new design that intends to match the requirements to web
servers in the World Wide Web today. Amongst those features is a better
integration of the POSIX thread system and native IPv6 suppport.
Apache runs on several platforms, including Unix based systems and Win-
dows NT. Still when referring to an Apache web server, one commonly refers
to a Unix or even more often to a Linux system. The term LAMP (Linux
Apache MySQL PHP – the environment used in this thesis) was coined
representing a very common configuration.
The Apache Group has meanwhile approached several other projects that
are very important in the field of Open Source software. Amongst others
the most important projects are the Jakarta Tomcat web server (for Java
based applications), Ant (build system (not only) for Java projects), and
Struts (a framework for Java apps).
There is also an Apache License (which also applies to the HTTP server)
that is primarily based on the BSD license. See [Hub04].
Today the market share of the Apache web server is very close to 70%.
4.1.2 Features
Basically the Apache web server is designed to serve data via the HTTP
protocol (versions 1.0 and 1.1). Its functionality can be enhanced by a great
variety of modules.
Also third party modules are supported which lead to a large variety of new
capabilities for the web server. For example, PHP (see 4.2) is commonly
integrated as a module, allowing higher performance than CGI.
Other important modules are all sorts of authentication modules (via LDAP,
MySQL, DBM, etc.), (highly configurable) logging and rewriting modules
(modify the request URI before processing).
4.1.3 Alternatives
For a Linux system there are very few alternatives. The greatest rival ac-
cording to [Net05] is Microsoft’s IIS which is only available for the desktop
monopoly operating system Windows (NT).
The only real alternative to Apache under Linux is the Zeus Web Server
(developed by Zeus Technology Ltd., receiving some coverage in [Mid02]),
claiming to be the fastest web server available. As it is not available on
a free basis (and is far from Open Source) it was not taken into greater
consideration.
26 CHAPTER 4. TOOLS
4.2 PHP
4.2.1 History
PHP 3 (released in June 1998) was the first version similar to PHP most web
sites use today. It was highly extensible and provided a solid infrastructure
for many databases and protocols. In the end of 1998 PHP hundreds of
thousands of web servers reported to have PHP installed which was approx-
imately 10% of the WWW’s servers.
The break through for PHP came with version 4 (released 1999, using the
“Zend Engine”, named after the Zeev Suraski and Andi Gutmans). Many
more web servers were supported as well as new features for programmers
such as HTTP session support and output buffering as well as security en-
hanced methods for receiving user data (“magic quotes”). Several millions of
sites report today that they use PHP 4 (about 20% of the WWW’s servers).
One of the biggest drawbacks of PHP was fixed with version 5, released only
in 2004. It provides full-featured OOP as earlier versions only had rudimen-
tary support for it, e.g. inheritance was supported but no encapsulation.
The code is declared as PHP by using special tags (<?php and ?>; similar to
ASP’s <% and %>). The text between opening and end tag is compiled and
interpreted, the HTML code is sent untouched to the client.
PHP is a scripting language. This means that there is no need for the pro-
grammer to compile the script before it can be executed. This enables quick
prototyping which matches the requirements of web applications: modifica-
tions have to be integrated quickly.
Reflecting the web specialized character, there are a few variables that make
processing of web pages much easier. The $ GET and $ POST variables au-
tomatically contain (in form of an array) the values received from an URL
or HTML form, depending on the HTTP method the data was sent to the
server.
The $ COOKIE variable contains the values of HTTP cookies (small portions
of data to be stored on the client side). $ SERVER contains server-set data
such as the file system path of the script being executed, or the IP address
of the remote client. The $ SESSION variable is suitable for storing data
which is persistent throughout multiple requests of the same client. With
this variable in use, PHP takes care of generating a so-called session id (for
identifying the user) and setting a cookie containing this id. If clients do
not support cookies the session id is appended to each link (URL rewriting)
and a hidden field containing this id is added to each form on the delivered
page, too.
1
This cannot be applied to completely different data types. For example, using an
array as a string results in the text “Array”.
28 CHAPTER 4. TOOLS
In PHP, arrays play an important role. Every variable mentioned until now
is by definition an array, i.e. a data structure that maintains key/value
pairs, which is internally established by using a hash table. PHP provides
many functions for arrays (e.g. foreach will cycle through each element)
as in everyday use you mostly have to do with structured data. A database
query will commonly return an array representing the data in a natural
and intuitive way. Multi-dimensional arrays can be created at will (just by
specifying an array as value), making it easy to juggle with data.
What makes the language quite special is the great number of functions
provided. Compared to other languages there are few internal functions.
These are mainly used for variable manipulation (for example for cropping
strings). The majority of functions is provided by third party libraries which
are integrated with PHP and provide enormous functionality which can be
accessed easily because of the initial integration into PHP.
The scope of functions starts at database wrappers for very many database
types (most important ones are MySQL, PostgreSQL, Oracle, and Berkeley
DB), a library for image manipulation (GD), compression (such as gzip or
bzip2) and encryption libraries (mcrypt), ending with libraries for accessing
remote services such as up-/downloading, SOAP calls, or XML-RPC. So a
large task of PHP is being a framework to third-party libraries.
The CGI variant is available for all web servers that support a cgi-bin
directory, such as IIS on the Windows platform. Generally speaking, this
approach should be dismissed when integrating PHP, as a web server module
is available which provides an enormous gain of speed.
4.2. PHP 29
The popularity of PHP causes the rise of a large number of third party tools
which provide even more functionality.
On the other hand many source code repositories exist (e.g. Hotscripts.com)
which are usually user-contributed. This has both its good and its bad
sides: these repositories contain thousands of pieces of source code, so there
are not too many “common problems” which have not been solved yet.
As everybody (may the programmer be experienced or not) can contribute
anything the quality of code becomes (naturally) highly diverse. What is
more critical: documentation is commonly bad.
4.2.5 Alternatives
There are many projects on the market that have either developed their
own programming language or modified an existing language for use with
the WWW. The alternatives can be classified in two categories: scripted
and compiled languages.
• Java can also be used for web projects, either in form of JSP (Java
Server Pages) which is similar to PHP in its design, or as Java Servlets
which are “normal” Java programs that implement a certain interface.
Java also has to be compiled (the result is an intermediate language,
though) and requires a special web server, e.g. Apache Tomcat.
• Perl has been used for generation of dynamic pages for a long time
already. Commonly it is integrated with the web server Apache by
using the module mod perl which integrates the script interpreter.
Perl was not primarily designed for web use, but there are modules
for Perl (e.g. CGI.Pm) that provide useful functions for processing web
data. Additionally projects such as Mason provide a whole framework
that implements templating and has many other useful features for web
development.
4.3 MySQL
An API for PHP is provided which is commonly integrated with PHP and
adds to its function pool. Actually this integration is the way MySQL is
most commonly used today, the rise of PHP also helped MySQL to emerge.
4.3.1 PEAR::DB
PEAR::DB is not a part of the MySQL distribution and is not solely de-
pendant on MySQL either. It is rather an abstraction layer from the PEAR
repository (see section 4.2.4) that provides a DBMS independant layer for
retrieving data.
Apart from SQL which has to be understood by the DBMS used (many
systems use their own flavour of SQL, so does MySQL) switching the DBMS
can be easily done by just switching the DSN string (Data Service Name).
Also methods for retrieving data (as associative hash, as “normal” array,
etc.) do not differ.
4.3.3 Alternatives
MySQL was chosen for its common use in Open Source projects and its
speed. Even though license problems arose in 2004 for using it with PHP, it
can now be recommended as a special license for this case of appliance has
been published.
4.4. SMARTY 33
4.4 Smarty
Another tool used in this diploma thesis is the Smarty Template Engine. It
is a tool – written in PHP, created by Monte Ohrt and Andrei Zmievski in
2001 – to separate program logic, i.e. the PHP code, from design, stored in
so-called template files.
Model
PHP Script
View Controller
Smarty PHP
• The Model specifies the part of the application that handles the busi-
ness logic, i.e. the actual problem is solved here. In this scenario this
part is taken by the PHP scripts written by the programmer.
• Smarty is used for the View component which is responsible for han-
dling the output and its formatting.
In a scenario without Smarty, View and Model are mixed. This would
not only dismiss the design pattern but also reduce reusability of source
code [Par04]. The use of Smarty contrasts the design goal PHP originally
implements. In fact Smarty only acts as a layer within a PHP script – this
is quite obvious as it is coded in PHP itself.
34 CHAPTER 4. TOOLS
This also represents the common three-tier architecture (see Figure 4.2). It is
quite desirable (also in other parts of information engineering) to split apart
the data (first tier), the business logic (second tier) and the presentation
(third) tier. The MVC model is a corresponding design pattern. More
benefits from the three-tier architecture are discussed in [Swe01].
In a company the roles of programmer and layout designer are separate. This
is supported and even pushed by Smarty because designer and programmer
can concurrently work on the same page with the designer changing the
appropriate .tpl file while the programmer makes changes to the PHP code.
Therefore, the use of Smarty is highly recommended.
Template files are quite similar to “normal” PHP files, they embed their logic
into HTML. A Hello World example using Smarty in combination with PHP
would look like this:
In this example, the variable $hello is displayed within the template file,
just by putting it into curly brackets. This is the default setting for inte-
grating logic and variables in .tpl files3 . The variable does not go together
with those from PHP. They have to be explicitly assigned to Smarty (line 5
of hello.php) to have it accessible in hello.tpl4 . After that the Smarty
command for displaying the template file is called.
For outputting arrays assigned from PHP in Smarty the helper functions
foreach and section are available. In “sections” arrays are traversed
with keys from 0 to n. foreach acts the same way as in PHP, provid-
ing access to key and value for each entry of the array. While looping, the
$smarty.section variable (resp. $smarty.foreach) is filled with values to
be used for design functionality. As an example, listings 4.4 and 4.5 show
how a table with alternating background colors is generated (see Figure 4.3
for a screenshot of a web browser displaying the page).
4 $data = array () ;
5 for ( $i = 0; $i < 10; $i ++) {
6 $data [] = " value " . $i ;
7 }
8 $smarty - > assign (" data " , $data ) ;
9 $smarty - > display (" alternate . tpl ") ;
10 ?>
In this short example several more aspects of Smarty and PHP are shown.
The program logic of if/elseif/else is available to Smarty for doing sim-
ple tasks (intended for design-oriented conditionals, something just like in
the example above). The $smarty.section array provides common states,
for instance the current index or whether it is the first or last iteration of the
loop. Array values are accessed in a PHP like form (index within squared
brackets) when using sections.
4.4.2 Alternatives
The idea of templating PHP is quite common and various such projects
exist.
Smarty was chosen for its features and its steady improvement.
38 CHAPTER 4. TOOLS
4.5 Squid
The idea of using a proxy on the same server as a web server has to do with
design goals of the two programs.
A web server has to provide several features for processing files to be served
(in the case of PHP, for example, the interpreter is commonly integrated
with the Apache web server via module). For each request a copy of the
4.5. SQUID 39
executable must be held in memory. Therefore, the larger the executable is,
the higher the memory consumption will be for a number of requests.
Considering several clients accessing the web server at the same time, for
each request an executable has to be loaded and held in memory until the
request is completed. With HTTP acceleration requests are collected by the
small proxy program (which consumes particularily little memory) and can
therefore take many more requests than the web server itself. Only after
the request has been transmitted completly the web server is contacted to
collect the pages.
4.5.3 Alternatives
• The mod proxy Apache module can also be used for proxying requests.
This is only useful, though, when the proxy resides on its own machine.
Squid was chosen for its being commonly used in production environments
and its availability through standard shipping of most Linux distributions.
40 CHAPTER 4. TOOLS
compile
main script
compile
execute included script
main script
execute
included script
complete
The idea of reducing execution time is based on the mechanism how PHP
executes a script (see Figure 4.4). This is basically done in two steps:
2. PHP, i.e. the Zend Engine virtual machine, executes the intermediate
code.
These two steps have to be done every time a script is requested – the
compiled result is dismissed after execution. The same goes for each file
that is included during execution.
4.6. ADVANCED PHP CACHE 41
4.6.1 Concept
In fact step 1 stays the same for most requests (except when a modification
was made). APC implements the idea of caching the compilation results
until a modification was made to (one of) the PHP source file(s).
extension = /usr/lib/php4/apc.so
load compiled
load compiled script
main script
yes is script
cached?
load compiled
no
execute included script
has script yes
main script no been modified?
execute
included script
load insert compile
compiled script script
script from to cache
cache
complete
It instantly starts working5 when PHP resp. the web server is restarted. The
defaults reserve a total storage space of 30 mega bytes for caching compiled
scripts. Figure 4.5 shows how the cache is being used. Grey boxes show
where the cache repository is accessed.
5
This is done by subclassing the file loading routines of PHP.
42 CHAPTER 4. TOOLS
4.6.2 Alternatives
There are quite a few compiler caches around also worth a try.
The authors choice was APC for its ongoing development and the PHP open
source license.
4.7. ADVANCED PHP DEBUGGER 43
The tool called APD is primarily a debugger that can be integrated with
PHP. Mainly it provides functions and tools for debugging and profiling.
APD acts as a PHP module and is activated and controlled using PHP
functions which are provided by the module.
4.7.1 Debugging
The debugging functions of this tool provide the “standard” range of com-
monly used debuggers. This includes the setting of break points, debugging
output, printing of stacks and currently used variables, and overriding or
renaming of functions.
For this diploma thesis debugging will not be thoroughly used as a working
and approved application is being tested, supposing that no bugs affect the
caching procedure (and if, only in a relative measure).
4.7.2 Profiling
Profiling is an important tool for reaching the goal of this diploma thesis. It
can be used to spot inefficiencies in source code by measuring the amount of
time the processor spent in each function. While the script is being executed,
a trace file is generated including compiled information about the on-goings
of the current execution.
Afterwards this trace file can be processed with the included tool pprof to
gather the information recorded. The output of the tool can be customized
to the needs of analysis through several options. For example, a call tree can
be printed showing the functions called including their dependencies. The
tool is also capable of listing totals for functions such as time and memory
consumed or times of calls.
4.7.3 Alternatives
older version 1 has its bonus points, for example the profiling output
can be directly appended to the generated page.
Choosing the right tool for this thesis was hard, as all three of the intro-
duced tools have good and distinct features. If it was for debugging only, a
combination of all tools for different cases would have been the best choice.
As mainly profiling is done, APD provides the best functions, especially the
tool for processing trace files sets it apart from the other tools.
4.8. APACHEBENCH AB 45
4.8 ApacheBench ab
ApacheBench is a tool that just does its task of load generation, not much
more. The most important settings used are the number of requests (speci-
fied by command line option -n) and the number of concurrent connections
(-c).
4.8.1 Alternatives
47
Chapter 5
Evaluation
Before the analysis of the web application can be started, the goal of the
tuning has to be defined. This is done in order to find the points on where
to install caching mechanisms.
Primarily the web application should be optimized regarding load of the web
server. That means the generation of a single web page should not be very
CPU intensive. There are mainly three spots where the CPU is involved
heavily:
• The web server (program) takes processor time for reading direc-
tories and files, forking to other processes and some other configured
extensions or modules (such as mod rewrite for manipulating the URI
before the request is processed further).
• The RDBMS needs the CPU for reading files, building and process-
ing quite complex data structures (e.g. b+ -trees), searching through
indices and data manipulation.
• The script interpreter uses time for lexing and interpreting files,
execution of the specified script and the handling of data structures
the language provides.
49
50 CHAPTER 5. EVALUATION
Reducing the CPU load mainly works through following two schemes:
For web servers the caching strategy seems to be the best and biggest chance
to not only reduce the need for CPU time, but also for speeding up the
service. Still, this is not that simple and evident because there are many
points where caching can be applied.
The sources of load as well as the recurring processes can best be understood
by having a closer look at a script is requested by a client and returned by
the server. This is shown in Figure 5.1.
Client
8
Server
2
Web Server
3
4 7
5 6
PHP Script
RDBMS
PHP Module
7
1. The client establishes a connection to the web server via TCP, typically
to port 801 , where the web server is listening for connections.
2. The request for a document is sent to the server using the HTTP
protocol. This would look typically like this
(The first two lines are a minimum request for using NameVirtualHosts
– multiple web sites residing on one IP address – with HTTP/1.1)
4. If the file is of a MIME type for which there exists a responsible mod-
ule, it is loaded and processes the file. Otherwise the web server simply
delivers the file (skip to step 8). Here an example for a MIME type
definition in /etc/apache2/mods-avaliable/php4.conf (default lo-
cation for debian-based systems):
<IfModule mod_php4.c>
AddType application/x-httpd-php .php .phtml .php3
AddType application/x-httpd-php-source .phps
</IfModule>
5. The web server module (a script interpreter in this case) reads, parses
and executes the file.
7. Once the script has been executed, its output is delivered as if it was
the content of a file.
8. The file (or output of the script) is returned to the client, reusing the
TCP connection established by the client.
9. HTTP 1.1 [FGM+ 99] allows the client to reuse this TCP connection
for further requests (this is called “keep alive” [Mog95], go back to
step 2) if the server is configured to allow this behaviour.
5.3. POSSIBLE HOOKING POINTS 53
A client most commonly will request the home page (or index page) first.
Usually this is the page named / or /index.php. This sounds like a good
opportunity to cache those pages and not even have the script touched, and
deliver a previously stored copy instead. Unforunately this is not easily
possible with a script generated page. Is it randomly displays parts on the
page or other highly dynamic contents; most commonly the page cannot be
cached as a whole (but in parts, see Section 9).
The requests of clients are very similar, but the most important thing they
have in common is that they have a slow connection to the server. This
does not necessarily mean they add to the server load, but they have the
web server program reside longer in memory than necessary: The program
has to wait until the TCP connection is closed which will take longer if it is
a slow one (the speed of the data to be transmitted can usually only be as
fast as the slowest part of the connection).
A proxy program consuming very little memory will act in favour of the
web server, i.e. listens on port 80, receives the connection, and waits for the
client to submit its request (how ever long this takes). Then it transmits the
request to the web server with full speed by using the loop-back interface
when residing on the same machine, or over a fast LAN connection. The
result of the request is transferred back again at high speed to the proxy.
Now it handles the transmission of the data while the web server executable
can be removed from memory or handle the next request.
54 CHAPTER 5. EVALUATION
In the procedure necessary for executing a PHP script, the first steps are
heavily recurring. The ration of the real need to re-“compile” – this is when
the script has been modified – and when it is really done – every time the
script is executed – is very unfortunate. For smaller scripts the time for
compilation might be longer than the time for execution. For example, a
script only outputting a few lines runs considerably faster if the compilation
steps are skipped.
The repeated process of compiling can be left out quite easily when its result
– the runnable intermediate code – is stored in a cache. To maintain the
aspects of a scripting language the script file only has to be checked for
modifications when its run – this is by far less expensive (in terms of time)
than a re-compilation. If the script is modified, though, a little delay will
be added, as the script is both checked for modification and then compiled.
5.3.3 Database
When using databases in combination with web servers there are also op-
portunities for caching. Similar queries are executed over and over as most
of the content of web sites are not personalized which means most of the
queries usually do not differ. As the database changes comparably seldom,
caching the database output has a good gain in speed in evidence.
5.3.4 Application
5.4 Bandnews.org
Script
Presentation Skeleton
Application Skeleton
If only the first file was included, the script would represent the application
skeleton.
The additional files load the corresponding page parts, including database
calls when needed.
The index page belongs to a group of three page types (described below)
and is a good representative for a common page of the site. In addition to
the presentation skeleton, news items are displayed which underlie constant
modification. The selection of the news items (equals the SELECT statement
and its WHERE clause for an SQL database query) is based on the page type:
• The Index Page shows the most recent news from all bands and
genres, selected by date in a descending order.
• The Genre Page displays the news for a certain genre and its sub
genres, with a news item belonging to the genre of the band. As a band
58 CHAPTER 5. EVALUATION
can be classified into more genres, a weak entity is used for establishing
this relation. For displaying this page a joining of tables is necessary.
The index page which usually receives the most hits on a server and therefore
is worth close consideration.
• The band search: The search query is used for finding bands which
match the expression, not only taking bands into consideration that
provide news, but also those that could not be integrated within Band-
news.org.
• In the news search based on the query, matching news items are
displayed: A selection of news items, just as described at section 5.4.2.
Both points can be included in the testing of the index page. This is due to
the MySQL query cache which will be included in testing.
The links page is somewhat unlike the other page types, but or rather be-
cause of that it is worth to take a closer look at it: For each letter of the
alphabet, matching bands are displayed on one page, including all relevant
information:
• Band name
• Country
• Genres
• Homepage address
5.5. TESTING 59
The interesting thing about the page is that for database design reasons the
information is split across several tables: Band name and Homepage address
are stored in the bn band table, the country – for translation reasons – in
bn country and the genres in bn band genre respectively in bn genre as
a band can have more than one genre assigned (an m:n-relation, modeled
using bn band genre as weak entity).
There are many (expensive) queries needed to build this page. It is therefore
important to know, how the various caching strategies can optimize the
speed of this page.
5.5 Testing
When testing the capacity of a web server, there are several things to be
considered [BD99]. Aspects like the latency of a WAN connection are not
taken into consideration in this thesis, all tests are done via the loopback
interface.
In our scenario, we are not testing a web server program delivering text files,
but the output of a script instead. This significantly decreases the speed.
In a basic test on the author’s system, the Apache web server is able to
serve about 3,000 pages of a 2 kilobyte document in a second, while a script
generating 2 kb of random data only produces a through put of about 190
requests per second (lacking any tuning).
2. The web server and database server are restarted to provide a fresh
environment.
6. The log file of the test is parsed and converted to a format used for
generating a chart.
5.5.1 Preparations
The author’s script for benchmarking can automate many tests as it auto-
matically generates the possible testcases with each tool switched on or off.
There are a few steps to take until a tool can be used with the benchmark.
Listing 5.2: Creating a patch file for the MySQL query cache
1 cd / tmp
2 cp / etc / mysql / my . cnf .
3 vi my . cnf # do the necessary editing
4 diff -c / etc / mysql / my . cnf my . cnf > mqc
The result is a file containing only the modified lines (plus some contextual
lines, so that the file can even be patched if the line numbers do not match).
What such a file exactly looks like can be found in the appendix, Listing
A.4.
5.5. TESTING 61
When all necessary patch files are generated, the testing can be started. The
benchmark script is invoked with the patch files as command line arguments.
A single option can be tested just by specifying one argument. All in all 2n
(with n being the number of arguments) test cases will be run through. The
scripts to be tested (k) are hardcoded and can be overridden (see appendix
A.1). As a grand total k ∗ 2n benchmarks are run.
4 $ apache2 -v
5 Server version : Apache /2.0.53
6 Server built : Apr 1 2005 18:17:53
7
8 $ php -v
9 PHP 4.3.10 -10 ubuntu4 ( cli ) ( built : Apr 1 2005 14:16:27)
10 Copyright ( c ) 1997 -2004 The PHP Group
62 CHAPTER 5. EVALUATION
13 $ mysqld -V
14 mysqld Ver 4.1.10 a - Debian_2 - log for pc - linux - gnu on i386
( Source distribution )
15
19 $ squid -v | head -1
20 Squid Cache : Version 2.5. STABLE8
21
28 $ ab -V
29 This is ApacheBench , Version 2.0.41 - dev < $Revision : 1.141
$ > apache -2.0
30 Copyright ( c ) 1996 Adam Twiss , Zeus Technology Ltd , http
:// www . zeustech . net /
31 Copyright ( c ) 1998 -2002 The Apache Software Foundation ,
http :// www . apache . org /
Squid
The Squid Proxy server used here has been already introduced in Section
4.5.
6.1 Considerations
The primary idea which lead the author to integrate Squid and its Server
Acceleration Mode1 was to speed up large requests by taking away the part
of transfering the data from the client to the server and back from the web
server. According to [Wes04] there are many more benefits from this mode:
• Whole pages can be deliverd from Squid’s cache if they have been
requested before.
The two latter points are not taken into greater consideration in this thesis.
The first point, though, fits the topic and requires a closer look.
1
RFC 3040 [CMT01] calls the proxy server a surrogate in this mode. The term “reverse
proxy” is also quite common.
63
64 CHAPTER 6. SQUID
The most important fields (also from the view of the web developer) are
(compare to [Wes01]):
• Date
• Last-Modified
• Expires
• Cache-Control
• Content-Length
The Date field can be used to detect clock skews that might interfer with
other date based headers. The server sends its current time which can be
compared to the time on the proxy server. In our case this is somehow
useless as web and proxy server reside on the same machine. If the clock is
out of sync, the system might have a schizophrenic problem.
Even if the clocks of the systems are not synchronized, the Date field can
still be used to convert other fields specifying absolute timestamps to relative
time spans. The issue of wrong timestamps is also discussed in [Mog99].
with the Date field, on the one hand, the age of the document can be de-
termined. On the other hand, the proxy can use this timestamp to decide
whether its stored copy is stale and has to be refreshed or not.
The Expires field provides the proxy with information about the lifetime
of the document. Until this timestamp is reached, the proxy can deliver
the page without revalidating it. By providing a timestamp in the past, the
proxy can be told to check the file upon the next request for sure.
By using the Cache-Control header, the proxy server can be given specific
information about the caching options the document has. Most important
values are public vs. private which specifies whether the document con-
tains user-specific data. In the latter case the document can only be stored
in the client’s personal cache. The no-cache value advises the proxy not to
store the document under any circumstances.
For a quick stale check the field Content-length can be used. If the docu-
ment size stored in the cache does not match the specified one, the document
is likely to have been changed in the mean time. Decisions upon this field
are rather risky, though. For example content encodings (e.g. UTF-8 vs.
iso8859-1) can lead to different lengths of documents of the same content.
Because of this, proxy servers usually only use this value to verify that they
do not store a document that has not been transferred successfully. A client
must never receive incomplete data from the proxy.
66 CHAPTER 6. SQUID
Knowing the headers the proxy servers rely on, the programmer has to
ensure that the web application provides the proxy with the correct headers.
As mentioned earlier there are two types of documents: static and dynamic
ones. While the web server commonly takes care of the correct header fields
for static data – it obtains, for instance, the last modification date from
the operating system – the programmer is fairly left alone with dynamic
documents.
Note line 9: When using PHP sessions (using the command session start)
the header field Cache-Control is rewritten by PHP. When specifying a re-
6.1. CONSIDERATIONS 67
placement value with session cache limiter, this behaviour can be con-
trolled. One has to be careful with this option because sessions can affect
the users privacy (see Section 2.1.2).
A problem arises when either header or footer are changed. This script will
still return an old header. Either the last modification date of all three
documents need to be taken into consideration (see Listing 6.2), or, simply
the modification time of the static page is changed by using the command
touch static.html. It depends on the application (e.g. when data from a
database are retrieved in either header or footer) whether or not to choose
the former method.
With dynamic pages it heavily depends on the content whether a time of last
modification can be specified at all. Generally speaking the last modification
date is the date of the “youngest” part of the page. If for a single part of
the page (e.g. included content of a database) this time is unavailable, the
last modification for the whole page is unavailable. This case is not unlikely.
That is the reason why caching of whole pages is problematic for dynamically
generated pages. Therefore, the solution using a proxy does not suffice for
web applications.
68 CHAPTER 6. SQUID
It is difficult to estimate what effect turning off a cache has. This case exactly
matches the case when testing a purely dynamic script that is incapable of
delivering a last modification date. A gain of speed can be expected with
many concurrent requests or with slow clients.
6.2. PREPARATION 69
6.2 Preparation
To prepare Squid for the server acceleration mode, the configuration file has
to be modified. Squid is by default configured to be a client-side caching
proxy.
This section will only give an overview of the most important options. All
options that have been modified for the benchmark tests can be found in
the patch file in the appendix, Listing A.2.
The first consideration is to have the proxy listen on the port of the web
server. This is usually the well-known port 80. Before that, the web server
must be told to listen on another port because only one program can occupy
one port at the same time. The new port of the web server is arbitrary, the
proxy server needs to be configured to forward requests to this port anyway.
For the Squid proxy server there are some more changes needed. As we have
configured the web server to listen on another port, Squid should listen on
port 80 instead. This is done via the command http port 80 (see Listing
6.3).
Now we need to tell the proxy where the web server resides. This can
be done using the httpd accel host and httpd accel port option. The
values 127.0.0.1 (=localhost via loopback interface) and 81 do the correct
thing.
As with the default configuration of Squid the developers have taken care of
security, there is still an option to be changed. We need to allow everyone
(this is the usual purpose of a web server, opposed to the audience of a proxy)
to access the proxy-accelerator. The option http access allow all does
exactly that.
70 CHAPTER 6. SQUID
These options suffice for turning on server acceleration mode. The modifica-
tion of both an Apache configuration file and the Squid file requires a restart
of both programs (typically done via the commands2 sudo /etc/init.d/
apache2 restart and sudo /etc/init.d/squid restart).
When using the web server in domain virtual host mode (when more than
one (sub)domain point to one IP address) the HTTP 1.1 request field Host
needs to be transferred via the proxy, too. This is turned off due to se-
curity reasons once again. The appropriate option is httpd accel uses
host header on.
2
The command sudo allows a standard user to execute a command with the rights
of the super user (typically root). The scripts are executed with user privileges but for
certain commands more rights are required.
6.3. RESULTS 71
6.3 Results
For the tests there were 10,000 documents requested by the load generator,
for the comparison the number of requests per seconds is taken as a measure
(the total time of the test run can easily be calculated). Because static
documents receive a extra ordinarily high gain of speed (compare Table 6.1
with Table 6.2), that large a number was chosen to receive representative
results.
6.3.1 skeleton-t.php
6.3.2 pres-skel-t.php
This skeleton is a more realistic test candidate (see Figure 6.2). Static
pages (such as the About page or the contact form of the application Band-
news.org) wrapped through a script behave very similar to pres-skel-t.php.
This is only the case if the presentation skeleton stays the same upon each
request – a desirable state.
1,000
10
1 5 10 25 50 100 1,000
Concurrent Requests
6.3. RESULTS 73
100
10
4.42 5.64 5.70 5.99 5.62 5.99
3.44
1
1 5 10 25 50 100 1,000
Concurrent Requests
1.50
1.00
0.50
0.00
10 100
Concurrent Requests
74 CHAPTER 6. SQUID
6.3.3 index.php
Figure 6.3 (Table 6.4) shows benchmark results of index.php with Squid
turned on and off. There is no significant difference. In the case of 100
concurrent requests, turning Squid on adds so much overhead that it shows
up in the benchmark.
Squid can accelerate static and partially static pages massively (up to the
factor 420) when using the caching functionality of the proxy.
For dynamic pages Squid is no solution. It can even decrease speed due to
the added overhead. This is because Squid can only cache whole pages while
often no date of the last modification can be specified.
For dynamic pages a proxy server does not suffice for acceleration. There-
fore, more testing is necessary in the following sections.
Chapter 7
APC
In this section the Advanced PHP Cache introduced in section 4.6 will be
used.
7.1 Considerations
The idea behind APC has been discussed in detail already (see sections 4.6
and 5.3.2). Nevertheless here is a short overview of what APC does:
For PHP being a scripting language, the script code has to be compiled to a
runnable intermediate format each time the script is executed. This matches
the idea of quick prototyping as a change to the script file is immediately
applied when the file is saved.
That is where compiler caches hook in. They store the result and the compi-
lation and reuse this intermediate code for the next request. Before that, a
quick check is made whether the script has been modified or not, of course.
In that case the cached code is invalid and a recompilation is initiated.
The magnitude of this tuning even increases when you consider that the
75
76 CHAPTER 7. APC
compilation process has to be initiated for every file that is included by the
first script.
A topic which has not yet been discussed but adds to a speed increase of the
compiled PHP scripts is the optimization of code. It is worth spending some
effort (and therefore time) on optimizing the PHP code before storing it in
the cache. The cost for doing this is minimal considering that the optimized
code will be reused a few thousand times at least.
Especially the second point in the listing above shows the need for some
more testing. Outputting is an important point for a script that is used to
return data to the browser. Therefore the quickest method for transmitting
data has to be determined.
Initially, output buffering was integrated with PHP because of the necessity
to send HTTP headers before writing any output (compare to [Sur00]). This
is because PHP instantly sends the output of the script to the browser (it
“flushes” its buffer), but this can only be done after all headers have been
sent to the client. If you wanted to set a cookie (which is done in the HTTP
header with the Set-Cookie field) after you printed some text already, PHP
returns an error message or the cookie is simply dismissed.
Now output buffering not only enables the programmer to set headers at
any stage, but also performance benefits from the concept.
Instead of sending all data instantly to the browser, the data is stored in an
internal buffer. Therefore, all headers can be modified until output buffering
is terminated or the script is finished. At this point the headers and all data
stored in the buffer are sent to the browser.
The boring thing about this section is that there is nothing to do for the
programmer.
78 CHAPTER 7. APC
7.2 Preparation
extension = /usr/lib/php4/apc.so
has to be added (see the patch file in the appendix, Listing A.3).
For testing the output behaviour, three additional scripts have been created.
Common to all of them is the generation of random output of a double
quoted string (which is examined for contained variables by PHP) as shown
in listing 7.1.
These lines were not integrated (although it would be obvious) with another
file to be included in order to have these lines included with the compiler
cache separately for each file.
7.3 Results
The effects of switching on APC are not as extraordinary as those for Squid.
They apply to all of the tested scripts.
skeleton-t.php (Figure 7.1) receives the highest benefit from APC. This
is due to two points: There is no need to connect to the database and only
little output is made. The speed gained from dismissing the database is
quite obvious, but an interesting point is the cost of outputting data. We
will take a closer look at this point in Section 7.3.1.
400 409.11
300
200
0
10 100
Concurrent Requests
80 CHAPTER 7. APC
7 7.27 7.26
6
5
4
3
2
1
0
10 100
Concurrent Requests
2.50
Requests per second
1.50
1.00
0.50
0.00
10 100
Concurrent Requests
7.3. RESULTS 81
3.50
3.00 3.69
2.50
2.00
1.50
1.00
0.50
0.00
10 100
Concurrent Requests
Also scripts that need to build a connection to the database receive a gain in
speed. The difference in speed lies between 10 and 25 percent, which is quite
a value for literally uncommenting a line. In the rather uncommon case of
no database connection (we will see later how to make scripts independent
of a database) the gain hits a 500%.
The results for index.php (Figure 7.3) and links.php (Figure 7.4) show
that with APC the same performance (when speaking of requests per second)
for 100 concurrent requests2 can be achieved as for 10 CCR without APC.
When comparing requests per second for different concurrency rates it still
has to be considered that the speed from the view of a client still varies. If
benchmark A and B have the same rate of requests/s, the total time is also
the same regardless of the concurrency rate.
The response time “felt” by the client is therefore longer for more concurrent
requests.
Considering two benchmarks with the same speed of 25 requests per second,
25 users requesting the page at the same time will have to wait one second
each. If the request rate stays the same and 250 users want to access the
2
Throughout the thesis the abbreviation CCR will be used for concurrent requests.
82 CHAPTER 7. APC
page at the same time, every user has to wait 10 seconds. The request rate
is still 25 requests per second (and therefore a “good” result).
In the previous section a cause for the achieved results has been claimed
to be the output performance of PHP. Therefore, some extra testing was
included to check how different settings for outputting data perform.
As stated before and can be seen in Figure 7.5, output buffering increases
speed. When looking at the dark grey bars representing the results without
APC we see that for 10 concurrent requests output buffering increases speed
by only 2%. With APC the gap increases to about 5% (see Table 7.2).
For 100 CCR there is not a gain but a loss. This can be explained by the
higher memory usage that output buffering takes. With many concurrent
requests, much data has to be stored in a buffer: a mean size of 262,500
bytes sums up to 25 mega bytes only used for buffered data (there is an
additional overhead, of course). With the additional overhead of Apache
binaries this can quickly fill memory.
Files (.php)
CCR APC no.ob-start ob-start.nogz ob-start.gz skel-t
10 off 38.77 40.29 28.72 112.17
10 on 48.68 51.45 35.69 409.11
100 off 37.59 35.32 26.86 103.75
100 on 46.02 43.01 33.29 506.44
The slowest output comes from gzip compressed. This is due to the ex-
pensive algorithm for compressing data. Still, using this mode can be rec-
ommended as due to less data (text data is very suitable for compression
[BK93]) that needs to be transferred, the bandwidth becomes less a factor
for performance. It is also a cost factor: usually you have an account-based
traffic limit; using compression you can serve more visitors at the same price.
The results for 100 CCR do not show a better performance for output buffer-
7.3. RESULTS 83
100 112.17
48.68 51.45
38.77 40.29
35.69
28.72
10
no.ob−start ob−start.nogz ob−start.gz skeleton−t
File
100 103.75
46.02 43.01
37.59 35.32 33.29
26.86
10
no.ob−start ob−start.nogz ob−start.gz skeleton−t
File
84 CHAPTER 7. APC
ing than with 10 CCR. Instead, there is only a (very small) gain with APC
on and gzip compression turned off.
For documents that do not return any contents (skeleton-t.php) the rate
is evidently even higher since no output has to be stored or sent. Only the
headers are sent in this case. This reduces the used network bandwidth even
more. The headers are not compressed under any circumstances.
Although the gain in speed is not enormous, the use of output buffering
(with enabled compression) is heavily recommended. For large projects
also bandwith plays an important role and even a small loss of speed in
combination with a reduction of traffic can save a lot of money.
APC cache proves to speed up each PHP script requested more than once.
This shows how much time is consumed for compilation when executing a
PHP script.
Turning off APC or not installing a compiler cache is simply a waste of CPU
time and it should always be considered to install and activate APC.
Chapter 8
MySQL
8.1 Considerations
There are two concepts that will be tested in this thesis: the MySQL Query
Cache and Persistent Connections. The third concept of Query Tuning
should also be taken into consideration but is a too extensive topic.
Version 4.0.1 of MySQL – the database used in this thesis, see Section 4.3
– supports a caching mechanism that allows quicker retrieval of common
queries.
This concept works as long as nothing is changed in the database that affects
85
86 CHAPTER 8. MYSQL
When using the MySQL query cache, we want to consider certain queries
not to be cached at all. There is a special command for this case that will
be discussed in Section 8.2.
A script that wants to access the database has to connect to the database
daemon first. This is called “establishing a link”. Usually this is done via a
TCP connection and an authentication mechanism.
The cost for connection to a database and establishing a link depends heavily
on the environment. Most important factors for the speed for connecting
are the speed (and/or latency) of the network interface (can also be the
very fast loop back interface if the DBMS resides on the same machine) and
load on the database machine. Dependent on the configuration a certain
overhead for connecting will slow the script down.
Apart from caching and speeding up the environment, one also has to make
the queries behave well: take care that they are not (too) wasteful.
8.2. PREPARATION 87
Often indices and good database layouts can improve the speed even more
than caching techniques. In combination these techniques result in the high-
est speed, of course.
Tuning queries is very application dependent, but [ZB04] gives a good intro-
duction and leads to good starting points for optimizing the queries. You
can commonly start with the slowest queries of our application. They can
be automatically logged by MySQL if we specify a threshold of x seconds.
8.2 Preparation
The changes needed for this testing can both be made in the configura-
tion files of MySQL (for the query cache) and PHP (for generally enabling
the persistent connection feature), but it has also to be ensured that the
connection script code takes advantage of this feature.
Additionally we have to asure that the scripts really use the persistent con-
nections. When using plain PHP we need to use the function mysql pconnect
instead of mysql connect to connect to the database.
88 CHAPTER 8. MYSQL
Wrapper APIs (like PEAR::DB, see Section 4.3.1) need individual care. In
PEAR::DB either the construct
5 $dsn = array (
6 ’ phptype ’ = > ’ mysql ’ ,
7 ’ username ’ = > ’ bandnews_org ’ ,
8 ’ password ’ = > ’xyz ’ ,
9 ’ hostspec ’ = > ’ localhost ’ ,
10 ’ database ’ = > ’ bandnews_org ’ ,
11 );
12
13 $options = array (
14 ’ persistent ’ = > true ,
15 );
16
When using other Wrapper APIs the appropriate steps (usually well docu-
mented) need to be taken too, of course.
8.3. RESULTS 89
8.3 Results
For these tests, APC was turned on. This allows the MySQL query cache to
show its full potential and makes the following results to be the most useful
ones so far.
As can be seen in figures 8.2–8.4 (p. 90–91) and Table 8.1, the important
scripts now really gain speed and move to interesting regions regarding the
possible requests per second. index.php moves up by 471% for 10 CCR
and even by 784% for 100 concurrent requests. Also links.php gets faster
by 3-digit percentage numbers, an increase between 104% and 178%. The
underlying pres-skel-t.php receives similar gain.
If a test took for any reason (e.g. an I/O event) 0.1 seconds longer than the
original one (2.5s vs. 2.6s) the measured request rate would already drop
from 400 to 385 requests per second. Another point is that due to the query
cache of 25 mega bytes there is less memory available.
Table 8.2 (page 92) shows for the requests per second (rps) the correspond-
ing (mean) number of seconds that it takes to generate a page (gt). This
1
follows the simple formula rps = gt.
90 CHAPTER 8. MYSQL
500 mqc_disabled
mqc_enabled
429.61
400 404.08
Requests per second
338.48
300 308.76
200
100
0
10 100
Concurrent Requests
40
30
20
10 9.38 9.27
0
10 100
Concurrent Requests
8.3. RESULTS 91
12
10
8
6
4
3.29
2 2.07
0
10 100
Concurrent Requests
6
5.05
4
3.66
2
0
10 100
Concurrent Requests
92 CHAPTER 8. MYSQL
This test was done with an activated MySQL query cache. skeleton-t.php
was not tested since it does not use the database.
The results (Table 8.3) for this test do not show any significant change in
speed. Only index.php (Figure 8.6) profits a little. This result is quite
evident, though. None of the pro-persistent connection arguments really fits
our scenario. The database server resides on the same machine, there is no
network latency, the system load is low. Nevertheless, as the use of persistent
connection does not slow down anything significantly, it is arbitrary to use
them or not. The author feels more comfortable with reusing stuff that ain’t
broke1 .
1
Referring to the common saying (not only) amoung computer scientists: If it ain’t
broke, don’t fix it.
8.3. RESULTS 93
persist_disabled
60 persist_enabled
54.81 53.80
50 51.46 51.19
Requests per second
40
30
20
10
0
10 100
Concurrent Requests
persist_disabled
persist_enabled
20
19.41
18.68 18.17
Requests per second
15
14.97
10
0
10 100
Concurrent Requests
94 CHAPTER 8. MYSQL
0
10 100
Concurrent Requests
The power of persistent connections did not quite show in the benchmarks.
This is primarily due to the fact that script and database run on the same
machine so that expensive factors for link establishing such as network la-
tency do not come into play. Still the use of persistent connections can be
recommended. Because of detection techniques for damaged connections
and no real limits in connection numbers the minimal advantages of spon-
taneous connections do not weigh much.
Chapter 9
Smarty Caching
In this section the PHP scripts will be tuned using the Smarty caching
feature (introduced in Section 4.4).
9.1 Considerations
This testing method is different from those described before. Based on the
knowledge of the available tools used until now we will modify the scripts
to receive the best results.
The tool Smarty also provides compiling and caching functionality. These
abilities will be used here.
As we defined earlier the last modification date is the date of the “youngest”
part of the page. Furthermore, if for one part of the page no such date can
be determined, the last modification date for the whole page is unavailable.
This takes proxy servers, such as Squid, out of play.
If we descend a level and move the caching part to the script (for non-static
pages), then we cannot do anything useful about the part for which the last
95
96 CHAPTER 9. SMARTY CACHING
modification date is unavailable. We can cache all other parts that provide
such a date and, therefore, the caching process can be applied to those parts.
Moving the caching part to the PHP script makes it more vulnerable to bugs
regarding the delivery of outdated (stale) information. Therefore, there has
to be taken special care of the implementation.
The template files that follow their own syntax are transformed into a PHP
script on the first loading. From that point on only the compiled template
file is accessed as long as the template file is not modified. This can already
be considered as a kind of caching.
When enabling the caching feature of Smarty also loops and sections are
eliminated and the output of the script (corresponding to the template file)
is stored and delivered. This causes another speed increase. To allow to have
the programming constructs removed, further care has to be taken which is
discussed in the next section.
As demonstrated in Section 7.3, scripts that do not even touch the database
(such as skeleton-t.php) run considerably faster. The idea of caching only
the parts of the page that allow the detection of a last modification date
does not go far enough for that.
The concept of reducing (or even eliminating) database access can be com-
pared to the MySQL query cache. The idea behind the concept is as follows:
If the database did not change, the whole application (at least the parts that
rely on database selects, commonly unpersonalized pages) can be stored on
disk. A database change usually only affects certain parts of the application.
When done carefully many dynamic pages can be made semi-static. Espe-
cially the main page (index.php) is worth the effort as it is typically the
most frequently accessed page in a web application.
9.1. CONSIDERATIONS 97
The parallelity to the MySQL query cache is evident (the data from SQL
queries could be equally stored and received from the query cache). If it is
turned on, the additional effort (modifying existing scripts) seems useless.
From its first generation the news item stays about the same1 . If stored at
that point, the database is not needed at all to display it and therefore is
not even invoked.
For notifying the application there exist several concepts. For example:
• A file can be stored on the hard disk and carry the modification infor-
mation in its file modification time.
1
The feature of dynamic time display (x hours and y minutes ago) needs some PHP
processing, still. This small function can be inserted to the cached document.
98 CHAPTER 9. SMARTY CACHING
9.2 Preparation
To easily install the use of the Smarty cache the author proposes to use a
function for including files. The function is called load and introduces the
following convention (with $file as a file to be included; without extension):
The load function (see Listing 9.1) is quite a universal function but still
adapted for Bandnews.org. For example, there is a special branch for the
news items that dynamically replaces absolute time (e.g. “March 3, 2005
3:20 p.m.”) with relative time (e.g. “1 hour 3 minutes ago”).
9.2. PREPARATION 99
It also provides support for a caching ID: a template is often displayed with
varying parameters in different contexts. This can be handled with caching
IDs. Internally such an ID represents a directory in the cache. Even sub-
directories can be specified using the pipe character (|) as separator. If
$cacheid is specified carefully, (only) related cached files are placed in the
same directory and can easily be deleted to invalidate the cache.
An important point is that the associated include file is only loaded if there
is no cached copy available. This behaviour can eliminate database calls
or expensive execution of other PHP code. If the load function was used
(or the corresponding part, using the $smarty->is cached method), the
PHP code is still executed and Smarty does not even touch the generated
contents.
Listing 9.2 shows a modified version of the presentation skeleton. The last
modification code (see page 67, listing 6.2) from pres-skel-t.php was not
reused as it is taken for sure that no last modification date could be deter-
mined anyway.
4 load ( ’ header ’) ;
5 load ( ’ menu ’) ;
6 load ( ’ sidebar ’) ;
7
8 load ( ’ footer ’) ;
9 ?>
100 CHAPTER 9. SMARTY CACHING
9.3 Results
The benchmarks show another large improvement in speed and let the pages
be generated really fast. MySQL query cache and APC were also activated
for this test.
The skeleton (skeleton-t.php, see Figure 9.1) is once again the only script
to lose speed. Since the request rates are very high this cannot really be
felt, but it is still important to our analysis: the activation of the Smarty
Cache seems to add some overhead, and there is the previously observed
speed losses due tto memory occupied by the MySQL query cache.
All other scripts show high gains (compare with Table 9.1, page 103):
• For index.php (Figure 9.3, page 102) each of the previous points ap-
plies. There remain dynamic fields such as relative dates for news
items. They are not very expensive, though. Additionally 6 news items
and band headers for each item are displayed. Those news items are
stored in cache separately in order to share them between the common
pages (band page, index page, and search page).
Usually there is no connection to the database established, except if it
is explicitly needed. This is either the case when we consider a search
page or when a user is logged in. In this case each band header receives
a plus or minus sign, for adding or deleting it from the user’s band
list. Also there is a custom area displayed in the sidebare that uses
the database. Overall, the majority of users are not logged in and,
therefore, do not use the database at all. This enables the highest
speed boost for them.
Gains lie between 207% and 236% for 100 resp. 10 concurrent requests.
263.13
250
228.06
200
150
100
50
0
10 100
Concurrent Requests
120
100
80
60 57.20 54.76
40
20
0
10 100
Concurrent Requests
102 CHAPTER 9. SMARTY CACHING
40
30
20 20.16
17.15
10
0
10 100
Concurrent Requests
100
Requests per second
80
60
40
20
10.06 10.04
0
10 100
Concurrent Requests
9.4. CONCLUSIONS FOR SMARTY CACHING 103
When looking at Figure 9.4 the enormous gains of 1227% resp. 1070%
(for 100 CCRs) can be enjoyed.
The results only profit mainly from APC. The MySQL query cache is only
used for filling the cache and dynamic fields if the script needs any at all.
This section shows the power a well written PHP script has. When using the
cache carefully and managing the correct clearance of the cache whenever a
manipulation happens, there is another enourmous speed-up possible. This
is especially true for the pages which could not really be tuned by external
means, index.php and links.php.
An important point of this form of caching is the fact that only a part of the
previously used caching methods is active. For example, the MySQL query
cache is only used for generating and filling the cache (this applies to the
tested pages).
104 CHAPTER 9. SMARTY CACHING
Chapter 10
Conclusions
While the external caching methods (Squid, APC, and MySQL query cache)
do not need very much effort by the programmer, Smarty Caching involves
an application design with this option in mind or requires a redesign.
105
106 CHAPTER 10. CONCLUSIONS
The work done for this thesis was carried out only on a single desktop
PC. Its results already show what potential caching has. It is not very
favourable (but very common) to have the main three programs (proxy,
web, and database server) reside on the same machine.
File Sources
Listing A.1 shows the source code of the benchmarking script which was
developed for this thesis. The script takes patch files (see A.2, page 112) as
parameters.
3 # configuration
4
109
110 APPENDIX A. FILE SOURCES
20
37 echo ""
38 echo $1
39
72 done
73 }
74
75 run () {
76 let COUNT = $COUNT +1
77 RUN =1
78 if [ " $RUN_ONLY " ] && [ $COUNT - ne $RUN_ONLY ]; then
79 RUN =0
80 fi
81 if [ $RUN - eq 1 ]; then
82 run_benchmark $1
83 fi
84 }
85
96 else
97 run $ { identifier } _$ { cur }0
98 fi
99
10 # TAG : https_port
11 # Note : This option is only available if Squid is
rebuilt with the
12 --- 50 ,56 ----
13 # visible on the internal address .
14 #
A.2. PATCH FILES 113
15 # Default :
16 ! http_port 80
17
18 # TAG : https_port
19 # Note : This option is only available if Squid is
rebuilt with the
20 ***************
21 *** 1844 ,1850 ****
22 # of your access lists to avoid potential confusion .
23 #
24 # Default :
25 ! # http_access deny all
26 #
27 # Recommended minimum configuration :
28 #
29 --- 1844 ,1850 ----
30 # of your access lists to avoid potential confusion .
31 #
32 # Default :
33 ! http_access allow all
34 #
35 # Recommended minimum configuration :
36 #
37 ***************
38 *** 2182 ,2188 ****
39 # the ’ httpd_accel_with_proxy ’ option .
40 #
41 # Default :
42 ! # httpd_accel_port 80
43
44 # TAG : h tt pd _ ac c e l _ s i n g l e _ h o s t on | off
45 # If you are running Squid as an accelerator and have a
single backend
46 --- 2182 ,2189 ----
47 # the ’ httpd_accel_with_proxy ’ option .
48 #
49 # Default :
50 ! httpd_accel_host 127.0.0.1
51 ! httpd_accel_port 81
52
53 # TAG : h tt pd _ ac c e l _ s i n g l e _ h o s t on | off
114 APPENDIX A. FILE SOURCES
62 # TAG : h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on | off
63 # HTTP /1.1 requests include a Host : header which is
basically the
64 --- 2212 ,2218 ----
65 # setting )
66 #
67 # Default :
68 ! h t t p d_a cc el_ wi th _pr ox y on
69
70 # TAG : h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on | off
71 # HTTP /1.1 requests include a Host : header which is
basically the
72 ***************
73 *** 2231 ,2237 ****
74 # require the Host : header will not be properly cached .
75 #
76 # Default :
77 ! # h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r off
78
79 # TAG : h t tp d _ a cc e l _n o _ p m t u _ d i s c on | off
80 # In many setups of transparently intercepting proxies
Path - MTU
81 --- 2232 ,2238 ----
82 # require the Host : header will not be properly cached .
83 #
84 # Default :
85 ! h t t p d _ a c c e l _ u s e s _ h o s t _ h e a d e r on
86
87 # TAG : h t tp d _ a cc e l _n o _ p m t u _ d i s c on | off
88 # In many setups of transparently intercepting proxies
Path - MTU
A.2. PATCH FILES 115
89 *** / etc / apache2 / ports . conf 2005 -04 -12 09 :1 3:4 3. 96 37 91 952
+0200
90 --- ports . conf 2005 -04 -12 09 :1 3: 24. 62 47 31 936 +0200
91 ***************
92 *** 1 ****
93 ! Listen 80
94 --- 1 ----
95 ! Listen 81
14 #
15 query_cache_limit = 1048576
16 ! query_cache_size = 26214400
17 query_cache_type = 1
18 #
19 # Here you can see queries with especially long
duration
6 [ MySQL ]
7 ; Allow or prevent persistent links .
8 ! mysql . allow_persistent = Off
9
14 [ MySQL ]
15 ; Allow or prevent persistent links .
16 ! mysql . allow_persistent = On
17
25 $options = array (
26 ’ debug ’ = > 0 ,
27 ! ’ persistent ’ = > false ,
28 );
A.2. PATCH FILES 117
29
33 $options = array (
34 ’ debug ’ = > 0 ,
35 ! ’ persistent ’ = > true ,
36 );
37
119
120 APPENDIX B. LIST OF FIGURES
121
122 APPENDIX C. LIST OF TABLES
D List of Listings
123
124 APPENDIX D. LIST OF LISTINGS
125
126 REFERENCES
[Lin02] Nick Lindridge. The PHP Accelerator 1.2. PHP e.V. Mag-
azine, Apr 2002, http://www.phpaccelerator.co.uk/PHPA
Article.pdf.
[Sur00] Zeev Suraski. Output buffering, and how it can change your
life. Zend Article, Dec 2000, http://www.zend.com/zend/art/
buffering.php.
[Wes01] Duane Wessels. Web Caching. O’Reilly & Associates, Inc., Se-
bastopol, CA, USA, 2001.
[Wes04] Duane Wessels. Squid: The Definitive Guide. O’Reilly & Asso-
ciates, Inc., Sebastopol, CA, USA, 2004.