You are on page 1of 35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Menu

Sherif's Tech Blog


Just another guy on the Internet with a keyboard

HOME PHP A C LOSER LOOK INTO PHP ARRAYS: WHAT YOU


DONT SEE

A Closer Look Into PHP Arrays: What


You Dont See
by GoogleGuy | posted: October 29, 2012

17 Comments

PHP is one
unique
language where
the array data
type has been
highly
generalized to
suit a very
broad set of
use cases. For
example, in PHP
you can use an
array to create
both ordered lists as well as dicts (key/value pairs or maps)
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

1/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

with a single data type. A PHP array isnt an array in the


traditional sense, but in fact its actually implemented as an
ordered hashmap. There are good reasons for this. One of
those reasons is that arrays traditionally do not allow you
to mix types. They also dont normally provide a simple
means of random access such as mapping a key to its
value. At least not in the sense that were used to doing in
PHP. So Im going to share with you some of the
underlying details of how the PHP array data type works,
why it works the way that it does, how its different from
other languages, and what behaviors the PHP array has
that you may not be fully aware of.
To start off with a basic example: you can do the following
in PHP
1
2
3
4
5
6

$array[12] = 1;
$array[1] = 2;
$array[17] = 3;
foreach ($array as $num)
echo "$num\n";

This outputs the following

1
2
3

As you can see, despite the numbering of the keys in our


array, the elements of the array remain in the same order
we defined them.
You cant do the same thing in a language like Python.
1 array = []
2 array[12] = 1
We would get an index error

IndexError: list assignment index out of range

You also cant do this in a language like C, for example,


because in those languages arrays are not made up of
keys, but offsets. These offsets are serial. So you can not
have an array of three elements that start with an offset of
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

2/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

12, followed by 1, and end with 17. In PHP, however, these


are not offsets at all. They are, instead, referred to as
keys. They map to a value and the keys themselves do not
depict order (as opposed to offsets, which do conform to
order).

So What Are Arrays Exactly?


In order to eloborate on some of the internal workings of a
PHP array well first need to get a general understanding of
what arrays really are and how theyre seen on a very lowlevel. Ill use C arrays to demonstrate this general
understanding of what arrays are and what they look like.
In C an array is quite simple. Its just a designated block of
memory that is divided up equally into pieces where each
piece must represent a primitive data type. This is
sometimes referred to as chunk memory. So for example,
in C an int is a primitive data type that may represent 4 or
more bytes. That means in order for you to store one
integer variable you would need at least that many bytes. If
you wanted to store an array of 4 integers using a single
variable you would need an integer array of size 4. This
means we would normally get a block of memory thats 4 *
4 bytes wide (16 bytes total), where the variable then
becomes a pointer to the first integer in our array. If you
dont know what a pointer is dont worry. Its not incredibly
important for the purposes of our discussion, but think of
a pointer as something that keeps track of which memory
address we need to go to in order to find the data were
looking for. Keep in mind that memory is divided up into
pieces and assigned addresses just like a neighborhood is
divided up into blocks and each home is given an address
(a street and a number). The same thing happens with our
memory in a computer.
Now, we can access each integer in our integer array using
an offset where the first integer sits at offset 0 and the
last integer sits at offset 3. The way this works is that the
offset is basically multiplied by the size of the array type (in
our case thats 4 because an integer is made up of 4
bytes) and then added to the value of the address
assigned to our variable (the pointer to the first element in
the array). This allows us to seek to any integer in the
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

3/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

array simply by calling the variable with its designated


offset in order to dereference the value we need from the
block of memory where all the integers are stored.
Heres an example of this in C.
1 #include <stdio.h>
2
3 int main() {
4
/* This initializes an integer array of
size 4 */
5
int array[4] = { 1, 2, 3, 4 };
6
printf("%d\n",array[0]); /* Prints 1 */
7
printf("%d\n",array[3]); /* Prints 4 */
8
return 0;
9 }
Here the variable array is first declared with a size of 4 and
then initialized with 4 different integers. Each integer is
stored at its designated offset in order, starting from
offset 0 all the way through offset 3. So if we picture this
array as one contiguous block of memory that starts at
address 0xd4e3c8f and ends at address 0xd4e3c9f then
we can say that the variable array is a pointer to the
address 0xd4e3c8f, which is the first element in our array.
That means in order for us to get array[0] we would do
(0xd4e3c8f + (0 * 4)), which is really just (0xd4e3c8f +
(0)). To get the second element in our array we do
array[1] which is similar to (0xd4e3c8f + (1 * 4)), which
equals 0xd4e3c93 and thats the second integer in our
array.

The above diagram illustrates what the array would look like
in memory. Where the individual blocks (in purple) depict
the bytes in memory with their designated starting
addresses, and the offset depicts where each integer is
stored (in blue). So as you can see the entire 16 byte
block of memory is evenly divided up into 4 bytes, each
signifying our 4 integers and now its really simple to
understand this array.

How Are PHP Arrays Different?


PHP arrays are very different from this simplistic concept we
examined above. They are far more complex than just a
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

4/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

contiguous block of memory that stores a single data type.


PHP arrays map a scalar key value to any of PHPs primitive
data types. They also maintain order. Additionally, they use
a hash in order to provide random access to their elements
by corresponding keys. This makes them ordered
hashmaps. Lets see exactly what that means if youre not
familiar with hashmaps in general.
A PHP array is first made up of a hashtable. That
hashtable is simply a container of information about the
array. To put it more precisely, it is a C struct that tells us
the array size, the first element of the array, the last
element of the array, the internal array pointers position,
and the next free element in the array (along with some
other internal meta data we wont get into). The hashtable
also stores an address in memory to the array of buckets
that belong to the array. A bucket is another container
that stores information about each element in the array
including its key, which value that key maps to, and some
other internal metadata such as the hashed value of the
key, and if there are any other elements that share the
same hashed key value. The value that any key points to is
made up of a ZVAL, which is yet another container for a
PHP variable. That container stores the necessary metadata
that tells PHP where to find the value we need for that
variable. So as you can see weve already peeled away at
least three layers of the PHP array. Its quite a complex
beast and it takes on a lot of overhead. PHP arrays sacrifice
memory for speed, however. You can read about how big
PHP arrays are on nikics blog where he does a fine job of
revealing all the gory details.

Differences From Other


Languages
The main difference is that PHP defines arrays in a way that
makes them generalized enough to suit all the major use
cases instead of having multiple types. For example, in
Python you have arrays and you also have dicts. Not to
mention you also have tuples on top of that, which a lot of
people will couple with a list in order to achieve something
similar to what a PHP array represents. PHP, however, only
has a single type called array that behaves more like a dict
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

5/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

would in Python, but still shares many of the characteristics


of arrays in other languages. Its a hybrid of both, really.
To give you an example, in Python, we still dont quite get
the same behavior as we did in the PHP example earlier if
we used a dict.
1
2
3
4
5
6
7

dict = {}
dict[12] = 1
dict[1] = 2
dict[17] = 3
for key in dict:
print dict[key]

As you can see the results arent typical

2
1
3

The order of elements in a dict acording to CPython spec


is not guaranteed. It may be assorted or it may be
ordered. Often people will use a list of tuples in Python to
maintain order and while that may result in a similar effect
to what a PHP array can do its still not quite the same.

How Do PHP Arrays Work?


To give you an idea of just how PHP arrays work the way
they do lets explore a very simple example in PHP and then
we can break down exactly whats going on internally that
makes this possible.
1 $array = array(
2
4
=> 1,
3
'foo' => 'bar',
4
-16 => true,
5
'baz'
6
);
7 echo $array[-16]; // prints 1
Here, weve initialized an array of 4 elements. We have 1
integer, 2 strings, and 1 boolean element in the array.
Every element in a PHP array is associative. There is no
such thing as chunk memory in PHP arrays. So that means
every element has a key whether we assign it one or not.
Notice we only assigned keys to 3 of our elements, yet if
we look at the output of var_dump($array) here we will see
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

6/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

that all 4 elements indeed have a key.


1 var_dump($array);

array(4) {
[4]=>
int(1)
["foo"]=>
string(3) "bar"
[-16]=>
bool(true)
[5]=>
string(3) "baz"
}

Notice that the last element has a key of 5 even though we


never assigned it one in the initialization of the array. Why
did PHP chose 5 and not any other number? The answer
lies in the hashtable!
01 /* Lines 66 - 82 of Zend/zend_hash.h */
02 typedef struct _hashtable {
03
uint nTableSize;
04
uint nTableMask;
05
uint nNumOfElements;
06
ulong nNextFreeElement;
07
Bucket *pInternalPointer; /* Used for
element traversal */
08
Bucket *pListHead;
09
Bucket *pListTail;
10
Bucket **arBuckets;
11
dtor_func_t pDestructor;
12
zend_bool persistent;
13
unsigned char nApplyCount;
14
zend_bool bApplyProtection;
15 #if ZEND_DEBUG
16
int inconsistent;
17 #endif
18 } HashTable;
Take notice of line 6 in the above code. The
nNextFreeElement member of this struct stores a
unsigned long containing the next integer value to use
when we append to this array. It starts at 0 and only gets
modified whenever we append a new element to the array
using a positive integer value. We assigned the first
element in our array with the integer key 4. At the time
that we did this the nNextFreeElement member of the
HashTable struct was modified to 4 + 1, giving us a new
next free element of 5. So the next time we append
another element to this array without supplying a key PHP
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

7/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

uses it as the next key for this new element and


increments by one again. That way we should always have
a new unique key ready for any elements we append to our
array.

The PHP Array Structure


Here is a graph illustrating what this PHP array (from the
example above) would look like internally to PHP.

As you can see this is quite a complex structure despite


our data appearing very superficial (just by looking at our
PHP array). Theres also a lot here that I intentionally left
out for simplicity. However, this is also what makes PHPs
arrays very resilient. We just mixed both numeric and
string keys along with strings, ints, and bools, all in the
same array and with remarkable ease. To do the same in a
language like C, on the other hand, you would have to
apply quite a bit more effort than the simple statement we
used to initialize our array here in PHP.
Inspite of this remarkable ease in which PHP arrays make
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

8/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

compound data structures a breeze, there is an inherent


flaw in their design. Not to fret though. Its a flaw that
comes with a trade-off. If you notice from the graph above
we have 4 elements in our array, but the C Bucket Array,
which is the chunk memory array we described in the very
first part of this article, only contains two elements and the
rest are empty. Notice that the two elements are Bucket1
and Bucket3 and that they do not begin from offset 0 of
our array. This is a result of hash collision, which is
remnant of every hashing function.
The collisions means that when we attempted to hash two
or more of the keys in our PHP array, they ended up
resulting in the same hash. Because of this collision we end
up with two or more buckets stored in the same place in
our C Bucket Array (in orange). The buckets (in purple)
then become a doubly linked list. Notice that Bucket1 has
a Next member that points to Bucket2. Inversely,
Bucket2 has a Last member that points back to Bucket1.
So when a key in our PHP array produces a hash collision
we simply traverse the doubly linked list of buckets until we
find the key that matches the one were looking for. See
that each of the buckets have a Key member that stores
the actual key we used in our PHP array. Believe it or not
these collisions happen quite frequently and the smaller the
array the more likely the possibility a collision will occur.
These collisions have an adverse performance impact since
it causes PHP to traverse the linked list of buckets in order
to find the specific bucket we need each time. That means
the cost could be as great as

((n1)*(n2)/

2 ) or less.
It is entirely possible to have 100% hash collision in a PHP
array and its a lot simpler than you think. PHP sees array
keys as either one of two things. Either its an int or its a
string. If its an int producing 100% collision is a rather
trivial task. You simply take the size of the array to the
nearest power of 2 and produce keys that increment in
multiples of that size until youve filled the size of the array.
At this stage you have 100% hash collision, meaning
youve exhausted the above cost, which is the worst
possible scenario. To give you an idea hashing a ~65K
(2^16) element array with 100% collision can take up to
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

9/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

~30 seconds in PHP (thats a potential DDoS vector).


If youre curious about what hashing function PHP uses to
hash array keys, its DJBX33A and its not just used for
arrays. Its actually used everywhere throughout PHP. This
is a very simple hash that was used because its fast. Its
not a cryptographically secure hash and it was never meant
to be. If we were to write an implementation of this hashing
function in PHP it would look similar to the following
01 /* DJBX33A Hash function implemented in PHP
*/
02 function DJBX33A($key) {
03
$hash = 5381;
04
if (is_int($key)) {
05
$key = pack("I*", $key);
06
}
07
for ($i = 0, $c = strlen($key); $i < $c;
$i++) {
08
$hash = (($hash << 5) + $hash) +
ord($key[$i]);
09
}
10
return $hash;
11 }
So if we used the above function to get the hash for each
of the keys we used in our PHP array earlier wed see they
come out to the same Hash numbers in our Buckets in
the graph above. So the question is, how do we find these
buckets in our C Bucket Array?
The answer is by using whats called a hash table mask.
The mask is simply the size of the hash table minus one.
Every PHP array starts off at 8 elements and doubles every
time the number of elements exceed the size of the array.
So in our example our mask is 7. We simply take our Hash
produced from the DJBX33A hash function we
demonstrated above and apply a bitwise AND of the mask
to get its offset in the C Bucket Array. You could also
just use the Hash MOD the size of the array, but a bitwise
operator will be much faster than a modulus, which is the
reason we use it.
So for example, heres how we got the offset for the key
-16 in our PHP array.
1 echo DJBX33A(-16) & 7;
We get

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

10/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

And there you have it! Now we can get the pointer to the
first bucket at offset 2 of our C Bucket Array, which points
to Bucket3. Once we get to Bucket3 we simply verify the
Key member of the bucket to make sure its the element
were looking for and if its not we check the Next member
to get the next bucket and keep going until we find the
element we need. In our case Bucket3 does indeed have a
Key member of -16, which is exactly what we want.

Iterating Arrays
There are quite a few misconceptions when it comes to
traversing a PHP array and what is or is not the
fastest/slowest or most efficient means of traversal. Im
going to do my very best to help debunk any myths or
misnomers you may have heard about such processes. The
key thing to remember here is that for the majority of use
cases no micro-optimizations are necessary since most of
the time each of the methods described here should work
just fine for the bulk of the PHP user base.
First, Id like to start by debunking the myth that a foreach
loop is faster than using a for loop, once-and-for-all. If you
want the tl;dr version its that for loops are faster than
foreach loops in every scenario. However, this does not
mean that one should chose a for loop over a foreach loop
to iterate arrays strictly based on the performance factor.
Lets examine the details a bit more closely to understand
why.
Here I ran a bench mark against both a foreach loop and a
for loop using the same array on both PHP 5.3 and 5.4
release branches. The first row shows a test where all we
did was iterate over the array with no statements in the
body of the loop. The second tests shows what happens
when we make modifications to the array from within each
loop.

PHP 5.3

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

PHP 5.4

11/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

foreach

for loop

foreach

loop

for loop

loop

0.025086

0.012185

0.007306

0.004201

seconds

seconds

seconds

seconds

0.139499

0.027206

0.048462

0.011421

seconds

seconds

seconds

seconds

Here the tests were conducted on an array of 100,000


elements. After redacting the first and last test samples
the times were averaged out over the number of tests run.
All tests done were on release branches PHP 5.3.10 and
PHP 5.4.5, respectively.
In the first test sample foreach doesnt seem to be too far
behind. In the second test sample we can start to notice
some bigger losses. Since in the foreach loop we are
accessing the array directly by key, we have no significant
factors affecting performance. So you would think that this
means they should at the very least both perform
equally. However, the actuall performance loss in the
foreach scenario has nothing to do with how we access the
array for modification, but more to do with the fact that
the foreach construct works with a copy of the array and
not the original array. So this means we invoke COW
(Copy On Write) behavior in the scenario where we write
to the array in the foreach loop.
If you arent familiar with Copy-On-Write behavior let me
give you a brief demonstration of how it works. Copy-OnWrite just means the PHP runtime engine makes
optimization on our behalf in order to conserve as much
memory as possible. Take the following example
1 $str1 = "Hello World";
2 $str2 = $str1;
Here PHP only makes one copy of the string Hello World
even though two different variables are using the same
value. This saves us some memory since the engine is
already smart enough to do the right thing. Now what
happens when we modify $str2?
1 $str2 .= "!";
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

12/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Now PHP realizes we must break the refcount in order to


make the string $str2 uses different from the one $str1
uses. This causes PHP to only copy the string when we
have actually written to the variable. Hence Copy On
Write!
Getting back to our bench mark, however, since the array
were looping over is pretty big this means were going to
be copying over a lot of memory. Its this copying of
memory that actually causes the performance loss. Now,
you might wonder so then why is the foreach loop
slower when were not modifying the array? and Im
going to address that question, in full detail, a little further
ahead.
This is what the foreach test looks like when we attempt to
modify the array inside the loop.
1
2
3
4
5
6
7
8

$array = range(1,100000);
$start = microtime(true);
foreach ($array as $key => $value) {
$array[$key] += 1; // Invokes COW
}
$end = microtime(true);
$time = $end - $start;
printf("Completed in %.6f seconds\n", $time);

This is what the for-loop test looks like when we attempt


to modify the array inside the loop.
1 $array = range(1,100000);
2 $start = microtime(true);
3 for ($i = 0, $c = count($array); $i < $c;
$i++) {
4
$array[$i] += 1;
5 }
6 $end = microtime(true);
7 $time = $end - $start;
8 printf("Completed in %.6f seconds\n", $time);
So, one solution to resolve this performance problem with
foreach loops, where you want to make modifications to
the array from inside the loop and since they are actually
more convenient to use in most cases is to use a
reference. References also solves the double memory
problem. Since we invoke cow when we break the refcount
of the ZVAL that means weve now doubled the amount of
memory necessary to iterate over the loop. With a
reference there is hardly any extra memory in use.
The following chart demonstrates the differences in
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

13/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

memory consumption between using a foreach loop


where copy-on-write is invoked, a foreach loop where
references are used instead, and a for loop where the
array is modified directly. All tests were done using
1,000 element arrays.

Using 1,000 Element Arrays

foreach (COW)

foreach (refere...

for loop

350.000

400.000

450.000

500.000

550.000

1
2
3
4
5
6
7

$array = range(1,100000);
$start = microtime(true);
foreach ($array as $key => &$value) {
$value += 1;
}
$end = microtime(true);
unset($value); // make sure you destroy the
reference
8 $time = $end - $start;
9 printf("Completed in %.6f seconds\n", $time);
Pay special attention to line 7 where we unset the
variable we used as a reference in order to destroy the
reference. Otherwise, you could fall into some
unexpected behavior if you continue to use the same
variable later on.
Here are the results of the bench mark using a foreach loop
with a reference in order to modify the array during the
loop. Compare that to the use of a for-loop to do the same
thing and they are actually quite comparable this time. In
the tests where foreach invokes COW behavior we see a
performance difference of up to 400% in PHP 5.4 and more
than 500% in PHP 5.3 (there have been significant
optimizations in the PHP engine since 5.4 that account for
these dramatic increases in performance). Here weve
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

14/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

closed the gap quite a bit and can hardly see any real
performance differences.

PHP 5.3
foreach
loop

for loop

PHP 5.4
foreach
loop

for loop

0.034114

0.027206

0.011769

0.011421

seconds

seconds

seconds

seconds

Note that by using a for loop to iterate over the array,


where you do make modifications to the array from inside
the loop, you stand to seriously break your loop if you
happen to append/remove elements from the array.
However, with foreach you do not have the same problem
since youre only iterating over a copy. Any modifications
made to the original array do not affect your copy and the
loop remains in tact. This key difference may actually make
a foreach loop far more desirable to a developer in most
given scenarios than for loops despite any performance
differences that may or may not arise. Also, consider that
unless youre working with incredibly enormous arrays, the
performance gains are hardly worth the extra code given
what weve seen from these bench marks. In my own
personal opinion, I find foreach loops afford you so much
more convenience in many scenarios.
For example, a foreach loop automatically resets the
internal array pointer for you before it begins iteration. This
ensures that we always start at the beginning of the loop.
It also stores a separate copy of the internal pointer in
order to prevent you from breaking the loop by moving the
pointer yourself with calls to next(), prev(), or reset(), for
example
01
02
03
04
05
06

$array = array(1,2,3,4);
echo 'key($array): ' . key($array) . "\n";
/* let's move the pointer */
echo 'next($array): ' . next($array) . "\n";

foreach ($array as $value) { // foreach reset


it for us
07
echo "$value\n";
08
/* notice foreach doesn't care about this
pointer */
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

15/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

09

if (!next($array)) reset($array); // lets


keep moving the pointer
10
echo 'key($array): ' . key($array) . "\n";
11 }
12 echo 'key($array): ' . key($array) . "\n";
We get

key($array): 0
next($array): 2
1
key($array): 2
2
key($array): 3
3
key($array): 0
4
key($array): 1
key($array): 1

Notice the foreach loop continues to work just fine.


1 $array = array(1,2,3,4);
2 foreach ($array as $value) {
3
/* This should be pretty obvious */
4
unset($array);
5
echo "$value\n";
6
var_dump(isset($array));
7 }
Look Ma no arrays!

1
bool(false)
2
bool(false)
3
bool(false)
4
bool(false)

A Low-Level Analysis of foreach


So, I promised to address why foreach loops were still
slightly slower than for loops even when we didnt make
any modifications to the array (invoking COW). The answer
lies in the guts of the PHP engine. It reveals itself to us
when we look at the opcodes generated by the foreach
construct that allow us to iterate over the array.
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

16/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Your PHP script is run in two phases. The first phase is the
parsing phase, where the interpreter reads, tokenizes, and
then lexes your PHP code. The second phase is the
compilation and execution phase where the interpreter
compiles your PHP code down into bytecodes (called
opcodes) and then executes them. During the execution
phase you can run a hook into the Zend engine and ask it
to give you the opcodes as they are generated/executed.
Heres the code we used
1
2
3
4
5
6

<?php
$array = array(1,2,3,4);
foreach ($array as $key => $value) {
echo "$key => $value\n";
}
?>

Heres what the opcodes for the foreach loop would look
like
Line

OPCODE

INIT_ARRAY

ADD_ARRAY_ELEMENT

ADD_ARRAY_ELEMENT

ADD_ARRAY_ELEMENT

ASSIGN

FE_RESET

IS_VAR $2

FE_FETCH

IS_VAR $3

ZEND_OP_DATA

ASSIGN

ASSIGN

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

Return
IS_TMP_VAR
~0
IS_TMP_VAR
~0
IS_TMP_VAR
~0
IS_TMP_VAR
~0

IS_TMP_VAR
~5

17/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

10

ADD_VAR

IS_TMP_VAR
~7

11

ADD_STRING

12

ADD_VAR

13

ADD_CHAR

14

ECHO

15

JMP

16

SWITCH_FREE

17

RETURN

IS_TMP_VAR
~7
IS_TMP_VAR
~7
IS_TMP_VAR
~7

IS_UNUSED

The key to the extra performance loss in foreach is whats


happening with opcodes 6 9 in the above table, which all
take place in the foreach construct upon every iteration.
PHP has to go and fetch the data from the iterator and
then assign it to the variable in our construct with every
pass. That means were doing this 100,000 times in this
loop. Those are hundreds of thousands of extra opcodes
that wouldnt happen in our for loop. However, do not
panic! PHP executes the opcodes very very fast. As you
can see in our bench mark it only takes about 7
milliseconds thats (1 / 1000 * 7) seconds to
complete the entire loop. Granted, thats still about ~3
milliseconds slower than our for loop test, but completely
unnoticeable for you. If your PHP code really did have any
serious performance problems this wouldnt likely be a
major one to focus on.

Arrays Within Arrays


Using multidimensional arrays is possible in PHP because,
as weve seen earlier, an array is just a hashtable, right?
Well, when I said I was simplifying in my diagram earlier,
which demonstrated how the PHP array structure looked, I
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

18/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

wasnt lying. To give you an even more elaborate picture


(yet Im still simplifying) lets take a look at a
multidimensional array.
01 $array = array(
02
4
=> 1,
03
'foo'
=> 'bar',
04
-16
=> true,
05
'baz',
06
'array2' => array(
07
"PHP",
08
"Arrays"
09
)
10
);
11
12 var_dump($array);
Alright so all I did here was take our previous array and
add another array to it. So then the above code would
show us that we have a multidimensional array

array(5) {
[4]=>
int(1)
["foo"]=>
string(3) "bar"
[-16]=>
bool(true)
[5]=>
string(3) "baz"
["array2"]=>
array(2) {
[0]=>
string(3) "PHP"
[1]=>
string(6) "Arrays"
}
}

Now just imagine what this looks like when I present it to


you in a diagram similar to our first array

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

19/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Here Ive factored in how each portion of this array is


broken up into different units of memory and how they are
all related. As you can see we start at the very top with the
variable that we just defined $array, which is in blue. The
variable points to a ZVAL, which is in red (actually the
variable name is compiled out into a hashtable that points
to a ZVAL but again Im trying to simplify). The ZVAL
points to a HashTable, which is in light-blue. The
HashTable points to a BucketArray, which is in orange.
The BucketArray allows us to get to a whole bunch of
Buckets, which are in purple. The Buckets themselves
point to other ZVALs. Notice that I singled out strings in
green for the ZVALs that point to strings. The reason for
this is because the memory for the string itself can be
allocated separately from the ZVAL.
Heres one thing you should notice right away by looking at
this picture. You cant make a connection to any element in
the array $array["array2"] to get back to $array directly.
The reason is that big light-blue HashTable that gets in
your way. Remember the HashTable leads us to the
BucketArray, which leads us to the Buckets, but theres no
way to get to the Buckets without the HashTable. This is
also what makes arrays with references behave a little
different than references anywhere else.
Take the following example where we create a reference to
a string value and try to modify a copy of the variable in a
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

20/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

function.
1
2
3
4
5
6
7

$string = "Hello World!";


function modify_string($string) {
$string = "Hello PHP!";
}

$string_reference = &$string; // Creates a


reference
8 modify_string($string_reference);
9 var_dump($string_reference);
As expected

string(12) "Hello World!"

However, lets see what happens when we try the same


thing with an array.
1
2
3
4
5
6
7
8
9

$string = "Hello World!";


function modify_array(array $array) {
$array[0] = "Hello PHP!";
}
$array[0] = &$string;
modify_array($array);
var_dump($array); // WTF?

array(1) {
[0]=>
&string(10) "Hello PHP!"
}

If you dont understand whats happening here Ill refer


you to the diagram below (perhaps that might give you a
clearer picture of whats going on). Remember that each
element in the array is represented by a Bucket, that
points to a ZVAL. Here this bucket just happens to point
to a ZVAL thats also being used by the variable $string
(thats what happens when we assign something by
reference), that ultimately points to our string. So if we
change one or the other ($array[0] or $string) we end
up changing the same string since they both take us to the
same ZVAL, which means theres only one string. Now, as
you can imagine this might seem weird, but I assure you
its perfectly intended behavior. You might even wonder,
how on earth we managed to break the local scope, but
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

21/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

thats the problem with references. They do not abide by


scope. They follow the ZVAL no matter which scope it may
be in.

Note that here we do not use pass-by-reference, we do


not return anything from the function (by reference or
otherwise), and we do not use call-time-pass-by-reference
either. None of those behaviors are at play here. Whats
really happening is that the array makes it possible for
references to travel with it even when its copied, which is
something you might not have expected. To prove that
this variable is indeed copied into the functions local scope
and not being passed by reference we can test the
following code.
01 $string = "Hello World!";
02
03 function modify_array(array $array) {
04
$array[0] = "Hello PHP!";
05
$array[] = "This element only exists in the
local scope";
06
var_dump($array);
07 }
08
09 $array[0] = &$string;
10 modify_array($array);
11 var_dump($array);
As you can see the first array contains the new element we
appended to the array inside the functions local scope. But
when we return from the function, the array in the global
scope is left without this new element.
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

22/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

array(2) {
[0]=>
&string(10) "Hello PHP!"
[1]=>
string(43) "This element only exists in the local scope"
}
array(1) {
[0]=>
&string(10) "Hello PHP!"
}

So you are definitely making a copy of the array. It just so


happens that when you copy the Bucket with the shared
ZVAL, you end up at the same ZVAL for that one Bucket!
So now, you should be clever enough to know why this
doesnt work the other way around.
01 $string = "Hello World!";
02
03 function modify_array(array $array) {
04
$array[0] = "Hello PHP!";
05
$string = "This element only exists in the
local scope";
06
$array[] = &$string;
07
var_dump($array);
08 }
09
10 $array[0] = &$string;
11 modify_array($array);
12 var_dump($array);

array(2) {
[0]=>
&string(10) "Hello PHP!"
[1]=>
&string(43) "This element only exists in the local scope"
}
array(1) {
[0]=>
&string(10) "Hello PHP!"
}

Remember its a copy of the array in the functions local


scope we modified. Not the array in the global scope. Dont
let the references confuse you.

No More Arrays!
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

23/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

If you feel like youre either exasperated or really excited


about PHP arrays now then either way Ive done my job!
:)
Perhaps next time Ill introduce you to the innards of the
PHP Object

If you have any comments, questions, or


suggestions about anything that Ive
explained here or what more youd like me to
discuss about PHP internals aspects of the
array please feel free to leave me your
comments below. If you have any ideas about
what youd like me to discuss in respect to
PHP objects for next time do leave those as
well and I will try to make my next post as
informative and entertaining as possible.

PHP
arrays

DJBX33A

for-loop

multidimensional arrays

foreach

iterating

php internals

GoogleGuy
View all posts by GoogleGuy

Related Posts
PHP OOP: Objects Under The Hood

Test Drive PHP 5.5: A

PHP OOP: Objects Under

Sneak Peek

The Hood

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

24/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

17 Responses toA Closer Look Into PHP Arrays: What


You Dont See

Hirvine
Reply

October 29, 2012 at 3:23 pm #

Awesome article. I knew about the struct


and C-Arrays, but youre images are
brilliant. Im not sure how long you took
your time to write this down, but its much
appreciated. Great post!

unreal4u
Reply

October 29, 2012 at 6:15 pm #

Incredible post, many thanks to you as I


am just introducing myself to the C side of
PHP and I dont quite understand yet many
of the new things Ive seen :)
Really looking forward to your next post of
PHP objects!
Greetings.

michael stevens
Reply

October 29, 2012 at 9:23 pm #

Very informational, thanks!

Achmad Solichin
October 29, 2012 at 10:59 pm #

Reply

nice posting. a complete discussion about


array. but you can also discuss about php
function of array.

Kate
Reply

November 6, 2012 at 8:15 am #

Thanks for sharing the great news! It was


included into a digest of the hottest and
the most interesting PHP news:
http://www.zfort.com/blog/php-digestnovember-5-2012-zfort-group/

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

25/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Reply

Vladimir S.

December 19, 2012 at 12:15 pm #

Thank you for the article! Its very interesting.


But I didnt understand, how PHP maintains elements
order.
In your example array keys are ordered in such a
way, that their hash offsets appear to be sequential
1 1 2 2. And everything looks simple on the
figure `PHP Array Structure`. However, if to change
the keys order to be 4, -16, foo, 5 the figure wont
change, as I understand, but PHP will preserve the
new order.
Could you explain this magic? :)

Valdimir S.
Reply

December 25, 2012 at 7:00 am #

Hello, again :)
Ive found an answer for my previous
question, so Im posting it here in case
anyone else is interested in this too.
Bucket structure is defined as follows:
typedef struct bucket {
ulong h;
uint nKeyLength;
void *pData;
void *pDataPtr;
struct bucket *pListNext;
struct bucket *pListLast;
struct bucket *pNext;
struct bucket *pLast;
const char *arKey;
} Bucket;
As you see there are two pairs of pointers:
pListNext/pListLast and pNext/pLast. The
former pair is one, thats mentioned in this
article to maintain relation between buckets
under one record in hash table. And the

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

26/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

later pair is the one, which helps to


maintain array order and is used for
iteration over the array.

Israel Smith
Reply

January 20, 2013 at 7:14 am #

I am completely blown away, its rare to


come across something this hyperinformative and well-written at the same
time. This is a digital treasure, thank you!

Martin Konecny
Reply

Awesome article , thanks!

Aura Acosta
Reply

January 23, 2013 at 8:59 pm #

January 31, 2013 at 12:17 pm #

I have a question, how I can prevent a float


value change decimals? is very important to
preserve the exact value, eg
[3] => Array
(
[0] => 23.00
[stock_id] => 23.00
[1] => HUEVO EL CALVARIO 23 KGR
[description_item] => HUEVO EL
CALVARIO 23 KGR
[2] => 232.4
[quantity] => 232.4
[3] => 24.915232358003
[unit_price] => 25
[4] => 5790.3
[tot_partida] => 5790.3
[5] => A08
[loc_code] => A08
[6] => 2013-01-15
[tran_date] => 2013-01-15
[7] => 25
[8] => 0
[unit_tax] => 0
[9] => 4
[category_id] => 4

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

27/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

[10] => HUEVO


[description] => HUEVO
[11] => kg
[units] => kg
[12] => 0
[provision] => 0
)
I need these two are equal
[3] => 24.915232358003
[unit_price] => 25
the difference greatly affects my result
thanks

Rahul A
Reply

June 29, 2013 at 7:27 am #

nice blog.awesome through this article i


understood the strong of array.

prabhakar n. rao October


Reply

11, 2013 at 5:45 am #

Being in the software teaching, I really


didnt know the functioning of
PHP array. I liked this blog and thanks lot,
Please continue this yomans service, god
bless you.

sigmato
Reply

October 17, 2013 at 7:57 am #

PHP arrays are really complex in structure.


This is really great article. Most of them
really do not know these even they
program a bit.

Site
Reply

February 3, 2014 at 9:10 am #

Retratable leashes aare idesl for walking


your canine.
ll be opened to an entirfe new world oof
stylish dog collars and leashes.The
nylonn varieties can be quite durable They

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

28/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

are ideal for attaching


ID tags.

Pingbacks/Trackbacks
1. Sherif Ramadan: A Closer Look Into PHP Arrays:

What You Dont See : Atom Wire -

October 29, 2012

[...] a new post Sherif Ramadan takes an in-depth look


at PHP arrays and what happens behind the scenes
when theyre put to [...]
2. Bookmarks for October 29th | Chriss Digital

Detritus -

October 29, 2012

[...] A Closer Look Into PHP Arrays: What You Dont See
This entry was posted in Web Bookmarks and tagged
php, programming by chris. Bookmark the permalink.
[...]
3. Best-of-the-Web 11 | David Mller:

Webarchitektur -

November 3, 2012

[...] A Closer Look Into PHP Arrays: What You Dont See
PHP Arrays von allen Seiten beleuchtet mit ein paar
Insights in die interne Struktur von PHP selbst. [...]

Leave a Reply
Your email address will not be published. Required fields are marked *
Name (Required)
E-Mail (Required)
Website

You may use these HTML tags and attributes: <a href="" title=""> <abbr
title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code>
<del datetime=""> <em> <i> <q cite=""> <strike> <strong>
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

29/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Post Comment

Search
Search this site...
Search

Calendar
October 2012
S

9 10 11 12 13

14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
Aug

Dec

Socialize
31

Tw eet

18

PHP 101 Tutorials


PHP 101
PHP Intro
PHP Syntax
PHP Constructs
PHP Data Types
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

30/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

PHP Functions
PHP Scope
PHP Classes and Objects
PHP User Input
The PHP Project

PHP Corner
The PHP Corner
Your Development
Environment
Compiling PHP 5.4
Install PECL Extensions
The PHP Project

Categories
DNS (2)
MySQL (11)
PHP (26)
Programming (9)
Search (5)
Security (1)
Technology (8)
Uncategorized (8)
Web Hosting (6)

Recent Posts
A Software Project
Journey
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

31/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

This Little Thing Called


Apple
Password Hashing And
Why People Do It Wrong
My Chat With Joel
Spolsky on Why
StackOverflow Works
How to Write an
Operator Precedence
Parser in PHP
Web Analytics with PHP
and Google
Visualization: Made
Simple
PHP OOP: Objects Under
The Hood
A Closer Look Into PHP
Arrays: What You Dont
See
Test Drive PHP 5.5: A
Sneak Peek
Finally Getting finally In
PHP?
JavaScript & CSS Modal
Dialogs and Popup
Windows
Dynamic Types and PHP
Made Simple With
Examples In C
Building A Data Center:
What It Takes
Creating A Word Cloud
With PHP
Data Sanitization Suite
2.0
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

32/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

Do Computer Science
Geeks Need Glasses?
Renewable Energy: You
Are Being Lied To
The Internet Blackout:
SOPA
Why You Need a
Database
Remember Me
Load Balancing Software
as a Service
What Programming
Language Should I Learn
Browsing the Web
WebSockets Making
The Web More Useful
Viral Videos and the
Web

Tags
AI artificial intelligence

clean code cloud


hosting crawler

cloud

data

datetime dedicated hosting


dns domain names forum
free hosting

Google

indexing linux login

machines
system

member

memory

mysql objectoriented-programming
online communication

OOP

php php internals


https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

33/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

posix posix time

programming SaaS
search search engines
shared hosting small
business

social

networking

technology
timeinterval unix unix time
vps

web web

development web
hosting

y2k38 year

2038 problem

Programming
Examples
Here are some random
programming
examples I've put up
over the years when
helping others with
PHP and some other
languages...
Flash Uploader
Written In
ActionScript
A PHP Calendar
PHP Word Cloud
Generator
My MicroController
Emulator Written In
PHP
My PHP Exploit Joke
An Operator
Precedence Parser
written in PHP
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

34/35

29/5/2014

A Closer Look Into PHP Arrays: What You Dont See | Sherif's Tech Blog

A Client Billing Tool


written in Java
This is an
incomplete list
and will not likely
be refined so
please be careful
if you use any
code examples
presented here
as they're not
likely to have
been well tested.
They are only
meant for
demonstration
purposes.

Copyright 2008 2014 Sherif Ramadan

https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

All rights reserved

35/35