Beruflich Dokumente
Kultur Dokumente
Simon Fuhrmann
Darmstadt University of Technology, Germany
Abstract: This paper deals with library implemented automated dynamic memory manage-
ment, also known as automated garbage collection (or just AGC or GC for short) for the C++
programming language.
Contents
1 Introduction 1
2 Motivation 2
2.1 Objects as return value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Objects in GUI environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
List of Figures 28
I
1 INTRODUCTION 1
1 Introduction
Managing dynamic memory is a complex and highly error-prone task and is one of the most
common causes of runtime errors. Such errors can cause intermittent program failures that are
difficult to find and to fix.
There are different types of errors with managing dynamic memory. One of the most common
mistakes are memory leaks. Memory leaks occur if a program fails to free objects that are no
longer in use. Typically, a program continues to leak memory, and over the time its performance
degrades and it can eventually run out of memory. Another problem are premature frees. This
is caused by freeing objects that are still in use, which can result in corrupted data and may lead
to sudden failures. Memory fragmentation is caused by frequent allocation and deallocation of
memory such that larger chunks of memory are divided into much smaller, scattered pieces, and
the memory between these pieces is too small to be reallocated again.
2 Motivation
In real life applications manual memory management is an issue and cannot be bypassed completely
because some situations require manual memory managment. There are several reasons for that
but the most important ones are:
• More important are objects that need to outlast the current function scope. There are a lot
scenarious for this issue. Most common are objects in GUI environments and objects that
sould be used as return value.
return result;
}
The function repeat string() takes a string and an amount of repetitions and returns times
concatenated copies of str. On the one hand, the code is simple and easy to understand, and if
str and times are small, copying the result should not be matter. But on the other hand if str
is long, times has a big value, and performance is critical, this code is very expensive.
Note: This very simple example may get optimized by the compiler with a technique called ”return
value optimization”. This optimization requests space for the result in the right place and skips
copying the result (skips calling the copy constructor). But more complex function calls may not
get optimized.
This approach uses a trick to prevent copying the result from a local variable. There are no more
performance considerations and memory management is not an issue here. But other reasons rates
this solution far from beeing good. Using parameters for values that should be return values is
unintuitively. But unfortunately, this is common practice in functions from libc, the C library.
2 MOTIVATION 3
Even worse, there is no control over the creation process of the result object and there is no
guarantee that result was created with appropriate constructor parameters. Quickly forgetting
this approach, it’s time to forge ahead with some better ideas.
return result;
}
Returning a pointer to the data is the most convenient way to perform a cheap copy operation.
Only few bytes (the actual size of a pointer depends on the machine) need to be copied. But there
is no way to reclaim the requested memory within the repeat string() function. The deletion of
the data need to be done outside this function, after the result has been used. Another idea would
be using a static buffer that gets overwritten on subsequent use to avoid deletion. But neither
remembering to delete the data nor supplying a static buffer is a good solution for that simple
problem.
return result;
}
The result ist created in the heap, of course, because the data need to exceed the current scope.
But instead of leaving the cleanup to the caller, a smart pointer called auto ptr takes care about
the deletion of the data. The specific semantics from auto ptr will be explained later on. The
slightly complicated code (that uses C++ templates) can be cleaned with typedef’s and results in
a code like this:
typedef auto_ptr<std::string> StrPtr;
return result;
}
As a result, the code looks identical to the first approach but some things are different: The data
needs to be created with new and the result needs to be dereferenced to access the data, that’s
because a smart pointer behaves like a dump pointer.
2 MOTIVATION 4
Ok, whats wrong with this code? Like mentioned before, the GUI toolkit has its own main loop.
Until the main loop is not active, no windows will be drawn and the GUI will not respond to
user interaction. In the example above the window and the buttons are created in the local scope
and if the scope exits, the window and the buttons will be destroyed again. It’s relatively easy
to understand that the code above does absolutely nothing but creating and destroying widgets,
that will not be drawn by the GUI toolkit.
The second scrap of code will work, but the problem is obvious: The acquired memory need to be
deleted somewhere.
After returning from the callback function the main loop can do its work and display the window
with its content.
3 UNDERSTANDING SMART POINTERS 5
Constructors
To get the dump pointer inside the wrapped pointer a few constructors are required. Null pointers
should indicate an undefined destination for smart pointers like they do for dump pointers.
• A null smart pointer can be created with the default constructor that takes no arguments.
Example: SmartPtr<std::string> ptr;
• A copy constructor need to be defined, to initialize a wrapped pointer from another
compatible wrapped pointer instance.
Example: SmartPtr<std::vector> ptr(ptr2);
• To initialize the wrapped pointer from a dump pointer a constructor is needed with the type
of the dump pointer as argument.
Example: SmartPtr<int> ptr(new int(31337));
Operators
Operators are needed to implement the syntax of a dump pointer. The operators of interest are
the following, where ptr1 and ptr2 are wrapped pointers:
• The dereference operator returns a reference to the data.
Example: Object& o = *ptr1;
• The pointer to member operator (for direct member access) returns a pointer to the data.
Example: ptr1->my_method();
• The assignment operators reassign the pointer to the data from a dump pointer or an existing
smart pointer.
Example: ptr1 = new Object();
ptr1 = ptr2;
• The comparison operators compare the data pointer with another data pointer of a given
smart pointer or dump pointer.
Example: if (ptr1 == ptr2) { ... }
if (ptr1 != ptr2) { ... }
3 UNDERSTANDING SMART POINTERS 6
Class outline
The following outline of a class for wrapped pointers declares all discussed features. The definition
of these methods is trivial and can be looked up in the appendix of this article. Note that the
definitions of the assignment operators and the destructor depend on the strategy that is used for
the smart pointer. The explicit keyword is explained just below.
public:
MyPointer (void);
explicit MyPointer (T* p);
MyPointer (const MyPointer<T>& src);
~MyPointer(void);
void f(std::string s)
that takes a string. Calling this function with f("Hello") is absolutely ok because "Hello"
(which is of type char*) gets implicitly converted to the type std::string because there is an
implicit constructor like std::string(char* str).
If such a conversion is not welcome the constructor should be defined as explicit.
In the smart pointer’s case an implicit conversion from an unmanaged pointer to a smart pointer
is to be prevented to avoid accidental deletion of the data and mental aberration.
resource is tied to the lifetime of the object. Since object lifetimes are well-defined, so is the use
of the resource.
The following example demonstrates elegant use of RAII by opening and closing a file:
This code opens the named file within myfile’s constuctor, operates on the open file in the while-
loop and closes the file in myfile’s destructor. So, the code does what it’s intended to do without
manually closing the file, which saves time and errors.
In C++, newed objects has dynamic extent and must be explicitly deleted, but in many cases the
RAII idiom can be used to automate this. This is the place where smart pointers becomes handy.
Typically, memory will be requested and passed to the constructor of a smart pointer. Later on
the managed data will be released in ”the right” destructor. The means of ”the right” are defined
in the smart pointer’s semantics. ”The right” destructor for the reference counting or reference
linking smart pointer is the one from the last referencing smart pointer.
One of the prominent examples for explicit ownership is dynamic memory, which has to be acquired
with new and need to be released with delete. But there are other explicit owned resources like
file handles, sockets and critical sections.
• File handles can be wrapped into smart file handles that close themself if there are no more
references. This may be useful for logging file handles which are scattered throughout the
whole code.
• Networking sockets can be wrapped as smart sockets. Like smart file handles, a socket can
be closed if there is no more referencing object. And in fact file handles and networking
sockets are both file descriptors (on Unix-like operating systems, at least).
• An exclusive area or critical section needs to be locked when the flow enters a specific part of
the code and unlocked again if the critical section is exited. The program tends to deadlock
if a critical section is never exited due to programming errors. In this case it’s useful to bind
the lifetime of a lock to the lifetime of a scoped object. So the critical section is guaranteed
to be released by the mechanisms of the language.
The most common strategy for resource management is reference counting and, for some resources,
reference counting is the only appropriate strategy. But there are others particularly with regard
to memory management.
3.4 Strategies
The created wrapped pointer class offers no more functionality than any dump pointer. To add
some smartness to the code skeleton a strategy needs to be implemented. Before going into an
implementation some strategies will be discussed. The list of strategies is not complete but should
be sufficient for most applications.
Every strategy is entirely defined by the characteristics of the assignment operators and the de-
structor. Let ptr1 and ptr2 be smart pointers. Possible strategies for ptr1 = ptr2 and reset()
are described for each strategy.
Scoped pointer
The scoped pointer is rarely used. It stores a pointer to a dynamically allocated object. The
object pointed to is guaranteed to be deleted, either on destruction of the scoped pointer, or
via an explicit reset(). The assignment ptr1 = ptr2 is forbidden. Since a scoped pointer is
noncopyable it is caved within its scope and cannot get out. It’s safer than the reference counting
or ownership transfer pointers for pointers which should not be copied.
Example: A local object (e.g. a class member) should be protected from being accidently passed
to the outside of the class scope.
Copied pointer
The copied pointer stores a pointer to a dynamically allocated object that is guaranteed to be
deleted on destruction of the copied pointer or via explicit reset(). The assignment ptr1 =
ptr2 creates a new copy of the object pointed by ptr2, and let ptr1 point to this copy. Creating
the new copy can be done with memcpy() (for plain old data (POD), like any C primitive) or by
using some ”clonable” concept (for user defined types (UDT), like C++ classes), which has to be
implemented by the element type.
Example: The copied pointer can be used whenever stack semantics are needed but the object
can not be created on the stack for some reason.
3 UNDERSTANDING SMART POINTERS 9
To indicate a pointer with no target (a null pointer) the shared counter pointer and data pointer
are set to null. There are only two possible cases: Data and counter pointer are both null or both
not null.
To initialize the smart pointer from a given dump pointer a new counter needs to be established.
To initialize the smart pointer from another smart pointer the data and counter pointer need to
be copied and the counter needs to be incremented.
The destructor just decrements the counter. If the counter drops to zero, data and count will be
deleted within the decrement() member function.
Dereference operators
There are two dereference operators. The first one is for accessing the dereferenced data, the
second one is the direct member access or pointer to member operator which is used for calling
member functions or accessing member variables.
// Dereference operator
T& ref_ptr::operator* (void) const
{
return *ptr;
}
Assignment operators
The assignment operators are the most complicated ones. The following objectives need to be
accomplished:
Note that the comparison between ptr and src.ptr is essential! Imagine the following code:
Without this comparision the above code would lead to an error (typically segmentation fault
on unix machines). Some STL algorithms make such assignments while sorting a container, for
example std::sort. This comparison could be left out if src is passed as copy and not as
reference, but performance would be worse.
An alternative, which relies on the constructor and destructor implementation, for the member
function above would be:
Comparison operators
Comparison can only be implemented for equality and unequality. The standard does not define
the meaning of ”one pointer is less than another” (unless both pointers point into the same array),
so a member operator< or something similar is not supported.
// Comparison with smart pointer
bool ref_ptr::operator== (ref_ptr<T>& p) const
{
return ptr == p.ptr;
}
It is always possible to store a pointer with a subclass type to a pointer variable of a base class
type using bare pointer arithmetic. Thus this is valid C++ code:
An equivalent code with dump pointers replaced by smart pointers should be possible, too. But
there is a problem:
The problem is a missing assignment operator that takes a right hand side from a different dump
or smart pointer type. Since this type is not known in advance member templates must be used.
Introducing the following new template member function is a solution that allows assignments from
other right hand sides Y instread of T (where Y can be any type but T is always the parameter
type for instances of this class).
This codes leads to another problem. Since rhs is of type ref ptr<Y> and associated objects from
this class are of type ref ptr<T>, there is no access to rhs.ptr and rhs.count because the ptr
and count members are private and can only be accessed by objects of the same type.
Two approaches will help out. The first one is to make the appropriate members public. The
second and much better approach is to introduce template friends that allow access from all
ref ptr types. In the reverse direction the type ref ptr<Y> will grant access from all ref ptr
types.
To also support all operations from other smart pointer types, all constructors, assignment and
comparison operators need to be reintoduced with a template parameter. And since some compilers
are not capable of member templates or template friends, it should be possible to easily disable
the new code. The following outline declares the new members, the definition is identical to the
definitions of the appropriate non-template member functions.
4 REFERENCE COUNTING SMART POINTERS 15
template <class Y> ref ptr<T>& operator= (const ref ptr<Y>& rhs);
template <class Y> ref ptr<T>& operator= (Y* rhs);
#endif
};
Example 1: A cyclic singly liked list: A singly linked list serves as an example. A list
consists of items and an interface that manages these items. Destroying the list interface should
destroy all the items like the STL containers used to do.
Destruction of all the list items happens because destroying the interface decrements the reference
counter of the first list item to zero. So the first list item gets destroyed, which decrements the
recerence counter of the second item to zero, and so on.
4 REFERENCE COUNTING SMART POINTERS 16
The problem arises if there is any circle within the references. For illustration purposes the last
list item references the first item now. Note that the first list item has a reference count of two,
because it is referenced by the interface and the last list item.
The destruction of the interface decrements the reference counter of the first list item to one
instead of zero. This means that the list items are not reclaimed because none of the reference
counters is zero, and all items that are part of the cycle leak.
Example 2: A graph data structure: To model a graph data structure one needs to introduce
classes for nodes and for edges. To keep it simple, the following minimal code should be enough:
class Node
{
std::list<EdgePtr> edges;
};
class Edge
{
NodePtr predecessor;
NodePtr successor;
};
4 REFERENCE COUNTING SMART POINTERS 17
Every node can have multiple edges and every edge references two nodes, a predecessor and a
successor node. This makes the references cyclic, because a node references one or more edges,
and each of the edges references the node. The illustration visualizes the cycle.
The question is how to fix these cycles or how to handle cyclic references. Some libraries offer a
solution for that problem.
Weak pointers: A new pointer type called ”weak pointer” is introduced. The already intro-
duced ref ptr can be treated as a ”strong pointer”. The idea is to let both, the strong and the
weak pointers, share a data object with some limitations to the weak pointer: The weak pointer
stores a ”weak reference” to an object that is already managed by a strong pointer. To access
the object, a weak pointer can be converted to a strong pointer. This is typically done by calling
the lock() member function or using an appropriate constructor from the strong pointer. When
the last strong pointer to the object goes away and the object is deleted, the attempt to obtain a
strong pointer from the weak pointer instances that refer to the deleted object will fail.
The implementation of weak pointers will not go into detail but the steps are relatively easy. The
first thing that needs to be changed is the shared counter. The counter now needs to keep track
of strong and weak references to the data. A single integer for the counter is insufficient and we
need to provide a counter class:
class Count
{
public:
int ref_cnt;
int weak_cnt;
};
The counter object must not be destroyed until all references (weak and strong ones) have dis-
appeared, because the weak pointers need to check if the data is still available and has not been
reclaimed (and that is, if ref cnt > 0).
Rules for destroying the data and the counter look like this:
Example 2: A graph data structure (Continued): The cycles in the graph data structure
can easily be fixed with replacing strong node references by weak references. The resulting code
looks like this:
class Node
{
std::list<EdgePtr> edges;
};
class Edge
{
WeakNodePtr predecessor;
WeakNodePtr successor;
};
There are no more cycles in the structure as the new visualization shows. Converting the weak
pointers to strong pointers in the edge will never fail because edges can semantically not exist
without the predecessor and the successor node.
4.2.2 Multi-Threading
Multi-Threading is always a problematic issue. Smart pointers suffer from some inconsistencies in
multi-threaded environments, and multi-threading issues ultimately affect smart pointers’ imple-
mentation. The introduced reference counting smart pointer is not thread safe because the critical
operations are not atomic.
If smart pointers are copied between threads, incrementing and decrementing the reference counter
happens at unpredictable times. And the most critical operation is decrementing the counter
(because incrementing never frees memory).
The following piece of code is used to illustrate the problem (it is the decrement member function
in a slightly different version):
The problem arises if two threads execute the decrement member function simultanously. A simple
example of unintended behaviour could happen like this: Two threads have each a smart pointer
to the same data, thus the reference counter has two as value.
The first thread executes the statements until and including *count -= 1;. Then this thread is
paused but the use count is already decremented to one. The other thread decrements the counter
to zero and deletes the counter and data afterwards. If the first thread continues execution, the
data and counter are about to be deleted the second time and the program will crash.
A simple solution for that problem is using a mutex to create an area with mutally exclusion of
program flow. This makes the decrement() member function atomic.
mutex.lock();
if (--(*count) == 0)
{
delete count;
delete ptr;
}
mutex.unlock();
}
Memory penalties: Every reference counting smart pointer instance takes up 2·sizeof(void*)
bytes on the stack (8 bytes an 32 bit machines). This size consists of the size for the data pointer
and the counter pointer. A bare pointer only takes up sizeof(void*) bytes, which is half the
size. In the heap the smart pointer allocates additional memory of sizeof(Count) bytes where
Count is either just an int or a counter class.
But performance is not only a matter of a few bytes of memory. The time to allocate this memory
is significant.
Performance penalties: The process of allocating memory (which is the process of requesting
memory from the operating system) is expensive. There is almost no difference between reqesting
4 or 4000 bytes, but there is a huge difference between allocating memory one or two times. Thus
the number of allocations is significant, not the size of one allocation.
4 REFERENCE COUNTING SMART POINTERS 20
• The Boost shared ptr with multi-threading support (denoted with a purple line).
The values along the x-axis are the amount of allocations and deletions while the values along the
y-axis are the chronometries for these allocations and deletions in seconds.
Left diagram: The data object that has been Right diagram: This diagram was created
used for this test was the primitive type int using a data object with a size of 4 megabytes
which has a size of 4 bytes. The counter also (one million times bigger). The difference be-
has a size of 4 bytes. Thus to create a reference tween the dump pointer and the smart pointers
counting smart pointer two allocations need to is almost zero because the additional counter al-
be done. For this reason the ref ptr needs location takes a proportionally short time.
about twice the time than the dump pointer,
which only needs to allocate 4 bytes for the data.
Figure 10: Alloc and del, 4 byte data Figure 11: Alloc and del, 4 MB data
Most of the objects are typically small and objects with a size of one megabyte and up are
unrealistic. However, 3’500’000 allocations can be done within one second (instead of 7’000’000
with a dump pointer). This should be enough for most of the cases and stable programms are
more important than slightly more CPU time.
4 REFERENCE COUNTING SMART POINTERS 21
Left diagram: The left diagram compares the Right diagram: This diagram visualizes tim-
time that is used to swap smart pointer variables ings for code that uses the swap() member func-
manually with a third temporary variable. tion instead of swapping manually with a tem-
The dump pointer nearly uses zero time for porary variable.
this test compared to the smart pointers. The Note that the values of the x-axis from the pre-
Boost smart pointer with multi-threading sup- vious test are in range from 0 to 107 . The new
port is far away from having competitive tim- diagram’s range reaches from 0 to 108 which is
ings. Thats because making increment and ten times greater. The old range is yellow col-
decrement atomic operations is expensive. ored for comparison reasons.
Figure 12: Copy operation, manual swapping Figure 13: Copy operation, smart swapping
It is easy to read off that the Boost pointer now needs about 0.4 seconds instead of 3.5 seconds
for the same amount of manual swapping operations. Both, the Boost pointer with and without
multi-threading support have equal timings now, which means that Boost undertakes no effort in
atomizing the swap() operation.
If swapping of smart pointers is an issue, it is strongly recommended to use the swap() mem-
ber function. This member function just swaps the data and counter pointers and circumvents
incrementing and decrementing the counters of the participating smart pointers.
5 SMART POINTER LIBRARIES 22
http://www.open-std.org/jtc1/sc22/WG21/
for detailed information about TR (technical report) and ”General Purpose Smart Pointers”.
The auto ptr is a smart pointer with ownership strategy. This strategy passes responsibility for
reclaiming the data or, if responsibility has not been passed, reclaims the data if the auto ptr
leaves scope or reset() is called.
scoped ptr: The scoped pointer is a lightweight and simple pointer for sole ownership of single
objects.
The scoped pointer stares a pointer to a dynamically allocated object, which is reclaimed either on
destuction of the smart pointer on via explicit reset(). It has no shared ownership or ownership
tranfer semantics and is non-copyable. The scoped pointer is safer and faster for objects that
should not be copied.
The scoped pointer does not meet the CopyConstructible and Assignable requirements for STL
containers. There is also a version called scoped array which is for sole ownership of arrays.
shared ptr: The shared pointer is an external reference counting pointer with ownership shared
among multiple pointers.
The shared pointer does meet the requirements for STL containers. There is also a version called
shared array which is for array ownership among muliple pointers.
weak ptr: The weak pointer is a non-owning observer of a shared ptr-owned object. There is
no need for an appropriate weak array variant because weak pointers never reclaim the data.
The weak pointer does meet the requirements for STL containers.
intrusive ptr: The intrusive pointer is a shared ownership pointer (like shared pointer) with
embedded reference count. It is a lightweight shared pointer for objects which already have internal
reference counting. Thus, the intrusive pointer is smaller and faster than the shared pointer.
increment() and decrement() operations are forwarded to user defined member functions.
The intrusive pointer does meet requirements for STL containers.
RefCounted: This pointer supports classic external reference counting like Boost’s shared ptr.
There is no support for weak pointers and it is not thread-safe. This pointer meets the requirements
for STL containers.
RefCountedMTAdj: This pointer is similar to RefCounted but it has support for multi-threaded
environments. It can of course also be stored in STL containers.
DeepCopy: This pointer implements deep copy semantics. It is similar to the copied pointer
strategy, but leaves the deep copy implementation to the user. For this purpose it assumes a
Clone() member function. The copied pointer strategy only implements flat copying. Since deep
copying satisfies appropriate container requirements, a pointer of this type can be stored in STL
containers.
RefLinked: This smart pointer implements the referene linking strategy. Reference linking is
similar to reference counting, thus this pointer can be stored in STL containers.
Destructive Copy: The destructive copy smart pointer implements the ownership transfer strat-
egy just like std::auto ptr. It is not possible to store this pointer in STL containers.
NoCopy: This smart pointer implements the scoped pointer strategy like boost::scoped ptr.
This smart pointer cannot be copied, thus there is no way to get it in and out of STL containers.
A REF PTR SOURCE CODE 24
/* Destructor. */
~ref ptr (void)
{ decrement(); }
/* Operations. */
void reset (void)
{
decrement();
ptr = 0; count = 0;
}
/* Dereference. */
T& operator* (void) const
{ return *ptr; }
/* Comparison. */
bool operator== (const T* p) const
{ return p == ptr; }
/* Information. */
int use count (void) const
{ return (count == 0) ? 0 : *count; }
public:
/* Ctor: From diffrent pointer. */
template <class Y>
explicit ref ptr (Y* p) : ptr(p)
{ count = (p == 0) ? 0 : new int(1); }
List of Figures
1 Smart pointer structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Null smart pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Ordinary list structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Reclaimed, ordinary list structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Cyclic list structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6 Reclaimed, cyclic list structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7 Graph structure: UML-like notation . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8 Weak and strong pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9 New graph structure: UML-like notation . . . . . . . . . . . . . . . . . . . . . . . . 18
10 Alloc and del, 4 byte data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
11 Alloc and del, 4 MB data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12 Copy operation, manual swapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
13 Copy operation, smart swapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21