Beruflich Dokumente
Kultur Dokumente
Array and string Binary heap Binary search tree (BST) Dynamic array Graph Hash table Singly-linked list
Dictionary ADT Priority queue ADT Stack ADT (impl.: array-based stack implementation)
No overhead per element. Any element of an array can be accessed at O(1) time by its index.
Drawbacks
Array data structure is not completely dynamic. Many programming languages provides an opportunity to allocate arrays with arbitrary size (dynamically allocated array), but when this space is used up, a new array of greater size must be allocated and old data is copied to it. Insertion and deletion of an element in the array requires to shift O(n) elements on average, where n is size of the array.
There are two types of arrays, which differ in the method of allocation. Static array has constant size and exists all the time, application being executed. Dynamically allocated array is created during program run and may be deleted when it is not more needed. Dynamically allocated arrays can be quite large, even bigger, than amount of physical memory. Yet, dynamically allocated array can not be resized. But you can expand an array as noted below:
1. Create new array of bigger size; 2. Copy data from old array to the new one; 3. Free memory, occupied by the old array.
H e l
! \0
Code snippets
Sample program finds a minimal value among entered. Note that Java allows only dynamically allocated arrays.
Java
import java.util.Scanner;
public class Arrays { public static void main(String[] args) { Scanner keyboard = new Scanner(System.in); // dynamically allocated array int arr[] = new int[15]; int n = 0; int value = 0; System.out.println("Enter values. Type \"-1\" to stop: "); while (n < 15 && value != -1) { value = keyboard.nextInt(); keyboard.nextLine(); if (value != -1) { arr[n] = value; n++; } }
if (n == 0) { System.out.println("You have entered no values, bye!"); } else { int minimum = arr[0]; for (int i = 1; i < n; i++) { if (arr[i] < minimum) minimum = arr[i]; } System.out.print("The minimal value is " + minimum); } } }
C++
#include <iostream>
int main() { // static array int arr[15]; int n = 0; int value = 0; cout << "Enter values. Type \"-1\" to stop: "; while (n < 15 && value != -1) {
cin >> value; if (value != -1) { arr[n] = value; n++; } } if (n == 0) { cout << "You have entered no values, bye!"; } else { int minimum = arr[0]; for (int i = 1; i < n; i++) { if (arr[i] < minimum) minimum = arr[i]; } cout << "The minimal value is " << minimum; } return 0; }
Binary heap
There are several types of heaps, but in the current article we are going to discuss the binary heap. For short, let's call it just "heap". It is used to implement priority queue ADT and in the heapsort algorithm. Heap is a complete binary tree, which answers to the heap property.
It is said, that binary tree is complete, if all its levels, except possibly the deepest, are complete. Thought, incomplete bottom level can't have "holes", which means that it has to be fulfilled from the very left node and up to some node in the middle. See illustrations below.
Correct example of a complete binary tree
Heap property
There are two possible types of binary heaps: max heap and min heap. The difference is that root of a min heap contains minimal element and vice versa. Priority queue is often deal with min heaps, whereas heapsort algorithm, when sorting in ascending order, uses max heap.
Heap property for min heap
For every node in a heap, node's value is lesser or equal, than values of the children.
For every node in a heap, node's value is greater or equal, than values of the children.
it is a binary tree; each node contains a value; a total order is defined on these values (every two values can be compared with each other); left subtree of a node contains only values lesser, than the node's value; right subtree of a node contains only values greater, than the node's value.
Implementation
See how binary search tree is represented inside the computer. Operations on a BST
search for a place to put a new element; insert the new element to this place.
At this stage analgorithm should follow binary search tree property. If a new value is less, than the current node's value, go to the left subtree, else go to the right subtree. Following this simple rule, the algorithm reaches a node, which has no left or right subtree. By the moment a place for insertion is found, we can say for sure, that a new value has no duplicate in the tree. Initially, a new node has no children, so it is a leaf. Let us see it at the picture. Gray circles indicate possible places for a new node.
Now, let's go down to algorithm itself. Here and in almost every operation on BST recursion is utilized. Starting from the root,
1. check, whether value in current node and a new value are equal. If so, duplicate is found. Otherwise, 2. if a new value is less, than the node's value: o if a current node has no left child, place for insertion has been found;
o otherwise, handle the left child with the same algorithm. 3. if a new value is greater, than the node's value: o if a current node has no right child, place for insertion has been found; o otherwise, handle the right child with the same algorithm.
Just before code snippets, let us have a look on the example, demonstrating a case of insertion in the binary search tree. Example
Code snippets
The only the difference, between the algorithm above and the real routine is that first we should check, if a root exists. If not, just create it and don't run a common algorithm for this special case. This can be done in the BinarySearchTree class. Principal algorithm is implemented in the BSTNode class.
Java
public class BinarySearchTree {
if (root == null) { root = new BSTNode(value); return true; } else return root.add(value); } } public class BSTNode { public boolean add(int value) { if (value == this.value) return false; else if (value <this.value) { if (left == null) { left = new BSTNode(value); return true; } else return left.add(value); } else if (value > this.value) { if (right == null) { right = new BSTNode(value); return true; } else return right.add(value); }
return false; } }
C++
bool BinarySearchTree::add(int value) { if (root == NULL) { root = new BSTNode(value); return true; } else return root->add(value); }
bool BSTNode::add(int value) { if (value == this->value) return false; else if (value < this->value) { if (left == NULL) { left = new BSTNode(value); return true; } else return left->add(value); } else if (value > this->value) { if (right == NULL) { right = new BSTNode(value);
Need help with a programming assignment? Get affordable programming homework help.
Now, let's see more detailed description of the search algorithm. Like an add operation, and almost every operation on BST, search algorithm utilizes recursion. Starting from the root,
1. check, whether value in current node and searched value are equal. If so, value is found. Otherwise, 2. if searched value is less, than the node's value: o if current node has no left child, searched value doesn't exist in the BST; o otherwise, handle the left child with the same algorithm. 3. if a new value is greater, than the node's value: o if current node has no right child, searched value doesn't exist in the BST; o otherwise, handle the right child with the same algorithm. Just before code snippets, let us have a look on the example, demonstrating searching for a value in the binary search tree. Example
Code snippets
As in add operation, check first if root exists. If not, tree is empty, and, therefore, searched value doesn't exist in the tree. This check can be done in the BinarySearchTree class. Principal algorithm is implemented in the BSTNode class.
Java
public class BinarySearchTree {
public boolean search(int value) { if (root == null) return false; else return root.search(value); } } public class BSTNode { public boolean search(int value) { if (value == this.value) return true; else if (value < this.value) { if (left == null) return false; else return left.search(value); } else if (value > this.value) { if (right == null) return false; else
C++
bool BinarySearchTree::search(int value) { if (root == NULL) return false; else return root->search(value); }
bool BSTNode::search(int value) { if (value == this->value) return true; else if (value < this->value) { if (left == NULL) return false; else return left->search(value); } else if (value > this->value) { if (right == NULL) return false;
Need help with a programming assignment? Get affordable programming homework help.
Now, let's see more detailed description of the search algorithm. Like an add operation, and almost every operation on BST, search algorithm utilizes recursion. Starting from the root,
1. check, whether value in current node and searched value are equal. If so, value is found. Otherwise, 2. if searched value is less, than the node's value: o if current node has no left child, searched value doesn't exist in the BST; o otherwise, handle the left child with the same algorithm. 3. if a new value is greater, than the node's value: o if current node has no right child, searched value doesn't exist in the BST; o otherwise, handle the right child with the same algorithm. Just before code snippets, let us have a look on the example, demonstrating searching for a value in the binary search tree. Example
Code snippets
As in add operation, check first if root exists. If not, tree is empty, and, therefore, searched value doesn't exist in the tree. This check can be done in the BinarySearchTree class. Principal algorithm is implemented in the BSTNode class.
Java
public class BinarySearchTree {
public boolean search(int value) { if (root == null) return false; else return root.search(value); } } public class BSTNode { public boolean search(int value) { if (value == this.value) return true; else if (value < this.value) { if (left == null) return false; else return left.search(value); } else if (value > this.value) { if (right == null) return false; else
C++
bool BinarySearchTree::search(int value) { if (root == NULL) return false; else return root->search(value); }
bool BSTNode::search(int value) { if (value == this->value) return true; else if (value < this->value) { if (left == NULL) return false; else return left->search(value); } else if (value > this->value) { if (right == NULL) return false;
search for a node to remove; if the node is found, run remove algorithm.
Now, let's see more detailed description of a remove algorithm. First stage is identical to algorithm for lookup, except we should track the parent of the current node. Second part is more tricky. There are three cases, which are described below.
1. Node to be removed has no children.
This case is quite simple. Algorithm sets corresponding link of the parent to NULL and disposes the node. Example. Remove -4 from a BST.
It this case, node is cut from the tree and algorithm links single child (with it's subtree) directly to the parent of the removed node. Example. Remove 18 from a BST.
3. Node to be removed has two children. This is the most complex case. To solve it, let us see one useful BST property first. We are going to use the idea, that the same set of values may be represented as different binary-search trees. For example those BSTs:
contains the same values {5, 19, 21, 25}. To transform first tree into second one, we can do following:
o o o
choose minimum element from the right subtree (19 in the example); replace 5 by 19; hang 5 as a left child.
The same approach can be utilized to remove a node, which has two children:
o o o
find a minimum value in the right subtree; replace value of the node to be removed with found minimum. Now, right subtree contains a duplicate! apply remove to the right subtree to remove a duplicate.
Notice, that the node with minimum value has no left child and, therefore, it's removal may result in first or second cases only. Example. Remove 12 from a BST.
Find minimum element in the right subtree of the node to be removed. In current example it is 19.
Replace 12 with 19. Notice, that only values are replaced, not nodes. Now we have two nodes with the same value.
Code snippets
First, check first if root exists. If not, tree is empty, and, therefore, value, that should be removed, doesn't exist in the tree. Then, check if root value is the one to be removed. It's a special case and there are several approaches to solve it. We propose the dummy root method, when dummy root node is created and real root hanged to it as a left child. When remove is done, set root link to the link to the left child of the dummy root.
In the languages without automatic garbage collection (i.e., C++) the removed node must be disposed. For this needs, remove method in the BSTNode class should return not the boolean value, but the link to the disposed node and free the memory in BinarySearchTree class.
Java
public class BinarySearchTree {
public boolean remove(int value) { if (root == null) return false; else { if (root.getValue() == value) { BSTNode auxRoot = new BSTNode(0); auxRoot.setLeftChild(root); boolean result = root.remove(value, auxRoot); root = auxRoot.getLeft(); return result; } else { return root.remove(value, null); } } } } public class BSTNode { public boolean remove(int value, BSTNode parent) {
if (value < this.value) { if (left != null) return left.remove(value, this); else return false; } else if (value > this.value) { if (right != null) return right.remove(value, this); else return false; } else { if (left != null && right != null) { this.value = right.minValue(); right.remove(this.value, this); } else if (parent.left == this) { parent.left = (left != null) ? left : right; } else if (parent.right == this) { parent.right = (left != null) ? left : right; } return true; } }
}
}
C++
bool BinarySearchTree::remove(int value) { if (root == NULL) return false; else { if (root->getValue() == value) { BSTNode auxRoot(0); auxRoot.setLeftChild(root); BSTNode* removedNode = root->remove(value, &auxRoot); root = auxRoot.getLeft(); if (removedNode != NULL) { delete removedNode; return true; } else return false; } else { BSTNode* removedNode = root->remove(value, NULL); if (removedNode != NULL) {
BSTNode* BSTNode::remove(int value, BSTNode *parent) { if (value < this->value) { if (left != NULL) return left->remove(value, this); else return NULL; } else if (value > this->value) { if (right != NULL) return right->remove(value, this); else return NULL; } else { if (left != NULL && right != NULL) { this->value = right->minValue(); return right->remove(this->value, this); } else if (parent->left == this) {
parent->left = (left != NULL) ? left : right; return this; } else if (parent->right == this) { parent->right = (left != NULL) ? left : right; return this; } } }
left subtree of a node contains only values lesser, than the node's value; right subtree of a node contains only values greater, than the node's value.
Algorithm looks as following: 1. get values in order from left subtree; 2. get values in order from right subtree; 3. result for current node is (result for left subtree) join (current node's value) join (result for right subtree). Running this algorithm recursively, starting form the root, we'll get the result for whole tree. Let us see an example of algorithm, described above.
Example
Dynamic arrays
One of the problems occurring when working with array data structure is that its size can not be changed during program run. There is no straight forward solution, but we can encapsulate capacity management.
Internal representation
The idea is simple. Application allocates some amount of memory and logically divides it into two parts. One part contains the data and another one is a free space. Initially all allocated space is free. During the data structure functioning, the boundary between used / free parts changes. If there no more free space to use, storage is expanded by creating new array of larger size and copying old contents to the new location. Dynamic array data structure has following fields:
storage: dynamically allocated space to store data; capacity value: size of the storage; size value: size of the real data.
Ensure capacity
Before value or several values is added, we should ensure, that we have enough capacity to store them. Do the following steps:
check, if current capacity isn't enough to store new items; calculate new capacity by the formula: newCapacity = (oldCapacity * 3) / 2 + 1. Algorithm makes a free space reserve in order not to resize the storage too often. check if new capacity is enough to store all new items and, if not, increase it to store exact amount of items; allocate new storage and copy contents from the old one to it; deallocate the old storage (in C++); change the capacity value;
Enlargement coefficient can be chosen arbitrary (but it should be greater, than one). Proposed value is 1.5 and it is optimal on average.
Pack
When items are removed, amount of the free space increases. If there are too few values in the dynamic array, unused storage become just a waste of space. For the purpose of saving space, we develop a mechanism to reduce capacity, when it is excessive.
check, if size is less or equal, than half of the capacity; calculate new capacity by the formula: newCapacity = (size * 3) / 2 + 1. Algorithm leaves exact the amount of space, as if storage capacity had been trimmed to the size and then method to ensure capacity was called. allocate new storage and copy contents from the old one to it; deallocate the old storage (in C++); change the capacity value.
Lower boundary for size, after which packing is done, may vary. In the current example it is 0.5 of the capacity value. Commonly, pack is a private method, which is called after removal. Also, dynamic array interface provides a trim method, which reduces capacity to fit exact amount of items in the array. It is done from the outside of the implementation, when you are sure, that no more values to be added (for instance, input from user is over).
Code snippets
Both Java and C++ provides efficient tools to copy memory, which are used in the implementations below.
Java
import java.util.Arrays;
public void ensureCapacity(int minCapacity) { int capacity = storage.length; if (minCapacity > capacity) { int newCapacity = (capacity * 3) / 2 + 1;
private void pack() { int capacity = storage.length; if (size <= capacity / 2) { int newCapacity = (size * 3) / 2 + 1; storage = Arrays.copyOf(storage, newCapacity); } }
C++
#include <cstring>
memcpy(newStorage, storage, sizeof(int) * size); capacity = newCapacity; delete[] storage; storage = newStorage; }
void DynamicArray::ensureCapacity(int minCapacity) { if (minCapacity > capacity) { int newCapacity = (capacity * 3) / 2 + 1; if (newCapacity < minCapacity) newCapacity = minCapacity; setCapacity(newCapacity); } }
setCapacity(newCapacity); }
Range check
There is no much to say about the range check. Algorithm checks, whether index is inside the 0..size-1 range and if not, throws an exception.
InsertAt
This operation may require array expanding, so algorithm invokes ensure capacity method first, which should ensure size + 1 minimal capacity. Then shift all elements from i to size - 1, where i is the insertion position, one element right. Note, that if new element is inserted after the last element in the array, then no shifting required. After shifting, put the value to i-th element and increase size by one.
RemoveAt
Shift all elements from i to size - 1, where i is the removal position, one element left. Then decrease size by 1 and invoke pack opeartion. Packing is done, if there are too few elements left after removal.
Code snippets
Java
public class DynamicArray {
private void rangeCheck(int index) { if (index < 0 || index >= size) throw new IndexOutOfBoundsException("Index: " + index + ", Size: " + size); }
public void removeAt(int index) { rangeCheck(index); int moveCount = size - index - 1; if (moveCount > 0) System.arraycopy(storage, index + 1, storage, index, moveCount); size--; pack(); }
public void insertAt(int index, int value) { if (index < 0 || index > size) throw new IndexOutOfBoundsException("Index: " + index + ", Size: "
+ size); ensureCapacity(size + 1); int moveCount = size - index; if (moveCount > 0) System.arraycopy(storage, index, storage, index + 1, moveCount); storage[index] = value; size++; } }
C++
#include <cstring> #include <exception>
void DynamicArray::rangeCheck(int index) { if (index < 0 || index >= size) throw "Index out of bounds!"; }
void DynamicArray::removeAt(int index) { rangeCheck(index); int moveCount = size - index - 1; if (moveCount > 0) memmove(storage + index, storage + (index + 1), sizeof(int) * moveCount); size--; pack(); }
void DynamicArray::insertAt(int index, int value) { if (index < 0 || index > size) throw "Index out of bounds!"; ensureCapacity(size + 1); int moveCount = size - index; if (moveCount != 0) memmove(storage + index + 1, storage + index, sizeof(int) * moveCount); storage[index] = value; size++; }
Introduction to graphs
Graphs are widely-used structure in computer science and different computer applications. We don't say data structure here and see the difference. Graphs mean to store and analyze metadata, the connections, which present in data. For instance, consider cities in your country. Road network, which connects them, can be represented as a graph and then analyzed. We can examine, if one city can be reached from another one or find the shortest route between two cities. First of all, we introduce some definitions on graphs. Next, we are going to show, how graphs are represented inside of a computer. Then you can turn to basic graph algorithms. There are two important sets of objects, which specify graph and its structure. First set is V, which is called vertex-set. In the example with road network cities are vertices. Each vertex can be drawn as a circle with vertex's number inside.
vertices Next important set is E, which is called edge-set. E is a subset of V x V. Simply speaking, each edge connects two vertices, including a case, when a vertex is connected to itself (such an edge is called a loop). All graphs are divided into two big groups: directed and undirected graphs. The difference is that edges in directed graphs, called arcs, have a direction. These kinds of graphs have much in common with each other, but significant differences are also present. We will accentuate which kind of graphs is considered in the particular algorithm description. Edge can be drawn as a line. If a graph is directed, each line has an arrow.
undirected graph
directed graph
Sequence of vertices, such that there is an edge from each vertex to the next in sequence, is called path. First vertex in the path is called the start vertex; the last vertex in the path is called the end vertex. If start and end vertices are the same, path is called cycle. Path is called simple, if it includes every vertex only once. Cycle is called simple, if it includes every vertex, except start (end) one, only once. Let's see examples of path and cycle.
path (simple)
cycle (simple)
The last definition we give here is a weighted graph. Graph is called weighted, if every edge is associated with a real number, called edge weight. For instance, in the road network example, weight of each road may be its length or minimal time needed to drive along.
weighted graph
Undirected graphs
Adjacency matrix
Each cell aij of an adjacency matrix contains 0, if there is an edge between i-th and j-th vertices, and 1 otherwise. Before discussing the advantages and disadvantages of this kind of representation, let us see an example.
Graph
Adjacency matrix
Edge (2, 5)
Edge (1, 3)
The graph presented by example is undirected. It means that its adjacency matrix is symmetric. Indeed, in undirected graph, if there is an edge (2, 5) then there is also an edge (5, 2). This is also the reason, why there are two cells for every edge in the sample. Loops, if they are allowed in a graph, correspond to the diagonal elements of an adjacency matrix. Advantages. Adjacency matrix is very convenient to work with. Add (remove) an edge can be done in O(1) time, the same time is required to check, if there is an edge between two vertices. Also it is very simple to program and in all our graph tutorials we are going to work with this kind of representation. Disadvantages.
Adjacency matrix consumes huge amount of memory for storing big graphs. All graphs can be divided into two categories, sparse and dense graphs. Sparse ones contain not much edges (number of edges is much less, that square of number of vertices, |E| << |V|2). On the other hand, dense graphs contain number of edges comparable with square of number of vertices. Adjacency matrix is optimal for dense graphs, but for sparse ones it is superfluous. Next drawback of the adjacency matrix is that in many algorithms you need to know the edges, adjacent to the current vertex. To draw out such an information from the adjacency matrix you have to scan over the corresponding row, which results in O(|V|) complexity. For the algorithms like DFS or based on it, use of the adjacency matrix results in overall complexity of O(|V|2), while it can be reduced to O(|V| + |E|), when using adjacency list. The last disadvantage, we want to draw you attention to, is that adjacency matrix requires huge efforts for adding/removing a vertex. In case, a graph is used for analysis only, it is not necessary, but if you want to construct fully dynamic structure, using of adjacency matrix make it quite slow for big graphs.
To sum up, adjacency matrix is a good solution for dense graphs, which implies having constant number of vertices.
Adjacency list
This kind of the graph representation is one of the alternatives to adjacency matrix. It requires less amount of memory and, in particular situations even can outperform adjacency matrix. For every vertex adjacency list stores a list of vertices, which are adjacent to current one. Let us see an example.
Graph
Adjacency list
Advantages. Adjacent list allows us to store graph in more compact form, than adjacency matrix, but the difference decreasing as a graph becomes denser. Next advantage is that adjacent list allows to get the list of adjacent vertices in O(1) time, which is a big advantage for some algorithms.
Disadvantages.
Adding/removing an edge to/from adjacent list is not so easy as for adjacency matrix. It requires, on the average, O(|E| / |V|) time, which may result in cubical complexity for dense graphs to add all edges. Check, if there is an edge between two vertices can be done in O(|E| / |V|) when list of adjacent vertices is unordered or O(log2(|E| / |V|)) when it is sorted. This operation stays quite cheap. Adjacent list doesn't allow us to make an efficient implementation, if dynamically change of vertices number is required. Adding new vertex can be done in O(V), but removal results in O(E) complexity.
To sum up, adjacency list is a good solution for sparse graphs and lets us changing number of vertices more efficiently, than if using an adjacent matrix. But still there are better solutions to store fully dynamic graphs.
Code snippets
For reasons of simplicity, we show here code snippets only for adjacency matrix, which is used for our entire graph tutorials. Notice, that it is an implementation for undirected graphs.
Java
public class Graph { private boolean adjacencyMatrix[][]; private int vertexCount;
public void addEdge(int i, int j) { if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) { adjacencyMatrix[i][j] = true;
adjacencyMatrix[j][i] = true; } }
public void removeEdge(int i, int j) { if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) { adjacencyMatrix[i][j] = false; adjacencyMatrix[j][i] = false; } }
public boolean isEdge(int i, int j) { if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) return adjacencyMatrix[i][j]; else return false; } }
C++
class Graph { private: bool** adjacencyMatrix; int vertexCount; public:
Graph(int vertexCount) { this->vertexCount = vertexCount; adjacencyMatrix = new bool*[vertexCount]; for (int i = 0; i < vertexCount; i++) { adjacencyMatrix[i] = new bool[vertexCount]; for (int j = 0; j < vertexCount; j++) adjacencyMatrix[i][j] = false; } }
void addEdge(int i, int j) { if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) { adjacencyMatrix[i][j] = true; adjacencyMatrix[j][i] = true; } }
void removeEdge(int i, int j) { if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) { adjacencyMatrix[i][j] = false; adjacencyMatrix[j][i] = false; } }
bool isEdge(int i, int j) { if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) return adjacencyMatrix[i][j]; else return false; }
~Graph() { for (int i = 0; i < vertexCount; i++) delete[] adjacencyMatrix[i]; delete[] adjacencyMatrix; } };
Algorithm
In DFS, each vertex has three possible colors representing its state: white: vertex is unvisited; gray: vertex is in progress;
black: DFS has finished processing the vertex. NB. For most algorithms boolean classification unvisited / visited is quite enough, but we show general case here. Initially all vertices are white (unvisited). DFS starts in arbitrary vertex and runs as follows:
1. Mark vertex u as gray (visited). 2. For each edge (u, v), where u is white, run depth-first search for u recursively. 3. Mark vertex u as black and backtrack to the parent.
Example. Traverse a graph shown below, using DFS. Start from a vertex with number 1.
Source graph.
There are no ways to go from the vertex 3. Mark it as black and backtrack to the vertex 5.
There are no ways to go from the vertex 5. Mark it as black and backtrack to the vertex 2.
There are no more edges, adjacent to vertex 2. Mark it as black and backtrack to the vertex 4.
There are no more edges, adjacent to the vertex 4. Mark it as black and backtrack to the vertex 1.
There are no more edges, adjacent to the vertex 1. Mark it as black. DFS is over.
As you can see from the example, DFS doesn't go through all edges. The vertices and edges, which depth-first search has visited is a tree. This tree contains all vertices of the graph (if it is connected) and is called graph spanning tree. This tree exactly corresponds to the recursive calls of DFS.
If a graph is disconnected, DFS won't visit all of its vertices. For details, see finding connected components algorithm.
Complexity analysis
Assume that graph is connected. Depth-first search visits every vertex in the graph and checks every edge its edge. Therefore, DFS complexity is O(V + E). As it was mentioned before, if an adjacency matrix is used for a graph representation, then all edges, adjacent to a vertex can't be found efficiently, that results in O(V2) complexity. You can find strong proof of the DFS complexity issues in [1].
Code snippets
In truth the implementation stated below gives no yields. You will fill an actual use of DFS in further tutorials.
Java
public class Graph {
public void DFS() { VertexState state[] = new VertexState[vertexCount]; for (int i = 0; i < vertexCount; i++) state[i] = VertexState.White; runDFS(0, state); }
public void runDFS(int u, VertexState[] state) { state[u] = VertexState.Gray; for (int v = 0; v < vertexCount; v++) if (isEdge(u, v) && state[v] == VertexState.White) runDFS(v, state); state[u] = VertexState.Black; } }
C++
enum VertexState { White, Gray, Black };
void Graph::DFS() { VertexState *state = new VertexState[vertexCount]; for (int i = 0; i < vertexCount; i++) state[i] = White; runDFS(0, state); delete [] state; }
state[u] = Gray; for (int v = 0; v < vertexCount; v++) if (isEdge(u, v) && state[v] == White) runDFS(v, state); state[u] = Black; }
Hash table
Hash table (or hash map) is one of the possible implementions of dictionary ADT. Hence, basically it maps unique keys to associated values. In the view of implementation, hash table is an array-based data structure, which uses hash function to convert the key into the index of an array element, where associated value is to be sought.
Hash function
Hash function is very important part of hash table design. Hash function is considered to be good, if it provides uniform distribution of hash values. Other hash function's properties, required for quality hashing will be examined in detail later. The reason, why hash function is a subject to the principal concern, is that poor hash functions cause collisions and some other unwanted effects, which badly affect hash table overall performance.
Collisions
What happens, if hash function returns the same hash value for different keys? It yields an effect, called collision. Collisions are practically unavoidable and should be considered when one implements hash table. Due to collisions, keys are also stored in the table, so one can distinguish between key-value pairs having the same hash. There are various ways of collision resolution. Basically, there are two different strategies:
Closed addressing (open hashing). Each slot of the hash table contains a link to another data structure (i.e. linked list), which stores key-value pairs with the same hash. When collision occures, this data structure is searched for key-value pair, which matches the key. Open addressing (closed hashing). Each slot actually contains a key-value pair. When collision occurs, open addressing algorithm calculates another location (i.e. next one) to locate a free slot. Hash tables, based on open addressing strategy experience drastic performance decrease, when table is tightly filled (load factor is 0.7 or more).
Singly-linked list
Linked list is a very important dynamic data structure. Basically, there are two types of linked list, singly-linked list and doubly-linked list. In a singly-linked list every element contains some data and a link to the next element, which allows to keep the structure. On the other hand, every node in a doubly-linked list also contains a link to the previous node. Linked list can be an underlying data structure to implement stack, queue or sorted list.
Example
Sketchy, singly-linked list can be shown like this:
Each cell is called a node of a singly-linked list. First node is called head and it's a dedicated node. By knowing it, we can access every other node in the list. Sometimes, last node, called tail, is also stored in order to speed up add operation.
Visualizers
1. Linked List in Java Applets Centre
Traversal algorithm
Beginning from the head, 1. check, if the end of a list hasn't been reached yet; 2. do some actions with the current node, which is specific for particular algorithm; 3. current node becomes previous and next node becomes current. Go to the step 1.
Example
As for example, let us see an example of summing up values in a singly-linked list.
For some algorithms tracking the previous node is essential, but for some, like an example, it's unnecessary. We show a common case here and concrete algorithm can be adjusted to meet it's individual requirements.
Code snippets
Although we have two classes for singly-linked list, SinglyLinkedListNode class is used as storage only. Whole algorithm is implemented in the SinglyLinkedList class.
Java implementation
public class SinglyLinkedList {
public int traverse() { int sum = 0; SinglyLinkedListNode current = head; SinglyLinkedListNode previous = null; while (current != null) { sum += current.value;
C++ implementation
int SinglyLinkedList::traverse() { int sum = 0; SinglyLinkedListNode *current = head; SinglyLinkedListNode *previous = NULL; while (current != NULL) { sum += current->value; previous = current; current = current->next; } return sum; }
When list is empty, which is indicated by (head == NULL)condition, the insertion is quite simple. Algorithm sets both head and tail to point to the new node.
Add first
In this case, new node is inserted right before the current head node.
It can be done in two steps: 1. Update the next link of a new node, to point to the current head node.
Add last
In this case, new node is inserted right after the current tail node.
General case
In general case, new node is always inserted between two nodes, which are already in the list. Head and tail links are not updated in this case.
Code snippets
All cases, shown above, can be implemented in one function with two arguments, which are node to insert after and a new node. For add first operation, the arguments are (NULL, newNode). For add last operation, the arguments are (tail, newNode). Though, this specific operations (add first and add last) can be implemented separately, in order to avoid unnecessary checks.
Java implementation
public class SinglyLinkedList {
public void addLast(SinglyLinkedListNode newNode) { if (newNode == null) return; else { newNode.next = null; if (head == null) { head = newNode; tail = newNode; } else { tail.next = newNode;
tail = newNode; } } }
public void addFirst(SinglyLinkedListNode newNode) { if (newNode == null) return; else { if (head == null) { newNode.next = null; head = newNode; tail = newNode; } else { newNode.next = head; head = newNode; } } }
else { if (previous == null) addFirst(newNode); else if (previous == tail) addLast(newNode); else { SinglyLinkedListNode next = previous.next; previous.next = newNode; newNode.next = next; } } } }
C++ implementation
void SinglyLinkedList::addLast(SinglyLinkedListNode *newNode) { if (newNode == NULL) return; else { newNode->next = NULL; if (head == NULL) { head = newNode; tail = newNode; } else { tail->next = newNode;
tail = newNode; } } }
void SinglyLinkedList::addFirst(SinglyLinkedListNode *newNode) { if (newNode == NULL) return; else { if (head == NULL) { newNode->next = NULL; head = newNode; tail = newNode; } else { newNode->next = head; head = newNode; } } }
else { if (previous == NULL) addFirst(newNode); else if (previous == tail) addLast(newNode); else { SinglyLinkedListNode *next = previous->next; previous->next = newNode; newNode->next = next; } } }
Remove first
In this case, first node (current head node) is removed from the list.
It can be done in two steps: 1. Update head link to point to the node, next to the head.
Remove last
In this case, last node (current tail node) is removed from the list. This operation is a bit more tricky, than removing the first node, because algorithm should find a node, which is previous to the tail first.
General case
In general case, node to be removed is always located between two list nodes. Head and tail links are not updated in this case.
Code snippets
All cases, shown above, can be implemented in one function with a single argument, which is node previous to the node to be removed. For remove first operation, the argument is NULL. For remove last operation, the argument is the node, previous to tail. Though, it's better to implement this special cases (remove first and remove last) in separate functions. Notice, that removing first and last node have different complexity, because remove last needs to traverse through the whole list.
Java implementation
public class SinglyLinkedList {
public void removeLast() { if (tail == null) return; else { if (head == tail) { head = null; tail = null; } else { SinglyLinkedListNode previousToTail = head; while (previousToTail.next != tail) previousToTail = previousToTail.next; tail = previousToTail; tail.next = null; } }
public void removeNext(SinglyLinkedListNode previous) { if (previous == null) removeFirst(); else if (previous.next == tail) { tail = previous; tail.next = null; } else if (previous == tail) return; else { previous.next = previous.next.next; } } }
C++ implementation
void SinglyLinkedList::removeFirst() { if (head == NULL) return; else { SinglyLinkedListNode *removedNode; removedNode = head; if (head == tail) { head = NULL;
void SinglyLinkedList::removeLast() { if (tail == NULL) return; else { SinglyLinkedListNode *removedNode; removedNode = tail; if (head == tail) { head = NULL; tail = NULL; } else { SinglyLinkedListNode *previousToTail = head; while (previousToTail->next != tail) previousToTail = previousToTail->next; tail = previousToTail; tail->next = NULL; }
delete removedNode; } }
void SinglyLinkedList::removeNext(SinglyLinkedListNode *previous) { if (previous == NULL) removeFirst(); else if (previous->next == tail) { SinglyLinkedListNode *removedNode = previous->next; tail = previous; tail->next = NULL; delete removedNode; } else if (previous == tail) return; else { SinglyLinkedListNode *removedNode = previous->next; previous->next = removedNode->next; delete removedNode; } }
Dictionary ADT
Dictionary (map, association list) is a data structure, which is generally an association of unique keys with some values. One may bind a value to a key, delete a key (and naturally an associated
value) and lookup for a value by the key. Values are not required to be unique. Simple usage example is an explanatory dictionary. In the example, words are keys and explanations are values.
Dictionary ADT
Operations
Dictionary create() creates empty dictionary boolean isEmpty(Dictionary d) tells whether the dictionary d is empty put(Dictionary d, Key k, Value v) associates key k with a value v; if key k already presents in the dictionary old value is replaced by v Value get(Dictionary d, Key k) returns a value, associated with key k or null, if dictionary contains no such key remove(Dictionary d, Key k) removes key k and associated value destroy(Dictionary d) destroys dictionary d
Implementations
In the application we have a pair (priority, item) where an item is some auxiliary data priority is associated with. To maintain simplicity, we omit priorities and consider that for items e1, e2: e1 < e2 means e1 has higher priority than e2.
Operations
PriorityQueue create() creates empty priority queue boolean isEmpty(PriorityQueue pq) tells whether priority queue pq is empty insert(PriorityQueue pq, Item e) inserts item e to priority queue pq Item minimum(PriorityQueue pq) tells minimal item in priority queue pq Precondition: pq is not empty removeMin(PriorityQueue pq) removes minimum item from priority queue pq Precondition: pq is not empty destroy(PriorityQueue pq) destroys priority queue pq
Implementations
Implementation
Implementation of array-based stack is very simple. It uses top variable to point to the topmost stack's element in the array.
1. 2. 3. 4. 5. Initialy top = -1; push operation increases top by one and writes pushed element to storage[top]; pop operation checks that top is not equal to -1 and decreases top variable by 1; peek operation checks that top is not equal to -1 and returns storage[top]; isEmpty returns boolean (top == -1).
Code snippets
Java implementation
public class Stack { private int top; private int[] storage;
Stack(int capacity) { if (capacity <= 0) throw new IllegalArgumentException( "Stack's capacity must be positive"); storage = new int[capacity]; top = -1; }
void push(int value) { if (top == storage.length) throw new StackException("Stack's underlying storage is overflow"); top++;
storage[top] = value; }
int peek() { if (top == -1) throw new StackException("Stack is empty"); return storage[top]; }
C++ implementation
#include <string> using namespace std;
class Stack { private: int top; int capacity; int *storage; public: Stack(int capacity) { if (capacity <= 0) throw string("Stack's capacity must be positive"); storage = new int[capacity]; this->capacity = capacity; top = -1; }
void push(int value) { if (top == capacity) throw string("Stack's underlying storage is overflow"); top++; storage[top] = value;