Sie sind auf Seite 1von 19

Master of Computer Application (MCA) Semester 2 MC0068 Data Structures using C

Assignment Set 1 1. Describe the usage of pointers in functions with a suitable example. Ans:- Regarding their syntax, there are two different types of function pointers: On the one hand there
are pointers to ordinary C functions or to static C++ member functions. On the other hand there are pointers to non-static C++ member functions. The basic difference is that all pointers to non-static member functions need a hidden argument: The this-pointer to an instance of the class. Always keep in mind: These two types of function pointers are incompatible with each other. Since a function pointer is nothing else than a variable, it must be defined as usual. In the following example we define three function pointers named pt2Function, pt2Member and pt2ConstMember. They point to functions, which take one float and two char and return an int. In the C++ example it is assumed, that the functions, our pointers point to, are (non-static) member functions of TMyClass. int (*pt2Function)(float, char, char) = NULL; if(pt2Function >0){ if(pt2Function == &DoIt) printf("Pointer points to DoIt\n"); } else printf("Pointer not initialized!!\n");

2. Demonstrate with your own programming example the usage of structures within an array. Ans:- Just as arrays of basic types such as integers and floats are allowed in C, so are arrays of structures. An array of structures is declared in the usual way: 1 struct personal_data my_struct_array[100]; The members of the structures in the array are then accessed by statements such as the following: The value of a member of a structure in an array can be assigned to another variable, or the value of a variable can be assigned to a member. For example, the following code assigns the number 1974 to the year_of_birth member of the fourth element of my_struct_array: 1 my_struct_array[3].year_of_birth = 1974;

Example:#include <stdio.h> struct matrix { int rows; int cols; int **val; } a = { .rows=3, .cols=1, .val = (int*[3]){ (int[1]){1}, (int[1]){2}, (int[1]){3} } }, b = { .rows=3, .cols=4, .val = (int*[3]){ (int[4]){1, 2, 3, 4}, (int[4]){5, 6, 7, 8}, (int[4]){9,10,11,12} } };

void print_matrix( char *name, struct matrix *m ){ for( int row=0;row<m->rows;row++ ) for( int col=0;col<m->cols;col++ ) printf( "%s[%i][%i]: %i\n", name, row, col, m->val[row][col] ); puts(""); } int main(){ print_matrix( "a", &a ); print_matrix( "b", &b ); }

3. Explain the theory of non linear data structures. Ans:- Linear data structure: A linear data structure traverses the data elements sequentially, in
which only one data element can directly be reached. in which insertion and deletion is possible in linearsequential fashion .example:- arrays, linked lists.Non linear data structures:-in which sequential updation / addition is not possible. example:- trees ,stacksLinear data structures : Arrays, Linked ListNon-Linear data strctures :trees, GraphsIn Linear data structures traversals are linear. are multidimensional arrays and graphs. In the next few lessons, we will examine these data structures to see how they are represented using the computers linear memory....In a non-linear, the data items are not arrangedNon-Linear container classes represent trees and graphs. Each node or item may be connected with two or more other nodes or items in a non-linear arrangement. Moreover removing one of the links could. Ex: Arrays, Linked Lists.

4. Write a program in C showing the implementation of stack operations using structures. Ans:- Stack operations:2

Push and pop are the operations that are provided for insertion of an element into the stack and the removal of an element from the stack. Example:#include <stdio.h> #include <stdlib.h> #include "stack.h" #define size 3 void main() { int top,element; int stack[size]; init(&top); while(!full(&top,size)){ element = rand(); printf("push element %d into stack\n",element); push(stack,&top,element); getchar(); } printf("stack is full\n"); while(!empty(&top)){ element = pop(stack,&top); printf("pop element %d from stack\n",element); getchar(); } printf("stack is empty\n"); getchar(); }

5. Describe the theory and applications of Double Ended Queues (Deque) and circular queues. Ans:Double Ended Queues (Deque):- Like an ordinary queue, a double-ended queue is a container.
It supports the following operations: enq_front, enq_back, deq_front, deq_back, and empty. By choosing a subset of these operations, you can make the double-ended queue behave like a stack or like a queue. For instance, if you use only enq_front and deq_front you get a stack, and if you use only enq_front and deq_back you get a queue. By now, the reader should be used to using header objects in order to obtain uniform reference semantics. We will therefore go directly to a version of the double-ended queue with a separate header file and implementation file.

Example:-

#include "dqueue.h" #include "dlist.h" #include < stdlib.h> struct dqueue { dlist head; dlist tail; }; dqueue dq_create(void) { dqueue q = malloc(sizeof(struct dqueue)); q -> head = q -> tail = NULL; return q; } int dq_empty(dqueue q) { return q -> head == NULL; } void dq_enq_front(dqueue q, void *element) { if(dq_empty(q)) q -> head = q -> tail = dcons(element, NULL, NULL); else { q -> head -> prev = dcons(element, NULL, q -> head); q -> head -> prev -> next = q -> head; q -> head = q -> head -> prev; } } void dq_enq_back(dqueue q, void *element) { if(dq_empty(q)) q -> head = q -> tail = dcons(element, NULL, NULL);

else { q -> tail -> next = dcons(element, q -> tail, NULL); q -> tail -> next -> prev = q -> tail; q -> tail = q -> tail -> next; } } void * dq_deq_front(dqueue q) { assert(!empty(q)); { dqueue temp = q -> head; void *element = temp -> element; q -> head = q -> head -> next; free(temp); if(q -> head == NULL) q -> tail = NULL; else q -> head -> prev = NULL; return element; } } void * dq_deq_back(dqueue q) { assert(!empty(q)); { dqueue temp = q -> tail; void *element = temp -> element; q -> tail = q -> tail -> prev; free(temp); if(q -> tail == NULL) q -> head = NULL; else q -> tail -> next = NULL; return element; } }

circular queues:- A circular queue is a particular implementation of a queue. It is very efficient. It


is also quite useful in low level code, because insertion and deletion are totally independant, which

means that you don't have to worry about an interrupt handler trying to do an insertion at the same time as your main code is doing a deletion.

Example:#include <stdio.h> #include <stdlib.h> #define MAX 10 void insert(int queue[], int *rear, int front, int value) { *rear= (*rear +1) % MAX; if(*rear == front) { printf("The queue is full can not insert a value\n"); exit(0); } queue[*rear] = value; } void delete(int queue[], int *front, int rear, int * value) { if(*front == rear) { printf("The queue is empty can not delete a value\n"); exit(0); } *front = (*front + 1) % MAX; *value = queue[*front]; } void main() { int queue[MAX]; int front,rear; int n,value; front=0; rear=0; insert(queue,&rear,front,1); insert(queue,&rear,front,2); insert(queue,&rear,front,3); insert(queue,&rear,front,4); delete(queue,&front,rear,&value); printf("The value deleted is %d\n",value); delete(queue,&front,rear,&value); printf("The value deleted is %d\n",value); delete(queue,&front,rear,&value); printf("The value deleted is %d\n",value);

6. With the help of a suitable numerical example, describe the following concepts of a Binary Search Tree: A) Analysis of BST :- carry out an Analysis of this method to determine its time complexity. Since there are no for loops, we can not use summations to express the total number of operations. Let us examine the operations for a specific case, where the number of elements in the array n is 64.
When n= 64 BinarySearch is called to reduce size to n=32 When n= 32 BinarySearch is called to reduce size to n=16 When n= 16 BinarySearch is called to reduce size to n=8 When n= 8 BinarySearch is called to reduce size to n=4 When n= 4 BinarySearch is called to reduce size to n=2 When n= 2 BinarySearch is called to reduce size to n=1 int BinarySearch (int A[ ], int n, int K) { int L=0, Mid, R= n-1; while (L<=R) { Mid = (L +R)/2; if ( K= =A[Mid] ) return Mid; else if ( K > A[Mid] ) L = Mid + 1; else R = Mid 1 ; } return 1 ; }

B) Insertion of Nodes into a BST :- To insert a node into a BST 1. find a leaf st the appropriate place and
2. connect the node to the parent of the leaf. TREE-INSERT (T, z) y NIL x root [T] while x NIL do yx if key [z] < key[x] then x left[x] else x right[x] p[z] y if y = NIL then root [T] z

else if key [z] < key [y] then left [y] z else right [y] z Like other primitive operations on search trees, this algorithm begins at the root of the tree and traces a path downward. Clearly, it runs in O(h) time on a tree of height h.

7. Explain the Bellman Ford algorithm with respect to Minimum Spanning Trees. Ans:#include <stdio.h> /* The input file (weight.txt) look something like this 4 0 0 0 21 0 0 8 17 0 8 0 16 21 17 16 0 The first line contains n, the number of nodes. Next is an nxn matrix containg the distances between the nodes NOTE: The distance between a node and itself should be 0 */ int n; /* The number of nodes in the graph */ int weight[100][100]; /* weight[i][j] is the distance between node i and node j; if there is no path between i and j, weight[i] [j] should be 0 */ char inTree[100]; /* inTree[i] is 1 if the node i is already in the minimum spanning tree; 0 otherwise*/ int d[100]; /* d[i] is the distance between node i and the minimum spanning tree; this is initially infinity (100000); if i is already in the tree, then d[i] is undefined; this is just a temporary variable. It's not necessary but speeds up execution considerably (by a factor of n) */ int whoTo[100]; /* whoTo[i] holds the index of the node i would have to be linked to in order to get a distance of d[i] */ /* updateDistances(int target)

tree;

should be called immediately after target is added to the

updates d so that the values are correct (goes through target's neighbours making sure that the distances between them and the tree are indeed minimum) */ void updateDistances(int target) { int i; for (i = 0; i < n; ++i) if ((weight[target][i] != 0) && (d[i] > weight[target] [i])) { d[i] = weight[target][i]; whoTo[i] = target; } } int main(int argc, char *argv[]) { FILE *f = fopen("dist.txt", "r"); fscanf(f, "%d", &n); int i, j; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) fscanf(f, "%d", &weight[i][j]); fclose(f); /* Initialise d with infinity */ for (i = 0; i < n; ++i) d[i] = 100000; /* Mark all nodes as NOT beeing in the minimum spanning tree */ for (i = 0; i < n; ++i) inTree[i] = 0; /* Add the first node to the tree */ printf("Adding node %c\n", 0 + 'A'); inTree[0] = 1; updateDistances(0); int total = 0; int treeSize; for (treeSize = 1; treeSize < n; ++treeSize) { /* Find the node with the smallest distance to the tree int min = -1; for (i = 0; i < n; ++i) if (!inTree[i]) if ((min == -1) || (d[min] > d[i])) min = i;

*/

/* And add it */ printf("Adding edge %c-%c\n", whoTo[min] + 'A', min + 'A'); inTree[min] = 1; total += d[min]; } updateDistances(min);

printf("Total distance: %d\n", total); return 0; }

8. Explain the following graph problems: A) Telecommunication problem :- Subscription con_guration problem and is taken
(with slight modi_cation) from [4]. Let F denote a _nite set of features. For fi; fj 2 F a precedence constraint (fi>fj) indicates that fi is after fj . An exclusion constraint (fi<>fj) between fi and fj indicates that fi and fj cannot appear together in a sequence of features, and is equivalent to the pair (fi>fj ), (fj>fi). A catalog is a pair hF; Pi with F a set of features and P a set of precedence constraints on F. A feature subscription S of a catalog hFc; Pci is a tuple hF;C;U;WF ;WUi where F _ Fc is the set of features selected from Fc, C is the projection of Pc on F, U is a set of user de_ned precedence constraints on F, and WF :F ! N and WU:U ! N are maps which assign weights to features and user precedence constraints. The value of S is de_ned by V alue(S) = _f2FWF (f) + _p2UWU(p). The weight associated with a feature or a precedence constraint signi_es its importance for the user. A feature subscription hF;C;U;WF ;WUi is consistent i_ the directed graph hF;C[Ui is acyclic. Checking for consistency is straightforward using topological sort as described in [4]. If a feature subscription is inconsistent then the task is to relax it and to generate a consistent one with maximum value. A relaxation of a feature subscription S = hF;C;U;WF ;WUi is a consistent subscription S0 = hF0;C0;U0;WF0 ;WU0 i such that F0 _ F, C0 is the projection of C on F0, U0 is a subset of the projection of U on F0, WF0 is the restriction of WF to F0, and WU0 is the restriction of WU to U0. We say that S0 is an optimal relaxation of S if there does not exist another relaxation S00 of S such that V alue(S00) > V alue(S0). In [4], the authors prove that _nding an optimal relaxation of a feature subscription is NP-hard. This is the problem addressed in this paper.

B) Knight Moves :-Aknightisachesspiecethatcanmoveeithertwospaceshorizontallyand


onespaceverticallyoronespacehorizontallyandtwospacesvertically.Thatis,aknightonsquare (x,y) canmovetoanyoftheeightsquares(x 2,y 1), (x 1,y 2),ifthesesquaresexiston thechessboard(whichisnormally8 8). Aknightstourisasequenceoflegalmovesbyaknightstartingatsomesquareandvisitingeach squareexactlyonce.Aknightstouriscalledreentrantifthereisalegalmovethattakestheknight fromthelastsquareofthetourbacktowherethetourbegan.Wecanmodelknightstoursusing

thegraphthathasavertexforeachsquareontheboard,withanedgeconnectingtwoverticesifa knightcanlegallymovebetweenthesquaresrepresentedbythesevertices. (a)Showthatfindingareentrantknightstouronanm n chessboardisequivalenttofindinga Hamiltoncircuitonthecorrespondinggraph. (b)Showthatthegraphrepresentingthelegalmovesofaknightonamn chessboard,wherever m andn arepositiveintegers,isbipartite. (c) Deducethatthereisnoreentrantknightstouronan m n chessboardwhen m and n are bothodd.

10

August 2010 Master of Computer Application (MCA) Semester 2 MC0068 Data Structures using C
Assignment Set 2

1. Describe the theory of circular singly linked lists. Ans:- The linked lisis that we have seen so far are often known as linear linked
lists. The elements of such a linked list can be accessed, first by setting up a pointer pointing to the first node in the list and then traversing the entire list using this pointer. Although a linear linked list is a useful data structure, it has several shortcomings. For example, given a pointer p to a node in a linear list, we cannot reach any of the nodes that precede the node to which p is pointing. This disadvantage can be overcome by making a small change to the structure of a linear list such that the link field in the last node contains a pointer back to the first node rather than a NULL. Such a list is called a circular linked list.

2. Describe following Binary Trees: A) Strictly Binary trees :- A binary tree is a finite set of elements that is either empty
or is partitioned into three disjoint subsets. The first subset contains a single element called the root of the tree. The other two subsets are themselves binary trees, called the left and rightsubtrees of the original tree. A left or right subtree can be empty. Each element of a binary tree is called a node of the tree and the tree consists of nine nodes with A as its root. Its left subtree is rooted at B and its right subtree is rooted at C . This is indicated by the two branches emanating from A to B on the left and to C on the right. The absence of a branch indicates an empty subtree. For example, the left subtree of the binary tree rooted at C and the right subtree of the binary tree rooted at E are both empty. The binary trees rooted at D , G , H and I have empty right and left subtrees. Following figure illustrates some structures that are not binary trees. Be sure that you understand why each of them is not a binary tree as just defined.

B) Complete Binary trees :- A binary tree is made of nodes, where each node contains a "left" pointer, a "right" pointer, and a data element. The "root" pointer points to the topmost node in the tree. The left and right pointers recursively point to smaller "subtrees" on either side. A null pointer represents a binary tree with no elements -- the empty tree. The formal recursive 11

definition is: a binary tree is either empty (represented by a null pointer), or is made of a single node, where the left and right pointers (recursive definition ahead) each point to a binary tree. C) Almost Complete Binary Trees :All leafs at lowest and next-tolowest levels only.All

except the lowest level is full.No gaps except at the end of a level. A perfectly balanced tree with leaves at the last level all in the leftmost position. A tree which can be represented without any vacant space in the array representation.

3. With the help of a program and an example, explain Breadth First Tree Traversal Ans:- Certain programming problems are easier to solve using multiple data structures. For example, testing a sequence of characters to determine if it is a palindrome (i.e., reads the same forward and backward, like "radar") can be accomplished easily with one stack and one queue. The solution is to enter the sequence of characters into both data structures, then remove letters from each data structure one at a time and compare them, making sure that the letters match. In this palindrome example, the user (person writing the main program) has access to both data structures to solve the problem. Another way that 2 data structures can be used in concert is to use one data structure to help implement another.
tree.h -----tree.c -----#include "tree.h" #include "queue.h" typedef struct treeNodeTag { treeElementT element; struct treeNodeTag *left, *right; } treeNodeT; typedef struct treeCDT { treeNodeT *root; } treeCDT;

typedef char treeElementT;

typedef struct treeCDT *treeADT;

4. Write a program to perform a binary search on an unsorted list of n integers. Ans:Program Example:#include<stdio.h> #include<conio.h> #include<math.h>

12

int main() { int a[20]= {0}; int n, i, j, temp; int *beg, *end, *mid, target;

printf(" enter the total integers you want to enter (make it less then 20):\n"); scanf("%d", &n); if (n >= 20) return 0; printf(" enter the integer array elements:\n" ); for(i = 0; i < n; i++) {

scanf("%d", &a[i]); } for(i = 0; i < n-1; i++) { for(j = 0; j < n-i-1; j++) { if (a[j+1] < a[j]) { temp = a[j]; a[j] = a[j+1]; a[j+1] = temp; } } }

13

printf(" the sorted numbers are:"); for(i = 0; i < n; i++) { printf("%d ", a[i]); } beg = &a[0]; end = &a[n]; mid = beg += n/2; printf("\n enter the number to be searched:"); scanf("%d",&target); while((beg <= end) && (*mid != target)) { if (target < *mid) { end = mid 1; n = n/2; mid = beg += n/2; } else { beg = mid + 1; n = n/2; mid = beg += n/2; } } if (*mid == target) {

14

printf("\n %d found!", target); } else { printf("\n %d not found!", target); }

getchar(); getchar(); return 0; }

5. With the help of a numerical example, explain the working of Insertion Sort. Ans:- When sorting an array with insertion sort, we conceptually separate it into two parts:
The list of elements already inserted, which is always in sorted order and is found at the beginning of the array.The list of elements we have yet to insert, following. In outline, our primary function looks like this: <<insertion_sort>>= void insertion_sort(int a[], int length) { int i; for (i=0; i < length; i++) { insert a[i] into sorted sublist } } To insert each element, we need to create a hole in the array at the place where the element belongs, then place the element in that hole. We can combine the creation of the hole with the searching for the place by starting at the end and shifting each element up by one until we find the place where the element belongs. This overwrites the element we're inserting, so we have to save it in a variable first: <<insert a[i] into sorted sublist>>= int j, v = a[i];

15

for (j = i - 1; j >= 0; j--) { if (a[j] <= v) break; a[j + 1] = a[j]; } a[j + 1] = v;

6. Describe Depth First search algorithm and analyze its complexity. Ans:- Formally, DFS is an uninformed search that progresses by expanding the first child node of the
search tree that appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then the search backtracks, returning to the most recent node it hasn't finished exploring. In a non-recursive implementation, all freshly expanded nodes are added to a stack for exploration. The time and space analysis of DFS differs according to its application area. In theoretical computer science, DFS is typically used to traverse an entire graph, and takes time O(|V| + |E|), linear in the size of the graph. In these applications it also uses space O(|V|) in the worst case to store the stack of vertices on the current search path as well as the set of already-visited vertices. Thus, in this setting, the time and space bounds are the same as for breadth first search and the choice of which of these two algorithms to use depends less on their complexity and more on the different properties of the vertex orderings the two algorithms produce.

a depth-first search starting at A, assuming that the left edges in the shown graph are chosen before right edges, and assuming the search remembers previously-visited nodes and will not repeat them (since this is a small graph), will visit the nodes in the following order: A, B, D, F, E, C, G. Performing the same search without remembering previously visited nodes results in visiting nodes in the order A, B, D, F, E, A, B, D, F, E, etc. forever, caught in the A, B, D, F, E cycle and never reaching C or G. 7. Explain the following theorems of Splay trees: A) Working Set Theorem :- A splay tree is a self-adjusting binary search tree with the
additional property that recently accessed elements are quick to access again. It performs basic

16

operations such as insertion, look-up and removal in O(log n) amortized time. For many sequences of operations, splay trees perform better than other search trees, even when the specific pattern of the sequence is unknown.

All normal operations on a binary search tree are combined with one basic operation, called splaying. Splaying the tree for a certain element rearranges the tree so that the element is placed at the root of the tree. One way to do this is to first perform a standard binary tree search for the element in question, and then use tree rotations in a specific fashion to bring the element to the top. Alternatively, a top-down algorithm can combine the search and the tree reorganization into a single phase. When a node x is accessed, a splay operation is performed on x to move it to the root. To perform a splay operation we carry out a sequence of splay steps, each of which moves x closer to the root. By performing a splay operation on the node of interest after every access, the recently accessed nodes are kept near the root and the tree remains roughly balanced, so that we achieve the desired amortized time bounds. B) Sequential Access Theorem :- When a node x is accessed, a splay operation is
performed on x to move it to the root. To perform a splay operation we carry out a sequence of splay steps, each of which moves x closer to the root. By performing a splay operation on the node of interest after every access, the recently accessed nodes are kept near the root and the tree remains roughly balanced, so that we achieve the desired amortized time bounds.Each particular step depends on three factors: Whether x is the left or right child of its parent node, p, whether p is the root or not, and if not whether p is the left or right child of its parent, g (the grandparent of x).

The three types of splay steps are: Zig Step: This step is done when p is the root. The tree is rotated on the edge between x and p. Zig steps exist to deal with the parity issue and will be done only as the last step in a splay operation and only when x has odd depth at the beginning of the operation. Zig-zig Step: This step is done when p is not the root and x and p are either both right children or are both left children. The picture below shows the case where x and p are both left children. The tree is rotated on the edge joining p with its parent g, then rotated on the edge joining x with p. Note that zig-zig steps are the only thing that differentiate splay trees from the rotate to root method introduced by Allen and Munro prior to the introduction of splay trees.

17

Zig-zag Step: This step is done when p is not the root and x is a right child and p is a left child or vice versa. The tree is rotated on the edge between x and p, then rotated on the edge between x and its new parent g.

C) Dynamic Finger Theorem :- A simple amortized analysis of static splay trees can be
carried out using the potential method. Suppose that size(r) is the number of nodes in the subtree rooted at r (including r) and rank(r) = log2(size(r)). Then the potential function P(t) for a splay tree t is the sum of the ranks of all the nodes in the tree. This will tend to be high for poorly-balanced trees, and low for well-balanced trees. We can bound the amortized cost of any zig-zig or zig-zag operation by: amortized cost = cost + P(tf) - P(ti) 3(rankf(x) - ranki(x)), where x is the node being moved towards the root, and the subscripts "f" and "i" indicate after and before the operation, respectively. When summed over the entire splay operation, this telescopes to 3(rank(root)) which is O(log n). Since there's at most one zig operation, this only adds a constant.

Dynamic Finger Theorem: The cost of performing S is O(m+n log n + m/i=1 log( |I,j+1 I,j + 1 )). 8. Explain the following in the context of files: A) Sequential Files :- True random access file handling, however, only accesses the file at the point at which the data should be read or written, rather than having to process it sequentially. A hybrid approach is also possible whereby a part of the file is used for sequential access to locate something in the random access portion of the file, in much the same way that a File Allocation Table (FAT) works. The three main functions that this article will deal with are:

rewind() return the file pointer to the beginning; fseek() position the file pointer; ftell() return the current offset of the file pointer.

Each of these functions operates on the C file pointer, which is just the offset from the start of the file, and can be positioned at will. All read/write operations take place at the current position of the file pointer. The rewind() Function The rewind() function can be used in sequential or random access C file programming, and simply tells the file system to position the file pointer at the start of the file. Any error flags will also be cleared, and no value is returned. While useful, the companion function, fseek(), can also be used to reposition the file pointer at will, including the same behavior as rewind(). Using fseek() and ftell() to Process Files The fseek() function is most useful in random access files where either the record (or block) size is known, or there is an allocation system that denotes the start and end positions of records in an index portion of the file. The fseek() function takes three parameters:

FILE * f the file pointer; 18

long offset the position offset; int origin the point from which the offset is applied.

B) Inverted Files :- The inverted index data structure is a central component of a typical search
engine indexing algorithm. A goal of a search engine implementation is to optimize the speed of the query: find the documents where word X occurs. Once a forward index is developed, which stores lists of words per document, it is next inverted to develop an inverted index. Querying the forward index would require sequential iteration through each document and to each word to verify a matching document. The time, memory, and processing resources to perform such a query are not always technically realistic. Instead of listing the words per document in the forward index, the inverted index data structure is developed which lists the documents per word. With the inverted index created, the query can now be resolved by jumping to the word id (via random access) in the inverted index. In pre-computer times, concordances to important books were manually assembled. These were effectively inverted indexes with a small amount of accompanying commentary, that required a tremendous amount of effort to produce. There are two main variants of inverted indexes: A record level inverted index (or inverted file index or just inverted file) contains a list of references to documents for each word. A word level inverted index (or full inverted index or inverted list) additionally contains the positions of each word within a document. The latter form offers more functionality (like phrase searches), but needs more time and space to be created.

19

Das könnte Ihnen auch gefallen