Sie sind auf Seite 1von 222

Master of Computer Application (MCA) Semester 2

MC0066 OOPS using C++ 4 Credits (Book ID: B0681 & B0715)

Assignment Set 1
MC0066 OOPS using C++ (Book ID: B0681 & B0715)

1. Write a program in C++ for matrix multiplication. The program should accept the dimensions of both the matrices to be multiplied and check for compatibility with appropriate messages and give the output. void main() { int m1[10][10],i,j,k,m2[10][10],add[10][10], printf("Enter number of rows and columns of first matrix MAX 10\n"); scanf("%d%d",&r1,&c1); printf("Enter number of rows and columns of second matrix MAX 10\n"); scanf("%d%d",&r2,&c2); if(r2==c1) { printf("Enter rows and columns of First matrix \n"); printf("Row wise\n"); for(i=0;i<r1;i++) { for(j=0;j<c1;j++) scanf("%d",&m1[i][j]); } printf("You have entered the first matrix as follows:\n"); for(i=0;i<r1;i++) { for(j=0;j<c1;j++)

printf("%d\t",m1[i][j]); printf("\n"); } printf("Enter rows and columns of Second matrix \n"); printf("Again row wise\n"); for(i=0;i<r2;i++) { for(j=0;j<c2;j++) scanf("%d",&m2[i][j]); } printf("You have entered the second matrix as follows:\n"); for(i=0;i<r2;i++) { for(j=0;j<c2;j++) printf("%d\t",m2[i][j]); printf("\n"); } if(r1==r2&&c1==c2) { printf("Now we add both the above matrix \n"); printf("The result of the addition is as follows;\n"); for(i=0;i<r1;i++) { for(j=0;j<c1;j++) { add[i][j]=m1[i][j]+m2[i][j]; printf("%d\t",add[i][j]); } printf("\n"); } }

else { printf("Addition cannot be done as rows or columns are not equal\n"); } printf("Now we multiply both the above matrix \n"); printf("The result of the multiplication is as follows:\n"); /*a11xA11+a12xA21+a13xA31 a11xA12+a12xA22+a13xA32 a11xA13+a12xA23+a13xA33*/ for(i=0;i<r1;i++) { for(j=0;j<c2;j++) { mult[i][j]=0; for(k=0;k<r1;k++) { mult[i][j]+=m1[i][k]*m2[k][j]; /*mult[0][0]=m1[0][0]*m2[0][0]+m1[0][1]* } printf("%d\t",mult[i][j]); } printf("\n"); } getch(); } else { printf("Matrix multiplication cannot be done"); } }

2.Write a program to check whether a string is a palindrome or not. Please note that palindrome is one which remains the same if you reverse the characters in the string. For example MADAM. // palindrome.cpp # include<iostream.h> # include <string.h> void main() { char a[10]; int i,j,len; int flag=0; cout<<Please enter the text to be checked; cin>>a; len=strlen(a); for(i=0, j=len-1;i<len/2;i++,j) { if(a[i]!=a[j]) {flag=1; break;} } if (flag==0) cout<< The text you have entered is a palindrome; else

cout<<The text is not a palindrome; } 3. What is structure in C++? Define a structure named product with elements productcode, description, unitprice and qtyinhand. Write a C++ product data. Structure is a collection of variables under a single name. Variables can be of any type: int, float, char etc. The main difference between structure and array is that arrays are collections of the same data type and structure is a collection of variables under a single name. Three variables: custnum of type int, salary of type int, commission of type float are structure members and the structure name is Customer. This structure is declared as follows:

In the above example, it is seen that variables of different types such as int and float are grouped in a single structure name Customer. Arrays behave in the same way, declaring structures does not mean that memory is allocated. Structure declaration gives a skeleton or template for the structure. After declaring the structure, the next step is to define a structure variable. #include <iostream> #include <string> #include <sstream> using namespace std; struct movies_t { string title;

int year; }; int main () { string mystr; movies_t amovie; movies_t * pmovie; pmovie = &amovie; cout << "Enter title: "; getline (cin, pmovie->title); cout << "Enter year: "; getline (cin, mystr); (stringstream) mystr >> pmovie->year; cout << "\nYou have entered:\n"; cout << pmovie->title; cout << " (" << pmovie->year << ")\n"; return 0; } 4. What is the purpose of exception handling? How do you infer from the phrase, Throwing an exception?. Of course, the thrown exception must end up someplace. This is the exception handler, and theres one for every exception type you want to catch. Exception handlers immediately follow the try block and are denoted by the keyword catch: try { // code that may generate exceptions } catch(type1 id1) { // handle exceptions of type1 } catch(type2 id2) { // handle exceptions of type2 } // etc

Each catch clause (exception handler) is like a little function that takes a single argument of one particular type. The identifier (id1, id2, and so on) may be used inside the handler, just like a function argument, although sometimes there is no identifier because its not needed in the handler the exception type gives you enough information to deal with it. The handlers must appear directly after the try block. If an exception is thrown, the exception handling mechanism goes hunting for the first handler with an argument that matches the type of the exception. Then it enters that catch clause, and the exception is considered handled. (The search for handlers stops once the catch clause is finished.) Only the matching catch clause executes; its not like a switch statement where you need a break after each case to prevent the remaining ones from executing. Notice that, within the try block, a number of different function calls might generate the same exception, but you only need one handler.

Assignment Set 2
MC0066 OOPS using C++ (Book ID: B0681 & B0715)

1. Write a program which accepts a number from the user and generates prime numbers till that number.

#include<stdio.h> #include<conio.h> main() { int n,s=0,i; clrscr(); printf("Enter the number\n"); scanf("%d",&n); for(i=2;i<n/2;i++) { if(n%i==0) s++; } if(s>=1) printf("%d is not prime",n); else printf("%d is prime",n); getch(); }
2. Implement a class stack which simulates the operations of the stack allowing LIFO operations. Also implement push and pop operations for the stack.

A FIFO is an opaque structure that allows items to be placed or pushed onto it at one end (head) and removed or pulled from it, one at at time and in sequence, from the other end (tail). A LIFO is a similarly opaque structure that allows items to be pushed onto it at one end (head or top) and removed or popped from it from that same end (head). This means that items of a LIFO are removed one at

a time but in the reverse order of when they were entered. Let us further suppose that our only FIFO primitives are an append operator push and a pull operator shift. push adds elements to the head of the FIFO and shift grabs (and removes) items from the tail of the structure. Therefore, we want to build a stack (LIFO) using the pipe (FIFO) structure and its two primitive operators, push and shift. Essentially, we wish to combine push (onto head) and shift (from tail) pipe operators in some fashion to simulate push (onto head) and pop (from head) stack operators. class Stack def initialize *s # splat operator allows variable length argument list @q1 = [] @q2 = [] s.each { |e| push e } end def push v # pure FIFO allows only 'push' (onto head) and 'shift' (from tail) # LIFO requires 'push' (onto head) and 'pop' (from head) empty_queue, full_queue = @q1.length == 0 ? [ @q1, @q2 ] : [ @q2, @q1 ] empty_queue.push v until (v = full_queue.shift).nil? # shift each item from non-empty queue, empty_queue.push v # pushing it onto initially empty queue: # produces new full queue in reverse entry order end end def pop full_queue = @q1.length > 0 ? @q1 : @q2 # full queue has been maintained in reverse order # by the push method of the Stack class, so a simple # shift is all that is needed to simulate a stack # pop operation el = full_queue.shift end def empty? @q1.length == 0 && @q2.length == 0 end end # test: puts "STACK: tail/bottom -> x, y, z <- head/top" stack = Stack.new stack.push 'x' stack.push 'y' stack.push 'z'

# or equivalently, stack = Stack.new('x', 'y', 'z') until stack.empty? puts "popping stack, item: #{stack.pop}" end # prints items in reverse order: 'z', 'y', 'x'
3. What are allocators? Describe the sequence container adapters.

The vector, list and deque sequences cannot be built from each other without loss of efficiency. On the other hand, stacks and queues can be elegantly and efficiently implemented using those three basic sequences. Therefore, stack and queue are defined not as separate containers, but as adaptors of basic containers. A container adapter provides a restricted interface to a container. In particular, adapters do not provide iterators; they are intended to be used only through their specialized interfaces. The techniques used to create a container adapter from a container are generally useful for nonintrusively adapting the interface of a class to the needs of its users. Any class that fulfills the allocator requirements can be used as an allocator. In particular, a class A capable of allocating memory for an object of type T must provide the types A: pointer, A:const_pointer, A:reference, A:const_reference, and :value_type for generically declaring objects and references (or pointers) to objects of type T. It should also provide type A:size_type, an unsigned type which can represent the largest size for an object in the allocation model defined by A, and similarly, a signed integral A:difference_type that can represent the difference between any two pointers in the allocation mode.
4. Write about the following with the help of suitable programming examples:

A) Throwing an Exception

B) Catching an Exception

A) Exceptions are run-time anomalies that a program may detect, such as

division by 0, access to an array outside of its bounds, or the exhaustion of the free store memory. Such exceptions exist outside the normal functioning of the program and require immediate handling by the program. The C++ language provides built-in language features to raise and handle exceptions. These language features activate a run-time mechanism used to communicate exceptions between two unrelated (often separately developed) portions of a C+ + program.

When an exception is encountered in a C++ program, the portion of the program that detects the exception can communicate that the exception has occurred by raising, or throwing, an exception. To see how exceptions are thrown in C++, let's reimplement the class iStack presented in Section 4.15 to use exceptions to indicate anomalies in the handling of the stack. The definition of the class iStack looks like this:

#include <vector> class iStack { public: iStack( int capacity ) : _stack( capacity ), _top( 0 ) { } bool pop( int &top_value ); bool push( int value );

bool full(); bool empty(); void display();

int size();

private: int _top; vector< int > _stack; };

MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717)

Assignment Set 1
MC0067 - DBMS And Oracle 9i (Book ID: B0716 & B0717)

1.. Write about:

Linear Search Collision Chain

Linear search: Linear search, also known as sequential search, means starting at the beginning of the data and checking each item in turn until either the desired item is found or the end of the data is reached. Linear search is a search algorithm, also known as sequential search that is suitable for searching a list of data for a particular value. It operates by checking every element of a list one at a time in sequence until a match is found. The Linear Search, or sequential search, is simply examining each element in a list one by one until the desired element is found. The Linear Search is not very efficient. If the item of data to be found is at the end of the list, then all previous items must be read and checked before the item that matches the search criteria is found. This is a very straightforward loop comparing every element in the array with the key. As soon as an equal value is found, it returns. If the loop finishes without finding a match, the search failed and -1 is returned. For small arrays, linear search is a good solution because it's so straightforward. In an array of a million elements linear search on average will take500, 000 comparisons to find the key. For a much faster search, take a look at binary search. Algorithm For each item in the database if the item matches the wanted info exit with this item Continue loop

wanted item is not in database

Collision Chain: In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associate array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hast collisionsdifferent keys that map to the same hash valuewill occur and must be accommodated in some way.
2. Write about:


2.

Integrity Rules Relational Operators with examples for each Linear Search Collision Chain
* Integrity Rules:

These are the rules which a relational database follows in order to stay accurate and accessible. These rules govern which operations can be performed on the data and on the structure of the database. There are three integrity rules defined for a relational databse,which are:

Distinct Rows in a Table - this rule says that all the rows of a table should be distinct to avoid in ambiguity while accessing the rows of that table. Most of the modern database management systems can be configured to avoid duplicate rows. Entity Integrity (A Primary Key or part of it cannot be null) - this rule says that 'null' is special value in a relational database and it doesn't mean blank or zero. It means the unavailability of data and hence a 'null' primary key would not be a complete identifier. This integrity rule is also termed as entity integirty. Referential Integrity - this rule says that if a foreign key is defined on a table then a value matching that foreign key value must exist as th e primary key of a row in some other table.

Employee Emp# EmpName EmpCity EmpAcc# X101 Shekhar Bombay 120001 X102 Raj Pune 120002 X103 Sharma Nagpur Null X104 Vani Bhopal 120003 Account ACC# OpenDate BalAmt 120001 30-Aug-1998 5000 120002 29-Oct-1998 1200 120003 01-Jan-1999 3000 120004 04-Mar-1999 500 EmpAcc# in Employee relation is a foreign key creating reference from Employee to Account. Here, a Null value in EmpAcc# attribute is logically possible if an Employee does not have a bank account. If the business rules allow an employee to exist in the system without opening an account, a Null value can be allowed for EmpAcc# in Employee relation. In the case example given, Cust# in Ord_Aug cannot accept Null if the business rule insists that the Customer No. needs to be stored for every order placed.

The following are the integrity rules to be satisfied by any relation. No Component of the Primary Key can be null. The Database must not contain any unmatched Foreign Key values. This is called the referential integrity rule. Unlike the case of Primary Keys, there is no integrity rule saying that no component of the foreign key can be null. This can be logically explained with the help of the following example: Consider the relations Employee and Account as given below.

Relational Operators:

In the relational model, the database objects seen so far have specific names: Name Meaning Relation Table Tuple Record(Row) Attribute Field(Column) Cardinality Number of Records(Rows) Degree(or Arity) Number of Fields(Columns) View Query/Answer table

On these objects, a set of operators (relational operators) is provided to manipulate them: 1. Restrict 2. Project 3. Union 4. Difference 5. Product 6. Intersection 7. Join 8. Divide Restrict: Restrict simply extract records from a table. it is also known as Select, but not the same SELECT as defined in SQL. Project: Project selects zero or more fields from a table and generates a new table that contains all of the records and only the selected fields (with no duplications). Union: Union creates a new table by adding the records of one table to another tables, must be compatible: have the same number of fields and each of the field pairs has to have values in the same domain. Difference: The difference of two tables is a third table which contains the records which appear in the first BUT NOT in the second. Product: The product of two tables is a third which contains all of the records in the first one added to each of the records in the second.

Intersection: The intersection of two tables is a third tables which contains the records which are common to both. Join:

The join of two tables is a third which contains all of the records in the first and the second which are related. Divide: Dividing a table by another table gives all the records in the first which have values in their fields matching ALL the records in the second. The eight relational algebra operators are 1. SELECT To retrieve specific tuples/rows from a relation.

Ord# 101 104

OrdDate Cust# 02-08-94 002 18-09-94 002

2. PROJECT To retrieve specific attributes/columns from a relation.

Description Power Supply 101-Keyboard 2000 Mouse 800

Price 4000 2000 800

MS-DOS 6.0 5000 MS-Word 6.0 8000

5000 8000

3. PRODUCT To obtain all possible combination of tuples from two relations.

Ord# 101 101 101 101 101 102 102

OrdDate 02-08-94 02-08-94 02-08-94 02-08-94 02-08-94 11-08-94 11-08-94

O.Cust# 002 002 002 002 002 003 003

C.Cust# CustName City 001 Shah Bombay 002 Srinivasan Madras 003 Gupta Delhi 004 Banerjee Calcutta 005 Apte Bombay 001 Shah Bombay 002 Srinivasan Madras

4. UNION To retrieve tuples appearing in either or both the relations participating in the UNION.

Eg: Consider the relation Ord_Jul as follows (Table: Ord_Jul) Ord# 101 102 101 102 OrdDate 03-07-94 27-07-94 02-08-94 11-08-94 Cust# 001 003 002 003

103 104 105

21-08-94 28-08-94 30-08-94

003 002 005

Note: The union operation shown above logically implies retrieval of records of Orders placed in July or in August 5. INTERSECT To retrieve tuples appearing in both the relations participating in the INTERSECT.

Eg: To retrieve Cust# of Customers whove placed orders in July and in August Cust# 003 6. DIFFERENCE To retrieve tuples appearing in the first relation participating in the DIFFERENCE but not the second.

Eg: To retrieve Cust# of Customers whove placed orders in July but not in August Cust#

001 7. JOIN To retrieve combinations of tuples in two relations based on a common field in both the relations.

Eg: ORD_AUG join CUSTOMERS (here, the common column is Cust#) Ord# OrdDate Cust# CustNames 101 02-08-94 002 Srinivasan 102 11-08-94 003 Gupta 103 21-08-94 003 Gupta 104 28-08-94 002 Srinivasan 105 30-08-94 005 Apte

City Madras Delhi Delhi Madras Bombay

Note: The above join operation logically implies retrieval of details of all orders and the details of the corresponding customers who placed the orders. Such a join operation where only those rows having corresponding rows in the both the relations are retrieved is called the natural join or inner join. This is the most common join operation. Consider the example of EMPLOYEE and ACCOUNT relations. EMPLOYEE EMP # X101 X102 X103 X104 ACCOUNT Acc# OpenDate BalAmt EmpName Shekhar Raj Sharma Vani EmpCity Bombay Pune Nagpur Bhopal Acc# 120001 120002 Null 120003

120001 120002 120003 120004

30. Aug. 1998 29. Oct. 1998 1. Jan. 1999 4. Mar. 1999

5000 1200 3000 500

A join can be formed between the two relations based on the common column Acc#. The result of the (inner) join is : Emp# X101 X102 X104 EmpName Shekhar Raj Vani EmpCity Bombay Pune Bhopal Acc# 120001 120002 120003 OpenDate 30. Aug. 1998 29. Oct. 1998 1. Jan 1999 BalAmt 5000 1200 3000

Note that, from each table, only those records which have corresponding records in the other table appear in the result set. This means that result of the inner join shows the details of those employees who hold an account along with the account details. The other type of join is the outer join which has three variations the left outer join, the right outer join and the full outer join. These three joins are explained as follows: The left outer join retrieves all rows from the left-side (of the join operator) table. If there are corresponding or related rows in the right-side table, the correspondence will be shown. Otherwise, columns of the right-side table will take null values.

EMPLOYEE left outer join ACCOUNT gives: Emp# X101 X102 X103 X104 EmpName Shekhar Raj Sharma Vani EmpCity Bombay Pune Nagpur Bhopal Acc# 120001 120002 NULL 120003 OpenDate 30. Aug. 1998 29. Oct. 1998 NULL 1. Jan 1999 BalAmt 5000 1200 NULL 3000

The right outer join retrieves all rows from the right-side (of the join operator) table. If there are corresponding or related rows in the left-side table, the

correspondence will be shown. Otherwise, columns of the left-side table will take null values.

EMPLOYEE right outer join ACCOUNT gives: Emp# X101 X102 X104 NULL EmpName Shekhar Raj Vani NULL EmpCity Bombay Pune Bhopal NULL Acc# 120001 120002 120003 120004 OpenDate 30. Aug. 1998 29. Oct. 1998 1. Jan 1999 4. Mar. 1999 BalAmt 5000 1200 3000 500

(Assume that Acc# 120004 belongs to someone who is not an employee and hence the details of the Account holder are not available here) The full outer join retrieves all rows from both the tables. If there is a correspondence or relation between rows from the tables of either side, the correspondence will be shown. Otherwise, related columns will take null values.

EMPLOYEE full outer join ACCOUNT gives: Emp# X101 X102 X103 X104 NULL EmpName Shekhar Raj Sharma Vani NULL EmpCity Bombay Pune Nagpur Bhopal NULL Acc# 120001 120002 NULL 120003 120004 OpenDate 30. Aug. 1998 29. Oct. 1998 NULL 1. Jan 1999 4. Mar. 1999 BalAmt 5000 1200 NULL 3000 500

8. DIVIDE Consider the following three relations:

R1 divide by R2 per R3 gives: a Thus the result contains those values from R1 whose corresponding R2 values in R3 include all R2 values.

* Linear Search: Linear search, also known as sequential search, means starting at the beginning of the data and checking each item in turn until either the desired item is found or the end of the data is reached. Linear search is a search algorithm, also known as sequential search that is suitable for searching a list of data for a particular value. It operates by checking every element of a list one at a time in sequence until a match is found. The Linear Search, or sequential search, is simply examining each element in a list one by one until the desired element is found. The Linear Search is not very efficient. If the item of data to be found is at the end of the list, then all previous items must be read and checked before the item that matches the search criteria is found. This is a very straightforward loop comparing every element in the array with the key. As soon as an equal value is found, it returns. If the loop finishes without finding a match, the search failed and -1 is returned. For small arrays, linear search is a good solution because it's so straightforward. In an array of a million elements linear search on average will take500, 000 comparisons to find the key. For a much faster search, take a look at binary search.

* Collision Chain: In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associate array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hast collisionsdifferent keys that map to the same hash valuewill occur and must be accommodated in some way.

3. Discuss the correspondences between the ER model constructs and the

relational model constructs. Show how each ER model construct can be mapped to the relational model, and discuss any alternative mappings. Relational Data Model: The model uses the concept of a mathematical relation-which looks somewhat like a table of values-as its basic building block, and has its theoretical basis in set theory and first order predicate logic. The relational model represents the database a collection of relations. Each relation resembles a table of values or, to some extent, a flat file of records. When a relation is thought of as a table of values, each row in the table represents a collection of related data values. In the relation model, each row in the table represents a fact that typically corresponds to a real-world entity or relationship. The table name and column names are used to help in interpreting the meaning of the values in each row. In the formal relational model terminology, a row is called a tuple, a column header is called an attribute, and the table is called a relation. The data type describing the types of values that can appear in each column is represented by domain of possible values. ER Model: An entity-relationship model (ERM) is an abstract and conceptual representation of data. Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top-down fashion. Diagrams created by this process are called entity-relationship diagrams, ER diagrams, or ERDs.

The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. We create a relational schema from an entity-relationship(ER) schema. In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Sometimes, both of these phases are referred to as "physical design". Key elements of this model are entities, attributes, identifiers and relationships. Correspondence between ER and Relational Models: ER Model Entity type 1:1 or 1:N relationship type M:N relationship type n ary relationship type Simple attributes Composite attributes Multivalued attributes Value set Key attribute Relational Model Entity relation Foregin key Relationship relation and two foreign keys Relationship relation and n foreign keys Attributes Set of simple component attributes Relation and foreign key Domain Primary key or secondary key

The COMPANY ER schema is below:

Result of mapping the company ER schema into a relational database schema: EMPLOYEE
FNAM E INITIA L LNAM E EN O DO B ADDRES S SE X SALAR Y SUPEREN O DN O

DEPARTMENT
DNAME DNUMBER MGRENO MGRSTARTDATE

DEPT_LOCATIONS
DNUMBER DLOCATION

PROJECT
PNAME PNUMBER PLOCATION DNUM

WORKS_ON
EENO PNO HOURS

DEPENDENT
EENO DEPENDENT_NAME SEX DOB RELATIONSHIP

Mapping of regular entity types: For each regular entity type E in the ER schema, create a relation R that includes all the simple attributes of E. Include only the simple component attributes of a composite attribute. Choose one of the key attributes of E as primary key for R. If the chosen key of E is composite, the set of simple attributes that form it will together the primary key of R. If multiple keys were identified for E during the conceptual design, the information describing the attributes that form each additional key is kept in order to specify secondary (unique) keys of relation R. Knowledge about keys is also kept for indexing purpose and other types of analyses. We create the relations EMPLOYEE, DEPARTMENT, and PROJECT in to correspond to the regular entity types EMPLOYEE, DEPARTMENT, and PROJECT. The foreign key and relationship attributes, if any, are not include yet; they will be added during subsequent steps. These, include the attributes SUPERENO and DNO of EMPLOYEE, MGRNO and MGRSTARTDATE of DEPARTMENT, and DNUM of PROJECT. We choose ENO, DNUMBER, and PNUMBER as primary keys for the relations EMPLOYEE, DEPARTMENT, and PROJECT, respectively. Knowledge that DNAME of DEPARTMENT and PNAME of PROJCET are secondary keys is kept for possible use later in the design. The relation that is created from the mapping of entity types are sometimes called entity relations because each tuyple represents an entity instance.

4. Define the following terms: disk, disk pack, track, block, cylinder, sector, interblock gap, read/write head. Disk: Disk s are used for storing large amounts of data. The most basic unit of data on the disk is a single bit of information. By magnetizing a area on disk in certain ways, one can make it represent a bit value of either 0 or 1. To code information, bits are grouped into bytes. Byte sizes are typically 4 to 8 bits, depending on the computer and the device. We assume that one character is stored in a single byte, and we use the terms byte and character interchangeably. The capacity of a disk is the number of bytes it can store, which is usually very large. Small floppy disks used with microcomputers typically hold from 400 kbytes to 1.5 Mbytes; hard disks for micros typically hold from several hundred Mbytes up to a few Gbytes. Whatever their capacity, disks are all made of magnetic material shaped as a thin circular disk and protected by a plastic or acrylic cover. A disk is single-sided if it stores information on only one of its surface and double-sided if both surfaces are used. Disk Packs: To increase storage capacity, disks are assembled into a disk pack, which may include many disks and hence many surfaces. A Disk pack is a layered grouping of hard disk platters (circular, rigid discs coated with a magnetic data storage surface). Disk pack is the core component of a hard disk drive. In modern hard disks, the disk pack is permanently sealed inside the drive. In many early hard disks, the disk pack was a removable unit, and would be supplied with a protective canister featuring a lifting handle. Track and cylinder: The (circular) area on a disk platter which can be accessed without moving the access arm of the drive is called track. Information is stored on a disk surface in concentric circles of small width, for each having a distinct diameter. Each circle is called a track. For disk packs, the tracks with the same diameter on the various surfaces are called cylinder because of the shape they would form if connected in space. The set of tracks of a disk drive which can be accessed without changing the position of the access arm are called cylinder. Sector: A fixed size physical data block on a disk drive. A track usually contains a large amount of information; it is divided into smaller blocks or sectors. The division of a track into sectors is hard-coded on the disk surface and cannot be changed. One type of sector organization calls a portion of a track that subtends a fixed angle at the center as a sector. Several other

sector organizations are possible, one of which is to have the sectors subtend smaller angles at the center as one moves away, thus maintaining a uniform density of recording. Block and Interblock Gaps: A physical data record, separated on the medium from other blocks by inter-block gaps is called block. The division of a track into equal sized disk blocks is set by the operating system during disk formatting. Block size is fixed during initialization and cannot be changed dynamically. Typical disk block sizes range from 512 to 4096 bytes. A disk with hard coded sectors often has the sectors subdivided into blocks during initialization. An area between data blocks which contains no data and which separates the blocks is called interblock gap. Blocks are separated by fixed size interblock gaps, which include specially coded control information written during disk initialization. This information is used to determine which block on the track follows each interblock gap. Read/write Head: A tape drive is required to read the data from or to write the data to a tape reel. Usually, each group of bits that forms a byte is stored across the tape, and the bytes themselves are stored consecutively on the tape. A read/write head is used to read or write data on tape. Data records on tape are also stored in blocksalthough the blocks may be substantially larger than those for disks, and interblock gaps are also quite large. With typical tape densities of 1600 to 6250 bytes per inch, a typical interblock gap of 0.6 inches corresponds to 960 to 3750 bytes of wasted storage space.

Assignment Set 2

MC0067 - DBMS And Oracle 9i (Book ID - B0716 & B0717)


1. Explain the purpose of Data Modeling. What are the basic constructs of

E-R Diagrams? Data modeling in is the process of creating a data model by applying formal data model descriptions using data modeling techniques. Data modeling is the act of exploring data-oriented structures. Like other modeling artifacts data models can be used for a variety of purposes, from highlevel conceptual models to physical data models. Data modeling is the formalization and documentation of existing processes and events that occur during application software design and development. Data modeling techniques and tools capture and translate complex system designs into easily understood representations of the data flows and processes, creating a blueprint for construction and/or re-engineering. Basic Constructs of E-R Modeling: The ER model views the real world as a construct of entities and association between entities. The basic constructs of ER modeling are entities, attributes, and relationships. Entity: An entity may be defined as a thing which is recognized as being capable of an independent existence and which can be uniquely identified. An entity is an abstraction from the complexities of some domain. When we speak of an entity we normally speak of some aspect of the real world which can be distinguished from other aspects of the real world. An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. Although the term entity is the one most commonly used, following Chen we should really distinguish between an entity and an entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most people tend to use the term entity as a synonym for this term. Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem.

Relationship: A relationship captures how two or more entities are related to one another. Relationships can be thought of as verbs, linking two or more nouns. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem. Attributes: Entities and relationships can both have attributes. Examples: an employee entity might have a Social Security Number (SSN) attribute; the proved relationship may have a date attribute.

2. Write about: Types of Discretionary Privileges Propagation of Privileges using Grant Option Physical Storage Structure of DBMS Indexing

Types of Discretionary Privileges: The concept of an authorization identifier is used to refer, to a user account. The DBMS must provide selective access to each relation in the database based on specific accounts. There are two levels for assigning privileges to use use the database system: The account level: At this level, the DBA specifies the particular privileges that each account holds independently of the relations in the database. The relation (or table level): At this level, the DBA can control the privilege to access each individual relation or view in the database. The privileges at the account level apply to the capabilities provided to the account itself and can include the CREATE SCHEMA or CREATE TABLE privilege, to create a schema or base relation; the CREATE VIEW privilege; the ALTER privilege, to apply schema changes such adding or removing attributes from relations; the DROP privilege, to delete relations or views; the MODIFY privilege, to insert, delete, or update tuples; and the SELECT privilege, to retrieve information from the database by using a SELECT query. The second level of privileges applies to the relation level, whether they are base relations or virtual (view) relations.

The granting and revoking of privileges generally follow an authorization model for discretionary privileges known as the access matrix model, where the rows of a matrix M represents subjects (users, accounts, programs) and the columns represent objects (relations, records, columns, views, operations). Each position M(i,j) in the matrix represents the types of privileges (read, write, update) that subject i holds on object j. To control the granting and revoking of relation privileges, each relation R in a database is assigned and owner account, which is typically the account that was used when the relation was created in the first place. The owner of a relation is given all privileges on that relation. The owner account holder can pass privileges on R to other users by granting privileges to their accounts. In SQL the following types of privileges can be granted on each individual relation R: SELECT (retrieval or read) privilege on R: Gives the account retrieval privilege. In SQL this gives the account the privilege to use the SELECT statement to retrieve tuples from R. MODIFY privileges on R: This gives the account the capability to modify tuples of R. In SQL this privilege is further divided into UPDATE, DELETE, and INSERT privileges to apply the corresponding SQL command to R. In addition, both the INSERT and UPDATE privileges can specify that only certain attributes can be updated or inserted by the account. REFERENCES privilege on R: This gives the account the capability to reference relation R when specifying integrity constraints. The privilege can also be restricted to specific attributes of R. Propagation of Privileges using the GRANT OPTION: Whenever the owner A of a relation R grants a privilege on R to another account B, privilege can be given to B with or without the GRANT OPTION. If the GRANT OPTION is given, this means that B can also grant that privilege on R to other accounts. Suppose that B is given the GRANT OPTION by A and that B then grants the privilege on R to a third account C, also with GRANT OPTION. In this way, privileges on R can propagate to other accounts without the knowledge of the owner of R. If the owner account A now revokes the privilege granted to B, all the privileges that B propagated based on that privilege should automatically be revoked by the system. Physical Storage Structure of DBMS: The physical design of the database specifies the physical configuration of the database on the storage media. This includes detailed specification of data elements, data types, indexing options and other parameters residing in the DBMS data dictionary. It is the detailed design of a system that includes modules & the database's hardware & software specifications of the system. Physical structures are those that can be seen and operated

on from the operating system, such as the physical files that store data on a disk. Basic Storage Concepts (Hard Disk) disk access time = seek time + rotational delay disk access times are much slower than access to main memory.

overriding DBMS performance objective is to minimise the number of disk accesses (disk I/Os) Indexing: Data structure allowing a DBMS to locate particular records more quickly and hence speed up queries.Book index has index term (stored in alphabetic order) with a page number.Database index (on a particular attribute) has attribute value (stored in order) with a memory address.An index gives direct access to a record and prevents having to scan every record sequentially to find the one required. Using SUPPLIER(Supp# , SName, SCity) Consider the query Get all the suppliers in a certain city ( e.g. London) 2 possible strategies: a. Search the entire supplier file for records with city 'London' b. Create an index on cities, access it for 'London entries and follow the pointer to the corresponding records SCity Index Dublin London London Paris Paris S1 S2 S3 S4 S5 Clark Ellis Supp# Smith Jones SName London Paris Brown London Dublin Paris SCity

3. What is a relationship type? Explain the differences among a relationship instance, a relationship type, and a relationship set.

There are three type of relationships 1) One to one 2) One to many 3) Many to many Say we have table1 and table2 For one to one relationship, a record(row) in table1 will have at most one matching record or row in table2 I.e. it mustnt have two matching records or no matching records in table2. For one to many, a record in table1 can have more than one record in table2 but not vice versa Lets take an example, Say we have a database which saves information about Guys and whom they are dating. We have two tables in our database Guys and Girls Guy id 1 2 3 Girl id 1 2 3 Guy name Andrew Bob Craig Girl name Girl1 Girl2 Girl3

Here in above example Guy ID and Girl ID are primary keys of their respective table. Say Andrew is dating Girl1, Bob Girl2 and Craig is dating Girl3. So we are having a one to one relationship over there.

So in this case we need to modify the Girls table to have a Guy id foreign key in it.

Girl id 1 2 3

Girl name Girl1 Girl2 Girl3

Guy id 1 2 3

Now let say one guy has started dating more than one girl. i.e. Andrew has started dating Girl1 and say a new Girl4 That takes us to one to many relationships from Guys to Girls table. Now to accommodate this change we can modify our Girls table like this Girl Id 1 2 3 4 Girl Name Girl1 Girl2 Girl3 Girl4 Guy Id 1 2 3 1

Now say after few days, comes a time where girls have also started dating more than one boy i.e. many to many relationships So the thing to do over here is to add another table which is called Junction Table, Associate Table or linking Table which will contain primary key columns of both girls and guys table. Let see it with an example Guy id 1 2 3 Girl id 1 2 3 Guy name Andrew Bob Craig Girl name Girl1 Girl2 Girl3

Andrew is now dating Girl1 and Girl2 and Now Girl3 has started dating Bob and Craig

so our junction table will look like this

Guy ID 1 1 2 2 3

Girl ID 1 2 2 3 3

It will contain primary key of both the Girls and Boys table. A relationship type R among n entity types E1, E2, , En is a set of associations among entities from these types. Actually, R is a set of relationship instances ri where each ri is an n-tuple of entities (e1, e2, , en), and each entity ej in ri is a member of entity type Ej, 1jn. Hence, a relationship type is a mathematical relation on E1, E2, , En, or alternatively it can be defined as a subset of the Cartesian product E1x E2x xEn . Here, entity types E1, E2, , En defines a set of relationship, called relationship sets. Relationship instance: Each relationship instance ri in R is an association of entities, where the association includes exactly one entity from each participating entity type. Each such relationship instance ri represent the fact that the entities participating in ri are related in some way in the corresponding miniworld situation. For example, in relationship type WORKS_FOR associates one EMPLOYEE and DEPARTMENT, which associates each employee with the department for which the employee works. Each relationship instance in the relationship set WORKS_FOR associates one EMPLOYEE and one DEPARTMENT. 4. Write about: Categories of Data Models Schemas, Instances and Database States With an example for each *Catagories of Data Models: High Level (or conceptual): are close to the way user perceive data. Low Level (or physical) : describes the details of how data is stored in the computer

Representational (or Implementational): data models maybe understood by the user but are not too far from the way data is physically organized. *Database Scema: Its basically a description of database mostly by schema diagram. Student Name Course CourseName Prerequisite CourseNbr Section SectionId CourseNbr Semester SectionId PrerequisiteNbr Year Instructor Grade CourseNbr CreditHrs Department StudentNbr Class Major

Grade_Report StudentNbr

Master of Computer Application (MCA) Semester 2

MC0068 Data Structures using C 4 Credits

Assignment Set 1
MC0068 Data Structures using C (Book ID - B0701 & B0702)

1. Describe the theory and applications of Double Ended Queues (Deque) and circular queues. A double-ended queue (dequeue, often abbreviated to deque, pronounced deck) is an abstract data type that implements a queue for which elements can only be added to or removed from the front (head) or back (tail). It is also often called a head-tail linked list. Operations The following operations are possible on a deque: operat ion Ada C++ Java Perl PHP Python Rub y JavaScript

insert elemen Append t at back

push_b offerLa push ack st

array_pus append push push h

insert elemen Prepend t at front

push_fr offerFir array_uns appendl unshi unshift unshift ont st hift eft ft

remov e last Delete_La pop_ba pollLas pop elemen st ck t t

array_po pop p

pop pop

remov Delete_Fir pop_fro pollFirs shift

array_shif popleft shift shift

e first elemen st t

nt

examin e last Last_Elem back elemen ent t

peekL $array[ end ast -1]

<obj>[<obj>[<obj>.le last 1] ngth - 1]

examin e first First_Ele front elemen ment t Implementations

peekFi $array[ reset rst 0]

<obj>[0] first <obj>[0]

There are at least two common ways to efficiently implement a deque: with a modified dynamic array or with a doubly linked list. The dynamic array approach uses a variant of a dynamic array that can grow from both ends, sometimes called array deques. These array deques have all the properties of a dynamic array, such as constant time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant time insertion/removal at both ends, instead of just one end. Three common implementations include: Storing deque contents in a circular buffer, and only resizing when the buffer becomes completely full. This decreases the frequency of resizings.

Allocating deque contents from the center of the underlying array, and resizing the underlying array when either end is reached. This approach may require more frequent resizings and waste more space, particularly when elements are only inserted at one end.

Storing contents in multiple smaller arrays, allocating additional arrays at the beginning or end as needed. Indexing is implemented by keeping a dynamic array containing pointers to each of the smaller arrays.

A circular queue is a particular implementation of a queue. It is very efficient. It is also quite useful in low level code, because insertion and deletion are totally independant, which means that you don't have to worry about an interrupt handler trying to do an insertion at the same time as your main code is doing a deletion. How does it work? A circular queue consists of an array that contains the items in the queue, two array indexes and an optional length. The indexes are called the head and tail pointers and are labelled H and T on the diagram.

The head pointer points to the first element in the queue, and the tail pointer points just beyond the last element in the queue. If the tail pointer is before the head pointer, the queue wraps around the end of the array.

2. Illustrate the C program to represents the Stack Implementation on POP and PUSH operation #include<studio.h> #include<conio.h> #define Max 5 Int Staff [Max], top=-1; Void display() { If ((top==-1 || (top==0)) { Printf(\n stack is full\n); } Else { Printf(\n stack elements are\n); For(int i=top-1;i>=0;i--)

} } void push() { int ele; char ch; it(top-=-1) top=0; do { If (top>=5) { Printf(\n stack is full); Break; } Else { Clrscr(); Printf(\n enter the element to be insearted\n); Scanf(%d, &ele); Staff(top++)=ele display(); } Printf(\n Do u want to add more elements:?\n); Scanf(\n%c, &ch); } While ((ch==y ||(ch==Y)); } void pop() { If ((top==-1)||(top==0) { Printf(\nstack is under flow\n); } Else { Printf(%d is deleted fro stack\n,Staff(--top]); display(); } } Void main() { clrscr(); char c; int choice;

do { cirscr(); printf(\n enter the choice\n); printf(1->push\n); printf(2-pop\n); scant(%d, &choice); if(choice==1) push(); elshif(choice==2) pop(); else printf(\in valid choice); printf(\n do u want to continue:?); scanf(\n%c, &c); } While{(c==y)||(c==Y)); }

3. Show the result of inserting 3, 1, 4, 5, 2, 9, 6, 8 into a a) b) Splay Trees: We shall describe the algorithm by giving three rewrite rules in the form of pictures. In these pictures, x is the node that was accessed (that will eventually be at the root of the tree). By looking at the local structure of the tree defined by x, xs parent, and xs grandparent we decide which of the following three rules to follow. We continue to apply the rules until x is at the root of the tree: bottom-up splay tree top-down splay tree.

Notes

1)

Each rule has a mirror image variant, which covers all the cases.

2) The zig-zig rule is the one that distinguishes splaying from just rotating x to the root of the tree. 3) Top-down splaying is much more efficient in practice.

The Basic Bottom-Up Splay Tree: A technique called splaying can be used so a logarithmic amortized bound can be achieved. We use rotations such as weve seen before. The zig case. o Let X be a non-root node on the access path on which we are rotating. o If the parent of X is the root of the tree, we merely rotate X and the root as shown in Figure 2.

Figure 2 Zig Case o This is the same as a normal single rotation. The zig-zag case. o In this case, X and both a parent P and a grandparent G. X is a right child and P is a left child (or vice versa).

o This is the same as a double rotation. o This is shown in Figure 3. The zig-zig case. o This is a different rotation from those we have previously seen. o Here X and P are either both left children or both right children. o The transformation is shown in Figure 4. o This is different from the rotate-to-root. Rotate-to-root rotates between X and P and then between X and G. The zig-zig splay rotates between P and G and X and P. Figure 4 Zig-zig Case

Given the rotations, consider the example in Figure 5, where we are splaying c.

Top-Down Splay Trees: Bottom-up splay trees are hard to implement. We look at top-down splay trees that maintain the logarithmic amortized bound. o This is the method recommended by the inventors of splay trees. Basic idea as we descend the tree in our search for some node X, we must take the nodes that are on the access path, and move them and their subtrees out of the way. We must also perform some tree rotations to guarantee the amortized time bound. At any point in the middle of a splay, we have: o The current node X that is the root of its subtree. o Tree L that stores nodes less than X. o Tree R that stores nodes larger than X. Initially, X is the root of T, and L and R are empty. As we descend the tree two levels at a time, we encounter a pair of nodes. o Depending on whether these nodes are smaller than X or larger than X, they are placed in L or R along with subtrees that are not on the access path to X.

When we finally reach X, we can then attach L and R to the bottom of the middle tree, and as a result X will have been moved to the root. We now just have to show how the nodes are placed in the different tree. This is shown below: o Zig rotation Figure 7

Zig-Zig Figure 8

Zig-Zag Figure

4. Discuss the techniques for allowing a hash file to expand and shrink dynamically. What are the advantages and disadvantages of each?

Dynamic Hashing: As the database grows over time, we have three options: 1. Choose hash function based on current file size. Get performance degradation as file grows. 2. Choose has function based on anticipated file size. Space is wasted initially. 3. Periodically re-organize hash structure as file grows. Requires selecting new hash function, recomputing all addresses and generating new bucket assignments. Some hashing techniques allow the hash function to be modified dynamically to accommodate the growth or shrinking of the database. These are called dynamic hash functions. Extendable hashing is one form of dynamic hashing. Extendable hashing splits and coalesces buckets as database size changes.

Figure 11.9: General extendable hash structure.


We choose a hash function that is uniform and random that generates values over a relatively large range. Range is b-bit binary integers (typically b=32). is over 4 billion, so we don't generate that many buckets! Instead we create buckets on demand, and do not use all b bits of the hash initially. At any point we use i bits where . The i bits are used as an offset into a table of bucket addresses. Value of i grows and shrinks with the database.

Figure 11.19 shows an extendable hash structure. Note that the i appearing over the bucket address table tells how many bits are required to determine the correct bucket. It may be the case that several entries point to the same bucket. All such entries will have a common hash prefix, but the length of this prefix may be less than i. So we give each bucket an integer giving the length of the common hash prefix. This is shown in Figure 11.9 (textbook 11.19) as . Number of bucket entries pointing to bucket j is then : .

To find the bucket containing search key value


Compute

Take the first i high order bits of . Look at the corresponding table entry for this i-bit string. Follow the bucket pointer in the table entry.

We now look at insertions in an extendable hashing scheme.


Follow the same procedure for lookup, ending up in some bucket j. If there is room in the bucket, insert information and insert record in the file. If the bucket is full, we must split the bucket, and redistribute the records. If bucket is split we may need to increase the number of bits we use in the hash.

Two cases exist: 1. If


, then only one entry in the bucket address table points to bucket j. Then we need to increase the size of the bucket address table so that we can include pointers to the two buckets that result from splitting bucket j. We increment i by one, thus considering more of the hash, and doubling the size of the bucket address table. Each entry is replaced by two entries, each containing original value. Now two entries in bucket address table point to bucket j. We allocate a new bucket z, and set the second pointer to point to z. Set and to i. Rehash all records in bucket j which are put in either j or z. Now insert new record. It is remotely possible, but unlikely, that the new hash will still put all of the records in one bucket. If so, split again and increment i again.

2. If j.

, then more than one entry in the bucket address table points to bucket Then we can split bucket j without increasing the size of the bucket address table (why?). Note that all entries that point to bucket j correspond to hash prefixes that have the same value on the leftmost bits. We allocate a new bucket z, and set and to the original value plus 1. Now adjust entries in the bucket address table that previously pointed to bucket j. Leave the first half pointing to bucket j, and make the rest point to bucket z. Rehash each record in bucket j as before. Reattempt new insert.

Note that in both cases we only need to rehash records in bucket j. Deletion of records is similar. Buckets may have to be coalesced, and bucket address table may have to be halved. Insertion is illustrated for the example deposit file of Figure 11.20.

32-bit hash values on bname are shown in Figure 11.21. An initial empty hash structure is shown in Figure 11.22. We insert records one by one. We (unrealistically) assume that a bucket can only hold 2 records, in order to illustrate both situations described. As we insert the Perryridge and Round Hill records, this first bucket becomes full. When we insert the next record (Downtown), we must split the bucket. Since , we need to increase the number of bits we use from the hash. We now use 1 bit, allowing us buckets. This makes us double the size of the bucket address table to two entries. We split the bucket, placing the records whose search key hash begins with 1 in the new bucket, and those with a 0 in the old bucket (Figure 11.23). Next we attempt to insert the Redwood record, and find it hashes to 1. That bucket is full, and . So we must split that bucket, increasing the number of bits we must use to 2. This necessitates doubling the bucket address table again to four entries (Figure 11.24). We rehash the entries in the old bucket. We continue on for the deposit records of Figure 11.20, obtaining the extendable hash structure of Figure 11.25.

Advantages:

Extendable hashing provides performance that does not degrade as the file grows. Minimal space overhead - no buckets need be reserved for future use. Bucket address table only contains one pointer for each hash value of current prefix length.

Disadvantages:

Extra level of indirection in the bucket address table Added complexity

Assignment Set 2
MC0068 Data Structures using C (Book ID - B0701 & B0702)

1. Explain the double ended queue with the help of suitable example. A double-ended queue (dequeue, often abbreviated to deque, pronounced deck) is an abstract data structure that implements a queue for which elements can only be added to or removed from the front (head) or back (tail). It is also often called a head-tail linked list. Deque is a special type of data structure in which insertions and deletions will be done either at the front end or at the rear end of the queue. The operations that can be performed on deques are Insert an item from front end Insert an item from rear end Delete an item from front end Delete an item from rear end Display the contents of queue The three operations insert rear, delete front and display and the associated operations to check for an underflow and overflow of queue have already been discussed in ordinary queue. In this section, other two operations i.e., insert an item at the front end and delete an item from the rear end are discussed.

a) Insert at the front end Consider queue shown in above fig (a). Observe that, the front end identified by f is 0 and rear end identified by r is -1. Here, an item can be inserted first by incrementing r by 1 and then insert an item. If the front pointer f is not equal to 0 as shown in above fig. (b), an item can be inserted by decrementing the front pointer .f by 1 and then inserting an item at that position. Except for these conditions, it is not possible to insert an item at the front end. For example, consider the queue shown in above figure (c). Here, an item 10 is already present in the first position identified by f and so, it is not possible to insert an item. The complete C function to insert an item is shown in below example. Example 1: Function to insert an item at the front end void insert_front(int item, int q[ ], int *f, int *r) { if( *f= = 0 && *r = = -1) q[++(*r)] = item; else if ( *f ! = 0) q[--(*f)]=item; else printf("Front insertion not possible");

} Delete from the rear end To delete an element from the rear end, first access the rear element and then decrement rear end identified by r. As an element is deleted, queue may become empty. If the queue is empty, reset the front pointer f to 0 and rear pointer r to -1 as has been done in an ordinary queue. We delete an element only if queue is not empty. The complete C function to delete an item from the rear end is shown in below example. Example 2: Function to delete an item from the rear end of queue void delete_rear(int q[],int *f, int *r) { if ( qempty(*f,*r) ) { printf("Queue underflown"); return; } printf("The element deleted is %dn".q[(*r)--]); if (*f > *r) { *f = 0, *r = -1 ; } }

C program to Implement double-ended queue #include <stdio.h> #include <process.h>

#define QUEUE_SIZE 5 /* Include function to check for overflow 4.2.1 Eg.-1*/ /* Include function to check for underflow 4.2.1 Eg -3*/ /* Include function to insert an item at the front end 4.2. 3 Eg.-1*/ /* Include function to insert an item at the rear end 4.2.1 Eg -2*/ /* Include function to delete an item at the front end 4.2.1 Eg -4*/ /* Include function to delete an item at the rear end 4.2. 3 Eg.-2*/ /* Include function to display the contents of queue 4.2.1 Eg -5*/ void main() { int choice,item,f,r,q [10]; f=0; r = -1; for (;;) { printf(" 1:Insert_front 2:lnsert_rearn"); printf("3: Delete_front 4: Delete_rearn" ); printf("5: Display 6:Exitn"); printf("Enter the choicen"); scanf("%d" ,&choice ); switch ( choice ) { case 1: printf("Enter the item to be insertedn");

scanf("%d",& item); insert_ front(item, q, &f, &r); break; case 2: printf("Enter the item to be insertedn"); scanf("%d",& item); insert_rear(item, q, &r); break; case 3: delete _front(q, &f, &r); break; case 4: delete_rear(q, &f, &r); break; cases 5: display(q, f, r); break; default: . exit(0); } } }

2. Explain Insert/Delete a node at the rear, front end from circular singly linked list.

A singly linked circular list is a linked list where the last node in the list points to the first node in the list. A circular list does not contain NULL pointers. With a circular list, a pointer to the last node gives easy access also to the first node, by following one link. Thus, in applications that require access to both ends of the list (e.g., in the implementation of a queue), a circular structure allows one to handle the structure by a single pointer, instead of two. A circular list can be split into two circular lists, in constant time, by giving the addresses of the last node of each piece. The operation consists in swapping the contents of the link fields of those two nodes. Applying the same operation to any two nodes in two distinct lists joins the two list into one. This property greatly simplifies some algorithms and data structures, such as the quad-edge and faceedge. The simplest representation for an empty circular list (when such a thing makes sense) is a null pointer, indicating that the list has no nodes. With this choice, many algorithms have to test for this special case, and handle it separately. By contrast, the use of null to denote an empty linear list is more natural and often creates fewer special cases.

Insert a node at the rear end: Consider the list contains 4 nodes and last is a pointer variable that contains the address of the last node. 50 20 45 10 Let us insert the item 80 at the end of this list. Step 1:Obtain a free node temp from the availability list and store the item in info field. This can be accomplished using the following statements Temp=getnode(); Temp->info=item; Step 2: Copy the address of the first node(i.e. last->link) into link field of newly obtained node temp and the statement to accomplish this task is Temp-> link=last->link Step 3: Establish a link between the newly created node temp and the last node. This is achieved by copying the address of the node temp into link field of last node. The corresponding code for this is

Last->link=temp; Step 4: The new node is made as the last node using the statement: Return temp; These steps have designed by assuming the list already exists. Function to insert an item at the rear end of the list. NODE insert_rear(int item, NODE last) { NODE temp; Temp=getnode(); Temp->info=item; If(last==NULL) Last=temp; Else Emp->link=last->link; Last->link=temp; Retrun temp;
}

Insert a node at the front end: Consider the list contains 4 nodes and a pointer variable last contains address of the last node. 20 45 10 80

Step 1: To insert an item 50 at the front of the list, obtain a free node temp from the availability list and store the item in info field. This can be accomplished using the following statements Temp=getnode(); Temp->info=item; Step 2: Copy the address of the first node(i.e. last->link) into link field of newly obtained node temp and the statement to accomplish this task is Temp->link=last->link. Step 3: Establish a link between the newly created node temp and the last node. This is achieved by copying the address of the node temp into link field of last node. The corresponding code for this is Last->link=temp; Now, an item is successfully inserted at the front of the list. All these steps have been designed by assuming the list is already existing. If the list is empty, make

temp itself as the last node and establish a link between the first node and the last node. Repeatedly insert the items using the above procedure to create a list. Function to insert an item at the front end of the list. NODE insert_front(int item, NODE last) { NODE temp; Temp=getnode(); Temp->info=item; If(last==NULL) Last=temp; Else temp->link=last->link; Last->link=temp; Return last; } 3. a). Compare and contrast DFS and BFS and DFS+ID approaches b). Discuss how Splay Tree differs from a Binary Tree?Justify your answer with ex. a) Once youve identified a problem as a search problem, its important to choose the right type of search. Here are some information which may be useful while taking the decision. Search Time Space When to use Must search tree anyway, know the level the DFS O(c^k) O(k) answers are on, or you arent looking for the shallowest number. Know answers are very near top of tree, or BFS O(c^d) O(c^d) want shallowest answer. Want to do BFS, dont have enough space, DFS+ID O(c^d) O(d) and can spare the time. d is the depth of the answer k is the depth searched d <= k Remember the ordering properties of each search. If the program needs to produce a list sorted shortest solution first (in terms of distance from the root node), use breadth first search or iterative deepening. For other orders, depth first search is the right strategy. If there isnt enough time to search the entire tree, use the algorithm that is more likely to find the answer. If the answer is expected to be in one of the rows of

nodes closest to the root, use breadth first search or iterative deepening. Conversely, if the answer is expected to be in the leaves, use the simpler depth first search. Be sure to keep space constraints in mind. If memory is insufficient to maintain the queue for breadth first search but time is available, use iterative deepening.

b) Binary Tree: A binary tree is simply a tree in which each node can have at most two children. A binary search tree is a binary tree in which the nodes are assigned values, with the following restrictions ; -No duplicate values. -The left subtree of a node can only have values less than the node -The right subtree of a node can only have values greater than the node and recursively defined; -The left subtree of a node is a binary search tree. -The right subtree of a node is a binary search tree. Splay Trees We shall describe the algorithm by giving three rewrite rules in the form of pictures. In these pictures, x is the node that was accessed (that will eventually be at the root of the tree). By looking at the local structure of the tree defined by x, xs parent, and xs grandparent we decide which of the following three rules to follow. We continue to apply the rules until x is at the root of the tree:

Notes 1) Each rule has a mirror image variant, which covers all the cases.

2) The zig-zig rule is the one that distinguishes splaying from just rotating x to the root of the tree.

3)

Top-down splaying is much more efficient in practice.

Consider the following class of binary-tree based algorithms for processing a sequence of requests: on each access we pay the distance from the root to the accessed node. Arbitrary rotations are also allowed, at a cost of 1 per rotation. For a sequence Sigma, let C(T,Sigma) be the cost of the optimal algorithm in this class for processing a sequence Sigma starting from tree T. Then the cost of splaying the sequence (starting from any tree) is O(C(T,Sigma) + n). What about dynamic operations that change the set of elements in the tree? Heres how we can do Insertion, Join, and Deletion. Insertion Say we want to insert a new item e. Proceed like ordinary binary tree insertion, but stop short of linking the new item e into the tree. Let the node you were about to link e to be called p. We splay p to the root. Now construct a tree according to one of the following pictures:

depending on if the item in e is before or after p. Lets analyze this when all the weights are 1. The amortized cost of the splay is O(log n) according to the access lemma. Then we have to pay the additional amortized cost which equals the potential of the new root. This is log(n+1). So the amortized cost of the whole thing is O(log(n)). An alternative insertion algorithm is to do ordinary binary tree insertion, and then simply splay the item to the root. This also has efficient amortized cost, but the analysis is a tiny bit more complicated.

Master of Computer Application (MCA) Semester 2 MC0069 System Analysis & Design (SAD) 4 Credits

Assignment Set 1
MC0069 System Analysis & Design (SAD) (Book ID - B0714)

1. Describe the following: A) Modeling Simple Dependencies B) Modeling Single Inheritance C) Modeling Structural Relationships A) Modeling Simple Dependencies :- While entity types describe independent artifacts, relationship types describe meaningful associations between entity types. To be precise, the relationship type describes that entities of entity types participating in the relationship can build a meaningful association. The actual occurrence of the association between entities is called a relationship. It is important to understand that although we defined a relationship type, this does not mean that every pair of entities builds a relationship. A relationship type defines only that relationships can occur. An example of a relationship type is the Employee owns Product. In this example the relationship type is Owns. The relationship itself is: The Employee Joe Ward owns a Product, the yellow Telephone, with the Serial Number 320TS03880. Dependency refers to a situation in which a particular entity can not meaningfully exist without the existence of another entity as specified within a relationship. The entity type is dependent on another entity type when each entity of a dependent entity (subtype) depends on the existence of the corresponding parent entity in the super type. A mandatory dependency relationship has to be specified by explicitly defining the lower limit for cardinality that is not equal to 0 or by a fixed cardinality that is not equal to 0. B) Modeling Single Inheritance :- To understand OO you need to understand common object terminology. The critical terms to understand are summarized in Table 1. I present a much more detailed explanation of these terms in The Object Primer 3/e. Some of these concepts you will have seen before, and some of them you havent. Many OO concepts, such as encapsulation, coupling, and cohesion come from software engineering. These concepts are important because they underpin good OO design. The main point to be made here is that

you do not want to deceive yourself just because you have seen some of these concepts before, it dont mean you were doing OO, it just means you were doing good design. While good design is a big part of object-orientation, there is still a lot more to it than that. Single Inheritance Represents is a, is like, and is kind of relationships. When class B inherits from class A it automatically has all of the attributes and operations that A implements (or inherits from other classes) Associations also exist between use cases in system use case models and are depicted using dashed lines with the UML stereotypes of <<extend>> or <<include>>, as you see in.It is also possible to model inheritance between use cases, something that is not shown in the diagram. The rectangle around the use cases is called the system boundary box and as the name suggests it delimits the scope of your system the use cases inside the rectangle represent the functionality that you intend to implement. C) Modeling Structural Relationships :- The class is the basic logical entity in the UML. It defines both the data and the behaviour of a structural unit. A class is a template or model from which instances or objects are created at run time. When we develop a logical model such as a structural hierarchy in UML we explicitly deal with classes. When we work with dynamic diagrams, such as sequence diagrams and collaborations, we work with objects or instances of classes and their inter-actions at run-time. The principal of data hiding or encapsulation is based on localisation of effect. A class has internal data elements that it is responsible for. Access to these data elements should be through the class's exposed behaviour or interface. Adherence to this principal results in more maintainable code.

2. Write and explain important relationships that are used in object-oriented modeling. A relationship is a general term covering the specific types of logical connections found on class and objects diagrams. A relationship references one or more related elements. There is no general notation for a relationship. In most cases the notation is some kind of a line connecting related elements. Specific subclasses of the relationship define their own notation. Instance Level Relationships AssociationAssociation is a relationship between classifiers which is used to show that instances of classifiers could be either linked to each other or combined logically or physically into some aggregation. Association defines the relationship between two or more classes in the System. These generally relates to the one object having instance or reference of another object inside it. Associations in UML can be implemented using following ways: 1) Multiplicity 2) Aggregation 3) Composition Multiplicity in UML: Multiplicity indicates the no of instance of one class is linked to one instance of another class. The various multiplicity values are listed below: Notation Description 1 0..1 * 0..* 1..* Only One Instance Zero or One Instance Many Instance Zero or Many Instance One or Many Instance

Aggregation: Aggregation represents the set of Main Classes that are dependent on SubClasses, the Main class cannot exist without Sub-Class but the Sub-Class can exists without the Main Class. Aggregation (shared aggregation) is a "weak" form of aggregation when part instance is independent of the composite: the same (shared) part could be included in several composites, and if composite is deleted, shared parts may still exist. Shared aggregation is shown as binary association decorated with a hollow diamond as a terminal adornment at the aggregate end of the association line.

The diamond should be noticeably smaller than the diamond notation for N-ary associations.

Search Service has a Query Builder using shared aggregation Composition in UML: Composition represents the set of Main Classes that are dependent on SubClasses, the Main class cannot exists without Sub-Class and the Sub-Class cannot exists without the Main Class. The Sub-Class can represent only one composite relation with the Main class. Composition (composite aggregation) is a "strong" form of aggregation. Composition requirements/features listed in UML specification are: it is a whole/part relationship, it is binary association, part could be included in at most one composite (whole) at a time, and if a composite (whole) is deleted, all of its composite parts are "normally" deleted with it. Note, that UML does not define how, when and specific order in which parts of the composite are created. Also, in some cases a part can be removed from a composite before the composite is deleted, and so is not necessarily deleted as part of the composite. Composite aggregation is depicted as a binary association decorated with a filled black diamond at the aggregate (whole) end.

Folder could contain many files, while each File has exactly one Folder parent. If Folder is deleted, all contained Files are deleted as well. When composition is used in domain models, both whole/part relationship as well as event of composite "deletion" should be interpreted figuratively, not necessarily as physical containment and/or termination. UML specification needs to be updated to explicitly allow this interpretation.

Class Level RelationshipsGeneralizationThe Generalization relationship indicates that one of the two related classes (the subclass) is considered to be a specialized form of the other (the super type) and superclass is considered as 'Generalization' of subclass. In practice, this means

that any instance of the subtype is also an instance of the superclass. An exemplary tree of generalizations of this form is found in binomial nomenclature: human beings are a subclass of simian, which are a subclass of mammal, and so on. The relationship is most easily understood by the phrase 'an A is a B' (a human is a mammal, a mammal is an animal). The UML graphical representation of a Generalization is a hollow triangle shape on the superclass end of the line (or tree of lines) that connects it to one or more subtypes. The generalization relationship is also known as the inheritance or "is a" relationship. The superclass in the generalization relationship is also known as the "parent", superclass, base class, or base type. The subtype in the specialization relationship is also known as the "child", subclass, derived class, derived type, inheriting class, or inheriting type. Note that this relationship bears no resemblance to the biological parent/child relationship: the use of these terms is extremely common, but can be misleading. Generalization-Specialization relationship A is a type of B E. g. "an oak is a type of tree", "an automobile is a type of vehicle" Realization In UML modeling, a realization relationship is a relationship between two model elements, in which one model element (the client) realizes (implements or executes) the behavior that the other model element (the supplier) specifies. A realization is indicated by a dashed line with an unfilled arrowhead towards the supplier. Realizations can only be shown on class or component diagrams. A realization is a relationship between classes, interfaces, components, and packages that connects a client element with a supplier element. A realization relationship between classes and interfaces and between components and interfaces shows that the class realizes the operations offered by the interface. DependencyDependency relationship is used to show that some element or a set of elements depends on other model element(s), and on class diagrams is usually represented by usage dependency or abstraction. A dependency is rendered as a dashed arrow between elements. The element at the tail of the arrow (the client) depends on the model element at the arrowhead (the supplier). The arrow may be labeled with an optional stereotype and an optional name.

Data Access depends on Connection Pool It is possible to have a set of elements for the client or supplier. In this case, one or more arrows with their tails on the clients are connected to the tails of one or more arrows with their heads on the suppliers. A small dot can be placed on the

junction if desired. A note on the dependency should be attached at the junction point. 3. Discuss the concept of dynamic type modeling and how is different from static type? UML diagrams represent two different views of a system model: Static (or structural) view: emphasizes the static structure of the system using objects, attributes, operations and relationships. The structural view includes class diagrams and composite structure diagrams.. Dynamic (or behavioral) view: emphasizes the dynamic behavior of the system by showing collaborations among objects and changes to the internal states of objects. This view includes sequence diagrams, activity diagrams and state machine diagrams. Most object-oriented programming languages are statically typed, which means that the type of an object is bound at the time the object is created. Even so, that object will likely play different roles over time. This means that clients that use that object interact with the object through different sets of interfaces, representing interesting, possibly overlapping, sets of operations. Modeling the static nature of an object can be visualized in a class diagram. However, when you are modeling things like business objects, which naturally change their roles throughout a workflow; its sometimes useful to explicitly model the dynamic nature of that objects type. In these circumstances, an object can gain and lose types during its life. To model a dynamic type, . Specify the different possible types of that object by rendering each type as a class stereotyped as type (if the abstraction requires structure and behavior) or as interface (if the abstraction requires only behavior). Model all the roles the class of the object may take on at any point in time. You can do so in two ways: 1. First, in a class diagram, explicitly type each role that the class plays in its association with other classes. Doing this specifies the face instances of that class put on in the context of the associated object. 2. Second, also in a class diagram, specify the class-to-type relationships using generalization. In an interaction diagram, properly render each instance of the dynamically typed class. Display the role of the instance in brackets below the objects name. To show the change in role of an object, render the object once for each roie it plays in the interaction, and connect these objects with a message stereotyped as become. For example, Figure 4-15 shows the roles that instances of the class Person might play in the context of a human resources system.

Figure 4.15: Modeling Static Types This diagram specifies that instances of the Person class may be any of the three types namely. Candidate, Employee, or Retiree Figure 4-16 shows the dynamic nature of a persons type. In this fragment of an interaction diagram, p (the Person object) changes its role from Candidate to Employee.

Figure 4.16: Modeling Dynamic Types 4. What are the important factors that need to be considered to model a system from different views? Modeling Different Views of a System: When we model as system from different views, we are in effect construction our system simultaneously from multiple dimensions. By choosing the right set of views, we set up a process that forces us to ask good questions about our system and to expose risks that need to be attacked. If we do a poor job of choosing these views or if we focus on one view at the expense of all others, we run the risk of hiding issues and deferring problems that will eventually destroy any chance of success. To model a system from different views; - Decide which views we need to best express the architecture of our system and to expose the technical risks of our project. The five views: Use case, Design, Process, Implementation and Deployment. - For each of these views, decide which artifacts we need to create to capture has essential details of that view. - As part of our process planning, decide which of these diagrams we will want to put under some sort of formal or semi forma control. These are the diagrams for which we will want to schedule reviews and to preserve as documentation for the project.

Allow room for diagrams that are thrown away. Such transitory diagrams are still useful for exploring the implications of your decisions and for experimenting with changes. Example: If we are modeling a complex, distributed system, we will need to employ the full range of the UMLs diagrams in order to express the architecture of our system and the technical risks to our project as: Use case view: Use case diagrams, Activity diagrams for behavioral modeling Design view: Class diagrams for structural modeling, interaction diagram, statechart diagrams Process view: Class diagrams for structural modeling, Interaction diagram for behavioral modeling Implementation view: component diagrams Deployment view: Deployment diagrams 5. Explain the different components of states with reference to state machines State Machines: A state machines is a behavior that specifies the sequences of states an object goes through during its lifetime in response to events, together with its responses to those events. We use state machines to model the dynamic aspects of a system. The state of an object is a condition or situation during the life of an object during which it satisfies some condition, performs some activity, or waits for some events. An event is the specification of a significant occurrence that has a location in time and space. In the context of state machines, an event is an occurrence of a stimulus that can trigger a state transition. A transition is a relationship between two states indicating that an object is the first state will perform certain actions and enter the second state when a specified nonatomic execution within a state machine. A action is an executable atomic computation that results in a change in state of the model or the return of a value. A state has several parts: Name- A textual string that distinguishes the state from other states; a state may be anonymous, meaning that it has no name. Entry/Exit actions-Actions executed on entering and ending the state, respectively. Internal transitions- Transitions that are handled without causing a change in state. Substates- The nested structure of a state, involving disjoint or concurrent substates. Deferred event- A list of events that are not handled in that state but, rather, are postponed and queued for handling by the object in another state.

6. Explain the different stages to model a flow of control Modeling a Flow of Control: To model a flow of control - Set the context for the interaction, whether it is the system as a whole, a class, or an individual operation. - Set the stage for the interaction by identifying which objects play a role. Set their initial properties, including their attribute values, state, and role. - If your model emphasizes the structural organization of these objects, identify the links that connect them, relevant to the paths of communication that take place in this interaction. Specify the nature of the links using the UMLs standard stereotypes and constraints, as necessary. - In time order, specify the messages that pass from object to object. As necessary, distinguish the different kinds of messages, include parameter and return values to convey the necessary detail of this interaction. - Also to convey the necessary detail of this interaction, adorn each object at every moment in time with its state and role. For example: Figure shows a set of objects that interact in the context of a publish and subscribe mechanism. This figure includes three objects: p(a StockQuotePubliser), s1 and s2(both instances of StockQuoteSubscriber). This figure is an example of a sequence diagram, which emphasizes the time order of messages.

P:stockQuotePublisher

S1:StockQuoteSubscriber Attach(s1)

S2:StockQuoteSubscriber

Attach(s2)

Notify() Update() Getstate()

Update() Getstat e() Figure: Flow of control by Time

Assignment Set 2
MC0069 System Analysis & Design (SAD) (Book ID - B0714)

1. Describe the following with suitable examples: A) Modeling the realization of a Use case B) Modeling the realization of an operation C) Modeling a Mechanism A) Modeling the realization of a Use case :- A relationship is a general term covering the specific types of logical connections found on class and object diagrams. UML shows the following relationships: In UML modeling, a realization relationship is a relationship between two model elements, in which one model element (the client) realizes (implements or executes) the behavior that the other model element (the supplier) specifies. A realization is indicated by a dashed line with a unfilled arrowhead towards the supplier. Realizations can only be shown on class or component diagrams. A realization is a relationship between classes, interfaces, components, and packages that connects a client element with a supplier element. A realization relationship between classes and interfaces and between components and interfaces shows that the class realizes the operations offered by the interface. B) Modeling the realization of an operation :- The class diagram is the main building block in object oriented modelling. It is used both for general conceptual modelling of the systematics of the application, and for detailed modelling translating the models into programming code. The classes in a class diagram represent both the main objects and or interactions in the application and the objects to be programmed. In the class diagram these classes are represented with boxes which contain three parts: A class with three sections. The upper part holds the name of the class The middle part contains the attributes of the class The bottom part gives the methods or operations the class can take or undertake In the system design of a system, a number of classes are identified and grouped together in a class diagram which helps to determine the statical relations between those objects. With detailed modeling, the classes of the conceptual design are often split in a number of subclasses. In order to further describe the behavior of systems, these class diagrams can be complemented by state diagram or UML state machine. Also instead of class

diagrams Object role modeling can be used if you just want to model the classes and their relationships. C) Modeling a Mechanism :- It is important to distinguish between the UML model and the set of diagrams of a system. A diagram is a partial graphic representation of a system's model. The model also contains documentation that drive the model elements and diagrams (such as written use cases). UML diagrams represent two different views of a system model: Static (or structural) view: emphasizes the static structure of the system using objects, attributes, operations and relationships. The structural view includes class diagrams and composite structure diagrams. Dynamic (or behavioral) view: emphasizes the dynamic behavior of the system by showing collaborations among objects and changes to the internal states of objects. This view includes sequence diagrams, activity diagrams and state machine diagrams. UML models can be exchanged among UML tools by using the XMI interchange format. 2. Discuss the concept of Association with the help of example Association defines the relationship between two or more classes in the System. These generally relates to the one object having instance or reference of another object inside it.

Class diagram example of association between two classes An Association represents a family of links. Binary associations (with two ends) are normally represented as a line, with each end connected to a class box. Higher order associations can be drawn with more than two ends. In such cases, the ends are connected to a central diamond. An association can be named, and the ends of an association can be adorned with role names, ownership indicators, multiplicity, visibility, and other properties. There are five different types of association. Bi-directional and uni-directional associations are the most common ones. For instance, a flight class is associated with a plane class bi-directionally. Associations can only be shown on class diagrams. Association represents the static relationship shared among the objects of two classes. Example: "department offers courses", is an association relation. 3. Describe the following: A) State machines

State machines: UML state machine diagrams depict the various states that an object may be in and the transitions between those states. In fact, in other modeling languages, it is common for this type of a diagram to be called a state-transition diagram or even simply a state diagram. A state represents a stage in the behavior pattern of an object, and like UML activity diagrams it is possible to have initial states and final states. An initial state, also called a creation state, is the one that an object is in when it is first created, whereas a final state is one in which no transitions lead out of. A transition is a progression from one state to another and will be triggered by an event that is either internal or external to the object.

Advanced States :-Advanced state diagram is a type of diagram used in computer science and related fields to describe the behavior of systems. State diagrams require that the system described is composed of a finite number of states; sometimes, this is indeed the case, while at other times this is a reasonable abstraction. There are many forms of state diagrams, which differ slightly and have different semantics. Advanced State diagrams are used to give an abstract description of the behavior of a system. This behavior is analyzed and represented in series of events, that could occur in one or more possible states. Hereby "each diagram usually represents objects of a single class and track the different states of its objects through the system". a flowchart to an assembly line in manufacturing because the flowchart describes the progression of some task from beginning to end (e.g., transforming source code input into object code output by a compiler). A state machine generally has no notion of such a progression. The door state machine shown at the top of this article, for example, is not in a more advanced stage when it is in the "closed" state, compared to being in the "opened" state; it simply reacts differently to the

open/close events. A state in a state machine is an efficient way of specifying a particular behavior, rather than a stage of processing. Transitions :- A transition is a progression from one state to another and will be triggered by an event that is either internal or external to the entity being modeled. For a class, transitions are typically the result of the invocation of an operation that causes an important change in state, although it is important to understand that not all method invocations will result in transitions. An action is something, in the case of a class it is an operation that is invoked by/on the entity being modeled. 1. Name Software Actions Using Implementation Language Naming Conventions 2. Name Actor Actions Using Prose 3. Indicate Entry Actions Only When Applicable For All Entry Transitions 4. Indicate Exit Actions Only When Applicable For All Exit Transitions 5. Model Recursive Transitions Only When You Want to Exit and Re-Enter the State 6. Name Transition Events in Past Tense 7. Place Transition Labels Near The Source State 8. Place Transitions Labels Based on Transition Direction. To make it easier to identify which label goes with a transition, place transition labels according to the following heuristics: Above transition lines going left-to-right Below transition lines going right-to-left Right of transition lines going down Left of transition lines going up 4. Define stereotype, tagged value and constraint in detail. Stereotype is a profile class which defines how an existing metaclass may be extended as part of a profile. It enables the use of a platform or domain specific terminology or notation in place of, or in addition to, the ones used for the extended metaclass. A stereotype cannot be used by itself, but must always be used with one of the metaclasses it extends. Stereotype cannot be extended by another stereotype. A stereotype uses the same notation as a class, with the keyword stereotype shown before or above the name of the stereotype. Stereotype names should not clash with keyword names for the extended model element.

Servlet Stereotype extends Component. Stereotype can change the graphical appearance of the extended model element by using attached icons represented by Image class.

Stereotype Servlet with attached custom icon.

Actor is extended by stereotype Web Client with attached custom icon. Because stereotype is a class, it may have properties. Properties of a stereotype are referred to as tag definitions. When a stereotype is applied to a model element, the values of the properties are referred to as tagged values.

Device extended by Server stereotype with tag definitions and custom icon. Tagged Value: UML 1.x defined tagged value as one of UML extensibility mechanisms permitting arbitrary information (which could not be expressed by UML) to be attached to models. Tagged value is a keyword-value pair that may be attached to any kind of model element. The keyword is called a tag. Each tag represents a particular kind of property applicable to one or many kinds of model elements. Both the tag and the value are usually encoded as strings though UML tool allow to use other data types for values. Tagged value specification in UML 1.x has the form name = value where name is the name of a tag or metamodel attribute and value is an arbitrary string that denotes its value. For example, {author="Joe Smith", deadline=31-March-1997, status=analysis}

Boolean tags frequently have the form isQuality, where quality is some condition that may be true or false. In these cases, the form "quality" may usually appear by itself, without a value and defaulting to true. For example, {abstract} is the same as {isAbstract=true}. To specify a value of false, omit the name completely. Tags of other types require explicit values. Tagged value (as well as metamodel attribute) is displayed as a comma delimited sequence of properties inside a pair of curly braces "{" and "}".

Stereotype Computer applied using "traditional" tag values notation. In UML 1.3 tagged values could extend a model element without requiring the presence of a stereotype. In UML 1.4, this capability, although still supported, was deprecated, to be used only for backward compatibility reasons. Since UML 2.0, a tagged value can only be represented as an attribute defined on a stereotype. Therefore, a model element must be extended by a stereotype in order to be extended by tagged values. To support compatibility with the UML 1.3 some UML tools can automatically define a stereotype to which "unattached" attributes (tagged values) will be attached. Tag values could be shown in class compartment under stereotype name. An additional compartment is required for each applied stereotype whose values are to be displayed. Each such compartment is headed by the name of the applied stereotype in guillemets.

Stereotype Computer applied with tag values in compartment Tag values could be shown in attached comment under stereotype name.

Stereotype Computer applied with tag values in comment note When displayed in compartments or in a comment symbol, each name-value pair should appear on a separate line. Constraint: A constraint is used to express a condition, restriction or assertion related to some of the semantics of a model element. Constraint is a condition which must evaluate to true by a correct design of the system. For example, bank account balance constraint could be expressed as: {balance >= 0}

Constraint in UML is usually written in curly braces using any computer or natural language including English, so that it might look as: {account balance is positive} UML prefers OCL as constraint language. 5. What are the important factors that need to be considered to model a system from different views. Modeling Different Views of a System:When you model a system from different views, you are in effect constructing your system simultaneously from multiple dimensions. By choosing the right set of views, you set up a process that forces you to ask good questions about your system and to expose risks that need to be attacked. If you do a poor job of choosing these views or if you focus on one view at the expense of all others, you run the risk of hiding issues and deferring problems that will eventually destroy any chance of success. To model u system from different views, Decide which views you need to best express the architecture of your system and to expose the technical risks to your project. The five views of an architecture described earlier are a good starting point. For each of these views, decide which artifacts you need to create to capture the essential details of that view. For the most part, these artifacts will consist of various UML diagrams. As part of your process planning, decide which of these diagrams youll want to put under some sort of formal or semi formal control. These are the diagrams for which youll want to schedule reviews and to preserve as documentation for the project. Allow room for diagrams that are thrown away. Such transitory diagrams are still useful for exploring the implications of your decisions and for experimenting with changes. For example, if you are modeling a simple monolithic application that runs on a single machine, you might need only the following handful of diagrams. Use case view Use ease diagrams Design view Class diagrams (for structural modeling) Interaction diagrams (for behavioral modeling) Process view None required Implementation view None required Deployment view None required If yours is a reactive system or if it focuses on process flow, youll probably want to include statechart diagrams and activity diagrams, respectively, to model your systems behavior.

Similarly, if yours is a client/server system, youll probably want to include component diagrams and deployment diagrams to model the physical details of your system. Finally, if you are modeling a complex, distributed system, youll need to employ the full range of the UMLs diagrams in order to express the architecture of your system and the technical risks to your project, as in the following. Use case view Use case diagrams Activity diagrams (for behavioral modeling) Design view Class diagrams (for structural modeling) Interaction diagrams (for behavioral modeling) Statechart diagrams (for behavioral modeling) Process view Class diagrams (for structural modeling) Interaction diagrams (for behavioral modeling) Implementation view Component diagrams Deployment view Deployment diagrams 6. What are the necessary considerations to be taken care to forward engineer a class diagram? Forward engineering is the process of transforming a model into code through a mapping to an implementation language. Forward engineering results in a loss of information, because models written in the UML are semantically richer than any current object oriented programming language. In fact, this is a major reason why you need models in addition to code. Structural features, such as collaborations, and behavioral features, such as interactions, can be visualized clearly in the UML, but not so clearly from raw code. . Identify the rules for mapping to your implementation language or languages of choice. This is something youll want to do for your project or your organization as a whole. Depending on the semantics of the languages you choose, you may have to constrain your use of certain UML features. For example, the UML permits you to model multiple inheritance, but Smalltalk permits only single inheritance. You can either choose to prohibit developers from modeling with multiple inheritances (which makes your models language dependent) or develop idioms that transform these richer features into the implementation language (which makes the mapping more complex). Use tagged values to specify your target language. You can do this at the level of individual classes if you need precise control. You can also do so at a higher level, such as with collaborations or packages. Use tools to forward engineer your models.

Figure 6-3 illustrates a simple class diagram specifying an instantiation of the chain of responsibility pattern. This particular instantiation involves three classes: Client, EventHandler, and GUIEventHandler. Client and EventHandler are shown as abstract classes, whereas GUI EventHandler is concrete. EventHandler has the usual operation expected of this pattern (handleRequest), although two private attributes have been added for this instantiation.

Figure : Forward Engineering

Master of Computer Application (MCA) Semester 2 MC0070 Operating Systems with Unix 4 Credits

Assignment Set 1
MC0070 Operating Systems with Unix (Book ID: B0682 & B0683)
1. Describe the concept of process control in Operating systems. :A process can be simply defined as a program in execution. Process along with program code, comprises of program counter value, Processor register contents, values of variables, stack and program data. A process is created and terminated, and it follows some or all of the states of process transition; such as New, Ready, Running, Waiting, and Exit. Process is not the same as program. A process is more than a program code. A process is an active entity as oppose to program which considered being a passive entity. As we all know that a program is an algorithm expressed in some programming language. Being a passive, a program is only a part of process. Process, on the other hand, includes: Current value of Program Counter (PC) Contents of the processors registers Value of the variables The process stack, which typically contains temporary data such as subroutine parameter, return address, and temporary variables. A data section that contains global variables. A process is the unit of work in a system. A process has certain attributes that directly affect execution, these include: PID - The PID stands for the process identification. This is a unique number that defines the process within the kernel. PPID - This is the processes Parent PID, the creator of the process. UID - The User ID number of the user that owns this process. EUID - The effective User ID of the process. GID - The Group ID of the user that owns this process. EGID - The effective Group User ID that owns this process. Priority - The priority that this process runs at. To view a process you use the ps command. # ps -l F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD 30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq 6 0:00 rlogind The F field: This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory.

The S field: The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available. PID field: The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal. PRI field: This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range from 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value. P flag: This is the processor flag. On the SGI this refers to the processor the process is running on. SZ field: This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes. TTY field: This is the terminal assigned to your process. Time field: The cumulative execution time of the process in minutes and seconds. COMD field: The command that was executed. In Process model, all software on the computer is organized into a number of sequential processes. A process includes PC, registers, and variables. Conceptually, each process has its own virtual CPU. In reality, the CPU switches back and forth among processes. 2. Describe the following: A.) Layered Approach B.) Micro Kernels C.) Virtual Machines A) Layered Approach: With proper hardware support, operating systems can be broken into pieces that are smaller and more appropriate than those allowed by the original MS-DOS or UNIX systems. The operating system can then retain much greater control over the computer and over the applications that make use of that computer. Implementers have more freedom in changing the inner workings of the system and in creating modular operating systems. Under the top-down approach, the overall functionality and features are determined and the separated into components. Information hiding is also important, because it leaves programmers free to implement the low-level routines as they see fit, provided that the external interface of the routine stays unchanged and that the routine itself performs the advertised task. A system can be made modular in many ways. One method is the layered approach, in which the operating system is broken up into a number of layers (levels). The bottom layer (layer 0) id the hardware; the highest (layer N) is the user interface.

Users File Systems Inter-process Communication I/O and Device Management Virtual Memory Primitive Process Management Hardware B) Micro-kernels: We have already seen that as UNIX expanded, the kernel became large and difficult to manage. In the mid-1980s, researches at Carnegie Mellon University developed an operating system called Mach that modularized the kernel using the microkernel approach. This method structures the operating system by removing all nonessential components from the kernel and implementing then as system and user-level programs. The result is a smaller kernel. There is little consensus regarding which services should remain in the kernel and which should be implemented in user space. Typically, however, micro-kernels provide minimal process and memory management, in addition to a communication facility. Device File Client . Virtual Drivers Server Process Memory Microkernel Hardware C) Virtual Machine: The layered approach of operating systems is taken to its logical conclusion in the concept of virtual machine. The fundamental idea behind a virtual machine is to abstract the hardware of a single computer (the CPU, Memory, Disk drives, Network Interface Cards, and so forth) into several different execution environments and thereby creating the illusion that each separate execution environment is running its own private computer. By using CPU Scheduling and Virtual Memory techniques, an operating system can create the illusion that a process has its own processor with its own (virtual) memory. Normally a process has additional features, such as system calls and a file system, which are not provided by the hardware. The Virtual machine approach does not provide any such additional functionality but rather an interface that is identical to the underlying bare hardware. Each process is provided with a (virtual) copy of the underlying computer.

3. Memory management is important in operating systems. Discuss the main problems that can occur if memory is managed poorly. The part of the operating system which handles this responsibility is called the memory manager. Since every process must have some amount of primary memory in order to execute, the performance of the memory manager is crucial to the performance of the entire system. Virtual memory refers to the technology in which some space in hard disk is used as an extension of main memory so that a user program need not worry if its size extends the size of the main memory. For paging memory management, each process is associated with a page table. Each entry in the table contains the frame number of the corresponding page in the virtual address space of the process. This same page table is also the central data structure for virtual memory mechanism based on paging, although more facilities are needed. It covers the Control bits, Multi-level page table etc. Segmentation is another popular method for both memory management and virtual memory Basic Cache Structure : The idea of cache memories is similar to virtual memory in that some active portion of a low-speed memory is stored in duplicate in a higher-speed cache memory. When a memory request is generated, the request is first presented to the cache memory, and if the cache cannot respond, the request is then presented to main memory. Content-Addressable Memory (CAM) is a special type of computer memory used in certain very high speed searching applications. It is also known as associative memory, associative storage, or associative array, although the last term is more often used for a programming data structure. In addition to the responsibility of managing processes, the operating system must efficiently manage the primary memory of the computer. The part of the operating system which handles this responsibility is called the memory manager. Since every process must have some amount of primary memory in order to execute, the performance of the memory manager is crucial to the performance of the entire system. Nutt explains: The memory manager is responsible for allocating primary memory to processes and for assisting the programmer in loading and storing the contents of the primary memory. Managing the sharing of primary memory and minimizing memory access time are the basic goals of the memory manager.

The real challenge of efficiently managing memory is seen in the case of a system which has multiple processes running at the same time. Since primary memory can be space-multiplexed, the memory manager can allocate a portion of primary memory to each process for its own use. However, the memory manager must keep track of which processes are running in which memory locations, and it must also determine how to allocate and de-allocate available memory when new processes are created and when old processes complete execution. While various different strategies are used to allocate space to processes competing for memory, three of the most popular are Best fit, Worst fit, and First fit. Each of these strategies are described below: Best fit: The allocator places a process in the smallest block of unallocated memory in which it will fit. For example, suppose a process requests 12KB of memory and the memory manager currently has a list of unallocated blocks of 6KB, 14KB, 19KB, 11KB, and 13KB blocks. The best-fit strategy will allocate 12KB of the 13KB block to the process. Worst fit: The memory manager places a process in the largest block of unallocated memory available. The idea is that this placement will create the largest hold after the allocations, thus increasing the possibility that, compared to best fit, another process can use the remaining space. Using the same example as above, worst fit will allocate 12KB of the 19KB block to the process, leaving a 7KB block for future use. First fit: There may be many holes in the memory, so the operating system, to reduce the amount of time it spends analyzing the available spaces, begins at the start of primary memory and allocates memory from the first hole it encounters large enough to satisfy the request. Using the same example as above, first fit will allocate 12KB of the 14KB block to the process.

Notice in the diagram above that the Best fit and First fit strategies both leave a tiny segment of memory unallocated just beyond the new process. Since the amount of memory is small, it is not likely that any new processes can be loaded here. This condition of splitting primary memory into segments as the memory is allocated and deallocated is known as fragmentation. The Worst fit strategy attempts to reduce the problem of fragmentation by allocating the largest fragments to new processes. Thus, a larger amount of space will be left as seen in the diagram above.

Another way in which the memory manager enhances the ability of the operating system to support multiple process running simultaneously is by the use of virtual memory. According the Nutt, virtual memory strategies allow a process to use the CPU when only part of its address space is loaded in the primary memory. In this approach, each processs address space is partitioned into parts that can be loaded into primary memory when they are needed and written back to secondary memory otherwise. Another consequence of this approach is that the system can run programs which are actually larger than the primary memory of the system, hence the idea of virtual memory. Brookshear explains how this is accomplished: Suppose, for example, that a main memory of 64 megabytes is required but only 32 megabytes is actually available. To create the illusion of the larger memory space, the memory manager would divide the required space into units called pages and store the contents of these pages in mass storage. A typical page size is no more than four kilobytes. As different pages are actually required in main memory, the memory manager would exchange them for pages that are no longer required, and thus the other software units could execute as though there were actually 64 megabytes of main memory in the machine. In order for this system to work, the memory manager must keep track of all the pages that are currently loaded into the primary memory. This information is stored in a page table maintained by the memory manager. A page fault occurs whenever a process requests a page that is not currently loaded into primary memory. To handle page faults, the memory manager takes the following steps: The memory manager locates the missing page in secondary memory. The page is loaded into primary memory, usually causing another page to be unloaded. The page table in the memory manager is adjusted to reflect the new state of the memory. The processor re-executes the instructions which caused the page fault.

4. Discuss the following: A) File Substitution B) I/O Control A) File Substitution and I/O Control: It is important to understand how file substitution actually works. In the previous examples, the ls command doesnt do the work of file substitution the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example, $ echo p*

p10 p101 p11 When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command. The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command $ ls LINES.* PAGES.* as $ ls LINES.dat LINES.idx PAGES.dat PAGES.idx There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files! Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the $ ls .* command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is $ ls . .. .profile which gives you a complete directory listing of both the current and parent directories. When you think about how filename substitution works, you might assume that the default form of the ls command is actually $ ls * However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is $ ls .

5. Discuss the concept of File substitution with respect to managing data files in UNIX. It is important to understand how file substitution actually works. In the previous examples, the ls command doesnt do the work of file substitution the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example, $ echo p* p10 p101 p11 When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command. The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command $ ls LINES.* PAGES.* as $ ls LINES.dat LINES.idx PAGES.dat PAGES.idx There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files! Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the $ ls .* command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is

$ ls . .. .profile which gives you a complete directory listing of both the current and parent directories. When you think about how filename substitution works, you might assume that the default form of the ls command is actually $ ls * However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is $ ls .

Assignment Set 2
MC0070 Operating Systems with Unix Book ID: B0682 & B0683

1. Describe the following with respect to Deadlocks in Operating Systems: a. Deadlock Avoidance b. Deadlock Prevention a) Deadlock avoidance Deadlock avoidance is to avoid deadlock by only granting resources if granting them cannot result in a deadlock situation later. However, this works only if the system knows what requests for resources a process will be making in the future, and this is an unrealistic assumption. The text describes the bankers algorithm but then points out that it is essentially impossible to implement because of this assumption. b) Deadlock Prevention The difference between deadlock avoidance and deadlock prevention is a little subtle. Deadlock avoidance refers to a strategy where whenever a resource is requested, it is only granted if it cannot result in deadlock. Deadlock prevention strategies involve changing the rules so that processes will not make requests that could result in deadlock. Here is a simple example of such a strategy. Suppose every possible resource is numbered (easy enough in theory, but often hard in practice), and processes must make their requests in order; that is, they cannot request a resource with a number lower than any of the resources that they have been granted so far. Deadlock cannot occur in this situation. As an example, consider the dining philosophers problem. Suppose each chopstick is numbered, and philosophers always have to pick up the lower numbered chopstick before the higher numbered chopstick. Philosopher five picks up chopstick 4, philosopher 4 picks up chopstick 3, philosopher 3 picks up chopstick 2, philosopher 2 picks up chopstick 1. Philosopher 1 is hungry, and without this assumption, would pick up chopstick 5, thus causing deadlock. However, if the lower number rule is in effect, he/she has to pick up chopstick 1 first, and it is already in use, so he/she is blocked. Philosopher 5 picks up chopstick 5, eats, and puts both down, allows philosopher 4 to eat. Eventually everyone gets to eat. An alternative strategy is to require all processes to request all of their resources at once, and either all are granted or none are granted. Like the above strategy,

this is conceptually easy but often hard to implement in practice because it assumes that a process knows what resources it will need in advance. 2. Discuss the following with respect to I/O in Operating Systems: a. I/O Control Strategies b. Program - controlled I/O c. Interrupt - controlled I/O a. I/O Control Strategies: Several I/O strategies are used between the computer system and I/O devices, depending on the relative speeds of the computer system and the I/O devices. The simplest strategy is to use the processor itself as the I/O controller, and to require that the device follow a strict order of events under direct program control, with the processor waiting for the I/O device at each step. Another strategy is to allow the processor to be ``interrupted'' by the I/O devices, and to have a (possibly different) ``interrupt handling routine'' for each device. This allows for more flexible scheduling of I/O events, as well as more efficient use of the processor. (Interrupt handling is an important component of the operating system.) A third general I/O strategy is to allow the I/O device, or the controller for the device, access to the main memory. The device would write a block of information in main memory, without intervention from the CPU, and then inform the CPU in some way that that block of memory had been overwritten or read. This might be done by leaving a message in memory, or by interrupting the processor. (This is generally the I/O strategy used by the highest speed devices -- hard disks and the video controller.) b. Program - controlled I/O: One common I/O strategy is program-controlled I/O, (often called polled I/O). Here all I/O is performed under control of an ``I/O handling procedure,'' and input or output is initiated by this procedure. The I/O handling procedure will require some status information (handshaking information) from the I/O device (e.g., whether the device is ready to receive data). This information is usually obtained through a second input from the device; a single bit is usually sufficient, so one input ``port'' can be used to collect status, or handshake, information from several I/O devices. (A port is the name given to a connection to an I/O device; e.g., to the memory location into which an I/O device is mapped). An I/O port is usually implemented as a register (possibly

a set of D flip flops) which also acts as a buffer between the CPU and the actual I/O device. The word portis often used to refer to the buffer itself. Typically, there will be several I/O devices connected to the processor; the processor checks the ``status'' input port periodically, under program control by the I/O handling procedure. If an I/O device requires service, it will signal this need by altering its input to the ``status'' port. When the I/O control program detects that this has occurred (by reading the status port) then the appropriate operation will be performed on the I/O device which requested the service. A typical configuration might look somewhat as shown in Figure . The outputs labeled ``handshake in'' would be connected to bits in the ``status'' port. The input labeled ``handshake in'' would typically be generated by the appropriate decode logic when the I/O port corresponding to the device was addressed.

c. Interrupt - controlled I/O:

Interrupt-controlled I/O reduces the severity of the two problems mentioned for program-controlled I/O by allowing the I/O device itself to initiate the device

service routine in the processor. This is accomplished by having the I/O device generate an interrupt signal which is tested directly by the hardware of the CPU. When the interrupt input to the CPU is found to be active, the CPU itself initiates a subprogram call to somewhere in the memory of the processor; the particular address to which the processor branches on an interrupt depends on the interrupt facilities available in the processor. The simplest type of interrupt facility is where the processor executes a subprogram branch to some specific address whenever an interrupt input is detected by the CPU. The return address (the location of the next instruction in the program that was interrupted) is saved by the processor as part of the interrupt process. If there are several devices which are capable of interrupting the processor, then with this simple interrupt scheme the interrupt handling routine must examine each device to determine which one caused the interrupt. Also, since only one interrupt can be handled at a time, there is usually a hardware ``priority encoder'' which allows the device with the highest priority to interrupt the processor, if several devices attempt to interrupt the processor simultaneously. In Figure 4.8, the ``handshake out'' outputs would be connected to a priority encoder to implement this type of I/O. the other connections remain the same. (Some systems use a ``daisy chain'' priority system to determine which of the interrupting devices is serviced first. ``Daisy chain'' priority resolution is discused later.) In most modern processors, interrupt return points are saved on a ``stack'' in memory, in the same way as return addresses for subprogram calls are saved. In fact, an interrupt can often be thought of as a subprogram which is invoked by an external device. If a stack is used to save the return address for interrupts, it is then possible to allow one interrupt the interrupt handling routine of another interrupt. In modern computer systems, there are often several ``priority levels'' of interrupts, each of which can be disabled, or ``masked.'' There is usually one type of interrupt input which cannot be disabled (a non-maskable interrupt) which has priority over all other interrupts. This interrupt input is used for warning the processor of potentially catastrophic events such as an imminent power failure, to allow the processor to shut down in an orderly way and to save as much information as possible. Most modern computers make use of ``vectored interrupts.'' With vectored interrupts, it is the responsibility of the interrupting device to provide the address in main memory of the interrupt servicing routine for that device. This means, of course, that the I/O device itself must have sufficient ``intelligence'' to provide this address when requested by the CPU, and also to be initially ``programmed'' with this address information by the processor. Although somewhat more complex than the simple interrupt system described earlier, vectored interrupts provide such a significant advantage in interrupt handling speed and ease of

implementation (i.e., a separate routine for each device) that this method is almost universally used on modern computer systems. Some processors have a number of special inputs for vectored interrupts (each acting much like the simple interrupt described earlier). Others require that the interrupting device itself provide the interrupt address as part of the process of interrupting the processor.

3. Discuss various conditions to be true for deadlock to occur. There are many resources that can be allocated to only one process at a time, and we have seen several operating system features that allow this, such as mutexes, semaphores or file locks. Sometimes a process has to reserve more than one resource. For example, a process which copies files from one tape to another generally requires two tape drives. A process which deals with databases may need to lock multiple records in a database. A deadlock is a situation in which two computer programs sharing the same resource are effectively preventing each other from accessing the resource, resulting in both programs ceasing to function. The earliest computer operating systems ran only one program at a time. All of the resources of the system were available to this one program. Later, operating systems ran multiple programs at once, interleaving them. Programs were required to specify in advance what resources they needed so that they could avoid conflicts with other programs running at the same time. Eventually some operating systems offered dynamic allocation of resources. Programs could request further allocations of resources after they had begun running. This led to the problem of the deadlock. Here is the simplest example: Program 1 requests resource A and receives it. Program 2 requests resource B and receives it. Program 1 requests resource B and is queued up, pending the release of B. Program 2 requests resource A and is queued up, pending the release of A. Now neither program can proceed until the other program releases a resource. The operating system cannot know what action to take. At this point the only alternative is to abort (stop) one of the programs. Learning to deal with deadlocks had a major impact on the development of operating systems and the structure of databases. Data was structured and the order of requests was constrained in order to avoid creating deadlocks. In general, resources allocated to a process are not preemptable; this means that once a resource has been allocated to a process, there is no simple mechanism by which the system can take the resource back from the process unless the process voluntarily gives it up or the system administrator kills the process. This can lead to a situation called deadlock. A set of processes or threads is deadlocked when each process or thread is waiting for a resource to be freed

which is controlled by another process. Here is an example of a situation where deadlock can occur. Mutex M1, M2; /* Thread 1 */ while (1) { NonCriticalSection() Mutex_lock(&M1); Mutex_lock(&M2); CriticalSection(); Mutex_unlock(&M2); Mutex_unlock(&M1); } /* Thread 2 */ while (1) { NonCriticalSection() Mutex_lock(&M2); Mutex_lock(&M1); CriticalSection(); Mutex_unlock(&M1); Mutex_unlock(&M2); } Suppose thread 1 is running and locks M1, but before it can lock M2, it is interrupted. Thread 2 starts running; it locks M2, when it tries to obtain and lock M1, it is blocked because M1 is already locked (by thread 1). Eventually thread 1 starts running again, and it tries to obtain and lock M2, but it is blocked because M2 is already locked by thread 2. Both threads are blocked; each is waiting for an event which will never occur. Traffic gridlock is an everyday example of a deadlock situation.

In order for deadlock to occur, four conditions must be true. Mutual exclusion Each resource is either currently allocated to exactly one process or it is available. (Two processes cannot simultaneously control the same resource or be in their critical section). Hold and Wait processes currently holding resources can request new resources No preemption Once a process holds a resource, it cannot be taken away by another process or the kernel. Circular wait Each process is waiting to obtain a resource which is held by another process.

The dining philosophers problem discussed in an earlier section is a classic example of deadlock. Each philosopher picks up his or her left fork and waits for the right fork to become available, but it never does. Deadlock can be modeled with a directed graph. In a deadlock graph, vertices represent either processes (circles) or resources (squares). A process which has acquired a resource is show with an arrow (edge) from the resource to the process. A process which has requested a resource which has not yet been assigned to it is modeled with an arrow from the process to the resource. If these create a cycle, there is deadlock.The deadlock situation in the above code can be modeled like this.

This graph shows an extremely simple deadlock situation, but it is also possible for a more complex situation to create deadlock. Here is an example of deadlock with four processes and four resources.

There are a number of ways that deadlock can occur in an operating situation. We have seen some examples, here are two more. Two processes need to lock two files, the first process locks one file the second process locks the other, and each waits for the other to free up the locked file. Two processes want to write a file to a print spool area at the same time and both start writing. However, the print spool area is of fixed size, and it fills up before either process finishes writing its file, so both wait for more space to become available 5. What do you mean by a Process? What are the various possible states of Process? Discuss. A process under unix consists of an address space and a set of data structures in the kernel to keep track of that process. The address space is a section of memory that contains the code to execute as well as the process stack. The kernel must keep track of the following data for each process on the system:

the address space map, the current status of the process, the execution priority of the process, the resource usage of the process, the current signal mask, the owner of the process. A process has certain attributes that directly affect execution, these include: PID - The PID stands for the process identification. This is a unique number that defines the process within the kernel. PPID - This is the processes Parent PID, the creator of the process. UID - The User ID number of the user that owns this process. EUID - The effective User ID of the process. GID - The Group ID of the user that owns this process. EGID - The effective Group User ID that owns this process. riority - The priority that this process runs at. To view a process you use the ps command. # ps -l F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD 30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq 6 0:00 rlogind The F field: This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory. The S field: The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available. PID field: The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal. PRI field: This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range form 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value. P flag: This is the processor flag. On the SGI this refers to the processor the process is running on. SZ field: This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes. TTY field: This is the terminal assigned to your process. Time field: The cumulative execution time of the process in minutes and seconds. COMD field: The command that was executed. The fork() System Call The fork() system call is the basic way to create a new process. It is also a very unique system call, since it returns twice(!) to the caller. This system call causes the current process to be split into two processes - a parent process, and a child

process. All of the memory pages used by the original process get duplicated during the fork() call, so both parent and child process see the exact same image. The only distinction is when the call returns. When it returns in the parent process, its return value is the process ID (PID) of the child process. When it returns inside the child process, its return value is 0. If for some reason this call failed (not enough memory, too many processes, etc.), no new process is created, and the return value of the call is -1. In case the process was created successfully, both child process and parent process continue from the same place in the code where the fork() call was used. #include <unistd.h> /* defines fork(), and pid_t. */ #include <sys/wait.h> /* defines the wait() system call. */ //storage place for the pid of the child process, and its exit status.

spid_t child_pid; int child_status; child_pid = fork(); /* lets fork off a child process */ switch (child_pid) /* check what the fork() call actually did */ { case -1: /* fork() failed */ perror("fork"); /* print a system-defined error message */ exit(1); case 0: /* fork() succeeded, were inside the child process */ printf("hello worldn"); exit(0); //here the CHILD process exits, not the parent. default: /* fork() succeeded, were inside the parent process */ wait(&child_status); /* wait till the child process exits */ } /* parents process code may continue here */

6. Explain the working of file substitution in UNIX. Also describe the usage of pipes in UNIX Operating system. It is important to understand how file substitution actually works. In the previous examples, the ls command doesnt do the work of file substitution the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example, $ echo p* p10 p101 p11 When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command.

The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command $ ls LINES.* PAGES.* as $ ls LINES.dat LINES.idx PAGES.dat PAGES.idx There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files! Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the $ ls .* command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is $ ls . .. .profile which gives you a complete directory listing of both the current and parent directories. When you think about how filename substitution works, you might assume that the default form of the ls command is actually $ ls * However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is $ ls . The find Command One of the wonderful things about UNIX is its unlimited path names. A directory can have a subdirectory that itself has a subdirectory, and so on. This provides great flexibility in organizing your data. Unlimited path names have a drawback, though. To perform any operation on a file that is not in your current working directory, you must have its complete path name. Disk files are a lot like flashlights: You store them in what seem to be perfectly logical places, but when you need them again, you cant remember where you put them. Fortunately, UNIX has the find command.

The find command begins at a specified point on a directory tree and searches all lower branches for files that meet some criteria. Since find searches by path name, the search crosses file systems, including those residing on a network, unless you specifically instruct it otherwise. Once it finds a file, find can perform operations on it. Suppose you have a file named urgent.todo, but you cannot remember the directory where you stored it. You can use the find command to locate the file. $ find / -name urgent.todo -print /usr/home/stuff/urgent.todo The syntax of the find command is a little different, but the remainder of this section should clear up any questions. The find command is different from most UNIX commands in that each of the argument expressions following the beginning path name is considered a Boolean expression. At any given stop along a branch, the entire expression is true file found if all of the expressions are true; or false file not found if any one of the expressions is false. In other words, a file is found only if all the search criteria are met. For example, $ find /usr/home -user marsha -size +50 is true for every file beginning at /usr/home that is owned by Marsha and is larger than 50 blocks. It is not true for Marshas files that are 50 or fewer blocks long, nor is it true for large files owned by someone else. An important point to remember is that expressions are evaluated from left to right. Since the entire expression is false if any one expression is false, the program stops evaluating a file as soon as it fails to pass a test. In the previous example, a file that is not owned by Marsha is not evaluated for its size. If the order of the expressions is reversed, each file is evaluated first for size, and then for ownership. Another unusual thing about the find command is that it has no natural output. In the previous example, find dutifully searches all the paths and finds all of Marshas large files, but it takes no action. For the find command to be useful, you must specify an expression that causes an action to be taken. For example, $ find /usr/home -user me -size +50 -print /usr/home/stuff/bigfile /usr/home/trash/bigfile.old first finds all the files beginning at /usr/home that are owned by me and are larger than 50 blocks. Then it prints the full path name. The argument expressions for the find command fall into three categories: Search criteria Action expressions Search qualifiers Although the three types of expressions have different functions, each is still considered a Boolean expression and must be found to be true before any further evaluation of the entire expression can take place. (The significance of this is discussed later.) Typically, a find operation consists of one or more search criteria, a single action expression, and perhaps a search qualifier. In other words, it finds a file and takes some action, even if that action is simply to print

the path name. The rest of this section describes each of the categories of the find options. Search Criteria The first task of the find command is to locate files according to some userspecified criteria. You can search for files by name, file size, file ownership, and several other characteristics. Finding Files with a Specific Name: -name fname Often, the one thing that you know about a file for which youre searching is its name. Suppose that you wanted to locateand possibly take some action on all the files named core. You might use the following command: $ find / -name core -print This locates all the files on the system that exactly match the name core, and it prints their complete path names. The -name option makes filename substitutions. The command $ find /usr/home -name "*.tmp" -print prints the names of all the files that end in .tmp. Notice that when filename substitutions are used, the substitution string is enclosed in quotation marks. This is because the UNIX shell attempts to make filename substitutions before it invokes the command. If the quotation marks were omitted from "*.tmp" and if the working directory contained more than one *.tmp file, the actual argument passed to the find command might look like this: $ find /usr/home -name a.tmp b.tmp c.tmp -print This would cause a syntax error to occur.

Master of Computer Application (MCA) Semester 3 Software Engineering 4 Credits 0071

Assignment Set 1
MC0071 - Software Engineering (Book ID: B0808 & B0809)

1. What is the importance of Software Validation, in testing?

Software Validation Also known as software quality control. Validation checks that the product design satisfies or fits the intended usage (high-level checking) i.e., you built the right product. This is done through dynamic testing and other forms of review. According to the Capability Maturity Model (CMMI-SW v1.1),

Verification: The process of evaluating software to determine whether the products of a given development phase satisfy the conditions imposed at the start of that phase. [IEEE-STD-610].

Validation: The process of evaluating software during or at the end of the development process to determine whether it satisfies specified requirements. [IEEE-STD-610] In other words, validation ensures that the product actually meets the users needs, and that the specifications were correct in the first place, while verification is ensuring that the product has been built according to the requirements and design specifications. Validation ensures that you built the right thing. Verification ensures that you built it right. Validation confirms that the product, as provided, will fulfill its intended use.

From testing perspective:


Fault wrong or missing function in the code. Failure the manifestation of a fault during execution. Malfunction according to its specification the system does not meet its specified functionality. Within the modeling and simulation community, the definitions of validation, verification and accreditation are similar:

Validation is the process of determining the degree to which a model, simulation, or federation of models and simulations, and their associated data are accurate representations of the real world from the perspective of the intended use(s). Accreditation is the formal certification that a model or simulation is acceptable to be used for a specific purpose. Verification is the process of determining that a computer model, simulation, or federation of models and simulations implementations and their associated data accurately represents the developers conceptual description and specifications. 2. Explain the following concepts with respect to Software Reliability: A) Software Reliability Metrics B) Programming for Reliability Software reliability Metrics: Metrics which have been used for software reliability specification are shown in Figure 3.1 shown below .The choice of which metric should be used depends on the type of system to which it applies and the requirements of the application domain. For some systems, it may be appropriate to use different reliability metrics for different sub-systems.

Reliability matrix In some cases, system users are most concerned about how often the system will fail, perhaps because there is a significant cost in restarting the system. In those cases, a metric based on a rate of failure occurrence (ROCOF) or the mean time to failure should be used. In other cases, it is essential that a system should always meet a request for service because there is some cost in failing to deliver the service. The number of failures in some time period is less important. In those cases, a metric based on the probability of failure on demand (POFOD) should be used. Finally, users or system operators may be mostly concerned that the system is available when a request for service is made. They will incur some loss if the system is unavailable. Availability (AVAIL). Which takes into account repair or restart time, is then the most appropriate metric. There are three kinds of measurement, which can be made when assessing the reliability of a system: 1) The number of system failures given a number of systems inputs. This is used to measure the POFOD. 2) The time (or number of transaction) between system failures. This is used to measure ROCOF and MTTF. 3) The elapsed repair or restart time when a system failure occurs. Given that the system must be continuously available, this is used to measure AVAIL. Time is a factor in all of this reliability metrics. It is essential that the appropriate time units should be chosen if measurements are to be meaningful. Time units, which may be used, are calendar time, processor time or may be some discrete unit such as number of transactions.

Programming for Reliability: There is a general requirement for more reliable systems in all application domains. Customers expect their software to operate without failures and to be available when it is required. Improved programming techniques, better programming languages and better quality management have led to very significant improvements in reliability for most software. However, for some systems, such as those, which control unattended machinery, these normal techniques may not be enough to achieve the level of reliability required. In these cases, special programming techniques may be necessary to achieve the required reliability. Some of these techniques are discussed in this chapter. Reliability in a software system can be achieved using three strategies: Fault avoidance: This is the most important strategy, which is applicable to all types of system. The design and implementation process should be organized with the objective of producing fault-free systems. Fault tolerance: This strategy assumes that residual faults remain in the system. Facilities are provided in the software to allow operation to continue when these faults cause system failures. Fault detection: Faults are detected before the software is put into operation. The software validation process uses static and dynamic methods to discover any faults, which remain in a system after implementation.

3. Suggest six reasons why software reliability is important. Using an example, explain the difficulties of describing what software reliability means. Six reasons why software reliability is important : 1) Computers are now cheap and fast: There is little need to maximize equipment usage. Paradoxically, however, faster equipment leads to increasing expectations on the part of the user so efficiency considerations cannot be completely ignored. 2) Unreliable software is liable to be discarded by users: If a company attains a reputation for unreliability because of single unreliable product, it is likely to affect future sales of all of that companys products.

3) System failure costs may be enormous: For some applications, such a reactor control system or an aircraft navigation system, the cost of system failure is orders of magnitude greater than the cost of the control system. 4) Unreliable systems are difficult to improve: It is usually possible to tune an inefficient system because most execution time is spent in small program sections. An unreliable system is more difficult to improve as unreliability tends to be distributed throughout the system. 5) Inefficiency is predictable: Programs take a long time to execute and users can adjust their work to take this into account. Unreliability, by contrast, usually surprises the user. Software that is unreliable can have hidden errors which can violate system and user data without warning and whose consequences are not immediately obvious. For example, a fault in a CAD program used to design aircraft might not be discovered until several plane crashers occurs. 6) Unreliable systems may cause information loss: Information is very expensive to collect and maintains; it may sometimes be worth more than the computer system on which it is processed. A great deal of effort and money is spent duplicating valuable data to guard against data corruption caused by unreliable software.

4. What are the essential skills and traits necessary for effective project managers in successfully handling projects? Project management can be defined as a set of principles, methods, tools, and techniques for planning, organizing, staffing, directing, and controlling projectrelated activities in order to achieve project objectives within time and under cost and performance constraints. The effectiveness of the project manager is critical to project success. The qualities that a project manager must possess include an understanding of negotiation techniques, communication and analytical skills, and requisite project knowledge. Control variables that are decisive in predicting the effectiveness of a project manager include the managers competence as a communicator, skill as a negotiator, and leadership excellence, and whether he or she is a good team worker and has interdisciplinary skills. Project mangers are responsible for directing project resources and developing plans, and must be able to ensure that a project will be completed in a given period of time. They play the essential role of coordinating between and interfacing with customers and management. Project mangers must be able to: Optimize the likelihood of overall project success

Apply the experiences and concepts learned from recent projects to new projects Manage the projects priorities Resolve conflicts Identify weaknesses in the development process and in the solution Identify process strengths upon completion of the project Expeditiously engage team members to become informed about and involved in the project Studies of project management in Mateyaschuk (1888), Sauer, Johnston, and Liu (1888), and Posner (1887) identify common skills and traits deemed essential for effective project managers, including: Leadership Strong planning and organizational skills Team-building ability Coping skills The ability to identify risks and create contingency plans The ability to produce reports that can be understood by business managers The ability to evaluate information from specialists Flexibility and willingness to try new approaches Feeny and Willcocks (1888) claim that the two main indicators of a project managers likely effectiveness are prior successful project experience and the managers credibility with stakeholders. The underlying rationale for this is that such conditions, taken together, help ensure that the project manager has the necessary skills to execute a project and see it through to completion and that the business stakeholders will continue to support the project; see also Mateyaschuk (1888) and Weston & Stedman (1888a,b). Research also suggests that the intangibility, complexity, and volatility of project requirements have a critical impact on the success of software project managers. 5. Which are the four phases of development according to Rational Unified

Process? Rational Unified Process Model (RUP): The RUP constitutes a complete framework for software development. The elements of the RUP (not of the problem being modeled) are the workers who implement the development, each working on some cohesive set of development activities and responsible for creating specific development artifacts. A worker is like a role a member plays and the worker can play many roles (wear many hats) during the development. For example, a designer is a worker and the artifact that the designer creates may be a class definition. An artifact supplied to a customer as part of the product is a deliverable. The artifacts are maintained in the Rational Rose tools, not as separate paper documents. A workflow is defined as a meaningful sequence of activities that produce some valuable result (Krutchen 2003). The development process has nine core workflows: business modeling; requirements; analysis and design; implementation; test; deployment; configuration and change management; project management; and environment. Other RUP elements, such as tool mentors, simplify training in the use of the Rational Rose system. These core workflows are spread out over the four phases of development: The inception phase defines the vision of the actual user end-product and the scope of the project. The elaboration phase plans activities and specifies the architecture. The construction phase builds the product, modifying the vision and the plan as it proceeds. The transition phase transitions the product to the user (delivery, training, support, maintenance). In a typical two-year project, the inception and transition might take a total of five months, with a year required for the construction phase and the rest of the time for elaboration. It is important to remember that the development process is iterative, so the core workflows are repeatedly executed during each iterative visitation to a phase. Although particular workflows will predominate during a particular type of phase (such as the planning and requirements workflows during inception), they will also be executed during the other phases. For example, the implementation workflow will peak during construction, but it is also a workflow during elaboration and transition. The goals and activities for each phase will be examined in some detail. The purpose of the inception phase is achieving concurrence among all stakeholders on the objectives for the project. This includes the project boundary

and its acceptance criteria. Especially important is identifying the essential use cases of the system, which are defined as the primary scenarios of behavior that will drive the systems functionality. Based on the usual spiral model expectation, the developers must also identify a candidate or potential architecture as well as demonstrate its feasibility on the most important use cases. Finally, cost estimation, planning, and risk estimation must be done. Artifacts produced during this phase include the vision statement for the product; the business case for development; a preliminary description of the basic use cases; business criteria for success such as revenues expected from the product; the plan; and an overall risk assessment with risks rated by likelihood and impact. A throw-away prototype may be developed for demonstration purposes but not for architectural purposes. The following elaboration phase ensures that the architecture, requirements, and plans are stable enough, and the risks are sufficiently mitigated, that [one] can reliably determine the costs and schedule for the project. The outcomes for this phase include an 80 percent complete use case model, nonfunctional performance requirements, and an executable architectural prototype. The components of the architecture must be understood in sufficient detail to allow a decision to make, buy, or reuse components, and to estimate the schedule and costs with a reasonable degree of confidence. Krutchen observes that a robust architecture and an understandable plan are highly correlated[so] one of the critical qualities of the architecture is its ease of construction. Prototyping entails integrating the selected architectural components and testing them against the primary use case scenarios. The construction phase leads to a product that is ready to be deployed to the users. The transition phase deploys a usable subset of the system at an acceptable quality to the users, including beta testing of the product, possible parallel operation with a legacy system that is being replaced, and software staff and user training.

Assignment Set 2
MC0071 - Software Engineering (Book ID: B0808 & B0809)

1. Explain the following with respect to Configuration Management: A) Change Management B) Version and Release Management A) Change Management: The change management process should come into effects when the software or associated documentation is put under the control of the configuration management team. Change management procedures should be designed to ensure that the costs and benefits of change are properly analyzed and that changes to a system are made in a controlled way. Change management processes involve technical change analysis, cost benefit analysis and change tracking. The pseudo-code, shown in table below defines a process, which may be used to manage software system changes: The first stage in the change management process is to complete a change request form (CRF). This is a formal document where the requester sets out the change required to the system. As well as recording the change required, the CRF records the recommendations regarding the change, the estimated costs of the change and the dates when the change was requested, approved, implemented and validated. It may also include a section where the maintenance engineer outlines how the change is to be implemented. The information provided in the change request form is recorded in the CM database.

Once a change request form has been submitted, it is analyzed to check that the change is valid. Some change requests may be due to user misunderstandings rather than system faults; others may refer to already known faults. If the analysis process discovers that a change request is invalid duplicated or has already been considered the change should be rejected. The reason for the rejection should be returned to the person who submitted the change request. For valid changes, the next stage of the process is change assessment and costing. The impact of the change on the rest of the system must be checked. A technical analysis must be made of how to implement the change. The cost of making the change and possibly changing other system components to accommodate the change is then estimated. This should be recorded on the change request form. This assessment process may use the configuration database where component interrelation is recorded. The impact of the change on other components may then be assessed. Unless the change involves simple correction of minor errors on screen displays or in documents, it should then be submitted to a change control board (CCB) who decide whether or not the change should be accepted. The change control board considers the impact of the change from a strategic and organizational rather than a technical point of view. It decides if the change is economically justified and if there are good organizational reasons to accept the change. The term change control board sounds very formal. It implies a rather grand group which makes change decisions. Formally structured change control boards which include senior client and contractor staff are a requirement of military projects. For small or medium-sized projects, however, the change control board may simply consist of a project manager plus one or two engineers who are not directly involved in the software development. In some cases, there may only be a single change reviewer who gives advice on whether or not changes are justifiable.

When a set of changes has been approved, the software is handed over to the development of maintenance team for implementation. Once these have been completed, the revised software must be revalidated to check that these changes have been correctly implemented. The CM team, rather than the system developers, is responsible for building a new version or release of the software. Change requests are themselves configuration items. They should be registered in the configuration database. It should be possible to use this database to discover the status of change requests and the change requests, which are associated with specific software components. As software components are changed, a record of the changes made to each component should be maintained. This is sometimes called the derivation history of a component. One way to maintain such a record is in a standardized comment prologue kept at the beginning of the component. This should reference the change request associated with the software change. The change management process is very procedural. Each person involved in the process is responsible for some activity. They complete this activity then pass on the forms and associated configuration items to someone else. The procedural nature of this process means that a change process model can be designed and integrated with a version management system. This model may then be interpreted so that the right documents are passed to the right people at the right time.

B) Version and Release Management : Version management are the processes of identifying and keeping track of different versions and releases of a system. Version managers must devise procedures to ensure that different versions of a system may be retrieved when required and are not accidentally changed. They may also work with customer liaison staff to plan when new releases of a system should be distributed. A system version is an instance of a system that differs, in some way, from other instances. New versions of the system may have different functionality, performance or may repair system faults. Some versions may be functionally equivalent but designed for different hardware or software configurations. If there are only small differences between versions, one of these is sometimes called a variant of the other. A system release is a version that is distributed to customers. Each system release should either include new functionality or should be intended for a different hardware platform. Normally, there are more versions of a system than

releases. Some versions may never be released to customers. For example, versions may be created within an organization for internal development or for testing. A release is not just an executable program or set of programs. It usually includes: (1) Configuration files defining how the release should be configured for particular installations. (2) Data files which are needed for successful system operation. (3) An installation program which is used to help install the system on target hardware. (4) Electronic and paper documentation describing the system. All this information must be made available on some medium, which can be read by customers for that software. For large systems, this may be magnetic tape. For smaller systems, floppy disks may be used. Increasingly, however, releases are distributed on CD-ROM disks because of their large storage capacity. When a system release is produced, it is important to record the versions of the operating system, libraries, compilers and other tools used to build the software. If it has to be rebuilt at some later date, it may be necessary to reproduce the exact platform configuration. In some cases, copies of the platform software and tools may also be placed under version management. Some automated tool almost always supports version management. This tool is responsible for managing the storage of each system version.

2. Discuss the Control models in details. The models for structuring a system are concerned with how a system is decomposed into sub-systems. To work as a system, sub-systems must be controlled so that their services are delivered to the right place at the right time. Structural models do not (and should not) include control information. Rather, the architect should organize the sub-systems according to some control model, which supplements the structure model is used. Control models at the architectural level are concerned with the control flow between sub-systems. Two general approaches to control can be identified:

(1) Centralized control: One sub-system has overall responsibility for control and starts and stops other sub-systems. It may also devolve control to another sub-system but will expect to have this control responsibility returned to it. (2) Event-based control: Rather than control information being embedded in a sub-system, each sub-system can respond to externally generated events. These events might come from other sub-systems or from the environment of the system. Control models supplement structural models. All the above structural models may be implemented using either centralized or event-based control. Centralized control In a centralized control model, one sub-system is designated as the system controller and has responsibility for managing the execution of other subsystems. Event-driven systems In centralized control models, control decisions are usually determined by the values of some system state variables. By contrast, event-driven control models are driven by externally generated events. The distinction between an event and a simple input is that the timing of the event is outside the control of the process which handless that event. A sub-system may need to access state information to handle these events but this state information does not usually determine the flow of control. There are two event-driven control models: (1) Broadcast models: In these models, an event is, in principle, broadcast to all sub-systems. Any sub-system, which is designed to handle that event, responds to it. (2) Interrupt-driven models: These are exclusively used in real-time systems where an interrupt handler detects external interrupts. They are then passed to some other component for processing. Broadcast models are effective in integrating sub-systems distributed across different computers on a network. Interrupt-driven models are used in real-time systems with stringent timing requirements. The advantage of this approach to control is that it allows very fast responses to events to be implemented. Its disadvantages are that it is complex to program and difficult to validate.

3. Using examples describe how data flow diagram may be used to document a system design. What are the advantages of using this type of design model? Data-flow models: Data-flow model is a way of showing how data is processed by a system. At the analysis level, they should be used to model the way in which data is processed in the existing system. The notations used in these models represents functional processing, data stores and data movements between functions. Data-flow models are used to show how data flows through a sequence of processing steps. The data is transformed at each step before moving on to the next stage. These processing steps or transformations are program functions when data-flow diagrams are used to document a software design. Figure shows the steps involved in processing an order for goods (such as computer equipment) in an organization.

Data flow diagrams of Order processing The model shows how the order for the goods moves from process to process. It also shows the data stores that are involved in this process. There are various notations used for data-flow diagrams. In figure rounded rectangles represent processing steps, arrow annotated with the data name represent flows and rectangles represent data stores (data sources). Data-flow diagrams have the advantage that, unlike some other modelling notations, they are simple and intuitive. These diagrams are not a good way to describe subsystem with complex interfaces.

The advantages of this architecture are:

(1) It supports the reuse of transformations. (2) It is intuitive in that many people think of their work in terms of input and output processing. (3) Evolving system by adding new transformations is usually straightforward. (4) It is simple to implement either as a concurrent or a sequential system.

4. Describe the Classic Invalid assumptions with respect to Assessment of Process Life Cycle Models. Classic Invalid Assumptions Four unspoken assumptions that have played an important role in the history of software development are considered next. First Assumption: Internal or External Drivers The first unspoken assumption is that software problems are primarily driven by internal software factors. Granted this supposition, the focus of problem solving will necessarily be narrowed to the software context, thereby reducing the role of people, money, knowledge, etc. in terms of their potential to influence the solution of problems. Excluding the people factor reduces the impact of disciplines such as management (people as managers); marketing (people as customers); and psychology (people as perceivers). Excluding the money factor reduces the impact of disciplines such as economics (software in terms of business value cost and benefit); financial management (software in terms of risk and return); and portfolio management (software in terms of options and alternatives). Excluding the knowledge factor reduces the impact of engineering; social studies; politics; language arts; communication sciences; mathematics; statistics; and application area knowledge (accounting, manufacturing, World Wide Web, government, etc). It has even been argued that the entire discipline of software engineering emerged as a reaction against this assumption and represented an attempt to view software development from a broader perspective. Examples range from the emergence of requirements engineering to the spiral model to human computer interaction (HCI). Nonetheless, these developments still viewed nonsoftware-focused factors such as ancillary or external drivers and failed to place software development in a comprehensive, interdisciplinary context. Because software development problems are highly interdisciplinary in nature, they can only be understood using interdisciplinary analysis and capabilities. In fact, no purely technical software problems or products exist because every software

product is a result of multiple factors related to people, money, knowledge, etc., rather than only to technology. Second Assumption: Software or Business Processes A second significant unspoken assumption has been that the software development process is independent of the business processes in organizations. This assumption implied that it was possible to develop a successful software product independently of the business environment or the business goals of a firm. This led most organizations and business firms to separate software development work, people, architecture, and planning from business processes. This separation not only isolated the software-related activities, but also led to different goals, backgrounds, configurations, etc. for software as opposed to business processes. As a consequence, software processes tended to be driven by their internal purposes, which were limited to product functionality and not to product effectiveness. This narrow approach had various negative side effects on software development. For example, the software process was allowed to be virtually business free. Once the product was finalized, it was tested and validated only for functionality, as opposed to being verified for conformity to stakeholder goals. As a result, even if the product did not effectively solve the underlying business problems or create a quantifiable business value for the organization, it could still pass its test. Because software development was not synchronized with the business process, software problems could be solved without actually solving business problems. Third Assumption: Processes or Projects A third unspoken assumption was that the software project was separate from the software process. Thus, a software process was understood as reflecting an area of computer science concern, but a software project was understood as a business school interest. If one were a computer science specialist, one would view a quality software product as the outcome of a development process that involved the use of good algorithms, data base deign, and code. If one were an MIS specialist, one would view a successful software system as the result of effective software economics and software management. This dichotomy ignored the fact that the final product was identical regardless of who produced it or how it was produced. The assumption reinforced the unwise isolation of project management from the software development process, thus increasing the likelihood of product failure. In contrast to this assumption, interdisciplinary thinking combines the process with the project; computer science with the MIS approach; and software economics with software design and implementation in a unified approach. Just as in the case of the earlier

assumptions, this assumption overlooks the role of business in the software development process. Fourth Assumption: Process Centered or Architecture Centered There are currently two broad approaches in software engineering; one is process centered and the other is architecture centered. In process-centered software engineering, the quality of the product is seen as emerging from the quality of the process. This approach reflects the concerns and interests of industrial engineering, management, and standardized or systematic quality assurance approaches such as the Capability Maturity Model and ISO. The viewpoint is that obtaining quality in a product requires adopting and implementing a correct problem-solving approach. If a product contains an error, one should be able to attribute and trace it to an error that occurred somewhere during the application of the process by carefully examining each phase or step in the process. In contrast, in architecture-centered software engineering, the quality of the software product is viewed as determined by the characteristics of the software design. Studies have shown that 60 to 70 percent of the faults detected in software projects are specification or design faults. Because these faults constitute such a large percentage of all faults within the final product, it is critical to implement design-quality metrics. Implementing design-quality assurance in software systems and adopting proper design metrics have become key to the development process because of their potential to provide timely feedback. This allows developers to reduce costs and development time by ensuring that the correct measurements are taken from the very beginning of the project before actual coding commences. Decisions about the architecture of the design have a major impact on the behavior of the resulting software particularly the extent of development required; reliability; reusability; understandability; modi-fiability; and maintainability of the final product, characteristics that play a key role in assessing overall design quality. However, an architecture-centered approach has several drawbacks. In the first place, one only arrives at the design phase after a systematic process. The act or product of design is not just a model or design architecture or pattern, but a solution to a problem that must be at least reasonably well defined. For example, establishing a functional design can be done by defining architectural structure charts, which in turn are based on previously determined data flow diagrams, after which a transformational or transitional method can be used to convert the data flow diagrams into structure charts. The data flow diagrams are outcomes of requirements analysis process based on a preliminary inspection of project feasibility. Similarly, designing object-oriented architectures in UML requires first building use-case scenarios and static object models prior to moving to the design phase.

A further point is that the design phase is a process involving architectural, interface, component, data structure, and database design (logical and physical). The design phase cannot be validated or verified without correlating or matching its outputs to the inputs of the software development process. Without a process design, one could end up building a model, pattern, or architecture that was irrelevant or at least ambivalent because of the lack of metrics for evaluating whether the design was adequate. In a comprehensive process model, such metrics are extracted from predesign and postdesign phases. Finally, a process is not merely a set of documents, but a problem-solving strategy encompassing every step needed to achieve a reliable software product that creates business value. A process has no value unless it designs quality solutions.

5. Describe the concept of Software technology as a limited business tool.

Software Technology as a Limited Business Tool What Computers Cannot Do? Software technology enables business to solve problems more efficiently than otherwise; however, as with any tool, it has its limitations. Solving business problems involves many considerations that transcend hardware or software capabilities; thus, software solutions can only become effective when they are placed in the context of a more general problem-solving strategy. Software solutions should be seen as essential tools in problem solving that are to be combined with other interdisciplinary tools and capabilities. This kind of interoperation can be achieved by integrating such tools with the software development process. Additionally, the software development process can also be used as a part of a larger problem-solving process that analyzes business problems and designs and generates working solutions with maximum business value. Some examples of this are discussed in the following sections. People have different needs that change over time Software technology is limited in its ability to recognize the application or cognitive stylistic differences of individuals or to adapt to the variety of individual needs and requirements. These differences among individuals have multiple causes and include: Use of different cognitive styles when approaching problem solving Variations in background, experience, levels and kinds of education, and, even more broadly, diversity in culture, values, attitudes, ethical standards, and religions

Different goals, ambitions, and risk-management strategies Assorted levels of involvement and responsibilities in the business organizations process A software system is designed once to work with the entire business environment all the time. However, organizational needs are not stable and can change for many reasons even over short periods of time due to changes in personnel, task requirements, educational or training level, or experience. Designing a software system that can adjust, customize, or personalize to such a diversity of needs and variety of cognitive styles in different organizations and dispersed locations is an immense challenge. It entails building a customizable software system and also necessitates a continuous development process to adapt to ongoing changes in the nature of the environment. Most Users Do not Understand Computer Languages A software solution can only be considered relevant and effective after one has understood the actual user problems. The people who write the source code for computer applications use technical languages to express the solution and, in some cases, they do not thoroughly investigate whether their final product reflects what users asked for. The final product is expected to convert or transform the users language and expectations in a way that realizes the systems requirements. Otherwise, the system will be a failure in terms of meeting its stated goals appropriately and will fail its validation and verification criteria. In a utopian environment, end-users could become sufficiently knowledgeable in software development environments and languages so that they could write their software to ensure systems were designed with their own real needs in mind. Of course, by the very nature of the division of expertise, this could rarely happen and so the distance in functional intention between user languages and their translation into programming languages is often considerable. This creates a barrier between software solutions reaching their intended market and users and customers finding reliable solutions. In many ways, the ideal scenario, in which one approached system design and development from a user point of view, was one of the driving rationales behind the original development of the software engineering discipline. Software engineering was intended as a problem-solving framework that could bridge the gap between user languages (requirements) and computer languages (the final product or source code). In software engineering, the users linguistic formulation of a problem is first understood and then specified naturally, grammatically, diagrammatically, mathematically, or even automatically; then, it is translated into a preliminary software architecture that can be coded in a programming

language. Thus, the underlying objective in software engineering is that the development solutions be truly reflective of user or customer needs. Decisions and Problems Complex and Ill Structured The existence of a negative correlation between organizational complexity and the impact of technical change (Keen 1981) is disputed. More complex organizations have more ill-structured problems (Mitroff & Turoff 1963). Consequently, their technical requirements in terms of information systems become harder to address. On the other hand, information technology may allow a complex organization to redesign its business processes so that it can manage complexity more effectively (Davenport & Stoddard 1994). On balance, a negative correlation is likely in complex organizations for many reasons. First, the complexity of an organization increases the degree of ambiguity and equivocality in its operations (Daft & Lengel 1986). Many organizations will not invest resources sufficient to carry out an adequately representative analysis of a problem. Therefore, requirement specifications tend to become less accurate and concise. Implementing a system based on a poor systems analysis increases the likelihood of failure as well as the likelihood of a lack of compatibility with the organizations diverse or competing needs. A demand for careful analysis and feasibility studies to allow a thorough determination of requirements might bring another dimension of complexity to the original problem. Second, technology faces more people-based resistance in complex organizations (Markus 1983). This can occur because a newly introduced system has not been well engineered according to accurate requirements in the first place, as well as because of the combination of social, psychological, and political factors found in complex organizations. One further factor complicating the effective delivery of computerized systems in large projects is the time that it takes to get key people involved. Finally, there are obvious differences in the rate of growth for complex organizations and information technology. Although information technology advances rapidly, complex organizations are subject to greater inertia and thus may change relatively slowly. Subsequently, incorporating or synthesizing technical change into an organization becomes a real challenge for individuals and departments and is affected by factors such as adaptability, training, the ability to upgrade, and maintainability. For such reasons, one expects a negative correlation between organizational complexity and the impact of technical change in terms of applying software technology and achieving intended organizational outcomes. Business View Software Technology as a Black Box for Creating Economic Value

Although software systems play a significant role in business organizations in terms of business added value, the traditional focus of many organizations has been on their role in cost reduction because software automation can reduce error, minimize effort, and increase productivity. Innovative applications can enable organizations to achieve more than traditional software goals, including the ability to compete more effectively, maximize profitability, and solve complex business problems. Business goals extend beyond direct financial benefits to include operational metrics involving customer satisfaction, internal processes, and an organizations innovation and improvement activities. Indeed, such operational measures drive future financial performance (Van Der Zee & De Jong 1999). Efficiency, quality, and market share and penetration are other important goals and measures of business vitality (Singleton, McLean, & Altman 1988) that can be dramatically improved by software systems. Moreover, research has shown that organizational performance can be maximized by clearly recognizing the interdependence between social and technological subsystems (Ryan & Harrison 2000). Software systems with Web capabilities can enhance business added value even more effectively through their ability to reach customers, affiliate with partners, and enrich information (Evans & Wurster 1999). Although some small organizations use software systems only as one of many tools to achieve financial goals, many organizations have become partially or totally dependent on software systems. Comprehensive software solutions are becoming the standard in many large organizations in which carefully thought out, unified software architectures are used to address business problems in levels of complexity that range from the operational to upper management and strategic levels. When an organization decides to assess whether it should develop a software system, a feasibility study is usually carried out to compare costs to benefits. Based on evaluating the appropriate organizational criteria and financial metrics, managers can decide whether to move affirmatively towards selecting an information system from among various alternative options. Organizations look at software as a tool that can make their businesses better, their customers happier, and their shareholders wealthier. Three criteria used in recent research on assessing business value for IT-based systems are productivity, business profitability, and consumer surplus (Hitt & Brynjolfsson 1996 and 1996). However, when a software system is being developed, the effective business value that it adds to the business performance of an organization tends to be neither explicitly addressed nor adequately quantified. In general, the focus in software development is generally on technical metrics intended to assure the quality of the software product, mainly in terms of its reliability characteristics. This is because software value is typically measured in terms of its intangible rather than tangible benefits on business. If a software system is reliable and

robust, is tested, and can be maintained efficiently, it is assumed that it has a business value regardless of the resultant business outcomes. The overall business effect on value is rarely considered, nor is the distance between the potential value of a system and its realized value (Davern & Kauffman 2000). Requirements validation is also an important metric when building software systems; however, the traditional forms of requirements focus on direct users needs and overlook business value in terms of comprehensive and quantifiable measurements. Although project management and fiscally driven factors are part of the software engineering process, they are often not integrated well into the process. Moreover, a gap remains between the discipline of management information systems and the software development disciplines: MIS looks at solutions from a managerial perspective, but technical concerns are more influential for software development. The direct connection between software development and business performance is inadequate and is not well quantified or recognized as a core of measures: general measures and emeasures. The arrows in Figure 6.1 are bidirectional because they reflect the mutual influences between the initial two variables of this framework. Business goals should be triggered to guide an optimal software development process. Thus, this framework represents a view of the initial impact of business metrics on the development process. The effect of the development process on business performance is also a key concern. Although many problem-solving strategies are used in software process modeling, the overall software process can be viewed in terms of certain basic elements or resources, such as activities, time, people, technology, and money. To reduce costs or increase benefits, one can think of combining activities; minimizing the cycle time; reducing the number of staff involved; maximizing profit; restructuring the composition of capital and finance; managing risk; or utilizing more technology. When the software process is reconsidered in these terms, business performance and metrics become the decisive driving force for building software process models. Consequently, the software process has two related roles. The first role is internal: to assure software project payoff with better return on the information system investment, as discussed earlier. The second is external: the software process should make an actual difference in business performance. The first role has been addressed extensively in the software development and project management literature. However, few research efforts have been dedicated to the study of the external impact of the software process on business performance. In fact, these roles should always be combined because external impacts cannot be studied without considering internal impacts. Figure 6.2 depicts this dual approach. This view represents the integration of the process and project themes and describes the evolution of software process models over the last several

decades. Business value has always been embedded implicitly or explicitly in almost every progress in software process modeling. Minimization of time was behind the Rapid Application Development (RAD) and prototyping models. Risk control and reduction were major issues behind spiral models. The efficient use of human resources lies behind the dynamic models. The impact of user involvement in software process models reflects the importance of customer influence. Achieving competitive advantage in software systems is a key business value related to users and customers. However, little empirical examination of the affect of the different problem solving strategies adopted in software process models takes place. The interdependencies between the software process and business performance must be a key issue. The former is driven by the need for business value, and the latter in turn depends more than ever on software. This encompasses users, analysts, project managers, software engineers, customers, programmers, and other stakeholders. Computer systems are human inventions and do not function or interact without human input. Some manifestations of this dependency are: Software applications are produced by people and are based on people needs. Software applications that do not create value will not survive in the marketplace. Computers cannot elastically adjust to real situations (they work with preexisting code and prescribed user inputs). Computers do not think; in terms of expertise, they reflect ifthen inputs or stored knowledge-based experiences. The main goal of software technology is to solve the problems of people. This dependency on the human environment makes the automation that computers facilitate meaningless without human involvement and underscores the limits of computer systems. It also highlights the central role that people play in making software technology an effective tool for producing desired outcomes.

6. Describe the round-trip problem solving approach. Round-Trip Problem-Solving Approach

The software engineering process represents a round-trip framework for problem solving in a business context in several senses. The software engineering process is a problem-solving process entailing that software engineering should incorporate or utilize the problem-solving literature regardless of its interdisciplinary sources. The value of software engineering derives from its success in solving business and human problems. This entails establishing strong relationships between the software process and the business metrics used to evaluate business processes in general. The software engineering process is a round-trip approach. It has a bidirectional character, which frequently requires adopting forward and reverse engineering strategies to restructure and reengi-neer information systems. It uses feedback control loops to ensure that specifications are accurately maintained across multiple process phases; reflective quality assurance is a critical metric for the process in general. The nonterminating, continuing character of the software development process is necessary to respond to ongoing changes in customer requirements and environmental pressures.

Master of Computer Application (MCA) Semester 3 MC0072 Computer Graphics 4 Credits

Assignment Set 1
MC0072 Computer Graphics (Book ID: B0810)

1. Write a short note on: a) Replicating pixels b) Moving pen c) Filling area between boundaries d) Approximation by thick polyline e) Line style ad pen style.

a ) Replicating pixels : A quick extension to the scan-conversation ineer loop to write multiple pixel at each computed pixel works resonably well fo lines: Here , pixel are duplicated in columns for lines with 1 < slope < 1 and in rows for all other lines . The effect, however , is that the line ends are always vertical or horizontal , which is not pleasing for rather thick lines as shown in figure

Furthermore , lines that are horizontal and vertical have different thickness for lines at an angle , where the thickness of the primitive is defined as the distence between the primitives boundaries parpendicular to its tangent. Thus if the thickness paramater is t , a horizontal or horizontal or verticalline has thickness t , whereas one drawn at 45o has an average thickness of This is another result of having fewer pixels in the line at an angle, as first noted in Section ; it descreses the brightness contrast with horizontal and vertical lines of the same thickness. Still another problem with pixel replication is the generic problem of even-numbered width: We cannot center the duplicated column or row about the selected pixel, so we must choose a side of the primitive to have

an extra pixel. Altogether, pixel replication is an efficient but crude approximation that works best for primitives that are not very thick. b) Moving pen : Choosing a rectangular pen whose center or corner travels along the single-pixel outline of the primitive works reasonably ell for lines; it produces the line shown in figure below

Notice that this line ios similar to that produced by pixel replication but is thicker at the endpoints. As with pixel replication, because the pen stays vertically aligned, the perceived thickness of the primitive varies as a function of the primitives angle, but in the opposite way:

The width is thinnest for horizontal segments and thickest for segments with slope of . An ellipse are, for example , varies in thickness alone its entire trajectory, being of the specified thickness when the tangent is nearly horizontal or vertical, and thickened by a factor of around . this problem would be eliminated if the square turned to follow the path, but it is much better to use a circular cross-section so that the thickness is angle-independent. Now lets look at how to implement the moving-pen algorithm for the simple case of an upright rectangular or circular cross-section ( also called footprint ) so that its center or corner is at the chosen pixel; for a circular footprint and a pattern drawn in opaque mode, we must in ddition mask off the bits outside the circular region, which is not as easy task unless our low-level copy pixel has a write mask for the destination region. The bure-force copy solution writes pixels more than

once , since the pens footprints overlap at adjacent pixels. A better technique that also handle the circular-cross-section problem is to use the snaps of the footprint to compute spans for successive footprints at adjacent pixels. As in the filling d) Approximation by thick polyline We can do piecewise-linear approximation of any primitive by computing points on the boundary ( with floating-point coordinates ) , then connecting these points with line segments to from a polyline. The advantage of this approach is that the algorithms for both line clipping and line scan conversion ( for thin primitives ) , and for polygon clipping and polygon scan conversion ( for thick primitives ) , are efficient. Naturall, the segnates must be quite short in places where the primitive changes direction rapidly. Ellipse ares cane be represented as ratios of parametric polynomials, which lend themselves readily to such piecewise-linear approximation. The individual line segments are then drawn as rectangles with the specified thickness. To make the thick approximation look nice, however, we must solve the problem of making thick lines join smoothly . e) Line style ad pen style. SRGPs line-style atribute can affect any outline primitive. In general, we must use conditional logic to test whether or not to write a pixel, writing only for 1s. we store the pattern write mask as a string of 16 booleans (e.g., a 16-bit integer ); it should therefore repeat every 16 pixels. We modify the unconditional WritePixel statemet in the line scan-conversion algorithm to handle this.There is a drawback to this technique, however. Since each bit in the mask corresponds to an iteration of the loop, and not to a unit distance along the line . the length of dashes varies with the angle of the line; a dash at an angle is longer than is a horizontal or vertical dash. For engineering drawings, this variation is unacceptable, and the dashes must be calculated and scan-coverted as individual line segments of length invariant with angle. Thick line are created as sequences of alternating solid and transparent rectangles whose vertices are calculated exactly as a function of the line style selected. The rectangles are then scan-converted individually; for horizontal and vertical line , the program may be able to copypixel the rectangle. Line style and pen style interact in thick outline primitive. The line style is used to calculate the rectangle for each dash, and each rectangle is filled with the selected pen pattern.

2. What is DDA line drawing algorithm explain it with the suitable example? discuss the merit and demerit of the algorithm.

DDA Line Algorithm 1: Read the line end points (X1,Y1) and (X2,Y2) such that they are equal . [ If equal then plot that point and end ] 2: x = 3. If Length= Else Length= end if 4. = (x2-x1)/length = (y2-y1)/length This makes either or equal to 1 because the length is either| x2-x1| or |y2-y1|, the incremental value for either x or y is 1. 5. x = x1+0.5 * sign ( y = y1+0.5*sign( ) ) then and y =

[Here the sign function makes the algorithm worl in all quadrant. It returns 1, 0,1 depending on whether its argument is <0, =0, >0 respectively. The factor 0.5 makes it possible to round the

values in the integer function rather than truncating them] 6. i=1 [begins the loop, in this loop points are plotted] 7. while(i length) { Plot (Integer(x), Integer(y)) x= x+x y= y+y i=i+1 } 8. stop Let us see few examples to illustrate this algorithm. Ex.1: Consider the line from (0,0) to (4,6). Use the simple DDA algorithm to rasterize this line. Sol: Evaluating steps 1 to 5 in the DDA algorithm we have x1=0, x2= 4, y1=0, y2=6 therefor Length = x = y= =6

/ Length = 4/6 /Length = 6/6-1

Initial values for x= 0+0.5*sign ( 4/6 ) = 0.5 y= 0+ 0.5 * sign(1)=0.5 Tabulating the results of each iteration in the step 6 we get,

The results are plotted as shown in the It shows that the rasterized line lies to both sides of the actual line, i.e. the algorithm is orientation dependent

Result for s simple DDA Ex. 2 : Consider the line from (0,0) to (-6,-6). Use the simple DDA algorithm to rasterize this line. Sol : x1=0, x2= -6,y1=0, y2=-6 Therefore Therefore Lenth = | X2 -- X1 | = | Y2-Y1| = 6 Ax = Ay = -1 Initial values for x= 0+0.5*sign (-1) = -0.5 y= 0+ 0.5 * sign(-1) = -0.5 Tabulating the results of each iteration in the step 6 we get,

The results are plotted as shown in the It shows that the rasterized line lies on the actual line and it is 450 line. Advantages of DDA Algorithm: 1. It is the simplest algorithm and it does not require special skills for implementation. 2. It is a faster method for calculating pixel positions than the direct use of equation y=mx + b. It eliminates the multiplication in the equation by making use of raster characteristics, so that appropriate increments are applied in the x or y direction to find the pixel positions along the line path

Disadvantages of DDA Algorithm 1. Floating point arithmetic in DDA algorithm is still time-consuming. 2. The algorithm is orientation dependent. Hence end point accuracy is poor.

3. Write a short note on: A) Reflection B) Sheer C) Rotation about an arbitrary axis

A ) Reflection : A reflection is a transformation that produces a mirror image of an object relative to an axis of reflection. We can choose an axis of reflection in the xy plane or perpendicular to the xy plane. The table below gives examples of some common reflection.

Reflection about y axis

B) Shear : A transformation that slants the shape of an objects is called the shear transformation. Two common shearing transformations are used. One shifts x coordinate values and other shifts y coordinate values. However, in both the

cases only one coordinate (x or y) changes its coordinates and other preserves its values. 1 X shear : The x shear preserves they coordinates, but changes the x values which causes vertical lines to tilt right or left as shown in the fig. 6.7. The transformation matrix for x shear is given as

2 Y Shear: The y shear preserves the x coordinates, but changes the y values which causes horizontal lines to transform into lines which slope up or down, as shown in the

The transformation matrix for y shear is given as

C) Rotation about an arbitrary axis: A rotation matrix for any axis that does not coincide with a coordinate axis can be set up as a composite transformation involving combinations of translation and the coordinate-axes rotations. In a special case where an object is to be rotated about an axis that is parallel to one of the coordinate axes we can obtain the resultant coordinates with the following transformation sequence. 1. Translate the object so that the rotation axis coincides with the parallel coordinate axis 2. Perform the specified rotation about that axis. 3. Translate the object so that the rotation axis is moved back to its original position When an object is to be rotated about an axis that is not parallel to one of the coordinate axes, we have to perform some additional transformations. The sequence of these transformations is given below. 1. Translate the object so that rotation axis specified by unit vector u passes through the coordinate origin. 2. Rotate the object so that the axis of rotation coincides with one of the coordinate axes. Usually the z axis is preferred. To coincide the axis of rotation to z axis we have to first perform rotation of unit vector u about x axis to bring it into xz plane and then perform rotation about y axis to coincide it with z axis. 3. Perform the desired rotation about the z axis 4. Apply the inverse rotation about y axis and then about x axis to bring the rotation axis back to its original orientation. 5. Apply the inverse translation to move the rotation axis back to its original position. As shown in the Fig. the rotation axis is defined with two coordinate points P1 and P2 and unit vector u is defined along the rotation of axis as

Where V is the axis vector defined by two points P1 and P2 as V = P2 P1= (x2 x1, y2 y1, z2 z1) The components a, b and c of unit vector us are the direction cosines for the rotation axis and they can be defined as

4. Describe the following: A) Basic Concepts in Line Drawing C) Bresenhams Line Drawing Algorithm A) Basic Concepts in Line Drawing : B) Digital Differential Analyzer

A line can be represented by the equation: y = m * x + b; where 'y' and 'x' are the co-ordinates; 'm' is the slope i.e. a quantity which indicates how much 'y' increases when 'x' increases by one unit. 'b' is the

intercept of line on 'y' axis (However we can safely ignore it for now). So what's the trick behind the equation. Would you believe it, we already have the algorithm developed ! Pay close attention to the definition of 'm'. It states A quantity that indicates by how much 'y' changes when 'x' changes by one unit. So, instead of determining 'y' for every value of 'x' we will let 'x' have certain discrete values and determine 'y' for those values. Moreover since we have the quantity 'm'; we will increase 'x' by one and add 'm' to 'y' each time. This way we can easily plot the line. The only thing we are left with now, is the actual calculations. Let us say we have two endpoints (xa,ya) and (xb,yb). Let us further assume that (xb-xa) is greater than (yb-ya) in magnitude and that (xa < xb). This means we will move from (xa) to the right towards (xb) finding 'y' at each point. The first thing however that we need to do is to find the slope 'm'. This can be done using the formulae: m = (yb - ya) / (xb - xa) Now we can follow the following algorithm to draw our line.

Let R represent the row and C the column Set C = Round(xa) Let F = Round(xb) Let H = ya Find the slope m Set R = Round(H) Plot the point at R,C on the screen Increment C {C+1} If C <= F continue else goto step (12) Add m to H Goto step (6) STOP

B) Digital Differential Analyzer :

A hardware or software implementation of a digital differential analyzer (DDA) is used for linear interpolation of variables over an interval between start and end point. DDAs are used for rasterization of lines, triangles and polygons. In its simplest implementation the DDA algorithm interpolates

values in interval [(xstart, ystart), (xend, yend)] by computing for each xi the equations xi = xi1+1/m, yi = yi1 + m, where x = xend xstart and y = yend ystart and m = y/x

The DDA method can be implemented using floating-point or integer arithmetic. The native floating-point implementation requires one addition and one rounding operation per interpolated value (e.g. coordinate x, y, depth, color component etc.) and output result. This process is only efficient when an FPU with fast add and rounding operation is available. The fixed-point integer operation requires two additions per output cycle, and in case of fractional part overflow, one additional increment and subtraction. The probability of fractional part overflows is proportional to the ratio m of the interpolated start/end values. DDAs are well suited for hardware implementation and can be pipelined for maximized throughput. where m represents the slope the line and c is the y intercept . this slope can be expressed in DDA as

yend-ystart m= ----------xend-xstart

in fact any two consecutive point(x,y) laying on this line segment should satisfy the equation.
C) Bresenhams Line Drawing Algorithm :

The common conventions will be used:

the top-left is (0,0) such that pixel coordinates increase in the right and down directions (e.g. that the pixel at (1,1) is directly above the pixel at (1,2)), and

that the pixel centers have integer coordinates.

The endpoints of the line are the pixels at (x0, y0) and (x1, y1), where the first coordinate of the pair is the column and the second is the row. The algorithm will be initially presented only for the octant in which the segment goes down and to the right (x0x1 and y0y1), and its horizontal projection x1 x0 is longer than the vertical projection y1 y0 (the line has a slope whose absolute value is less than 1 and greater than 0.) In this octant, for each column x between x0 and x1, there is exactly one row y (computed by the algorithm) containing a pixel of the line, while each row between y0 and y1 may contain multiple rasterized pixels. Bresenham's algorithm chooses the integer y corresponding to the pixel center that is closest to the ideal (fractional) y for the same x; on successive columns y can remain the same or increase by 1. The general equation of the line through the endpoints is given by:

Since we know the column, x, the pixel's row, y, is given by rounding this quantity to the nearest integer:

The slope (y1 y0) / (x1 x0) depends on the endpoint coordinates only and can be precomputed, and the ideal y for successive integer values of x can be computed starting from y0 and repeatedly adding the slope. In practice, the algorithm can track, instead of possibly large y values, a small error value between 0.5 and 0.5: the vertical distance between the rounded and the exact y values for the current x. Each time x is increased, the error is increased by the slope; if it exceeds 0.5, the rasterization y is increased by 1 (the line continues on the next lower row of the raster) and the error is decremented by 1.0.

In the following pseudocode sample plot(x,y) plots a point and abs returns absolute value:

function line(x0, x1, y0, y1) int deltax := x1 - x0 int deltay := y1 - y0 real error := 0 real deltaerr := abs (deltay / deltax) part int y := y0 for x from x0 to x1 plot(x,y) error := error + deltaerr if error 0.5 then y := y + 1 error := error - 1.0 // Assume deltax != 0 (line is not vertical), // note that this division needs to be done in a way that preserves the fractional

Assignment Set 2
MC0072 Computer Graphics (Book ID: B0810)

1. Write a short note on the followings: A) Video mixing B) Frame buffer C) Color table
A) Video mixing

The main concept of a professional vision mixer is the bus, basically a row of buttons with each button representing a video source. Pressing such a button will select the video out of that bus. Older video mixers had two equivalent buses (called the A and B bus; such a mixer is known as an A/B mixer). One of these buses could be selected as the main out (or program) bus. Most modern mixers, however, have one bus that is always the program bus, the second main bus being the preview (sometimes called preset) bus. These mixers are called flipflop mixers, since the selected source of the preview and program buses can be exchanged. Both preview and program bus usually have their own video monitor. Another main feature of a vision mixer is the transition lever, also called a Tbar or Fader Bar. This lever, similar to an audio fader, creates a smoothtransition between two buses. Note that in a flip-flop mixer, the position of the main transition lever does not indicate which bus is active, since the program

bus is always the active or hot bus. Instead of moving the lever by hand, a button (commonly labeled "mix", "auto" or "auto trans") can be used, which performs the transition over a user-defined period of time. Another button, usually labeled "cut" or "take", directly swaps the preview to the program without any transition. The type of transition used can be selected in the transition section. Common transitions include dissolves (similar to an audiocrossfade) and pattern wipes. The third bus used for compositing is the key bus. A mixer can actually have more than one key bus, but they usually share only one set of buttons. Here, one signal can be selected for keying over the program bus. The digital on-screen graphic image that will be seen in the program is called the fill, while the mask used to cut the key's translucence is called the source. This source, e.g. chrominance, luminance, pattern (the internal pattern generator is used) or split (an additional video signal similar to an alpha channel is used) and can be selected in the keying section of the mixer. Note that instead of the key bus, other video sources can be selected for the fill signal, but the key bus is usually the most convenient method for selecting a key fill. Usually, a key is turned on and off the same way a transition is. For this, the transition section can be switched from program (or background) mode to key mode. Often, the transition section allows background video and one or more keyers to be transitioned separately or in any combination with one push of the "auto" button. These three main buses together form the basic mixer section called Program/Preset or P/P. Bigger production mixers may have a number of additional sections of this type, which are calledMix/Effects (M/E for short) and numbered. Any M/E section can be selected as a source in the P/P stage, making the mixer operations much more versatile, since effects or keys can be composed "offline" in an M/E and then go "live" at the push of one button. After the P/P section, there is another keying stage called the downstream keyer (DSK). It is mostly used for keying text or graphics, and has its own "Cut" and "Mix" buttons. The signal before the DSK keyer is called clean feed. After the DSK is one last stage that overrides any signal with black, usually called Fade To Black or FTB. Modern vision mixers may also have additional functions, such as serial communications with the ability to use proprietary communications protocols, control aux channels for routing video signals to other sources than the program

out, macro programming, and DVE (Digital Video Effects) capabilities. Mixers are often equipped with effects memory registers, which can store a snapshot of any part of a complex mixer configuration and then recall the setup with one button press.
B) Frame buffer

A framebuffer is a video output device that drives a video display from a memory buffer containing a complete frame of data. The information in the memory buffer typically consists of color values for every pixel (point that can be displayed) on the screen. Color values are commonly stored in 1-bit binary (monochrome), 4-bit palettized, 8-bit palettized, 16-bit highcolor and 24-bit truecolor formats. An additional alpha channel is sometimes used to retain information about pixel transparency. The total amount of the memory required to drive the framebuffer depends on the resolution of the output signal, and on the color depth and palette size. Framebuffers differ significantly from the vector displays that were common prior to the advent of the framebuffer. With a vector display, only the vertices of the graphics primitives are stored. The electron beam of the output display is then commanded to move from vertex to vertex, tracing an analog line across the area between these points. With a framebuffer, the electron beam (if the display technology uses one) is commanded to trace a left-to-right, top-to-bottom path across the entire screen, the way a television renders a broadcast signal. At the same time, the color information for each point on the screen is pulled from the framebuffer, creating a set of discrete picture elements (pixels).
2. Describe the following with respect to methods of generating characters: A) Stroke method B) Starbust method C) Bitmap method

A) Stroke method:

This method uses small line segments to generate a character. The small series of line segments are drawn like a stroke of pen to form a character as shown in the figure above. We can build our own stroke method character generator by calls to the line drawing algorithm. Here it is necessary to decide which line segments are needed for each character and then drawing these segments using line drawing algorithm. B.Starbust method: In this method a fix pattern of line segments are used to generate characters. As shown in the fig. 5.20, there are 24 line segments. Out of these 24 line segments, segments required to display for particular character are highlighted. This method of character generation is called starbust method because of its characteristic appearance

Figure shows the starbust patterns for characters A and M. the patterns for particular characters are stored in the form of 24 bit code, each bit representing one line segment. The bit is set to one to highlight the line segment; otherwise it is set to zero. For example, 24-bit code for Character A is 0011 0000 0011 1100 1110 0001 and for character M is 0000 0011 0000 1100 1111 0011. This method of character generation has some disadvantages. They are 1. The 24-bits are required to represent a character. Hence more memory is required 2. Requires code conversion software to display character from its 24-bit code 3. Character quality is poor. It is worst for curve shaped characters. C) Bitmap method: The third method for character generation is the bitmap method. It is also called dot matrix because in this method characters are represented by an array of dots in the matrix form. It is a two dimensional array having columns and rows. An 5 7 array is commonly used to represent characters as shown in the fig 5.21. However 7 9 and 9 13 arrays are also used. Higher resolution devices such as inkjet printer or laser printer may use character arrays that are over 100 100.

character A in 5 7 dot matrix format Each dot in the matrix is a pixel. The character is placed on the screen by copying pixel values from the character array into some portion of the screens frame buffer. The value of the pixel controls the intensity of the pixel
3. Discuss the homogeneous coordinates for translation, rotation and sheer

In linear algebra, linear transformations can be represented by matrices. If T is a linear transformation mapping Rn to Rm and x is a column vector with n entries, then

for some mn matrix A, called the transformation matrix of T. There is an alternative expression of transformation matrices involving row vectors that is preferred by some authors.

Translation Essentially, translation refers to adding a vector to each point in a shape. Imagine a point at (0, 0, 0). If you wanted to translate that point up one unit, you could convert the point to a vector and add it to another vector, like this:

_ _ _ _ | | | | | 0 | | 0 | | | | | | 0 |+| 1 | | | | | | 0 | | 0 | |_ _| |_ _|

_ _ | | | 0 | | | | 1 | | | | 0 | |_ _|

In essence, translating an object amounts to adding a vector to each point in the object. Rotations To consider 3D rotation around the Z axis, all you need to do is recognize that the z value does not change at all. So a 3D rotation around the Z axis can be calculated with three formulas: x' = (x * cos(a)) + (y * -sin(a)) y' = (x * sin(a)) + (y * cos(a)) z' = z Likewise, a rotation around the Y axis keeps the Y value alone, but involves trig functions on the X and Z values. Here's the function list for rotating around the Y axis: x' = (x * cos(a)) + (z * sin(a))

y' = y z' = (x * -sin(a)) + (z * cos(a)) ... and rotation about the x axis uses a similar set of functions. x' = x y' = (y * cos(a)) + (z * -sin(a)) z' = (y * sin(a)) + (z * cos(a)) Don't get all hung up on memorizing these formulas. You can look them up when you need them. The more important thing is to understand the pattern. You'll see a technique for combining these formulas into a cleaner structure in a few minutes. For the time being, note that there is a consistent pattern emerging.

hearing For shear mapping (visually similar to slanting), there are two possibilities. For a shear parallel to the x axis has x' = x + ky and y' = y; the shear matrix, applied to column vectors, is:

A shear parallel to the y axis has x' = x and y' = y + kx, which has matrix form:

4. Describe the following with respect to Projection: A) Parallel B) Types of Parallel Projections

Parallel projections have lines of projection that are parallel both in reality and in the projection plane. Parallel projection corresponds to a perspective projection with an infinite focal length (the distance from the image plane to the projection point), or "zoom". Within parallel projection there is an ancillary category known as "pictorials". Pictorials show an image of an object as viewed from a skew direction in order to reveal all three directions (axes) of space in one picture. Because pictorial projections innately contain this distortion, in the rote, drawing instrument for pictorials, some liberties may be taken for economy of effort and best effect.

Types of Parallel Projections ::

Axonometric projection
Axonometric projection is a type of orthographic projection where the plane or axis of the object depicted is not parallel to the projection plane, such that multiple sides of an object are visible in the same image. It is further subdivided into three groups: isometric, dimetric and trimetric projection, depending on the exact angle at which the view deviates from the orthogonal. A typical characteristic of axonometric pictorials is that one axis of space is usually displayed as vertical.

Comparison of several types of graphical projection.

Isometric projection
In isometric pictorials (for protocols see isometric projection), the most common form of axonometric projection,[3] the direction of viewing is such that the three axes of space appear equally foreshortened, of which the displayed angles among them and also the scale of foreshortening are universally known. However in creating a final, isometric instrument drawing, in most cases a fullsize scale, i.e., without using a foreshortening factor, is employed to good effect because the resultant distortion is difficult to perceive.

Dimetric projection
In dimetric pictorials (for protocols see dimetric projection), the direction of viewing is such that two of the three axes of space appear equally foreshortened, of which the attendant scale and angles of presentation are determined according to the angle of viewing; the scale of the third direction (vertical) is determined separately. Approximations are common in dimetric drawings.

Trimetric projection
In trimetric pictorials (for protocols see trimetric projection), the direction of viewing is such that all of the three axes of space appear unequally foreshortened. The scale along each of the three axes and the angles among them are determined separately as dictated by the angle of viewing. Approximations in trimetric drawings are common, and trimetric perspective is seldom used.

Oblique projection
In oblique projections the parallel projection rays are not perpendicular to the viewing plane as with orthographic projection, but strike the projection plane at an angle other than ninety degrees. In both orthographic and oblique projection, parallel lines in space appear parallel on the projected image. Because of its simplicity, oblique projection is used exclusively for pictorial purposes rather than for formal, working drawings. In an oblique pictorial drawing, the displayed angles among the axes as well as the foreshortening factors (scale) are arbitrary. The distortion created thereby is usually attenuated by aligning one plane of the imaged object to be parallel with the plane of projection thereby creating a true

shape, full-size image of the chosen plane. Special types of oblique projections are cavalier projection and cabinet projection.

Master of Computer Application (MCA) Semester 3 MC0073 System Programming 4 Credits

Assignment Set 1
MC0073 System Programming (Book ID: B0811)
1) Explain the following, A) Lexical Analysis A) Lexical Analysis : lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner. A lexer often exists as a single function which is called by a parser or another function. B) Syntax Analysis: B) Syntax Analysis

Parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens (for example, words), to determine its grammatical structure with respect to a given (more or less) formal grammar. Parsing can also be used as a linguistic term, for instance when discussing how phrases are divided up in garden path sentences.

Parsing is also an earlier term for the diagramming of sentences of natural languages, and is still used for the diagramming of inflected languages, such as the Romance languages or Latin. The term parsing comes from Latin pars (rtinis), meaning part (of speech). Parsing is a common term used in psycholinguistics when describing language comprehension. In this context, parsing refers to the way that human beings, rather than computers, analyze a sentence or phrase (in spoken language or text) "in terms of grammatical constituents, identifying the parts of speech, syntactic relations, etc." This term is especially common when discussing what linguistic cues help speakers to parse garden-path sentences.

2. What is RISC and how it is different from the CISC? CISC CISC-Means Complex instruction set architecure.A CISC system has complex instructions such as direct addition between data in two memory locations. Typically CISC chips have a large amount of different and complex instructions. The philosophy behind it is that hardware is always faster than software, therefore one should make a powerful instructionset, which provides programmers with assembly instructions to do a lot with short programs. In common CISC chips are relatively slow (compared to RISC chips) per instruction, but use little (less than RISC) instructions. RISC RISC-Means Reduced Instruction Set Computer.a Risc system hasreduced number of instructions and more importantly it is load store architecture were pipelining can be implemented easily. Therefore fewer, simpler and faster instructions would be better, than the large, complex and slower CISC instructions. However, more instructions are needed to accomplish a task. An other advantage of RISC is that - in theory - because of the more simple instructions, RISC chips require fewer transistors, which makes them easier to design and cheaper to produce.

RISC vs CISC There is still considerable controversy among experts about which architecture is better. Some say that RISC is cheaper and faster and therefor the architecture of the future. Others note that by making the hardware simpler, RISC puts a greater burden on the software. Software needs to become more complex. Software developers need to write more lines for the same tasks. Therefore they argue that RISC is not the architecture of the future, since conventional CISC chips are becoming faster and cheaper anyway. RISC has now existed more than 10 years and hasn't been able to kick CISC out of the market. If we forget about the embedded market and mainly look at the market for PC's, workstations and servers I guess a least 75% of the processors are based on the CISC architecture. Most of them the x86 standard (Intel, AMD, etc.), but even in the mainframe territory CISC is dominant via the IBM/390 chip. Looks like CISC is here to stay Is RISC than really not better? The answer isn't quite that simple. RISC and CISC architectures are becoming more and more alike. Many of today's RISC chips support just as many instructions as yesterday's CISC chips. The PowerPC 601, for example, supports more instructions than the Pentium. Yet the 601 is considered a RISC chip, while the Pentium is definitely CISC. Further more today's CISC chips use many techniques formerly associated with RISC chips. So simply said: RISC and CISC are growing to each other. x86 An important factor is also that the x86 standard, as used by for instance Intel and AMD, is based on CISC architecture. X86 is th standard for home based PC's. Windows 95 and 98 won't run at any other platform. Therefore companies like AMD an Intel will not abandoning the x86 market just overnight even if RISC was more powerful. Changing their chips in such a way that on the outside they stay compatible with the CISC x86 standard, but use a RISC architecture inside is difficult and gives all kinds of overhead which could undo all the possible gains. Nevertheless Intel and AMD are doing this more or less with their current CPU's. Most acceleration mechanisms available to RISC CPUs are now available to the x86 CPU's as well.

Since in the x86 the competition is killing, prices are low, even lower than for most RISC CPU's. Although RISC prices are dropping also a, for instance, SUN UltraSPARC is still more expensive than an equal performing PII workstation is. Equal that is in terms of integer performance. In the floating point-area RISC still holds the crown. However CISC's 7th generation x86 chips like the K7 will catch up with that. The one exception to this might be the Alpha EV-6. Those machines are overall about twice as fast as the fastest x86 CPU available. However this Alpha chip costs about 20000, not something you're willing to pay for a home PC. Maybe interesting to mention is that it's no coincidence that AMD's K7 is developed in co-operation with Alpha and is for al large part based on the same Alpha EV-6 technology.

EPIC The biggest threat for CISC and RISC might not be eachother, but a new technology called EPIC. EPIC stands for Explicitly Parallel Instruction Computing. Like the word parallel already says EPIC can do many instruction executions in parallel to one another. EPIC is a created by Intel and is in a way a combination of both CISC and RISC. This will in theory allow the processing of Windows-based as well as UNIX-based applications by the same CPU.

3. Explain the following with respect to the design specifications of an Assembler: A) Data Structures B) pass1 & pass2 Assembler flow chart

Data Structure The second step in our design procedure is to establish the databases that we have to work with. Pass 1 Data Structures 1. Input source program

2. A Location Counter (LC), used to keep track of each instructions location. 3. A table, the Machine-operation Table (MOT), that indicates the symbolic mnemonic, for each instruction and its length (two, four, or six bytes) 4. A table, the Pseudo-Operation Table (POT) that indicates the symbolic mnemonic and action to be taken for each pseudo-op in pass 1. 5. A table, the Symbol Table (ST) that is used to store each label and its corresponding value. 6. A table, the literal table (LT) that is used to store each literal encountered and its corresponding assignment location. 7. A copy of the input to be used by pass 2.

Pass 2 Data Structures 1. Copy of source program input to pass1. 2. Location Counter (LC) 3. A table, the Machine-operation Table (MOT), that indicates for each instruction, symbolic mnemonic, length (two, four, or six bytes), binary machine opcode and format of instruction. 4. A table, the Pseudo-Operation Table (POT), that indicates the symbolic mnemonic and action to be taken for each pseudo-op in pass 2. 5. A table, the Symbol Table (ST), prepared by pass1, containing each label and corresponding value. 6. A Table, the base table (BT), that indicates which registers are currently specified as base registers by USING pseudo-ops and what the specified contents of these registers are. 7. A work space INST that is used to hold each instruction as its various parts are being assembled together. 8. A work space, PRINT LINE, used to produce a printed listing. 9. A work space, PUNCH CARD, used prior to actual outputting for converting the assembled instructions into the format needed by the loader.

10. An output deck of assembled instructions in the format needed by the loader.

Fig. 1.3: Data structures of the assembler Format of Data Structures The third step in our design procedure is to specify the format and content of each of the data structures. Pass 2 requires a machine operation table (MOT) containing the name, length, binary code and format; pass 1 requires only name and length. Instead of using two different tables, we construct single (MOT). The Machine operation table (MOT) and pseudo-operation table are example of fixed tables. The contents of these tables are not filled in or altered during the assembly process. The following figure depicts the format of the machine-op table (MOT) 6 bytes per entry Mnemonic Opcode (4bytes) characters Abbb Ahbb ALbb ALRB . Binary Opcode Instruction Instruction (1byte) length format (hexadecimal) (2 bits) (binary)(3bits) (binary) 5A 10 001 4A 10 001 5E 10 001 1E 01 000 . . . Not used here (3 bits)

b represents blank 3.3.2 The flowchart for Pass 1:

The primary function performed by the analysis phase is the building of the symbol table. For this purpose it must determine the addresses with which the symbol names used in a program are associated. It is possible to determine some address directly, e.g. the address of the first instruction in the program, however others must be inferred. To implement memory allocation a data structure called location counter (LC) is introduced. The location counter is always made to contain the address of the next memory word in the target program. It is initialized to the constant. Whenever the analysis phase sees a label in an assembly statement, it enters the label and the contents of LC in a new entry of the symbol table. It then finds the number of memory words required by the assembly statement and updates; the LC contents. This ensure: that LC points to the next memory word in the target program even when machine instructions have different lengths and DS/DC statements reserve different amounts of memory. To update the contents of LC, analysis phase needs to know lengths of different instructions. This information simply depends on the assembly language hence the mnemonics table can be extended to include this information in a new field called length. We refer to the processing involved in maintaining the location counter as LC processing. Flow hart for Pass 2

Fig. 1.6: Pass2 flowchart

4. Define the following, A) Parsing. B) Scanning. C) Token.

A) Parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens (for example, words), to determine its grammatical structure with respect to a given (more or less) formal grammar. Parsing can also be used as a linguistic term, for instance when discussing how phrases are divided up in garden path sentences. Parsing is also an earlier term for the diagramming of sentences of natural languages, and is still used for the diagramming of inflected languages, such as the Romance languages or Latin. The term parsing comes from Latin pars (rtinis), meaning part (of speech).

Parsing is a common term used in psycholinguistics when describing language comprehension. In this context, parsing refers to the way that human beings, rather than computers, analyze a sentence or phrase (in spoken language or text) "in terms of grammatical constituents, identifying the parts of speech, syntactic relations, etc." This term is especially common when discussing what linguistic cues help speakers to parse garden-path sentences.

B) Scanner, is usually based on a finite-state machine (FSM). It has encoded within it information on the possible sequences of characters that can be contained within any of the tokens it handles (individual instances of these character sequences are known as lexemes). For instance, an integer token may contain any sequence of numerical digit characters. In many cases, the first nonwhitespace character can be used to deduce the kind of token that follows and subsequent input characters are then processed one at a time until reaching a character that is not in the set of characters acceptable for that token (this is known as the maximal munch rule, or longest match rule). In some languages[which?] the lexeme creation rules are more complicated and may involvebacktracking over previously read characters.

C) Token is a string of characters, categorized according to the rules as a symbol (e.g., IDENTIFIER, NUMBER, COMMA). The process of forming tokens from an input stream of characters is calledtokenization and the lexer categorizes them according to a symbol type. A token can look like anything that is useful for processing an input text stream or text file. A lexical analyzer generally does nothing with combinations of tokens, a task left for a parser. For example, a typical lexical analyzer recognizes parentheses as tokens, but does nothing to ensure that each '(' is matched with a ')'.

Consider this expression in the C programming language: sum=3+2; Tokenized in the following table:

Lexeme

Token type

sum

Identifier

Assignment operator

Number

Addition operator

Number

End of statement

Tokens are frequently defined by regular expressions, which are understood by a lexical analyzer generator such as lex. The lexical analyzer (either generated automatically by a tool like lex, or hand-crafted) reads in a stream of characters, identifies the lexemes in the stream, and categorizes them into tokens. This is called "tokenizing." If the lexer finds an invalid token, it will report an error. Following tokenizing is parsing. From there, the interpreted data may be loaded into data structures for general use, interpretation, or compiling. 5. Describe the process of Bootstrapping in the context of Linkers. Boot straping: In computing, bootstrapping refers to a process where a simple system activates another more complicated system that serves the same purpose. It is a solution to the Chicken-and-egg problem of starting a certain system without the system already functioning. The term is most often applied to the process of starting up a computer, in which a mechanism is needed to execute the software

program that is responsible for executing software programs (the operating system). Bootstrap loading: The discussions of loading up to this point have all presumed that theres already an operating system or at least a program loader resident in the computer to load the program of interest. The chain of programs being loaded by other programs has to start somewhere, so the obvious question is how is the first program loaded into the computer? In modern computers, the first program the computer runs after a hardware reset invariably is stored in a ROM known as bootstrap ROM. as in "pulling ones self up by the bootstraps." When the CPU is powered on or reset, it sets its registers to a known state. On x86 systems, for example, the reset sequence jumps to the address 16 bytes below the top of the systems address space. The bootstrap ROM occupies the top 64K of the address space and ROM code then starts up the computer. On IBM-compatible x86 systems, the boot ROM code reads the first block of the floppy disk into memory, or if that fails the first block of the first hard disk, into memory location zero and jumps to location zero. The program in block zero in turn loads a slightly larger operating system boot program from a known place on the disk into memory, and jumps to that program which in turn loads in the operating system and starts it. (There can be even more steps, e.g., a boot manager that decides from which disk partition to read the operating system boot program, but the sequence of increasingly capable loaders remains.) Why not just load the operating system directly? Because you cant fit an operating system loader into 512 bytes. The first level loader typically is only able to load a single-segment program from a file with a fixed name in the top-level directory of the boot disk. The operating system loader contains more sophisticated code that can read and interpret a configuration file, uncompress a compressed operating system executable, address large amounts of memory (on an x86 the loader usually runs in real mode which means that its tricky to address more than 1MB of memory.) The full operating system can turn on the virtual memory system, loads the drivers it needs, and then proceed to run userlevel programs. Many Unix systems use a similar bootstrap process to get user-mode programs running. The kernel creates a process, then stuffs a tiny little program, only a few dozen bytes long, into that process. The tiny program executes a system call that runs /etc/init, the user mode initialization program that in turn runs configuration files and starts the daemons and login programs that a running system needs. None of this matters much to the application level programmer, but it becomes more interesting if you want to write programs that run on the bare hardware of

the machine, since then you need to arrange to intercept the bootstrap sequence somewhere and run your program rather than the usual operating system. Some systems make this quite easy (just stick the name of your program in AUTOEXEC.BAT and reboot Windows 95, for example), others make it nearly impossible. It also presents opportunities for customized systems. For example, a single-application system could be built over a Unix kernel by naming the application /etc/init. Software Bootstraping & Compiler Bootstraping: Bootstrapping can also refer to the development of successively more complex, faster programming environments. The simplest environment will be, perhaps, a very basic text editor (e.g. ed) and an assembler program. Using these tools, one can write a more complex text editor, and a simple compiler for a higher-level language and so on, until one can have a graphical IDE and an extremely highlevel programming language Compiler Bootstraping: In compiler design, a bootstrap or bootstrapping compiler is a compiler that is written in the target language, or a subset of the language, that it compiles. Examples include gcc, GHC, OCaml, BASIC, PL/I and more recently the Mono C# compiler.

6. Describe the procedure for design of a Linker. Linker or link editor is a program that takes one or more objects generated by a compiler and combines them into a single executableprogram. In IBM mainframe environments such as OS/360 this program is known as a linkage editor. On Unix variants the term loader is often used as a synonym for linker. Other terminology was in use, too. For example, on SINTRAN III the process performed by a linker (assembling object files into a program) was called loading (as in loading executable code onto a file). Because this usage blurs the distinction between the compile-time process and the run-time process, this article will use linking for the former and loading for the latter. However, in some operating systems the same program handles both the jobs of linking and loading a program.

An illustration of the linking process. Object files and static libraries are assembled into a new library or executable.

Assignment Set 2
MC0073 System Programming (Book ID: B0811)
1. Discuss the various Addressing mode for CISC. Addressing Modes of CISC : The 68000 addressing (Motorola) modes Register to Register, Register to Memory, Memory to Register, and

Memory to Memory 68000 Supports a wide variety of addressing modes. Immediate mode - the operand immediately follows the instruction Absolute address the address (in either the "short" 16-bit form or "long" 32-bit form) of the operand immediately follows the instruction Program Counter relative with displacement A displacement value is added to the program counter to calculate the operands address. The displacement can be positive or negative. Program Counter relative with index and displacement The instruction contains both the identity of an "index register" and a trailing displacement value. The contents of the index register, the displacement value, and the program counter are added together to get the final address. Register direct The operand is contained in an address or data register. Address register indirect An address register contains the address of the operand. Address register indirect with predecrement or postdecrement An address register contains the address of the operand in memory. With the predecrement option set, a predetermined value is subtracted from the register before the (new) address is used. With the postincrement option set, a predetermined value is added to the register after the operation completes. Address register indirect with displacement A displacement value is added to the registers contents to calculate the operands address. The displacement can be positive or negative. Address register relative with index and displacement The instruction contains both the identity of an "index register" and a trailing displacement value. The contents of the index register, the displacement value, and the specified address register are added together to get the final address.

2. Write about Deterministic and Non-Deterministic Finite Automata with suitable numerical examples. Deterministic finite automaton (DFA)also known as deterministic finite state machineis afinite state machine accepting finite strings of symbols. For each state, there is a transition arrow leading out to a next state for each symbol.

Upon reading a symbol, a DFA jumps deterministically from a state to another by following the transition arrow. Deterministic means that there is only one outcome (i.e. move to next state when the symbol matches (S0 -> S1) or move back to the same state (S0 -> S0)). A DFA has a start state (denoted graphically by an arrow coming in from nowhere) where computations begin, and a set of accept states (denoted graphically by a double circle) which help define when a computation is successful. DFAs recognize exactly the set of regular languages which are, among other things, useful for doing lexical analysis and pattern matching. A DFA can be used in either an accepting mode to verify that an input string is indeed part of the language it represents, or a generating mode to create a list of all the strings in the language. A DFA is defined as an abstract mathematical concept, but due to the deterministic nature of a DFA, it is implementable in hardware and software for solving various specific problems. For example, a software state machine that decides whether or not online user-input such as phone numbers and email addresses are valid can be modeled as a DFA. Another example in hardware is the digital logic circuitry that controls whether an automatic door is open or closed, using input from motion sensors or pressure pads to decide whether or not to perform a state transition .

Formal definition A deterministic finite automaton M is a 5-tuple, (Q, , , q0, F), consisting of

a finite set of states (Q) a finite set of input symbols called the alphabet () a transition function ( : Q Q) a start state (q0 Q) a set of accept states (F Q)

Let w = a1a2 ... an be a string over the alphabet . The automaton M accepts the string w if a sequence of states, r0,r1, ..., rn, exists in Q with the following conditions: 1. 2. 3. r0 = q 0 ri+1 = (ri, ai+1), for i = 0, ..., n1 rn F.

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition function . The last condition says that the machine accepts w if the last input of w causes the machine to halt in one of the accepting states. Otherwise, it is said that the automatonrejects the string. The set of strings M accepts is the language recognized by M and this language is denoted by L(M). A deterministic finite automaton without accept states and without a starting state is known as a transition system or semiautomaton. For more comprehensive introduction of the formal definition see automata theory.

DFAs can be built from nondeterministic finite automata through the powerset construction.

An example of a Deterministic Finite Automaton that accepts only binary numbers that are multiples of 3. The state S0 is both the start state and an accept state.

Example The following example is of a DFA M, with a binary alphabet, which requires that the input contains an even number of 0s.

The state diagram for M M = (Q, , , q0, F) where


Q = {S1, S2}, = {0, 1}, q0 = S1, F = {S1}, and is defined by the following state transition table: 0 1 S1 S2 S1 S2 S1 S2

The state S1 represents that there has been an even number of 0s in the input so far, while S2 signifies an odd number. A 1 in the input does not change the state of the automaton. When the input ends, the state will show whether the input contained an even number of 0s or not. If the input did contain an even number of 0s, M will finish in state S1, an accepting state, so the input string will be accepted. The language recognized by M is the regular language given by the regular expression 1*( 0 (1*) 0 (1*) )*, where "*" is the Kleene star, e.g., 1* denotes any non-negative number (possibly zero) of symbols "1".

Nondeterministic finite automaton (NFA) or nondeterministic finite state machine is a finite state machine where from each state and a given input symbol the automaton may jump into several possible next states. This distinguishes it from the deterministic finite automaton (DFA), where the next possible state is uniquely determined. Although the DFA and NFA have distinct definitions, a NFA can be translated to equivalent DFA using powerset construction, i.e., the constructed DFA and the NFA recognize the same formal language. Both types of automata recognize only regular languages. Nondeterministic finite automata were introduced in 1959 by Michael O. Rabin and Dana Scott,[1] who also showed their equivalence to deterministic finite automata. Non-deterministic finite state machines are sometimes studied by the name subshifts of finite type. Non-deterministic finite state machines are generalized by probabilistic automata, which assign a probability to each state transition. Formal definition An NFA is represented formally by a 5-tuple, (Q, , , q0, F), consisting of

a finite set of states Q a finite set of input symbols a transition function : Q P(Q). an initial (or start) state q0 Q a set of states F distinguished as accepting (or final) states F Q.

Here, P(Q) denotes the power set of Q. Let w = a1a2 ... an be a word over the alphabet . The automaton M accepts the word w if a sequence of states, r0,r1, ..., rn, exists in Q with the following conditions: 1. 2. 3. r0 = q 0 ri+1 (ri, ai+1), for i = 0, ..., n1 rn F.

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition relation . The last

condition says that the machine accepts w if the last input of w causes the machine to halt in one of the accepting states. Otherwise, it is said that the automatonrejects the string. The set of strings M accepts is the language recognized by M and this language is denoted by L(M). For more comprehensive introduction of the formal definition see automata theory. NFA- The NFA- (also sometimes called NFA- or NFA with epsilon moves) replaces the transition function with one that allows the empty string as a possible input, so that one has instead : Q ( {}) P(Q). It can be shown that ordinary NFA and NFA- are equivalent, in that, given either one, one can construct the other, which recognizes the same language. Example

The state diagram for M Let M be a NFA-, with a binary alphabet, that determines if the input contains an even number of 0s or an even number of 1s. Note that 0 occurrences is an even number of occurrences as well. In formal notation, let M = ({s0, s1, s2, s3, s4}, {0, 1}, , s0, {s1, s3}) where the transition relation T can be defined by this state transition table: 0 S0 {} S1 {S2} S2 {S1} 1 {} {S1, S3} {S1} {} {S2} {}

S3 {S3} {S4} S4 {S4} {S3}

{} {}

M can be viewed as the union of two DFAs: one with states {S1, S2} and the other with states {S3, S4}. The language of M can be described by theregular language given by this regular expression (1*(01*01*)*) (0*(10*10*)*). We define M using -moves but M can be defined without using -moves.

3. Write a short note on: A) C Preprocessor for GCC version 2 B) Conditional Assembly A) The C Preprocessor for GCC version 2 : The C preprocessor is a macro processor that is used automatically by the C compiler to transform your program before actual compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs. The C preprocessor provides four separate facilities that you can use as you see fit: Inclusion of header files. These are files of declarations that can be substituted into your program. Macro expansion. You can define macros, which are abbreviations for arbitrary fragments of C code, and then the C preprocessor will replace the macros with their definitions throughout the program. Conditional compilation. Using special preprocessing directives, you can include or exclude parts of the program according to various conditions. Line control. If you use a program to combine or rearrange source files into an intermediate file which is then compiled, you can use line control to inform the compiler of where each source line originally came from. ANSI Standard C requires the rejection of many harmless constructs commonly used by todays C programs. Such incompatibility would be inconvenient for users, so the GNU C preprocessor is configured to accept these constructs by default. Strictly speaking, to get ANSI Standard C, you must use the options `trigraphs, `-undef and `-pedantic, but in practice the consequences of having strict ANSI Standard C make it undesirable to do this.

Conditional Assembly : Means that some sections of the program may be optional, either included or not in the final program, dependent upon specified conditions. A reasonable use of conditional assembly would be to combine two versions of a program, one that prints debugging information during test executions for the developer, another version for production operation that displays only results of interest for the average user. A program fragment that assembles the instructions to print the Ax register only if Debug is true is given below. Note that true is any non-zero value.

Here is a conditional statements in C programming, the following statements tests the expression `BUFSIZE == 1020, where `BUFSIZE must be a macro. #if BUFSIZE == 1020 printf ("Large buffers!n"); #endif /* BUFSIZE is large */

4. Write about different Phases of Compilation. Phases of Compilation: 1. Lexical analysis (scanning): the source text is broken into tokens. 2. Syntactic analysis (parsing): tokens are combined to form syntactic structures, typically represented by a parse tree. The parser may be replaced by a syntax-directed editor, which directly generates a parse tree as a product of editing. 3. Semantic analysis: intermediate code is generated for each syntactic structure.

Type checking is performed in this phase. Complicated features such as generic declarations and operator overloading (as in Ada and C++) are also processed. 4. Machine-independent optimization: intermediate code is optimized to improve efficiency. 5. Code generation: intermediate code is translated to relocatable object code for the target machine. 6. Machine-dependent optimization: the machine code is optimized.

5. What is MACRO? Discuss its use. A macro is a rule or pattern that specifies how a certain input sequence (often a sequence of characters) should be mapped to an output sequence (also often a sequence of characters) according to a defined procedure. The mapping process that instantiates (transforms) a macro into a specific output sequence is known as macro expansion. The term is used to make available to the programmer, a sequence of computing instructions as a single program statement, making the programming task less tedious and less error-prone. (Thus, they are called "macros" because a big block of code can be expanded from a small sequence of characters). Macros often allow positional or keyword parameters that dictate what the conditional assembler program generates and have been used to create entire programs or program suites according to such variables as operating system, platform or other factors. 6. What is compiler? Explain the compiler process. A compiler is a computer program that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). If the compiled program can run on a computer whose CPU or operating system is different from the one on which

the compiler runs, the compiler is known as a cross-compiler. A program that translates from a low level language to a higher level one is a de-compiler. A program that translates between high-level languages is usually called a language translator, source to source translator, or language converter.

A compiler is likely to perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis (Syntax-directed translation), code generation, and code optimization. Program faults caused by incorrect compiler behavior can be very difficult to track down and work around; therefore, compiler implementers invest a lot of time ensuring the correctness of their software. The term compiler-compiler is sometimes used to refer to a parser generator, a tool often used to help create the lexer and parser.

A diagram of the operation of a typical multi-language, multi-target compiler

Basic process of Compiler: Compilers bridge source programs in high-level languages with the underlying hardware. A compiler requires 1) determining the correctness of the syntax of programs, 2) generating correct and efficient object code, 3) run-time organization, and 4) formatting output according to assembler and/or linker conventions. A compiler consists of three main parts: the frontend, the middleend, and the backend. The front end checks whether the program is correctly written in terms of the programming language syntax and semantics. Here legal and illegal programs are recognized. Errors are reported, if any, in a useful way. Type checking is also performed by collecting type information. The frontend then generates an intermediate representation or IR of the source code for processing by the middle-end. The middle end is where optimization takes place. Typical transformations for optimization are removal of useless or unreachable code, discovery and propagation of constant values, relocation of computation to a less frequently executed place (e.g., out of a loop), or specialization of computation based on the context. The middle-end generates another IR for the following backend. Most optimization efforts are focused on this part. The back end is responsible for translating the IR from the middle-end into assembly code. The target instruction(s) are chosen for each IR instruction. Register allocation assigns processor registersfor the program variables where possible. The backend utilizes the hardware by figuring out how to keep parallel execution units busy, filling delay slots, and so on. Although most algorithms for optimization are in NP, heuristic techniques are well-developed.

Master of Computer Application (MCA) Semester 3 MC0074 Statistical and Numerical methods using C++ 4 Credits

Assignment Set 1
MC0074 Statistical and Numerical methods using C++ (Book ID: B0812)
2. Discuss and define the Correlation coefficient with the suitable example.

Correlation coefficient Correlation is one of the most widely used statistical techniques. Whenever two variable are so related that a change in one variable result in a direct or inverse change in the other and also greater the magnitude of change in one variable corresponds to greater the magnitude of change in the other, then the variable are said to be correlated or the relationship between the variables is known as correlation. We have been concerned with associating parameters such as E(x) and V(X) with the distribution of one-dimensional random variable. If we have a twodimensional random variable (X,Y), an analogous problem is encountered. Definition Let (X, Y) be a two-dimensional random variable. We define xy, the correlation coefficient, between X and Y, as follows:

xy = The numerator of , is called the covariance of X and Y.

Example Suppose that the two-dimensional random variable (X, Y) is uniformly distributed over the triangular region R = {(x, y) | 0 < x < y < 1} The pdf is given as f(x, y) = 2, (x, y) R,

= 0, elsewhere. Thus the marginal pdfs of X and of Y are

g(x) = 2 (1 x), 0 x 1

h(y) = = 2y, 0 y 1 Therefore

E(X) =

, E(Y) =

E(X2) =

, E(Y2) =

V(X) = E(X2) (E(X))2 =

V(Y) = E(Y2) (E(Y))2 =

E(XY) = Hence

xy =

3. If x is normally distributed with zero mean and unit variance, find the expectation and variance of 2 . The equation of the normal curve is

If mean is zero and variance is unit, then putting m=0

and

,the above

equation reduced to

Expectation of x2 i.e.

(i)

Integrating by parts taking x as first function and remembering that

Putting

Hence

=1

(ii)

Integrating by parts taking x3 as first function

=3

=3(1)

with the help of (ii)

Variance of x2 = = 3-(1)2 =2

4. The sales in a particular department store for the last five years is given in the following table Years Sales (in lakhs) 1974 40 1976 43 1978 48 1980 52 1982 57

Estimate the sales for the year 1979. Newtons backward difference table is

We have

p= yn = 5,
2

yn = 1,

yn = 2,

yn = 5

Newtons interpolati0on formula gives y1979 = 57 + (-1.5) 5 + = 57 7.5 + 0.375 + 0.125 + 0.1172 y1979 = 50.1172

5. Find out the geometric mean of the following series Class Frequency 0-10 17 10-20 10 20-30 11 30-40 15 40-50 8

Here we have Class 0-10 10-20 20-30 Frequency(f) 17 10 11 Mid value(x) 5 15 25 Log x .6990 1.1761 1.3979 f.(Logx) 11.883 11.761 15.3769

30-40 40-50

15 8 N=61

35 45

1.5441 1.6532

23.1615 13.2256 Sum=75.408

If F be the required geometric mean, then

Log G = = 1/61(75.408) = 1.236197 G = antilog 1.23 = 16.28

6. Find the equation of regression line of x on y from the following data x y 0 1 2 3 4 10 12 27 10 30

sum(X) = 0+1+2+3+4 = 10 sum(X) = 0+1+2+3+4 = 30 sum(Y) = 10+12+27+10+30 = 89 sum(Y) = 10+12+27+10+30 = 1973 sum(XY) = 0.10 + 1.12 + 2.27 + 3.10 + 4.30 = 10.89 n =5 Xbar = sumX / n = 10 / 5 = 2 Ybar = sumY / n = 89 / 5 = 17.8 gradient m = [ n sumXY - sumX sumY ] / [ n sumX - (sumX) ] = (5.10.89 - 10.89) / (5.30 - 10)

= (54.45 - 890) / (150 - 100) = -835.55 / 50 = -16.711 Equation is y = mx + c Ybar = m.Xbar + c 17.8 =-16.711(2) + c c = 17.8 +33.422 = 51.222 Therefore the equation of the regressed line is y = (-16.711)x + 51.222

Assignment Set 2
MC0074 Statistical and Numerical methods using C++ (Book ID: B0812)

2. If

2 is approximated by 0.667, find the absolute and relative errors? 3

Absolute, relative and percentage errors An error is usually quantified in two different but related ways. One is known as absolute error and the other is called relative error. Let us suppose that true value of a data item is denoted by xt and its approximate value is denoted by xa. Then, they are related as follows: True value xt = Approximate value xa + Error The error is then given by: Error = xt - xa The error may be negative or positive depending on the values of xt and xa. In error analysis, what is important is the magnitude of the error and not the sign and, therefore, we normally consider what is known as absolute error which is denoted by e a = | x t xa | In general absolute error is the numerical difference between the true value of a quantity and its approximate value. In many cases, absolute error may not reflect its influence correctly as it does not take into account the order of magnitude of the value. In view of this, the concept of relative error is introduced which is nothing but the normalized absolute error. The relative error is defined as

er =

= absolute error of 2/3= 0.001666666... relative error of 2/3 = 0.0024999 approx.

3. If , denote forward, backward and central difference operator, E and , are respectively the shift and average operators, in the analysis of data with equal spacing h, show that
(1) 1 + 2 2 = 1 + 2
2 2

(2) E1/2 = +

(3) =

2 2 + 1+ 4 2

From the definition of operators, we have <<[1]>>

= Therefore

1+

= Also

..(1)

..(2)

From equation (1) and (2)

1 + 2 2 = <<[2]>>

<<[3]>>

= =E1 Thus we get

= 4. Find a real root of the equation x3 4x 9 = 0 using the bisection method.

First Let x0 be 1 and x1 be 3 F(x0) = x3 4x -9 =1 4 9 = -12 < 0 F(x1) =27 12 9 =6> 0 Therefore, the root lies between 1 and 3 Now we try with x2 =2 F(x2) = 8 8 9 = -9 < 0 Therefore, the root lies between 2 and 3 X3 = (x1+x2)/2 =(3+2)/2 = 2.5 F(x3) = 15.625 10 9 = - 3.375 < 0 Therefore, the root lies between 2.5 and 3 X4 = (x1+x3)/2 = 2.75

5. Find Newtons difference interpolation polynomial for the following data: x f(x) 0.1 1.40 0.2 1.56 0.3 1.76 0.4 2.00 0.5 2.28

Forward difference table

Here

p=

We have Newtons forward interpolation formula as

y=

(1) From the table substitute all the values in equation (1)

y = 1.40 + (10x 1) (0.16) + y = 2x2 + x + 1.28 This is the required Newtons interpolating polynomial.

6. Evaluate value of .

1 + x 2
0

dx

using Trapezoidal rule with h = 0.2. Hence determine the

, which is known as the trapezoidal rule. The trapezoidal rule uses trapezoids to approximate the curve on a subinterval. The area of a trapezoid is the width times the average height, given by the sum of the function values at the endpoints, divided by two. Therefore: 0.2( f(0) + 2f(0.2) + 2f(0.4) + 2f(0.6) + 2f(0.8) + f(1) ) / 2 = 0.2( 1 + 2*(0.96154) + 2(0.86207) + 2(0.73529) + 2(0.60976) + 0.5) / 2 = 0.78373 The integrand is the derivative of the inverse tangent function. In particular, if we integrate from 0 to 1, the answer is pi/4 . Consequently we can use this integral to approximate pi. Multiplyingby four, we get an approximation for pi: 3.1349

Master of Computer Application (MCA) Semester 3 MC0075 Computer Networks 4 Credits

Assignment Set 1
MC0075 Computer Networks (Book ID: B0813 & B0814)

1. Discuss the advantages and disadvantages of synchronous and asynchronous transmission. Synchronous & Asynchronous transmission:: Synchronous Transmission: Synchronous is any type of communication in which the parties communicating are "live" or present in the same space and time. A chat room where both parties must be at their computer, connected to the Internet, and using software to communicate in the chat room protocols is a synchronous method of communication. E-mail is an example of an asynchronous mode of communication where one party can send a note to another person and the recipient need not be online to receive the e-mail. Synchronous mode of transmissions are illustrated in figure 3.11

Figure :Synchronous and Asynchronous Transmissions The two ends of a link are synchronized, by carrying the transmitters clock information along with data. Bytes are transmitted continuously, if there are gaps then inserts idle bytes as padding Advantage: This reduces overhead bits It overcomes the two main deficiencies of the asynchronous method, that of inefficiency and lack of error detection. Disadvantage: For correct operation the receiver must start to sample the line at the correct instant Application: Used in high speed transmission example: HDLC Asynchronous transmission:

Asynchronous refers to processes that proceed independently of each other until one process needs to "interrupt" the other process with a request. Using the client- server model, the server handles many asynchronous requests from its many clients. The client is often able to proceed with other work or must wait on the service requested from the server.

Figure : Asynchronous Transmissions Asynchronous mode of transmissions is illustrated in figure 3.12. Here a Start and Stop signal is necessary before and after the character. Start signal is of same length as information bit. Stop signal is usually 1, 1.5 or 2 times the length of the information signal Advantage: The character is self contained & Transmitter and receiver need not be synchronized Transmitting and receiving clocks are independent of each other Disadvantage: Overhead of start and stop bits False recognition of these bits due to noise on the channel Application: If channel is reliable, then suitable for high speed else low speed transmission Most common use is in the ASCII terminals 2. Describe the ISO-OSI reference model and discuss the importance of every layer. Ans:2 The OSI Reference Model:

This reference model is proposed by International standard organization (ISO) as a a first step towards standardization of the protocols used in various layers in 1983 by Day and Zimmermann. This model is called Open system Interconnection (OSI) reference model. It is referred OSI as it deals with connection open systems. That is the systems are open for communication with other systems. It consists of seven layers. Layers of OSI Model : The principles that were applied to arrive at 7 layers: 1. A layer should be created where a different level of abstraction is needed. 2. Each layer should perform a well defined task. 3. The function of each layer should define internationally standardized protocols 4. Layer boundaries should be chosen to minimize the information flow across the interface. 5. The number of layers should not be high or too small.

Figure : ISO - OSI Reference Model The ISO-OSI reference model is as shown in figure 2.5. As such this model is not a network architecture as it does not specify exact services and protocols. It just tells what each layer should do and where it lies. The bottom most layer is referred as physical layer. ISO has produced standards for each layers and are published separately. Each layer of the ISO-OSI reference model are discussed below:

1. Physical Layer This layer is the bottom most layer that is concerned with transmitting raw bits over the communication channel (physical medium). The design issues have to do with making sure that when one side sends a 1 bit, it is received by other side as a 1 bit, and not as a 0 bit. It performs direct transmission of logical information that is digital bit streams into physical phenomena in the form of electronic pulses. Modulators/demodulators are used at this layer. The design issue here largely deals with mechanical, electrical, and procedural interfaces, and the physical transmission medium, which lies below this physical layer. In particular, it defines the relationship between a device and a physical medium. This includes the layout of pins, voltages, and cable specifications. Hubs, repeaters, network adapters and Host Bus Adapters (HBAs used in Storage Area Networks) are physical-layer devices. The major functions and services performed by the physical layer are: Establishment and termination of a connection to a communications medium. Participation in the process whereby the communication resources are effectively shared among multiple users. For example, contention resolution and flow control. Modulation, is a technique of conversion between the representation of digital data in user equipment and the corresponding signals transmitted over a communications channel. These are signals operating over the physical cabling (such as copper and fiber optic) or over a radio link. Parallel SCSI buses operate in this layer. Various physical-layer Ethernet standards are also in this layer; Ethernet incorporates both this layer and the data-link layer. The same applies to other local-area networks, such as Token ring, FDDI, and IEEE 802.11, as well as personal area networks such as Bluetooth and IEEE 802.15.4. 2. Data Link Layer The Data Link layer provides the functional and procedural means to transfer data between network entities and to detect and possibly correct errors that may occur in the Physical layer. That is it makes sure that the message indeed reach the other end without corruption or without signal distortion and noise. It accomplishes this task by having the sender break the input data up into the frames called data frames. The DLL of transmitter, then transmits the frames sequentially, and processes acknowledgement frames sent back by the receiver. After processing acknowledgement frame, may be the transmitter needs to retransmit a copy of the frame. So therefore the DLL at receiver is required to detect duplications of frames.

The best known example of this is Ethernet. This layer manages the interaction of devices with a shared medium. Other examples of data link protocols are HDLC and ADCCP for point-to-point or packet-switched networks and Aloha for local area networks. On IEEE 802 local area networks, and some non-IEEE 802 networks such as FDDI, this layer may be split into a Media Access Control (MAC) layer and the IEEE 802.2 Logical Link Control (LLC) layer. It arranges bits from the physical layer into logical chunks of data, known as frames. This is the layer at which the bridges and switches operate. Connectivity is provided only among locally attached network nodes forming layer 2 domains for unicast or broadcast forwarding. Other protocols may be imposed on the data frames to create tunnels and logically separated layer 2 forwarding domain. The data link layer might implement a sliding window flow control and acknowledgment mechanism to provide reliable delivery of frames; that is the case for SDLC and HDLC, and derivatives of HDLC such as LAPB and LAPD. In modern practice, only error detection, not flow control using sliding window, is present in modern data link protocols such as Point-to-Point Protocol (PPP), and, on local area networks, the IEEE 802.2 LLC layer is not used for most protocols on Ethernet, and, on other local area networks, its flow control and acknowledgment mechanisms are rarely used. Sliding window flow control and acknowledgment is used at the transport layers by protocols such as TCP. 3. Network Layer The Network layer provides the functional and procedural means of transferring variable length data sequences from a source to a destination via one or more networks while maintaining the quality of service requested by the Transport layer. The Network layer performs network routing functions, and might also perform fragmentation and reassembly, and report delivery errors. Routers operate at this layer sending data throughout the extended network and making the Internet possible. This is a logical addressing scheme values are chosen by the network engineer. The addressing scheme is hierarchical. The best known example of a layer 3 protocol is the Internet Protocol (IP). Perhaps its easier to visualize this layer as managing the sequence of human carriers taking a letter from the sender to the local post office, trucks that carry sacks of mail to other post offices or airports, airplanes that carry airmail between major cities, trucks that distribute mail sacks in a city, and carriers that take a letter to its destinations. Think of fragmentation as splitting a large document into smaller envelopes for shipping, or, in the case of the network layer, splitting an application or transport record into packets. The major tasks of network layer are listed It controls routes for individual message through the actual topology.

Finds the best route. Finds alternate routes. It accomplishes buffering and deadlock handling.

4. Transport Layer The Transport layer provides transparent transfer of data between end users, providing reliable data transfer while relieving the upper layers of it. The transport layer controls the reliability of a given link through flow control, segmentation/desegmentation, and error control. Some protocols are state and connection oriented. This means that the transport layer can keep track of the segments and retransmit those that fail. The best known example of a layer 4 protocol is the Transmission Control Protocol (TCP). The transport layer is the layer that converts messages into TCP segments or User Datagram Protocol (UDP), Stream Control Transmission Protocol (SCTP), etc. packets. Perhaps an easy way to visualize the Transport Layer is to compare it with a Post Office, which deals with the dispatch and classification of mail and parcels sent. Do remember, however, that a post office manages the outer envelope of mail. Higher layers may have the equivalent of double envelopes, such as cryptographic Presentation services that can be read by the addressee only. Roughly speaking, tunneling protocols operate at the transport layer, such as carrying non-IP protocols such as IBMs SNA or Novells IPX over an IP network, or end-to-end encryption with IP security (IP sec). While Generic Routing Encapsulation (GRE) might seem to be a network layer protocol, if the encapsulation of the payload takes place only at endpoint, GRE becomes closer to a transport protocol that uses IP headers but contains complete frames or packets to deliver to an endpoint. The major tasks of Transport layer are listed below: It locates the other party It creates a transport pipe between both end-users. It breaks the message into packets and reassembles them at the destination. It applies flow control to the packet stream.

5. Session Layer The Session layer controls the dialogues/connections (sessions) between computers. It establishes, manages and terminates the connections between the local and remote application. It provides for either full-duplex or half-duplex operation, and establishes check pointing, adjournment, termination, and restart procedures. The OSI model made this layer responsible for "graceful close" of sessions, which is a property of TCP, and also for session check pointing and recovery, which is not usually used in the Internet protocols suite. The major tasks of session layer are listed It is responsible for the relation between two end-users. It maintains the integrity and controls the data exchanged between the endusers. The end-users are aware of each other when the relation is established (synchronization). It uses naming and addressing to identify a particular user. It makes sure that the lower layer guarantees delivering the message (flow control).

6. Presentation Layer The Presentation layer transforms the data to provide a standard interface for the Application layer. MIME encoding, data encryption and similar manipulation of the presentation are done at this layer to present the data as a service or protocol developer sees fit. Examples of this layer are converting an EBCDICcoded text file to an ASCII-coded file, or serializing objects and other data structures into and out of XML. The major tasks of presentation layer are listed below: It translates the language used by the application layer. It makes the users as independent as possible, and then they can concentrate on conversation.

7. Application Layer (end users)

The application layer is the seventh level of the seven-layer OSI model. It interfaces directly to the users and performs common application services for the application processes. It also issues requests to the presentation layer. Note carefully that this layer provides services to user-defined application processes, and not to the end user. For example, it defines a file transfer protocol, but the end user must go through an application process to invoke file transfer. The OSI model does not include human interfaces. The common application services sub layer provides functional elements including the Remote Operations Service Element (comparable to Internet Remote Procedure Call), Association Control, and Transaction Processing (according to the ACID requirements). Above the common application service sub layer are functions meaningful to user application programs, such as messaging (X.400), directory (X.500), file transfer (FTAM), virtual terminal (VTAM), and batch job manipulation (JTAM).

3. Explain the following with respect to Data Communications: A) Fourier analysis B) Band limited signals C) Maximum data rate of a channel A) Fourier analysis : In 19th century, the French mathematician Fourier proved that any periodic function of time g (t) with period T can be constructed by summing a number of cosines and sines.

(3.1) Where f=1/T is the fundamental frequency, and are the sine and cosine amplitudes of the nth harmonics. Such decomposition is called a Fourier series. B) Band limited signals : Consider the signal given in figure 3.1(a). Figure shows the signal that is the ASCII representation of the character b which consists of the bit pattern 01100010 along with its harmonics.

Figure 3.1: (a) A binary signal (b-e) Successive approximation of original signal Any transmission facility cannot pass all the harmonics and hence few of the harmonics are diminished and distorted. The bandwidth is restricted to low frequencies consisting of 1, 2, 4, and 8 harmonics and then transmitted. Figure 3.1(b) to 3.1(e) shows the spectra and reconstructed functions for these bandlimited signals. Limiting the bandwidth limits the data rate even for perfect channels. However complex coding schemes that use several voltage levels do exist and can achieve higher data rates. C) Maximum data rate of a channel : In 1924, H. Nyquist realized the existence of the fundamental limit and derived the equation expressing the maximum data for a finite bandwidth noiseless channel. In 1948, Claude Shannon carried Nyquist work further and extended it to the case of a channel subject to random noise. In communications, it is not really the amount of noise that concerns us, but rather the amount of noise compared to the level of the desired signal. That is, it is the ratio of signal to noise power that is important, rather than the noise power alone. This Signal-to-Noise Ratio (SNR), usually expressed in decibel (dB), is one of the most important specifications of any communication system. The decibel is a logarithmic unit used for comparisons of power levels or voltage levels. In order to understand the implication of dB, it is important to know that a sound level of zero dB corresponds to the threshold of hearing, which is the smallest sound that can be heard. A normal speech conversation would measure about 60 dB.

If an arbitrary signal is passed through the Low pass filter of bandwidth H, the filtered signal can be completely reconstructed by making only 2H samples per second. Sampling the line faster than 2H per second is pointless. If the signal consists of V discrete levels, then Nyquist theorem states that, for a noiseless channel Maximum data rate = 2H.log2 (V) bits per second. (3.2) For a noisy channel with bandwidth is again H, knowing signal to noise ratio S/N, the maximum data rate according to Shannon is given as Maximum data rate = H.log2 (1+S/N) bits per second. (3.3)

4. Explain what all facilities FTP offers beyond the transfer function? FTP offers many facilities beyond the transfer function itself:

Interactive Access: It provides an interactive interface that allows humans to easily interact with remote servers. For example: A user can ask for listing of all files in a directory on a remote machine. Also a client usually responds to the input help by showing the user information about possible commands that can be invoked. Format (Representation) Specification: FTP allows the client to specify the type of format of stored data. For example: the user can specify whether a file contains text or binary integers and whether a text files use ASCII or EBCDIC character sets. Authentication Control: FTP requires clients to authorize themselves by sending a login name and password to the server before requesting file transfers. The server refuses access to clients that cannot supply a valid login and password.

5. What is the use of IDENTIFIER and SEQUENCE NUMBER fields of echo request and echo reply message? Explain. Echo Request and Echo Reply message format: The echo request contains an optional data area. The echo reply contains the copy of the data sent in the request message. The format for the echo request and echo reply is as shown in figure 5.2.

Figure 5.2 echo request and echo reply message format The field OPTIONALDATA is a variable length that contains data to be returned to the original sender. An echo reply always returns exactly the same data as ws to receive in the request. Field IDENTIFIER and SEQUENCE NUMBER are used by the sender to match replies to requests. The value of the TYPE field specifies whether it is echo request when equal to 8 or echo reply when equal to 0.

6. In what conditions is ARP protocol used? Explain. ARP protocol : In computer networking, the Address Resolution Protocol (ARP) is the standard method for finding a hosts hardware address when only its network layer address is known. ARP is primarily used to translate IP addresses to Ethernet MAC addresses. It is also used for IP over other LAN technologies, such as Token Ring, FDDI, or IEEE 802.11, and for IP over ATM. ARP is used in four cases of two hosts communicating:

1. When two hosts are on the same network and one desires to send a packet to the other 2. When two hosts are on different networks and must use a gateway/router to reach the other host 3. When a router needs to forward a packet for one host through another router 4. When a router needs to forward a packet from one host to the destination host on the same network The first case is used when two hosts are on the same physical network. That is, they can directly communicate without going through a router. The last three cases are the most used over the Internet as two computers on the internet are typically separated by more than 3 hops. Imagine computer A sends a packet to computer D and there are two routers, B & C, between them. Case 2 covers A sending to B; case 3 covers B sending to C; and case 4 covers C sending to D. ARP is defined in RFC 826. It is a current Internet Standard, STD 37.

Assignment Set 2

MC0075 Computer Networks (Book ID: B0813 & B0814)

1. Discuss the physical description of the different transmission mediums. Physical description: [[Guided transmission media]] Twisted pair=> Physical description>> _ Two insulated copper wires arranged in regular spiral pattern. _ Number of pairs are bundled together in a cable. _ Twisting decreases the crosstalk interference between adjacent pairs in the cable, by using different. _ twist length for neighboring pairs. Coaxial cable>> Physical description>> _ Consists of two conductors with construction that allows it to operate over a wider range of frequencies, compared to twisted pair. _ Hollow outer cylindrical conductor surrounding a single inner wire conductor. _ Inner conductor held in place by regularly spaced insulating rings or solid dielectrical material. _ Outer conductor covered with a jacket or shield. _ Diameter from 1 to 2.5 cm. _ Shielded concentric construction reduces interference and crosstalk. _ Can be used over longer distances and support more stations on a shared line than twisted pair. Optical fiber=> Physical description>> 1. Core _ Innermost section of the fiber

_ One or more very thin (dia - 8-100 mm). 2. Cladding _ Surrounds each strand _ Plastic or glass coating with optical properties different from core _ Interface between core and cladding prevents light from escaping the core 3. Jacket _ Outermost layer, surrounding one or more claddings _ Made of plastic and other materials _ Protects from environmental elements like moisture, abrasions, and crushing

[[Wireless Transmission]] Terrestrial microwave=>> Physical description: _ Parabolic dish antenna, about 3m in diameter. _ Fixed rigidly with a focused beam along line of sight to receiving antenna. _ With no obstacles, maximum distance (d, in km) between antennae can be d = 7:14 Rt(Kh) [[where h is antenna height and K is an adjustment factor to account for the bend in microwave due to earth's curvature, enabling it to travel further than the line of sight; typically K = 4/3]] _ Two microwave antennae at a height of 100m may be as far as 7:14 Rt(Kh) _ Long distance microwave transmission is achieved by a series of microwave relay towers Satellite microwave=>> Physical description: _ Communication satellite is a microwave relay station between two or more ground stations (also called earth stations). _ Satellite uses different frequency bands for incoming (uplink) and outgoing (downlink) data. _ A single satellite can operate on a number of frequency bands, known as transponder channels or transponders. _ Geosynchronous orbit (35,784 km). _ Satellites cannot be too close to each other to avoid interference. _ Current standard requires a 4_ displacement in the 4/6 ghz band and 3_ displacement at 12/14 Ghz.

_ This limits the number of available satellites. Broadcast radio=> Physical description: _ Omnidirectional transmission. _ No need for dish antennae. 2. Describe the following Medium Access Control Sub Layers Multiple access protocols: A) Pure ALOHA B) Slotted ALOHA Pure ALOHA or Unslotted ALOHA: The ALOHA network was created at the University of Hawaii in 1970 under the leadership of Norman Abramson. The Aloha protocol is an OSI layer 2 protocol for LAN networks with broadcast topology. The first version of the protocol was basic: If you have data to send, send the data If the message collides with another transmission, try resending later

Figure 7.3: Pure ALOHA

Figure 7.4: Vulnerable period for the node: frame The Aloha protocol is an OSI layer 2 protocol used for LAN. A user is assumed to be always in two states: typing or waiting. The station transmits a frame and checks the channel to see f it was successful. If so the user sees the reply and continues to type. If the frame transmission is not successful, the user waits and retransmits the frame over and over until it has been successfully sent. Let the frame time denote the amount of time needed to transmit the standard fixed length frame. We assume the there are infinite users and generate the new frames according Poisson distribution with the mean N frames per frame time. So if N>1 the users are generating the frames at higher rate than the channel can handle. Hence all frames will suffer collision. Hence the range for N is 0<N<1 If N>1 there is collisions and hence retransmission frames are also added with the new frames for transmissions. Let us consider the probability of k transmission attempts per frame time. Here the transmission of frames includes the new frames as well as the frames that are given for retransmission. This total traffic is also poisoned with the mean G per frame time. That is

At low load: N is approximately =0, there will be few collisions. Hence few retransmissions that is G=N At high load: N >>1, many retransmissions and hence G>N. Under all loads: throughput S is just the offered load G times the probability of successful transmission P0

The probability that k frames are generated during a given frame time is given by Poisson distribution

So the probability of zero frames is just . The basic throughput calculation follows a Poisson distribution with an average number of arrivals of 2G arrivals per two frame time. Therefore, the lambda parameter in the Poisson distribution becomes 2G. Hence Hence the throughput We get for G = 0.5 resulting in a maximum throughput of 0.184, i.e. 18.4%. Pure Aloha had a maximum throughput of about 18.4%. This means that about 81.6% of the total available bandwidth was essentially wasted due to losses from packet collisions.

Slotted ALOHA or Impure ALOHA An improvement to the original Aloha protocol was Slotted Aloha. It is in 1972, Roberts published a method to double the throughput of an pure ALOHA by uses discrete timeslots. His proposal was to divide the time into discrete slots corresponding to one frame time. This approach requires the users to agree to the frame boundaries. To achieve synchronization one special station emits a pip at the start of each interval similar to a clock. Thus the capacity of slotted ALOHA increased to the maximum throughput of 36.8%. The throughput for pure and slotted ALOHA system is as shown in figure 7.5. A station can send only at the beginning of a timeslot, and thus collisions are reduced. In this case, the average number of aggregate arrivals is G arrivals per 2X seconds. This leverages the lambda parameter to be G. The maximum throughput is reached for G = 1.

Figure 7.5: Throughput versus offered load traffic. With Slotted Aloha, a centralized clock sent out small clock tick packets to the outlying stations. Outlying stations were only allowed to send their packets immediately after receiving a clock tick. If there is only one station with a packet to send, this guarantees that there will never be a collision for that packet. On the other hand if there are two stations with packets to send, this algorithm guarantees that there will be a collision, and the whole of the slot period up to the next clock tick is wasted. With some mathematics, it is possible to demonstrate that this protocol does improve the overall channel utilization, by reducing the probability of collisions by a half. It should be noted that Alohas characteristics are still not much different from those experienced today by Wi-Fi, and similar contention-based systems that have no carrier sense capability. There is a certain amount of inherent inefficiency in these systems. It is typical to see these types of networks throughput break down significantly as the number of users and message burstiness increase. For these reasons, applications which need highly deterministic load behavior often use token-passing schemes (such as token ring) instead of contention systems For instance ARCNET is very popular in embedded applications. Nonetheless, contention based systems also have significant advantages, including ease of management and speed in initial communication. Slotted Aloha is used on low bandwidth tactical Satellite communications networks by the US Military; subscriber based Satellite communications networks, and contact less RFID technologies.

3. Discuss the different types of noise with suitable example . Noise: Noise is a third impairment. It can be define as unwanted energy from sources other than the transmitter. Thermal noise is caused by the random motion of the electrons in a wire and is unavoidable. Consider a signal as shown in figure 3.5, to which a noise shown in figure 3.6, is added may be in the channel.

Figure 3.5: Signal

Figure 3.6: Noise

Figure 3.7: Signal + Noise At the receiver, the signal is recovered from the received signal and is shown in figure 3.7. That is signals are reconstructed by sampling. Increased data rate implies "shorter" bits with higher sensitivity to noise Source of Noise Thermal: Agitates the electrons in conductors, and is a function of the temperature. It is often referred to as white noise, because it affects uniformly the different frequencies. The thermal noise in a bandwidth W is Where T=temperature, and k= Boltzmanns constant = 1.38 10-23 Joules/degrees Kelvin.

Signal to noise ratio:

Typically measured at the receiver, because it is the point where the noise is to be removed from the signal.

Intermodulation: Results from interference of different frequencies sharing the same medium. It is caused by a component malfunction or a signal with excessive strength is used. For example, the mixing of signals at frequencies f1 and f2 might produce energy at the frequency f1 + f2 . This derived signal could interfere with an intended signal at frequency f1 + f2 . Cross talk: Similarly cross talk is a noise where foreign signal enters the path of the transmitted signal. That is, cross talk is caused due to the inductive coupling between two wires that are close to each other. Sometime when talking on the telephone, you can hear another conversation in the background. That is cross talk. Impulse: These are noise owing to irregular disturbances, such as lightning, flawed communication elements. It is a primary source of error in digital data.

4. What is Non-repudiation? Define cryptanalysis. Non-repudiation: Non-repudiation, or more specifically non-repudiation of origin, is an important aspect of digital signatures. By this property an entity that has signed some information cannot at a later time deny having signed it. Similarly, access to the public key only does not enable a fraudulent party to fake a valid signature. It deals with signatures. Not denying or reneging. Digital signatures and certificates provide nonrepudiation because they guarantee the authenticity of a document or message. As a result, the sending parties cannot deny that they sent it (they cannot repudiate it). Nonrepudiation can also be used to ensure that an e-mail message was opened. Example: how does one prove that the order was placed by the customer. Cryptanalysis: The main constraint on cryptography is the ability of the code to perform the necessary transformation. From the top-secret military files, to the protection of private notes between friends, various entities over the years have found themselves in need of disguises for their transmissions for many different reasons. This practice of disguising or scrambling messages is called encryption.

In cryptography, a digital signature or digital signature scheme is a type of asymmetric cryptography used to simulate the security properties of a signature in digital, rather than written, form. Digital signature schemes normally give two algorithms, one for signing which involves the users secret or private key, and one for verifying signatures which involves the users public key. The output of the signature process is called the "digital signature."

5. Explain mask-address pair used in update message. Discuss importance of path attributes. Update message

Figure 7.4 BGP UPDATE message format The BGP peers after establishing a TCP connection sends OPEN message and acknowledge then. Then use UPDATE message to advertise new destinations that are reachable or withdraw previous advertisement when a destination has become unreachable. The UPDATE message format is as shown in Figure 7.4. Each UPDATE message is divided into two parts: 1. List of previously advertised destinations that are being withdrawn 2. List of new destinations being advertised. Fields labeled variable do not have fixed size. Update message contains following fields: WITHDRAWN LEN: is a 2-byte that specifies the size of withdrawn destinations field. If no withdrawn destination then its value =0 WITHDRAWN DESTINATIONS: IP addresses of withdrawn destinations. PATH LEN: is similar to WITHDRAWN LEN, but it specifies the length of path attributes that are associated with new destinations being advertised.

PATH ATTRIBUTES: it gives the additional information of new destinations. It is discussed in detail below DESTINATION NETWORKS: new destinations. Compressed mask-address pairs: In the update message many addresses are listed and the size of the update message goes on increasing. BGP uses it to store destination address and the associated mask. A technique, where instead of IP 32-bit address and a 32-bit mask compressed mask-address pair, is used to reduce the size of the update message. Here BGP encodes the mask into a single octet that precedes the address. The format of this compressed mask-address pair is as shown in figure 7.5.

Figure 7.5 BGP compress format This single octet LEN specifies the number of bits in the mask, assuming contiguous mask. The address that follows is also compressed and only those octets are covered by mask is included. Depending on the value of LEN the number of octets in the address field is listed in table 7.1. A zero length is used for default router. Table 7.1 LEN Less than 8 9 to 16 17 to 24 25 to 32 Path Attribute Path attributes are factored to reduce the size of the update message. That is attributes apply to all destinations. If any destinations have different attributes then, they must be advertised in a separate update message. The path attribute (4-octet) contains a list of items and each item is of the form given in figure 7.6(a). Number of octets in address 1 2 3 4

(type, length, value) Figure 7.6 (a) path attribute item The type is 2 bytes long. The format of the type field of an item in path attribute is given in figure 7.6(b).

Figure 7.6 (b) BGP 2-octet type field of path attribute FLAG Bits 0 1 2 3 5-7 Description Zero for required attribute, set to 1 if optional Set to 1 for transitive, zero for no transitive Set to 1 for complete, zero for partial Set to 1 for length is 2 octets, zero for one Unused and must be zero

Figure 7.7 Flag bits of type field of path attribute Type code 1 2 3 4 5 6 7 Description Specify origin of the path information List of AS on path to destination Next hop to use for destination Discriminator used for multiple AS exit points Preference used within an AS Indication that routes have been aggregated ID of AS that aggregates routes

Figure 7.8 Type codes of type field The values of flag bits and type code of type field of an item in path attribute is given in figure 7.7 and figure 7.8 respectively. A length field follows type field may be either 1 or 2 bytes long depending on flag bit 3, which specifies the size of the length field. Then the contents of length field gives the size of the value filed. Importance of path attributes:

1. Path information allows a receiver to check for routing loops. The sender can specify exact path through AS to the destination. If any AS is listed more than once then there is a routing loop. 2. Path information allows a receiver to implement policy constraints. A receiver can examine the path so that they should not pass through untrusted AS. 3. Path information allows a receiver to know the source of all routes.

6. Explain the following with respect to E-Mail: A) Architecture B) Header format A) Architecture : E-mail system normally consists of two sub systems 1. the user agents 2. the message transfer agents The user agents allow people to read and send e-mails. The message transfer agents move the messages from source to destination. The user agents are local programs that provide a command based, menu-based, or graphical method for interacting with e-mail system. The message transfer agents are daemons, which are processes that run in background. Their job is to move datagram e-mail through system. A key idea in e-mail system is the distinction between the envelope and its contents. The envelope encapsulates the message. It contains all the information needed for transporting the message like destinations address, priority, and security level, all of which are distinct from the message itself.

B) Header format The header is separated from the body by a blank line. Envelopes and messages are illustrated in figure 8.1. 9 Fig. 8.1: E-mail envelopes and messages The message header fields that are used in an example of figure 8.1.

consists of following fields From: The e-mail address, and optionally name, of the sender of the message. To: one or more e-mail addresses, and optionally name, of the receivers of the message. Subject: A brief summary of the contents of the message. Date: The local time and date when the message was originally sent. E-mail system based on RFC 822 contains the message header as shown in figure 8.2. The figure gives the fields along with their meaning.

Fig. 8.2: Fields used in the RFC 822 message header The fields in the message header of E-mail system based on RFC 822 related to message transport are given in figure 8.3. The figure gives the fields along with their meaning.

Fig. 8.3: Header Fields related to message Transport