Network

The Core FAQ File
By:-
-- 03MX
(2003-2006)
This watermark does not appear in the registered version - http://www.clicktoconvert.com
2
Table of Contents:- Pages
I) Computer Networks - 3 - 40
II) Database Management System - 41 - 82
III) Operating System - 83 - 129
IV) Object Oriented Programming - 130 - 177
V) Data Structures and Algorithms - 178 - 216
3
Part – I
COMPUTER NETWORKS
4
Topics Covered:-
i) Introduction to Network and Physical Layer – Nigitha S J
ii) Data Link Layer and Medium Access Control Sublayer
-- Jeevan Prakaash D
iii) Network Layer and Transport Layer – Sameera S, Gohila G
iv) Application Layer & Network Security –Raghu shankar R
v) TCP/IP - Sree Vidhya B
5
(i)
INTRODUCTION TO NETWORKS AND PHYSICAL LAYER
1. What is Computer Networks?
Interconnection of autonomous computer . One system cannot force other system to shut
down.
2. What is Data Communication?
Data Communication is the exchange of data between two devices via some form of
transmission medium.
3. What is Protocol?
Set of rules that govern data communication.
4. What is Line Configuration?
Line Configuration refers to the way two or more communication devices attach to a link.
5. What is Link?
Link is the physical communication pathway that transfers data from one device to
another.
6. State the Categories of line configuration & explain them.
i. Point-to-Point
A Point-to-Point line configuration provides a dedicated link between two
devices.
ii. Multipoint
A multipoint line configuration is one in which more than two specific devices
share a single link.
7. What is Topology?
Pattern of interconnection of nodes or system..
8. What is Mesh Topology? State its advantage & disadvantage.
In a Mesh Topology, every device has a dedicated point-to-point link to every other
device.
Advantages:
6
1] The use of dedicated links, eliminates the traffic problems.
2] It is robust.
3] Another advantage is privacy or security.
Disadvantages:
The main disadvantage of a mesh is related to the amount of cabling & the number of I/O
ports required.
9. What is Star Topology?
In a star topology, each device has a dedicated point-to-point link to central controller,
called a hub.
10. Define the term Transmission mode.
It is used to define the direction of signal flow between two linked devices.
11. Explain the three types of transmission modes.
1] In Simplex mode, the communication is unidirectional. Only one of the two
stations on a link can transmit, the other can only receive.
2] In Half-duplex, each station can both transmit and receive, but not at the same
time.
3] In Full-duplex mode, both stations can transmit and receive simultaneously.
12. Connection-Oriented Service
To use connection-oriented service, the user first establishes a connection, uses the
connection and then releases the connection. The sender sends the data in one end, and
the receiver takes them out in the same order at the other end.
13. Connectionless Service
In connectionless service, each message carries the full destination address and each
one is routed through the system independent of all the others. Here the Order of injection
is not same as order of delivery.
14. Define Gateway.
7
A device used to connect two separate networks that use different communication
protocols.
15. Define Repeater.
A device used in a network to strengthen a signal as it is passed along the network
cable . It works at the Physical Layer of the OSI model.
A device that receives, amplifies, and retransmits a signal. Consists of a transceiver
(transmitter and receiver). It is used to boost signal levels to extend the system distance .
16. Define Router.
A communications device between networks that determines the best path between them
for optimal performance. Routers are used in complex networks of networks such as
enterprise-wide networks and the Internet.
17. Define Switch.
A network traffic monitoring device that controls the flow of traffic between multiple
network nodes.
18. What is the function of Physical Layer?
1] It transmits a bit stream over a physical medium.
2] It defines the characteristics of the interface between the devices and the
transmission medium.
3] It also defines the type of transmission medium.
4] It defines the direction of transmission between two devices i,e; transmission
mode.
19. Data rate.
The number of bits sent each second is known as transmission rate or data rate.
Note:
Physical Layer is one of the network support layers.
Physical Layer is reliable bit pipe.
Physical Layer -the first layer of the OSI model, responsible for the mechanical and
electrical specifications of the medium.
8
(ii)
DATA LINK LAYER AND MEDIUM ACCESS CONTROL
SUBLAYER
1) What are the functions of Data Link Layer?
The functions of Data Link Layer are to provide
1) Well define interface with Network layer
2) Deal transmission errors
3) Regulate the flow of Data
2) What is a frame? What does it compose of?
The Data link layer takes the packets it gets from the network layer and
encapsulates them into frames. Each frame consists of
1) Frame Header
2) Payload Field ( for holding the packet)
3) Trailer
3) What are the services provided by Data Link Layer to the Network Layer?
When are they used?
Three services are provided to the Network Layer which include :
1) Unacknowledged Connectionless service
->Used when error rate is low & when recovery is done by higher layers
->Used for real time traffic where late data is worse than bad data
2) Acknowledged Connectionless Service
-> Used for unreliable channels; such as wireless systems
3) Acknowledged Connection oriented systems
->Used for reliable channels
4) How is the error detected in the Data Link Layer?
The error is detected by the sequence of steps which are :
1) Break the bit stream into discrete frames and compute check sum
2) At destination recompute checksum when frame reaches
3) If both are the same, error has ocurred
5) What are the different framing methods available with Data link layer?
The different framing methods include:
1) Character Count : Uses a field in the header to specify the no of
chacracters in a frame. The destination now knows how many characters
follow and where the end of each frame is.
9
2) Flag bytes with byte stuffing: Start and end of frames are FLAG bytes. 2
consecutive FLAG bytes denote a frame.
Disadvantage : FLAG byte could also occur as data
Rectified by: Stuffing ESC character before the byte (byte stuffing)
3) Starting and ending flags with bit stuffing : Here , a unique byte
“01111110” is used as frame delimiter.
Disadvantage: The pattern of flag could also occur as data
Rectified by: Stuffing 0 after 5 consecutive 1’s (Bit Stuffing)
4) Physical layer coding violation : This is applicable only for those n/wks
where encoding in physical medium contains some redundancy
6) What are the methods of Flow Control?
The various methods of Flow Control are:
1) Feed Back based Flow Control: Receiver sends back information to
sender to permitting to send more data.
2) Rate based Flow Control: Here the protocol has a built in mechanism that
limits the rate at which sender transmits data without feedback from
receiver.
7) What are the strategies in Error handling? When are they used?
The strategies are :
1) Error Detection:
Used in highly reliable channels ( as retransmission is cheaper)
2) Error Correction:
Used in unreliable wireless links that make may errors
8) What is a codeword?
A frame of ‘m’ bits of data (message) has ‘r’ redundant bits (check bits). The bit
unit containing data and check bits is called as codeword.
Therefore, codeword n = m + r
9) What is Hamming distance? What is the distance required
i) to detect ‘d’ errors and ii) to correct ‘d’ errors?
The number of bit positions in which 2 codewords differ is called as Hamming
Distance.It is obtained by ORing 2 codewords
10) How do you find the Hamming distance of a complete code?
From the list of legal codewords, find 2 codewords whose hamming distance is
minimum. This is the hamming distance for the complete code.
10
11) Given the list of codewords:
0000000000,0000011111,1111100000,1111111111
Comment on Hamming distance and Error Control Strategy.
The Hamming distance of the complete code = 5
(ORing 0000000000 , 0000011111).
Now 2d + 1 = 5; Therefore, this can correct double errors.
Now , if the codeword 0000000111 arrives, then receiver knows that original code
word must have been 0000011111
12) How to decide on the no of redundant bits ‘r’ to be assigned to a message
‘m’?
This is decided based on formula
m+r+1 <= 2 power r;
13) What is the limitation of Hamming code?
They could correct only single bit errors. They can be used to correct burst errors but
then they can correct only one error per code word.
14) What does a burst error imply?
A burst error does not imply that all the bits are wrong but it implies that at least the
first and last are wrong.
15) Brief about Cyclic Redundancy.
This is basically a polynomial code. Here we have,
Message to be sent : M(x) called the Message polynomial
Agreed upon Polynomial : G(x) called the Generator Polynomial
Calculated value : T(x) called the Transmission Polynomial.
Error : E(x) called the Error Polynomial
Receiver Calculates : ( T(x) + E(x)) /G(x)
Results
1) If G(x) contains 2 or more terms – single bit errors are detected.
2) if there are 2 single bit errors
E(x) = x power i + x power j ; where i> j
if G(x) is not divisible by x ,then the sufficient condition that all the double bit
errors are detected is : G(x) does not divide x power k + 1 for k up to max
value of i-j .
3) If there are odd number of bits in error then E(X) contains odd number of
terms. Polynomial with odd number of terms must be divisible by x + 1.
11
4) A polynomial code with r checkbits will detect all the burst errors of
length < r;
16) What are the fields in a frame?
The frame consists of 2 fields viz., kind,seq,ack,info.
Of these the first three contain control info and are called as Frame headers.
kind – tells whether there is data in a frame.
seq – indicates the sequence number.
ack – represents the acknowledgement number
info – represents the actual data
17) What does the destination Data link layer do as son as it receives a frame?
This extracts the packet from the frame and passes the packet to destination network
layer.
18) What are the different elementary Data link layer protocols? What are the
assumptions made in each protocol?
The different elementary Data link layer protocols are :
1) Unrestricted Simplex Protocol (Unrealistic protocol)
Assumptions
1.1) Data Transmission is simplex
1.2) Sender’s and Receiver’s n/w layers are ready
1.3) Process time ignored
1.4) Infinite buffer space.
1.5) Error free communication protocol.
2) Simplex Stop and Wait Protocol
Assumptions
2.1) Restricted Buffer Space
2.2) Error prone channel
2.3) Receiver sends dummy frame as ack for synchronization.
3) Simplex Protocol for Noisy Channel
Assumptions
3.1) Restricted Buffer.
3.2) Error prone channel
3.3) Sender also adds sequence number for each frame to prevent duplication.
3.4) Retransmissions are done based on time out of the timers
19) What is a PAR or ARQ?
12
PAR stands for Positive Acknowledgement with Retransmission and
ARQ stands for Automatic Repeat Request.
Both of them actually refer to protocols in which sender waits for a Positive
Acknowledgement before advancing to next data item.
20) What is piggy backing? What is its advantage? What is the complication it
introduces?
The technique of temporarily delaying outgoing acknowledgements so that they can be
hooked to next outgoing data frame is called Piggy Backing.
The principal advantage it brings is the better use of available channel bandwidth.
The complication it introduces is how long the data link layer must wait to send piggy
backed acknowledgements. If it is more than sender’s time out period , frame
retransmitted.
21) What are Sliding Window Protocols?
They are bidirectional protocols. At any instant of time , sender maintains a set of
sequence numbers corresponding to frames it is permitted to send. The sender’s window
and the receiver’s window need not have the same lower and upper limit or even have the
same size.
22) What is a one-bit sliding window protocol?
They refer to the Sliding Window Protocol with maximum window size 1. Such protocols
uses stop and wait since sender transmits a frame and waits for its
acknowledgement before sending next one.
23) What is Pipelining?
The time required for the ack frame to come back to sender is called Round Trip Delay.
The product of Bandwidth and Round Trip delay (bandwidth * Round Trip Delay)
basically tells what the capacity of pipe is and the sender needs the ability to fill it
without stopping in order to operate at higher efficiency. This technique is known as
Pipelining.
24) What are the approaches to deal with errors in presence of Pipelining?
There are basically 2 approaches which are
a) Go Back n : This is for the receiver to simply discard all the subsequent frames,
sending no acknowledgement for discarded frames.
b) Selective Repeat: The bad frame received is discarded but the good frames
received after it is buffered.
13
25) What is NAK? What does NAK 2 signify in a) Go Back n and b) Selective
Repeat ?
NAK generally stands for Negative Acknowledgement. It is sent by the receiver to the
sender whenever the receiver encounters an error like checksum error and frame out of
sequence.
a) NAK 2 in Go Back n would cause the sender to retransmit all the frames from
frame no 2
b) NAK 2 in Selective Repeat would cause the sender to retransmit only frame no 2
26) What is a finite state model?
This is a concept n which each protocol machine (sender or receiver) is always in a
specific state at every instant of time. The state consists of all values of its variables,
including the program counter. The no of states is 2 power n , where n is the number
of bits required to represent all the variables combined.
27) When does a transition occur for a protocol machine?
Transition occurs when a frame is sent, when a frame is sent, when a frame arrives,
when timer expires , when an interrupt occurs etc.
28) What are the events associated with a channel?
Typical events are
i) Insertion of New Frame on to channel by a protocol machine
ii) Delivery of a frame by a protocol machine
iii) Loss of frame due to noise
29) What is Reachability Analysis? What is its significance?
From the initial state ( ie the state corresponding to the description of the system
when it starts running), it could be possible to determine which states are reachable
and which are not. This technique is called Reachability Analysis. Its significance is
determining whether a protocol is correct or not.
30) What is Deadlock?
This is the situation in which the protocol makes no forward progress (ie deliver
packets to n/w layer) no matter what sequence of events happen. In graph model,
deadlock is characterized by existence of subset of states that is reachable from initial
state and has 2 properties:
i) There is no transition out of subset.
ii) There is no transition in subset that cause forward progress.
14
31) What are the errors found in Reachability Analysis?
The errors found are as follows:
1) Incompleteness : It is possible for a certain frame to occur in a certain state and
finite state machine does not say what action should be taken, the specification is
an error.
2) Deadlock
3) Extraneous Transition: It says of how to handle event in a state in which event
cannot occur
32) What is Petri Net Model? What are its constituents?
This model , like finite state machine is a technique for formally specifying protocols.
A Petri Net has four elements namely :-
Place – Represents the states which the system may be in.
token – Represents in which state currently the system may be in
Transition- indicated by horizontal or vertical line
Arcs – Both the input and output arcs.
33) Brief about working of Petri Net Model.
The Working of this model is as follows:
- Transition is enabled if there is atleast 1 input token
- Any enabled transition may fire t will
- if 2 or more transition are enabled, any one of them may fire.
The choice of the transition to fire is indeterminate. That is why they are useful
for modeling protocols.
THE MEDIUM ACCESS CONTOL SUBLAYER
Broadcast channels are also called as multi-access channels or random access channels.
E.g.: A conference call involving connection of several telephone lines.
1. What do you understand by medium access control (MAC) sublayer?
The protocols used to determine who goes next on a multi-access channel belong to a
sublayer of datalink layer called the MAC.
Note:
Many of LANs use multi-access channel as basis for communication.
WAN uses point-to- point links, except for satellite networks.
15
2. What is frequency division multiplexing (FDM)?
It is a technique of multiple-access channel where if there are N users, the bandwidth is
divided into N equal sized portions, each user being assigned a portion.
What are the advantages and disadvantages of frequency division multiplexing(FDM)?
ADVANTAGES:
There is no interference between users.
When there is small and constant number of users, FDM is simple and effective
mechanism.
DISADVANTAGES:
When some users are quiescent , their bandwidth is simply lost.
They are not applicable to bursty traffics.
Permissions will be denied if there are N users.
3. What are the key assumptions in dynamic channel allocation in LANs and
MANs.
There are 5 key assumptions which are:
Station Model:
The model consists of N independent stations or terminals. Once frames are generated
here, station is blocked and does nothing until frame successfully transmitted.
Single channel assumptions.
Collision assumption:
When transmitted frames overlap in time, resulting signal is garbled. This event is called
collision.
Carrier Sense: Stations can tell if the channel is in use before using.
No carrier sense: Stations cannot sense the channel before using.
4. What is a “Carrier”?
It refers to the electric signal on the cable.
What are the multiple access protocols you are familiar with?
1. ALOHA – Abramson’s Local Hawaii Networks.
2. Carrier Sense Multiple Access Protocols.
3. Collision free protocols.
4. Limited contention protocols.
5. Wavelength division multiple access protocols.
6. Wireless LAN protocols.
5. What are the two versions of ALOHA?
16
They are pure ALOHA and slotted ALOHA.
The former requires global time synchronization and the latter doesn’t.
6. What do you understand from pure ALOHA?
Users transmit whenever there is a data to be sent. Whenever there is a collision,
colliding frames are damaged. Sender keeps listening to the channel, if frame is destroyed
sender waits a random amount of time and sends it again.
7. What are contention systems?
Systems in which multiple users share a common channel in a way that can lead to
conflicts are widely known as contention systems.
8. What is slotted ALOHA technique?
Here the time is divided into discrete interval, each interval corresponding to one frame.
Here a computer is not permitted to send whenever carriage return is typed but is
requested to wait for the beginning of next slot.
9. What are carrier sense protocols?
In local area networks, it’s possible for stations to detect what other stations are doing
and behave accordingly. protocols in which stations listen for a carrier (ie. A
transmission) and act accordingly are called carrier sense protocols.
10. What are persistent, non-persistent, p-persistent CSMA?
Persistent:
When station has data to send , it listens to channel to find whether it is busy.
If busy then station waits until it becomes idle.
When channel idle, station transmits a frame.
If collision occurs, each station waits for a random amount of time and starts all
over again.
Here the station transmits with a probability 1 when channel is idle.
Non-persistent:
Same as former but difference is that:
Whenever the channel is busy, the station does not continually monitor channel
but waits for a random period of time and repeats the algorithm.
P-persistent:
This appears to slotted channels.
When station ready to send and channel idle, transmits with a probability “p”.
With probability of q=1-p, it differs until the next slot.
17
11. What is CSMA/ CD?
This stands for carrier sense multiple access with collision detection. whenever the
collisions occur with CSMA and when they are detected, then the transmission is aborted.
The CSMA/CD protocol is widely used in LANs in the MAC sublayer.
12. How are collisions detected?
They are detected by looking at power or pulse width of the received signal and
comparing it with transmitted signal.
18
(iii)
NETWORK LAYER AND TRANSPORT LAYER
1. Which is the lowest layer that deals with end to end transmission?
Network Layer.
2. Goals of network layer:
The services should be independent of the router technology.
The transport layer should be shielded from the number, type and topology
of the router present.
The network addresses made available to the transport layer should use a
uniform numbering plan, even across LAN’s and WAN’s.
3. What is meant by Datagram Subnet?
If connectionless service is offered in n/w layer, packets are injected into the
subnet individually and routed independently of each other. No advance setup is
needed. These packets are frequently called datagram. And the subnet is called
Datagram Subnet.
4. What is known as Virtual Circuit?
In connection oriented service, a path from the source router to the destination
router must be established before any data packet can be sent. This connection is
called a Virtual Circuit.
5. What is meant by Non – Adaptive Algorithm?
They do not base their routing decisions on measurements or estimates of the
current traffic and topology. Routing information is computer in advance off –line
and downloaded to the routers when the n/w is booted. It is said to be Static
Routing.
Ex: Shortest Path routing.
6. What is meant by Adaptive Routing Algorithm?
They change their routing decisions to reflect changes in the topology. It is said to
be Dynamic Routing.
Ex: Distance Vector Routing, Link State Routing.
7. What are the key differences between Distance Vector Routing and Link
State Routing?
Distance Vector Routing:
a. It has knowledge about the whole network.
19
b. It routes only to its neighbors.
c. The information is shared at regular intervals.
d. The databases are different for each router.
Link State Routing:
a. It has knowledge about the neighboring nodes.
b. It can route to any routers.
c. The information is shared where there is a change in the neighbors.
d. The database is same for all routers.
8. What is known as congestion?
When too many packets are present in the subnet, performance degrades. It is said
to be congestion.
9. Difference between Congestion Control and Flow Control.
Congestion Control:
Congestion control has to do with making sure the subnet is able to carry the
offered traffic. It is a global issue, involving the behavior of all the hosts, all the
routers that store and forwarding processing within the routers.
Flow Control:
Flow Control relates to the pt. to pt. traffic b/w a given sender and receiver. It
makes sure that a fast sender cannot automatically transmit data faster than the
receiver is able to absorb. It gives some direct feedback from receiver to sender.
10. What are all the Congestion Prevention Policies?
Transport Layer:
Retransmission Policy.
Out – of order caching Policy.
Acknowledgement Policy.
Flow Control Policy.
Time out determination Policy.
Network Layer:
Packet queuing and service Policy.
Packets discard Policy.
Routing algorithm.
Packet lifetime management.
Data link Layer:
Retransmission Policy.
Out of order caching Policy.
Acknowledgement Policy.
Flow control Policy.
11. What is meant by Choke Packet?
Packet sent by the source to indicate that threshold had been crossed and the
source is going to be congested. That is said to be Choke Packet.
20
12. What are Internet Gateways?
Computers that inter connect two networks and pass packets from one to the other
is said to be Internet Gateway.
13. What is an IP (or) Internet Address?
A scheme analogous to physical n/w addressing in which each host on the internet
is assigned a 32 bit integer address is called IP Address.
14. What are Multi Homed Hosts?
A Conventional computer having two or more physical connection. They are
called Multi Homed Hosts.
15. Difference between ARP & RARP.
ARP: Address Resolution protocol is used to associate the 32 bit IP address with the
48 bit physical address, used by a host or a router to find the physical address of
another host. It performs Dynamic Address Resolution.
RARP: Reverse Address Resolution Protocol allows a host to discover is IP
addressing when it knows only its physical address.
16. What is all the protocol type in ARP/RARP message format?
• ARP Request.
• ARP Response.
• RARP Request.
• RARP response.
17. What is meant by Fragmentation?
The small pieces into which a datagram is divided are called fragments and the
process of dividing a datagram is known as Fragmentation.
18. What do you know about TTL?
Whenever a node injects a datagram into the internet it sets a maximum time that
the datagram should survive. Router and hosts that process datagram must
increment the TTL field as time pass. And it returns the datagram from the
internet when the time expires.
19. What is meant by Time Stamp?
Timestamps give the time and date at which a router handles the datagram,
expressed as milliseconds.
21
20. What is known as Direct and Indirect delivery?
Direct Delivery, the transmission of a datagram from one m/c across a single
physical network directly to another. Indirect Delivery occurs when the
destination is not on a directly attached n/w, forcing the sender to pass the
datagram to a router for delivery.
21. Why do we need ICMP?
A datagram travels from router to router until it reaches one that can deliver the
datagram directly into its final destination. If a router cannot route or deliver a
datagram or if the router detects an unusual condition that affects its ability to
forward the datagram, the router needs to inform the original source to take action
to avoid or correct the problem.
22. What is PING?
Packet Internet Groper, it’s a blind search. It’s a message generated by hosts
which runs IP application without knowing whether the destination is there or not.
PING is followed by an IP address.
23. What for the term Subnet Mask is used?
Subnet Mask along with the IP address is used to find the subnet id.
24. What is said to be CIDR?
Classless Inter Domain Routing, a technique followed in super netting, which treats
IP addresses as arbitrary integers and collapses a lock of contiguous class C number
to a given site or a network within a site. Hosts & Routers that use super netting
should have routing software that understands ranges of addresses.
25. What are the drawbacks of RARP?
a. Operates at low level using it requires direct access to the n/w hardware.
b. Requires a packet exchange b/w a client m/c and a computer that answers
to its request.
c. RARP uses a computer’s hardware address to identify the machine.
26. What is said to be Retransmission Policy?
In BOOTP data are carried in IP, hence data may be lost, corrupted, and out of order.
IP does not provide a checksum for data. To avoid corruption, BOOTP requires that
UDP uses checksum and set don’t fragment bit. Hence it uses the technique Time Out
and Retransmission.
22
27. Why do we go for Dynamic Host configuration Protocol?
If the number of physical computers exceeds the number of available IP host
addresses, static assignment incurs excessive overhead. Hence we go for DHCP
allows a computer to acquire all the configuration information to obtain an IP address
quickly and dynamically.
DHCP allows three types of address assignment
Manual Configuration.
Automatic Configuration.
Dynamic Configuration.
28. What are the DHCP Address Acquisition States?
• Initialize: When a client first boots.
• Select: After collecting a request message to the server.
• Request: After sending a request message to the server.
• Bound: When a client typically uses the IP address it has acquired.
• Renew: When 50% of the lease period expires, client into the renew
layer.
• Rebind: when 87.5% of the lease period expires.
29. By what characteristics we say that a system as an autonomous system?
1). An Autonomous System consists of a group of routers exchanging
information via a common routing protocol.
2). An Autonomous System is a set of routers and networks managed by a
single organization.
3). Except in times of failures an Autonomous System is connected that is,
there is a path between any two pair of nodes.
30. What is known as Interior Router Protocol & Exterior Router Protocol?
An Interior Router protocol passes routing information between routers with in an
autonomous system. Border Gateway Protocol is used to exchange routing
information among participating routers in multiple Autonomous Systems.
31. What are the services of Transport Layer Protocol?
End – To – End Delivery.
Addressing.
Reliable Delivery.
Flow Control.
Multiplexing.
23
32. Differentiate between Transport Layer and Data link Layer.
Data Link Layer Transport Layer
1. Explicit addressing of destination is not
needed.
1. Explicit addressing of destination is
needed.
2. No idea of storing the data packets. 2. The subnet can store packets for some
times and then delivered later.
3. Provides services with in single network. 3. Provides services across an inter network
made of many networks.
33. What is said to be End – To – End Delivery?
Network Layer does not see any relationship between the packets of data. It treats
each as an independent entity where as transport layer, packs all the incoming
packets properly (Concatenation) and makes sure that the delivery of an entire
message from source destination. It is said to be End – To – End Delivery.
34. What is called TSAP?
In Transport layer, the communication is not just from one end machine to other
but from one end application to other end application. Data generated by an
application in one network must be received not just by the other machine but by
the correct application in some other network. So we need to define transport
addresses to which processes can listen for this particular application. It is said to
be Service Access Points or Ports. This kind of addressing is done at the
Transport Layer. It is said to be Transport Service Access Points (TSAP).
35. What are the aspects of reliable delivery of Transport Layer?
a. Error Control: Data must be delivered to their destination exactly as they
originated from the source. Mechanisms for error handling at this layer are
called as error detection and retransmission with the error handling.
b. Sequence Control: It packs all the incoming packets properly according to
the sequence number and then combines them into a single data unit.
c. Loss Control: The Transport Layer ensures that all pieces of a
transmission at the destination. Sequence numbers allow the receiver’s
transport layer protocol to identify any missing segments and requester
delivery.
d. Duplication Control: Transport layer guarantee that number of pieces of
data arrive at the receiving system duplicated. Sequence numbers allow
the receiver to identify and discard duplicate segments.
36. What are the commonly used protocols in the Transport Layer?
TCP: Transmission Control Protocol
1. It is connection oriented.
2. Designed to provide reliable communication between pairs of processes
across a variety of reliable and unreliable networks.
24
UDP: User Datagram Protocol
1. It is connectionless.
2. Designed to service for application level procedures. Hence it is an
unreliable service; delivery and duplicate protection are not guaranteed.
37. What is said to be Active and Passive machines?
Clients that take service from the service from the server are said to be Active
Machines. They advertise their routes to others.
Server, that waits for the connection from the other side is said to be Passive
Machine. They listen and update their routes based on advertisement, but they do
not advertise.
38. Define Hop Count.
The number of hops along a path from a given source to the destination refers to
the number of routers that datagram encounters along the path.
39. What is a Segment? Why they are exchanged?
The unit of transfer between the TCP software on two machines is called a
Segment.
They are exchanged to
i. Establish connections.
ii. To transfer data.
iii. To send acknowledgement.
iv. To advertise window size.
v. To close connections.
40. What is the purpose of Pseudo Header in the TCP/UDP?
Pseudo Header allows the receiver to verify that the segment has reached its
correct destination, which includes both a host IP address as well as a protocol
number. Protocol number for TCP is 6 and UDP is 17. Pseudo header is used at
the time of check sum calculation.
41. How many hand shakes are used in TCP to establish and close connection?
To establish a connection: 3 way hand shake.
To close a connection: 2 way hand shake.
42. What is known as Window Advertisement?
Window Advertisement specifies how many octets of data the receiver is prepared
to accept. If the receiver’s buffer begins to become full, it sends a smaller window
advertisement. Hence the size of the window changes with time, it is also said to be
Zero Window Advertisement. If the buffer is full in receiver, then the size of the
25
window is 0, said to be Zero Window Advertisement, The advantage of this
advertisement is to provide flow control as well as reliable transfer.
43. What is “Urgent Mode”?
It is sometimes important that one end of a connection to send data out of band,
without waiting for the program at the other end of the connection to consume
octets already in the stream. The TCP allows the sender to specify data as urgent,
meaning that the receiving program should be notified of its arrival. It is said to be
“Urgent Mode”. When the Urgent Bit is set, the urgent pointer specifies the
position in the segment where urgent data ends.
44. What are the three types of addresses allowed by IPV6?
UniCast: An identifier for a single interface. A packet sent to a unicast
address is delivered to the interface identified by that address.
AnyCast: An identifier for a set of interfaces. A packet sent to any cast
address is delivered to one of the interfaces identified by that address.
MultiCast: An identifier for a set of interfaces. A packet sent to a multicast
address is delivered to all interfaces identified by that address.
46. What is a Link State Packet?
In the Link State Routing, when a router floods the network with the information
about its neighborhood, it is said to be advertising. The packet send during the
advertisement is said to be Link State Packet.
47. When a HELLO Protocol is used?
It uses time instead of distance to determine optimal routing. It is an alternative to
Routing Information Protocol (RIP).
48. What is a Sliding Window?
A sliding window is used to make data transmission more efficient as well as to
control the flow of data so that the receiver does not become overwhelmed.
Sliding Window used at the transport layer are usually byte oriented rather than
frame oriented.
26
(iv)
APPLICATION LAYER & NETWORK SECURITY
1. What is DNS?
DNS : Domain Name System
2. When many computers were connected together often host name conflicts were
used to occur.
To solve this problem domain name system was invented.
3. What is Resolver?
It is used to map the name of a system to an IP address. It is a library procedure.
4. What is domain?
Domain contains many hosts.
Domains can be sub divided in to sub- domains.
5. Explain some domain names and their usage ?
Com - commercial
Edu - educational institutions.
Gov - Us federal goverment
Mil - armed forces
Int – international organizations.
6. What are Resource records?
Every domain have resource records associated with it.
One example for such an resource record is the IP address.
Name servers:
A single name server can contain the entire DNS data base.
7. What is Recursive query?
When one server donesnt have the required information it searches for other servers and
gets back the needed information.
8. What are sub systems of Emails?
27
It contains mainly two sub systems
1) user agents :
which allows people to read and send email
2) message transfer agents:
which move the messages from the source to the destination.
9. What is MIME ?
Multi pur pose Internet mail extension.
Explain some MIME types?
10. Some of the mime types are,
Text
Image
Audio
Video
Application
Message
Multipart.etc
11. What is Message transfer?
Simple Mail Transfer Protocol is used for the mail transfer.
It is ASCII based.
12. What is POP?
Post office protocol
It is used to contact the message transfer agent and allow the email to be copied form the
ISP to the user.
13. Explain the actions of POP?
3 actions of pop are,
1> authorization
2> transactions
3> update.
14. What is IMAP?
Internet Message access protocol.
28
It doesnot delete the mails from the mail box after download instead it assumes that the
mail will exist indefinitely in multiple mail boxes.
15. What are Filters?
To filter the incoming mails based on the sender .
16. Explain www.
world wide web.
It is an architectural frame work for accessing linked documents spread out over millions
of machines all over the internet.
17. Details of W3c?
World wide web consortium
It was formed by CERN and MIT to for further developing the web, standizing the
protocols and encourage the inter operability between sites.
18. What is URL?
Uniform resource locators.
Web pages are named using resource locators.
19. Explain HTTP?
Hyper text transfer protocol.
It is the webs native protocol . only webs servers use this protocol.
20. What is URN?
Uniform resource names.
A urn is a taught of generalized URL.
21. Explain Cookies?
It is a small text file stored by the server containing the information about that of the
client with respect to the server.
22. What is a persistent cookie and non persistent ?
When a cookie expires in an certain amount of time specified in the date field then it is
said to be persistent cookie.
29
If the field is null then it is non persistent cookie.
23. Explain HTML?
Hypertext mark up language
It is an language used to produce web pages that include text, graphics and pointers and
other web pages.
24. What is Caching?
When one page is viewed often then the some amount of information about the page is
stored in the client. This is used for easy access in the subsequent hits.
It is called as caching.
25. What is Mirroring?
The server replicates its contents to improve the access. This technique is called as
mirroring.
26. Explain about Audio compression?
Mp3
MPEG audio layer 3 .
It is an very popular audio compression technique.
27. What is Cipher?
A cipher is a character for character or bit for bit transformation, without regard to the
linguistic structure of the message.
The general symbol for symbol substitution is called a mono alphabetic substitution.
Substitution ciphers preserve the order of the plaintext symbols but disguise them.
Transposition ciphers in contrast, re order the letters but do not disguise them.
28. What is One-time pad?
Choose an random bit string as key. Then convert the plain text in to bit string. Compute
xor between the two strings. The resulting cipher text cannot be broken because in an
large amount of cipher text each letter will occurs equally often, as will every diagram
and every tri gram and so on.
30
This is called as one – time pad.
29. What is DES?
Data encryption standard.
Cipher modes.
The different types of cipher modes available are,
Electronic code book mode
Cipher block chaining mode
Cipher feed back mode
Stream cipher mode
Counter mode
30. What is Usage of Digital signature?
Using Digital signature
The receiver can verify the claimed identity of the sender.
The sender cannot later repudiate the contents of the message.
The receiver cannot possibly have concocted the message himself.
31. Explain Message digest?
This scheme is based on the idea of one way hash function that takes an arbitrarily long
piece of plain text and from it computes a fixed length bit string.
The hash function is called as message digest.
Example
Md5
SHA-1
31
(v)
TCP/IP
1. What is an IP address?
An IP address is a unique 32-bit integer address assigned to each host on the
internet. Each address is a pair consisting of netted and the hostid. IP addresses
stand for connection to a network rather than specifying an individual computer.
2. IP Address:
starts with netid (bits) hostid (bits)
Class A 0 7 24
Class B 10 14 16
Class C 110 21 8
Class D 1110 Multicast address
Class E 11110 Reserved for future use
3. Can a host be part of more than one physical network?
Yes. Such hosts are called as Multi-homed hosts. Routers belong to this category.
These multi-homed hosts require multiple IP addresses, where each address
corresponds to one of the machine's network connections.
4. What are Limited broadcast and directed broadcast addresses?
A directed broadcast address uniquely identifies the target network in addition to
specifying broadcast on that network. It contains both the valid networkid and the
broadcast hostid. It allows a remote system to send a single packet that will be
broadcasted on the specified network..
A Limited broadcast address consists of all 1’s (255.255.255.255) and provides a
broadcast address for the local network independent of the assigned IP addresses.
A host uses this address as a part of start up procedure before it learns its IP
address.
NOTE:
A broadcast address has hostid with all bits set to 1.A network address has all bits
of hostid equal to 0.
5. What is address resolution protocol (ARP) and reverse address resolution
protocol (RARP)?
The address resolution protocol (ARP) allows a host to find the physical address
of a target host on the same physical network, given only the target's IP address.
32
ARP request is broadcasted to all the hosts in the network whereas ARP response
is sent only to the host who broadcasted the request.
Given a physical network address, to map it to an internet address RARP is used.
A diskless machine uses RARP to obtain its IP address from a server. The RARP
mechanism supplies the target machine's physical hardware address to uniquely
identify the processor and broadcasts the RARP request. The RARP servers
process the request and send a reply containing the target IP address.
6. What is an IP datagram?
The fundamental unit of information passed across any network utilizing Internet
protocol. An IP datagram contains source and destination addresses along with
data and a number of fields that define such things as the length of the datagram,
the header checksum, and flags that indicate whether the datagram can be (or has
been) fragmented.
7. What do you mean by tunneling?
A method of transmission over internet which works based on differing
protocols. In tunneling, a packet based on one protocol is wrapped, or
encapsulated, in a second packet based on whatever differing protocol is needed
in order for it to travel over an intermediary network. In effect, the second
wrapper "insulates" the original packet and creates the illusion of a tunnel through
which the wrapped packet travels across the intermediary network. In real-life
terms, tunneling is comparable to "encapsulating" a present (the original packet)
in a box (the secondary wrapper) for delivery through the postal system.
8. What is a gateway?
A gateway is a computer that lies at the intersection of two networks and routes
traffic correctly from one network to another, while keeping traffic internal to the
two networks separated.
9. What are internet routers?
A router is a network device that transmits message packets, routing them over
the best route available at the time. Routers are used to connect multiple network
segments, including those based on differing architectures and protocols. Routers
operate on network layer information and participate in running one or more
network layer routing protocols. Routers use routing tables in determining where
to forward transmissions. Tables for routers include information about different
paths between other routers.
10. What is a subnet?
33
A portion of an IP network that shares a common address component and defined
by a subnet mask.. On TCP/IP networks, subnets are defined as all devices whose
IP addresses have the same prefix. For example, all devices with IP addresses that
start with 100.100.100 would be part of the same subnet. IP networks are divided
using a subnet mask. Each subnet has its own unique submitted network ID.
Devices on the same subnet have the same subnet mask.
11.What is the significance of TTL field in an IP datagram?
TTL-Time to Live. It specifies the time in seconds for which the datagram is
allowed to stay in the internet. When a router process the datagram header, the
TTL field is decremented by 1.When TTL reaches 0 before the datagram reaches
its destination , the datagram is discarded and an error message is sent to the
source.
12.What is ICMP?
ICMP-Internet Control Message Protocol.
ICMP allow routers in an internet to report errors or control messages to other
routers or hosts. It is an error reporting mechanism. They require two levels of
encapsulation. ICMP depends on IP to move packets around the network on its
behalf.
ICMP can be characterized as follows:
ICMP uses IP as if ICMP were a higher level protocol. However, ICMP is
an integral part of IP and must be implemented by every IP module.
ICMP is used to report errors, not to make IP reliable. Datagrams may
still be undelivered without any report on their loss.
For fragmented datagrams, ICMP messages are only sent about errors
with the first fragment.
ICMP messages are never sent in response to datagrams with a broadcast
or a multicast destination address.
ICMP messages are never sent in response to a datagram that does not
have a source IP address representing a unique host.
13.What is IP?
Internet Protocol (IP) is the central, unifying protocol in the TCP/IP suite. It
provides the basic delivery mechanism for packets of data sent between all
systems on an internet, regardless of whether the systems are in the same room or
on opposite sides of the world. All other protocols in the TCP/IP suite depend on
IP to carry out the fundamental function of moving packets across the internet.
IP does not guarantee to actually deliver the data to the destination, nor does it
guarantee that the data will be delivered undamaged, nor does it guarantee that
data packets will be delivered to the destination in the order in which they were
34
sent by the source, nor does it guarantee that only one copy of the data will be
delivered to the destination.
14.What is TCP?
Transmission Control Protocol (TCP) provides a reliable byte-stream transfer
service between two endpoints on an internet. TCP depends on IP to move
packets around the network on its behalf. IP is inherently unreliable, so TCP
protects against data loss, data corruption, packet reordering and data duplication
by adding checksums and sequence numbers to transmitted data and, on the
receiving side, sending back packets that acknowledge the receipt of data.
Before sending data across the network, TCP establishes a connection with the
destination via an exchange of management packets. The connection is destroyed,
again via an exchange of management packets, when the application that was
using TCP indicates that no more data will be transferred.
15.What is UDP?
User Datagram Protocol (UDP) provides an unreliable packet zed data transfer
service between endpoints on an internet. UDP depends on IP to move packets
around the network on its behalf. UDP does not guarantee to actually deliver the
data to the destination, nor does it guarantee that data packets will be delivered to
the destination in the order in which they were sent by the source, nor does it
guarantee that only one copy of the data will be delivered to the destination. UDP
does guarantee data integrity, and it does this by adding a checksum to the data
before transmission.
16. What is DHCP?
Dynamic Host Configuration Protocol (DHCP) allows IP addresses to be
allocated to hosts on an as-needed basis. DHCP lets a host 'borrow' an IP address
from a pool of IP addresses; when the address is no longer needed it is recycled
and made available for use by some other host. DHCP also allows a host to
retrieve a variety of configuration information at the same time as it acquires an
IP address. DCHP depends on UDP to carry packets between the client and server
tasks.
17. What is DNS ?
The Domain Name System (DNS) provides dynamic on-demand translation
between human-readable names and the numeric addresses actually used by IP
(like 192.112.170.243). DNS uses both UDP and TCP. It used UDP to carry
simple queries and responses but depends on TCP to guarantee the correct and
orderly delivery of large amounts of bulk data across the network.
35
18.What is SNMP ?
Simple Network Management Protocol (SNMP) provides a means of monitoring
and managing systems over a network. SNMP defines a method of sending
queries (the GET and GET-NEXT primitives) and commands (the SET
primitive) from a management station client to an agent server running on the
target system, and collecting responses and unsolicited event notifications (the
TRAP primitive).The various things that can be monitored and managed by
SNMP are collectively called as the Management Information Base (MIB).
SNMP sends traffic through UDP because of its relative simplicity and low
overhead.
19.What is Telnet ?
Telnet provides a network terminal or "remote login" capability. The Telnet
server accepts data from the telnet client and forwards them to the operating
system in such a way that the received characters are treated as though they had
been typed at a terminal keyboard. Responses generated by the server operating
system are passed back to the Telnet client for display. The Telnet protocol
provides the ability to negotiate many kinds of terminal-related behavior
between the client and server.
20.What are sockets?
A socket is an abstraction that represents an endpoint of communication. Most
applications that consciously use TCP and UDP do so by creating a socket of
the appropriate type and then performing a series of operations on that socket.
The operations that can be performed on a socket include control operations
(such as associating a port number with the socket, initiating or accepting a
connection on the socket, or destroying the socket) data transfer operations (such
as writing data through the socket to some other application, or reading data
from some other application through the socket) and status operations (such as
finding the IP address associated with the socket).The complete set of operations
that can be performed on a socket constitutes the Sockets API (Application
Programming Interface).
21.What do you mean by source quench?
A source quench message is a request for the host to reduce its current state of
datagram transmission. It is an ICMP message reporting congestion to the source.
The routers send one source quench message for every datagram that they discard.
22.What is a PING?
36
The ICMP echo request messages are named as ping. A host sends an ICMP echo
request message to a specified destination. The machine that receives the echo
request formulates an echo reply and returns it to the original sender. This
mechanism is used to test whether a destination is reachable and responding.
23.What is a port? What do the end points specify?
A port identifies the ultimate destination within a machine. Each port is assigned a
small integer value. A port is what an Internet server uses to distinguish between
requests for different service uses.
An endpoint specifies a pair(host, port)which is the point of connection for a tcp
application where host is the IP address of for a host and port is a TCP port on that
host.
24. What is the difference between TCP and UDP if they both operate at the
Transport layer?
Although both TCP and UDP are Transport layer protocols and provide the same
basic function, TCP is a connection-oriented protocol, which means a session is
established before data is transmitted, and acknowledgments are sent back to the
sending computer to verify that the data did arrive and was accurate and complete.
UDP is connectionless; no session, or one-to-one connection, is established prior
to data transmission. This makes UDP the faster of the two, and TCP the more
reliable.
25.What is IPv6?
IP Version 6 (IPv6) is the newest version of IP, sometimes called Ping for "IP,
Next Generation". IPv6 is fairly well defined but is not yet widely deployed. The
main differences between IPv6 and the current widely-deployed version of IP
(which is IPv4) are:
IPv6 uses larger addresses (128 bits instead of 32 bits in IPv4) and so can support
many more devices on the network, and IPv6 includes features like
authentication and multicasting that had been bolted on to IPv4 over the years.
26. How do applications coexist over TCP and UDP?
Each application running over TCP or UDP distinguishes itself from other
applications using the service by reserving and using a 16-bit port number.
Destination and source port numbers are placed in the UDP and TCP headers by
the originator of the packet before it is given to IP, and the destination port
number allows the packet to be delivered to the intended recipient at the
destination system. So, a system may have a Telnet server listening for packets on
TCP port 23 while an FTP server listens for packets on TCP port 21 and a DNS
server listens for packets on port 53. TCP examines the port number in each
37
received frame and uses it to figure out which server gets the data. UDP has its
own similar set of port numbers.
27.What is FTP?
FTP stands for File Transfer Protocol. FTP is the best means for moving large
files across the Internet. FTP is a client/server protocol that enables a user with an
FTP client to log on to a remote machine, navigate the file system of that remote
machine, and upload and download files from that machine. There are two basic
types of FTP on the Internet: Anonymous FTP and Private FTP. With
Anonymous FTP, one logs in as user "anonymous," giving one's email address as
a password. With Private FTP, one logs in with the username and password one
has established on that particular system.
28.What is the World Wide Web?
The World Wide Web, which uses Hyper Text Transfer Protocol (HTTP), is a
connectionless client/server protocol that was invented in 1993 by Tim Burners
Lee at CERN. Web servers can deliver a wide variety of media files using MIME.
Web clients, like Netscape and MS Internet Explorer, make requests (i.e., they
send URL's to servers) of Web Servers, to which the Web server responds by
delivering the requested file, running the requested script, or generating an
appropriate error message
29. What is URL ?
A URL is a Uniform Resource Locator. It is a general purpose Internet addressing
protocol, used in WWW (HTTP) service. It is of the form:
<protocol>://<hostname.domain.name>/<directory or file names>/<moredirectory
or file names>.<file extension>
30.What do you mean by Domain name resolution?
The domain name resolution process can be summarized in the following
steps: A user program issues a request such as the gethostbyname() system call.
The resolver formulates a query to the name serverThe name server
checks to see if the answer is in its local authoritative database or cache,
and if so, returns it to the client.
Otherwise, it will query other available name server(s), starting down
from the root of the DNS tree or as high up the tree as possible.
The user program will finally be given a corresponding IP address (or
hostname, depending on the query) or an error if the query could not be
answered.
31.What is an Autonomous system (AS) ?
38
An AS is defined as a logical portion of a larger IP network. An AS is normally
comprised of an internetwork within an organization. It is administered by a
single management authority. An AS may connect to other autonomous systems
managed by the same organization. Alternatively, it may connect to other
public or private networks.
32. What are the types of IP routing and IP routing algorithms ?
Routing algorithms are used to build and maintain the IP routing table on a evice.
There are two primary methods used to build the routing table:
• Static routing: Static routing use preprogrammed definitions representing paths
through the network.
• Dynamic routing: Dynamic routing algorithms allow routers to automatically
discover and maintain awareness of the paths through the network. This automatic
discovery can use a number of currently available dynamic routing protocols. The
difference between these protocols is the way they discover and calculate new
routes to destination networks. They
can be classified into three broad categories:
- Distance vector protocols
- Link state protocols
- Hybrid protocols
33. What is Exterior Gateway Protocol (EGP) ?
EGP is an exterior gateway protocol , one of the first protocols developed for
communication between autonomous systems. EGP assumes the network
contains a single backbone and a single path exists between any two autonomous
systems. Due to this limitation, the current use of EGP is minimal. In practice,
EGP has been replaced by BGP.
EGP is based on periodic polling using a hello/I-hear-you message exchange.
These are used to monitor neighbor reachability and solicit update responses. The
gateway connecting to an AS is permitted to advertise only those destination
networks reachable within the local AS. It does not advertise reachability
information about its EGP neighbors outside the AS.
34.What is Border Gateway Protocol (BGP) ?
The Border Gateway Protocol (BGP) is an exterior gateway protocol. It was
originally developed to provide a loop-free method of exchanging routing
information between autonomous systems. BGP has since evolved to support
aggregation and summarization of routing information.
39
35.What are the BGP packet types ?
All BGP packets contain a standard header. The header specifies the BGP packet
type. The valid BGP packet types include:
• OPEN2: This message type is used to establish a BGP session between two peer
nodes.
• UPDATE: This message type is used to transfer routing information between
BGP peers.
• NOTIFICATION: This message is send when an error condition is detected.
• KEEPALIVE: This message is used to determine if peers are reachable.
36. What are the criteria for Routing protocol selection ?
The protocol chosen for one type of network may not be appropriate for other
types of networks.
• Scalability to large environments: If support is needed for large, highlyredundant
networks, link state or hybrid algorithms should be considered.
Distance vector algorithms do not scale into these environments.
• Stability during outages: The counting to infinity problems may cause routing
loops or other non-optimal routing paths. Link state or hybrid algorithms reduce
the potential for these problems.
• Speed of convergence: Triggered updates provide the ability to immediately
initiate convergence when a failure is detected. One contributing factor to
convergence is the time required to detect a failure. In OSPF and EIGRP
networks, a series of hello packets must be missed before convergence begins. In
RIP environments, subsequent route advertisements must be missed before
convergence in initiated. These detection times increase the time required to
restore communication.
Others in the list are Ease of implementation, vendor interoperability, metrics etc.
37. What is cryptography?
Cryptography is the science of keeping your data and communications secure. To
achieve this goal, techniques such as encryption, decryption and authentication
are used.
The key factor to strong cryptography is the difficulty of reverse engineering.
Strong cryptography means that the computational effort needed to retrieve clear
text messages without knowing the proper keys makes the retrieval infeasible.
38.Explain : Authentication, integrity, and non-repudiation
• Authentication - A method for verifying that the sender of a message is really
who he or she claims to be. Any intruder masquerading as someone else is
detected by authentication.
• Integrity checking - A method for verifying that a message has not been altered
along the communication path. Any tampered message sent by an intruder is
40
detected by an integrity check. As a side effect, communication errors are also
detected.
• Non-repudiation - The possibility to prove that the sender has really sent the
message. When algorithms providing non-repudiation are used, the sender is not
able to later deny the fact that he or she sent the message in question.
39.What is a firewall?
A firewall is a system (or group of systems) that enforces a security policy
between a secure internal network and an untrusted network such as the Internet.
Firewalls tend to be seen as a protection between the Internet and a private
network. But generally speaking, a firewall should be considered as a means to
divide the world into two or more networks: one or more secure networks and one
or more non-secure networks.
40.Explain HTTP
The hypertext transfer protocol is a protocol designed to allow the transfer of
ypertext Markup Language (HTML) documents. HTTP is based on requestresponse
activity. A client, running an application called a browser, establishes a
connection with a server and sends a request to the server in the form of a request
method. The server responds with a status line, including the message's protocol
version and a success or error
code, followed by a message containing server information, entity information
and possible body content.
An HTTP transaction is divided into four steps:
1. The browser opens a connection.
2. The browser sends a request to the server.
3. The server sends a response to the browser.
4. The connection is closed.
41
Part – II
DATABASE
MANAGEMENT
SYSTEM
42
Topics Covered:-
i) Introduction to Database, ERD, Relational Model, Relational Algebra &
Calculus, SQL concepts & definitions(Constraints and triggers etc.) -
Hemasai B
ii) Storage, File Structure, Indexing (Hash based & Tree structure)
- Geethavahini N
iii) Transaction Management (Concurrency & Crash Recovery) -Russia T
iv) Database Design & Tunning (Normalization & deNormalization)
-Vijith
PV
v) SQL Refresher – Sandip Roy
43
(i)
INTRODUCTION TO DATABASE, ERD, RELATIONAL MODEL,
RELATIONAL ALGEBRA & CALCULUS, SQL CONCEPTS &
DEFINITIONS
1. What is a Data?
Data is a known fact and that have an implicit meaning.
2. What is a Database?
Database is a collection of related data.
3. What is DBMS?
DBMS - DataBase Management System
It is a general purpose software system that facilitates the process of defining,
constructing, and manipulating databases for various applications.
Defining - involves specifying the datatypes, structures and constraints for the data
to be stored in the database.
Constructing - is the process of storing the data itself on some storage medium that is
controlled by the DBMS.
Manipulating - includes such functions as querying the database to retrieve specific
data, updating the database to reflect changes in the miniworld and generating reports
from the data.
4. Differentiate between File system and Database Management system?
File system DBMS
Redundancy of data exists Controlled redundancy exists
Data Inconsistency exists. Change in
structure of data results in change in related
applications
Data Independency exists.
Physical data independency – Change in
physical schema does not affect the
application.
Logical data independency - Change in
logical schema does not affect the
application.
Contains the data Contains the data and the structure in
which it is stored. The structure
information is known as meta data.
File processing software can access only
specific databases
DBMS software can access diverse
databases by extracting the database
definitions from the meta data and then
using the definitions.
44
5. What are the characteristics of database?
Data independence.
Speedy handling of spontaneous request – real time
Controlled redundancy
Versatility in representing relationships between data items
Security protection
Real time accessibility
6. What are the advantages of DBMS?
Reduction of redundancy
Shared data
Integrity
Security
Conflict resolution
Data isolation
Data independence
7. What are the disadvantages of DBMS?
Centralization of data – sense of ownership lost
Inaccurate data – Extensive data integrity/validation
Target of security breaches
Increases politics/ organizational struggles-conflicts of interest
A threat to privacy
Improper design leads to problems
8. When not to use a DBMS?
It may be more desirable to use regular files in the following circumstances:
The database and applications are simple, well defined and not expected to
change.
There are stringent real-time requirements for some programs that may not be met
because of DBMS overhead.
Multiple-user access to data is not required.
9. Define the "integrity rules".
There are two Integrity rules:
Entity Integrity: States that “Primary key cannot have NULL value”
Referential Integrity: States that “Foreign Key can be either a NULL value
or should be Primary Key value of other relation”.
45
10. What is a data model? What are the different kinds of data models?
Data model is a type of abstraction that models data and their relationship. Data
model hides storage and implementation details (hence abstraction).
The different kinds of data models are
Primitive data model – File based
Traditional data model – Hierarchical data model
Network data model
Relational data model
Semantic data model - AI based
11. Differentiate the three Traditional data models.
Data model Inventor Basic data structure
Hierarchical IBM Tree structure
Network CODASYL (Conference On
DAta SYstem Language)
Network/ Graph structure
Relational E.F.Codd (IBM labs) Relation/ Table structure
12. List out the categories of End users.
There are several categories of end users,
Casual end users - They occasionally access the database and use sophisticated
query language to specify the request. They are middle or high level managers or other
occasional browsers.
Naïve or Parametric end users – They make up a sizable portion of the database
end users. Their main job function revolves around constantly querying and updating the
database, using standard types of queries and updates – called ‘Canned transaction ‘-
that have been carefully programmed and tested.
Sophisticated end users - They include engineers, scientists, business analyst,
and others who thoroughly familiarize themselves with the facilities of the DBMS so as
to implement their applications to meet their complex requirement.
Stand-alone users – They maintain personal databases by using ready-made
program packages that provide easy-to-use or graphics-based interfaces.
13. Explain the Three-Schema Architecture.
The goal of the three-schema architecture is to separate the user application and the
physical database. In this architecture, schemas can be defined at the following three
levels:
The Internal level – Has an internal schema, which describes the physical storage
structure of the database.
The Conceptual level – Has a conceptual schema, which describes the structure
of the whole database for a community of users.
46
The External or View level – Includes a number of external schemas or user
views. each external schema describes the part of the database that a particular user
group is interested in and hides the rest of the database from that user group.
14. What is Database Tuning?
Adjusting a database to improve the performance is referred to as tuning the
database. The database administrator or his group is responsible for the tuning and
operation of the database, and it is important that he/she should be free to make what
changes he/she feels necessary without playing havoc with the application programs.
Effective tuning has two requirements.
Physical data independence
Automatic monitoring of the usage of database so that appropriate
adjustments can be made.
15. What is Data Migration?
Some data are referred very frequently and others only occasionally. The process
of adjusting the storage of data to suit its popularity rating is called Data Migration.
In some cases the physical location of data is changed to archive storage. In some
cases the reference to them in the indices which are used for addressing them are changed
so that hey can be found more quickly.
16. Who are the ‘Actors on the Scene’?
Actors on the Scene are the people who use and are interested in the Database.
The different actors are
Database Administrator (DBA) – The DBA is responsible for authorizing
access to the database, for coordinating and monitoring its use and for acquiring software
and hardware resources as needed.
Database Designer – They are responsible for identifying the data to be store in
the database and for choosing appropriate structures to represent and store this data.
End users – End users are the people whose job requires access to the database
for querying, updating, and generating reports. The database primarily exists for them.
17. Who are the ‘Workers behind the Scene’?
People who are associated with the design, development and operation of the DBMS
software and system environment but not interested in the database itself are known as
‘Workers behind the Scene’.
Some of them are,
DBMS system designers and implementers – persons who design and implement
the DBMS modules and interfaces as a software package.
Tool developers – The persons who design and implement tools – the software
packages that facilitates database system design and use, and help improve performance.
47
Operators and maintenance personnel – They are the system administration
personnel who are responsible for the actual running and maintenance of the hardware
and software environment for the database system.
18. What are Virtual Data and Transparent Data?
The Data referred which is not in the main memory at that time of reference is Virtual
data.
Transparent data appears not to exist but in fact it does exist. (Many of the complex
mechanisms used in the data storage and data transmission can be hidden from the
programmer so that he does not have to understand or even know about them.)
19. What are the various DBMS languages?
The various DBMS languages are:
DDL – Used by the DBA and by database designer to define both internal and
conceptual schema
SDL – Used to specify internal schema (where there is a strict/clear separation
between internal and conceptual schema)
VDL – Used to specify user views and their mapping to the conceptual schema
DML – Used for manipulating the data in the database (insert, delete, update,
retrieve)
20. What is Entity-Relationship model?
It is a high-level conceptual model used for conceptual design of database
applications.
21. How data is represented in ER model?
In ER model the data is represented as entities, relationships and attributes.
Entity – A “thing” in the real world with an independent existence
Attribute – The particular properties that describe entity. Each entity should have
a key attribute whose values are unique for an entity
Relationship – Implicit relationship among the various entity types
22. Mention and explain the different types of attributes in ER model.
The various types of entities are,
Simple or Atomic attributes – Attributes that are not further divisible
Composite attributes – Attributes that can be divide into smaller parts, which
represent more basic attributes with independent meaning. Ex: Address
Single valued attributes – Attributes that have a single value for a particular
entity Ex: Age of a person
48
Multi-valued attributes – Attributes that can have a set of values for the same
entity Ex: Color of car painted in two colors
Derived attribute - Attribute whose value can be derived from another attribute
Ex: Age can be derived from date of birth
Stored attribute – Attribute from which other attribute values can be derived Ex:
Date of birth from which age can be derive
Complex attributes – Attributes in which composite and multi-valued attributed
are nested
Ex: A person having more than one residence and each have more than one phone
number
23. Define Entity types and entity sets.
Entity types – An entity type defines a collection of entities that have the same
attributes
Entity sets – The collection of all entities of a particular entity type in the
database at any point in time
24. What are value sets or domains of an attribute?
An attribute of an entity type is associated with value set (domain of values), which
specifies the set of values that may be assigned to that attribute for each individual entity.
25. What is Relationship set and Relationship type?
Relationship set – The collection (or set) of similar relationship.
Relationship type – Relationship type defines a set of associations or relationship
set among a given set of entity types.
26. What is degree of relationship type?
It is the number of entity types participating.
27. What are the constraints on the Relationship types? Explain.
Relationship types usually have certain constraints that limit the possible
combinations of entities that may participate in the corresponding relationship set.
Two main types of relationship constraints are:
• Cardinality Ratio
• Participation
Cardinality for Binary Relationships: The cardinality ratio for a binary
relationship specifies the number of relationship instances that an entity can participate
in.
49
Participation constraints and Existence dependencies: The participation
constraint specifies whether the existence of an entity depends on its being related to
another entity via the relationship type.
28. What are the types of Participation constraints? Explain.
Two types of participation constraints are:
• Total participation (or Existence dependency)
• Partial participation
29. What are Weak Entity types?
Entity types that do not have key attributes of their own are called weak entity
types. In contrast, regular entity types that do have a key attribute are called strong
entity types.
Weak entity types are identified by being related to specific entities from another
entity (identifying or owner entity type) in combination with some of their attribute
values.
A weak entity type always has a total participation constraints with respect to its
identifying relationship, because a weak entity cannot be identified without an owner
entity.
30. What is a key?
An entity type usually has an attribute whose values are distinct for each
individual entity in the collection. Such an attribute is called a key attribute and its
values can be used to identify each entity uniquely.
31. What are different types of keys? Explain.
(1) Candidate key - If a relation schema has more than one key, each is a candidate key.
(2) Primary key - One of candidate keys is arbitrarily designated to be the primary key.
(3) Superkey – A superkey of a relation schema R = { A1,A2,….An } is a set of
attributes
S ⊆R with the property that no two tuples t1 ant t2 in any legal relation state r of
R will have t1[S] = t2[S]. A key K is a superkey with the additional property that removal
of any attribute from K will cause K not to be a superkey anymore.
(4) Foreign key – A foreign key specifies a referential integrity constraints between the
two relation schema R1 and R2.
32. What is a Discreminent?
A group of attributes in a weak entity which helps to identify it is called a
Discreminent.
33. What is a relational model?
50
The usage of pointers and high complexity are demerits of hierarchical and network
model. The relational model represents the database as a collection of relations. Each
relation resembles a table of values or, to some extent, a flat file of records.
34. What are the different aspects of a relation?
Attribute: Columns in a table are called attributes.
Tuple: Rows are called tuples.
Degree: The number of attributes is called as degree of the relation or arity of the
relation.
Cardinality: The number of rows is called cardinality of relation.
Domain: Set of all values that an attribute can have.
35. What is Domain contraint?
Domain contraint specify that the value of each attribute A must be an atomic value
from the domain of dom(A ).
36. What is a Relation schema?
A Relation schema R, denoted by R(A1, A2, A3..,An), is made up of a relation name
R and a list of attributes A1, A2,..,An.
37. What are the constraints of a relational model?
The two constraints of a relational model are unique identification and nonredundancy.
Unique Identification:
Each tuple of the relation should be uniquely identifiable by means of a special
attribute known as key(s). No two rows can have the same key.
Non-Redundancy:
No attribute that a part of a key can be removed without destroying property of
unique identification. The key should be minimal.
38. Explain Integrity contraint.
Entity Integrity contraint states that no primary key value can be null.
Referential Integrity contraint states that a tuple in one relation that refers to another
relation must refer to an existing tuple in that relation.
39. What is Relational Algebra?
It is procedural query language. It consists of a set of operations that take one or two
relations as input and produce a new relation.
51
40. What are the basic operations of relational algebra?
Union
A binary operator for which arity of both the relations must be same. It gives a
relation of the same arity with set of all tuples in both the operand relations. Ex: RUS
Difference
A binary operator for which arity of both the relations must be same. Ex: R-S. It
gives a relation of the same arity that contains all tuples in R but not in S.
Cartesian product
A binary operator. Ex: RXS. It is a typical cross product o two relations R, S.
Thus, if arity( R ) = K1, arity ( S ) = K2, then arity ( RXS ) = K1+K2.
Projection
The project operation selects certain columns from the table and discards the
other columns.
Selection
The select operator is used to select a subset of the tuples from a relation that
satisfy a selection condition.
41. What are the advanced level operations in Relational algebra? Explain.
The advanced level operations are :
Intersection
A binary operator for which arity of both the relations must be same. It gives a
relation of the same arity with set of tuples which are comman in both the operand
relations. Ex: RnS
Quotient
A binary operator. Let R be of arity r and S be of arity s where r>s (pre requisite).
Then R÷S is a set of all (r-s) tuples t such that for all s tuples u in S, the tuple tu is in R.
Join
A join operation is used to combine related tuples from two relations into single
tuple.
Theta join – A join operation with general join condition. [ ó (R X S) ]
( ó – Condition)
Equi-join – A join operation with only comparison operators in
condition.
Natural join – A join operation based on a common column that have
same values in both the relations. [ ó <R.b=S.b>(R X S)]
52
42.What is relational calculus?
Relational calculus is a formal query language where we write one declarative
expression to specify a retrieval request and hence there is no description of how to
evaluate a query; a calculus expression specifies what is to be retrieved rather than how
to retrieve it, whereas in relational algebra we write a sequence of operations to specify a
retrieval request.
Relational calculus is considered to be non-procedural language whereas
relational algebra is a procedural language.
43. What are the two types of Relational Calculus? Explain.
Tuple Relational Calculus - A Query framed in TRC has the following syntax.
{ t/ p(t) } i.e, the formula retrieves the set of tuples formula p(t) is satisfied.
Domain Relational Calculus - DMC has the following syntax
{ <x1,x2,x3….,xn> | p(x1,x2,x3….xu) }
44. What is Outer join? Explain.
A varient of join that keeps all pieces of information from the operands.
It “pads with nulls” the tuples that have no counterpart
Three pads:
left : only tuples of the first operand are padded
right : only tuples of the second operand are padded
full : tuples of both operands are padded
45. How does Tuple-oriented relational calculus differ from domain-oriented
relational calculus?
The tuple-oriented calculus uses a tuple variables i.e., variable whose only permitted
values are tuples of that relation. Ex: QUEL
The domain-oriented calculus has domain variables i.e., variables that range over the
underlying domains instead of over relation. Ex: ILL, DEDUCE.
46. Explain existential and universal quantifiers.
The ∃ quantifier is called an existential quantifier because a formula (∃ t)(F) is
true if “ there exists “ tuple that makes F true.
Ex: { t / ∃ s (s ∈ BORROW ∧ s[AMOUNT] > 75000) ∧ t[ CUST_BNAME] =
s[CUST_NAME] }
The universal quantifier, (∀t)(F) is true if every possible tuple that can be
assigned to free occurrences of t in F is substituted for t and F is true for every such
substitution. It is called universal quantifier because every tuple in “the universe of “
tuples must make F true to make the quantified formula true.
47. Transforming the universal and existential quantifiers.
53
It is possible to transform a universal quantifier into an existential quantifier and
vice-versa and to get an equivalent expression.
(∀x) (P(x) ) ≡ not ( ∃ x) ( not (P(x)))
(∃ x) ( P(x)) ≡ not (∀x) (not (P(x)))
(∀x) ( P(x) and Q(x)) ≡ not ( ∃ x) ( not (P(x)) or not (Q(x)) )
(∀x) ( P(x) or Q(x)) ≡ not ( ∃ x) ( not (P(x)) and not (Q(x)) )
( ∃ x ) ( P(x) or Q(x)) ≡ not (∀x) ( not (P(x)) and not (Q(x)) )
(∃ x ) ( P(x) and Q(x)) ≡ not (∀x) ( not (P(x)) or not (Q(x)) )
48. What is a safe expression?
An expression is said to be safe only when it guarantees to yield a finite number
of tuples as its result. Otherwise, it is said to be unsafe. All expressions used in tuple
relational calculus need to be safe expressions.
Ex: { t / ¬(EMPLOYEE(t))} – infinite or unsafe expression
49. How domain relational calculus is different from tuple relational calculus?
The domain calculus differs from the tuple calculus in the type of variables used
in formulae: rather than having variables range over tuples, the variables range over
single values from domains of attributes.
50. What is oracle data dictionary?
The oracle data dictionary is read-only set of tables that keeps the metadata- i.e ,
the schema description for a database.
51. What is a schema?
The term schema refers to a collection of data definition objects. Schema objects
are the individual objects that describe tables, views etc. There is a distinction between
the logical schema objects and the physical storage components called tablespaces.
52. What are Dynamic Performance tables?
Oracle constantly monitors database activity and records in tables called Dynamic
Performance tables.
53. Explain SQL as a query language.
• SQL expresses query in a declarative way
- queries specify the properties of the result, not the way to obtain it.
54
• Queries are translated by the query optimizer into the procedural language
internal to the DBMS
• The programmer should focus on reliability, not on effiency.
[ include SQL basics here ]
54. What are assertions?
Assertions are used to specify more general constraints. Each assertion is a given
a constrained name and is specified via a condition similar to where clause in an SQL
query.
Ex:
CREATE ASSERTIONS SALARY_CONSTRAINT
CHECK ( NOT EXISTS (SELECT * FROM EMPLOYEE E, EMPLOYEE
M, DEPARTMENT D WHERE E.SALARY>M.SALARY AND
E.DNO=D.DNUMBER AND D.MGRSSN=M.SSN));
55. How access control to resources is given to other users?
The owner of the resource assigns previleges to other users.
A previlege is chraterized by:
• the resource
• the user who grants the previlege
• the user who receives the previege
• the action that is allowed on the resource
• whether or not the previlege can be passed to other users
56. What are the various types of previleges?
Insert – to insert a new object into the resource
Update – to modify the resource contents
Delete – to remove an object from the resource
Select – to access a resource contents in a query
References – to build a referential integrity constraint with the resource (may limt the
ability to modify the resource)
Usage – to use the resource in a schema definition
57. How to grant and revoke a previlege?
To grant a previlege to a user:
grant <previleges | all previleges> on resource to users [wit grant option]
- grant option specifies whether the previlege of propagating the previlege to other users
must be granted.
55
Ex: grant select on DEPARTMENT to stefano
To take away previleges:
revoke previleges on resources from users
[restrict | cascade]
58. Explain Embedded SQL.
A. Traditional applications often need to “embed” SQL statements inside the
instructions of a procedural programming language ( C, COBOL, etc.)
B. Programs with embedded SQL use a precompiler to manage SQL statements
C. Embedded statements are preceded by ‘$’ or ‘EXEC SQL’
D. Program variables may be used as parameters in SQL statements (preceded by ‘:’)
E. select preducing a single row and update commands may be embedded easily
F. The SQL environment offers a predefined variable ‘sqlcode’ which describes the
status of the execution of the SQL statements ( zero if the SQL statement
executed successfully)
Ex:
void DisplayDepartmentSalaries(char DeptName[])
{
char FirstName[20], Surname[20];
long int Salary;
$ declare DeptEmp cursor for
select FirstName, Surname, Salary
from Employee
where Dept = :DeptName;
$ open DeptEmp;
$ fetch DeptEmp into :FirstName, :Surname, :Salary;
printf(“Department %s\n”,DeptName);
while (sqlcode == 0)
{
printf(“Name: %s %s ”,FirstName,Surname);
printf(“Salary: %d\n”,Salary);
$ fetch DeptEmp into :FirstName, :Surname, :Salary;
}
$ close DeptEmp;
}
59. Explain Dynamic SQL
• When applications do not know at compile-time the SQL
statement to execute, they need dynamic SQL
• Major problem: managing the transfer of parameters between
the program and the SQL environment
• For direct execution:
56
execute immediate SQLStatement
• For execution preceded by the analysis of the statement:
prepare CommandName from SQLStatement
– followed by:
execute CommandName [ into TargetList ]
[ using ParameterList ]
60. What are program units? Explain.
The program units are: Functions, Stored procedures and Packages.
Stored procedure – It refers to a procedure that is considered to be a part of the data
definition and implements some integrity rule or business rule or a policy when it is
invoked. Functions return single values.
Package – It provides a method of encapsulating and storing related procedures for easier
management and control.
61. What is cluster and hash cluster?
Cluster – A group of records from one or more tables physically stored in a
mixed file. Tables are clustered by the cluster key. Related rows from multiple tables are
physically stored together on disk blocks to improve performance.
Hash Cluster – Hash clusters also group records; however the cluster key value
is hashed first and all rows belonging to this hash value are stored under the same hash
bucket address.
62. What are methods in Oracle8 ?
A method is a procedure or a function that is part of the definition of a user
defined abstract datatype. Methods are written in PL/SQL and stored in the database or
written in a language like C and stored externally. Method differ from stored procedure in
the following ways:
A program invokes a method by referring to an object of its associated type.
An Oracle method has complete access to the attributes of its associated object
and to the information about its type.
63. What are triggers?
A database trigger is a stored procedure ( or rule) that is implicitly executed ( or
fired) when the table with which it is associated has an insert, delete or update performed
on it. Triggers can be used to enforce additional constraints or to automatically perform
additional actions that are required by business rules or policies that go beyond the
standard key, entity integrity and referential integrity constraints imposed by the system.
57
64. What is Rollback segment?
Each database must contain one or more rollback segments, which are used for
“undoing” transactions. A rollback segment records old values of data that are used to
provide read consistency to rollback a transaction and for recovering a database.
65. What is PL/SQL?
PL/SQL is a block - structured language. That is , the basic units- procedures,
functions and anonymous blocks – that make up a PL/SQL program are logical blocks
which can contain any number of nested sub-blocks. A block or sub-block groups
logically related declarations and statements. The declarations are local to the block and
cease to exist when the block completes.
The three parts of PL/SQL block are
• Declaration part
• Executable part
• Exception part.
66. What is a cursor in PL/SQL?
A cursor is similar to a file variable or file pointer, which points to a single row
from the result of a query. Cursors are controlled by three commands – OPEN, FETCH
and CLOSE.
67. Explain Stored procedures.
• SQL-2 allows for the definition of procedures, also known as
stored procedures
• Stored procedures are part of the schema
procedure AssignCity(:Dep char(20),
:City char(20))
update Department
set City = :City
where Name = :Dep
• SQL-2 does not handle the writing of complex procedures
• Most systems offer SQL extensions that permit to write complex
procedures (e.g., Oracle PL/SQL)
Procedure in Oracle PL/SQL
Procedure Debit(ClientAccount char(5),Withdrawal integer) is
OldAmount integer;
NewAmount integer;
Threshold integer;
begin
select Amount, Overdraft into OldAmount, Threshold
from BankAccount
58
where AccountNo = ClientAccount
for update of Amount;
NewAmount := OldAmount - WithDrawal;
if NewAmount > Threshold
then update BankAccount
set Amount = NewAmount
where AccountNo = ClientAccount;
else
insert into OverDraftExceeded
values(ClientAccount,Withdrawal,sysdate);
end if;
end Debit;
59
(ii)
STORAGE, FILE STRUCTURE, INDEXING (HASH BASED &
TREE
STRUCTURE)
1. Mention & brief categories in storage media.
a) Primary storage and b) Secondary storage.
a) - can be directly operated on by the CPU. eg., Main memory, Cache memory
b) - can't be directly operated on by the CPU hence data has to be first copied onto
primary storage. e.g., magnetic disks, optical disks & tapes.
2. What are Dynamic and Static RAM?
Dynamic RAM is Main memory whereas Static RAM is Cache memory.
3. What are off-line and on-line data?
Off-line data: data stored on tapes, on-line data: data stored in disk.
4. What is a tape juke box?
It contains a bank of tapes that are catalogued and can be automatically loaded onto
tape drives and are becoming as a Tertiary Storage.
For Example: NASA's EOS (Earth Observation Satellite) system stores archived
data in this fashion.
5. What is Volatile & Non-volatile storage?
Main memory is a Volatile and Secondary memory is a Non-volatile storage.
6. Expand BLOB, SCSI, and RAID.
BLOB: Binary Large Object
SCSI: Small Computer Storage Interface
RAID: Redundant Array of Inexpensive/Independent Disks
7. What is Seek time?
“Time taken to position the read/write head on the track containing data".
8. What is Latency time?
"The beginning of desired block rotates into position under the read/write head."
9. What is Data Striping?
60
"The concept that makes use of parallelism to improve disk performance".
10. Explain RAID.
Its Goal: To make the wide different rates of performance improvements of disks
even against those in memory and micro processor.
11. what is an index?
An index is an access structure which is used to speed up the retrieval of records in
response to certain search conditions.
12. What is a Hash function?
Given a key, the hash function resolves and returns the address of a record.
13. What is an indexing field?
An index access structure is usually defined on a single field of a file called an
Indexing field.
14. Mention the type of indices?
Primary, Secondary, Clustered, Dense, Sparse indices.
15. Explain the type of indices.
Primary index: defined on ordering field of an ordered file of records.
Secondary index: defined on non-ordering field of an ordered file of records.
Clustered index: defined on records having same value for ordering field of an
ordered file.
Dense index: defined on every search key value in the data file.
16. What is block anchor?
The first record in each block of the data file is called the anchor record or Block
anchor.
17. What is a multi level index?
The multi level index is one which uses a blocking factor to reduce the search on
an index and the blocking factor to be larger than 2 and it is referred to as the fan out of
the multi level index.
18. What is an Indexed sequential file?
61
It is an ordered file with a multi level primary index on its ordering key field
19. How dynamic multilevel indexing is achieved?
B-trees and B+ -trees.
20. What is a search tree?
It is slightly different from a multilevel index with order p such that each node contains
at most p-1 search values and p pointers.
21. Define B-tree.
B-tree when used as an access structure on a key field to search records in a data
file can be defined as follows:
Each node has almost p tree pointers
Each node is of the form <P1, <K1,Pr1>,P2,<K2,Pr2>………….< Kq-
1,Pprq-1>,Pq> where Pi is a tree pointer and Pri is a data pointer, Ki is
the node.
Each node, except the root and leaf nodes has at least p/2 tree pointers.
All leaf nodes are at the same level.
22. Define B+-tree.
B+-tree when used as an access structure on a key field to search records in a
data file can be defined as follows:
Each node has almost p tree pointers
Each internal node of the form <P1,K1,P2,K2,…..Pq-1,> where Pi is a
tree pointer.
Each node, except the root and leaf nodes has at least p/2 tree pointers.
All leaf nodes are at the same level.
23. Why to use B-trees in particular for dynamic multilevel indexing.
B-tree is a balanced tree structure in which each node is atleast half full.
24. what is the popular file organization introduced by IBM?
ISAM: Indexed Sequential Access Method
25. What is a fully inverted file?
A file that has a secondary index on every one of its fields is often called a fully
inverted file.
(iii)
62
TRANSACTION MANAGEMENT (CONCURRENCY & CRASH
RECOVERY)
1) What is meant by transaction?
Collection of operations that for a single logical unit of work are called
transactions. In other words, a transaction is a unit of program execution that
accesses and possibly updates various data items.
2) Which property of transaction used to ensure integrity of the data?
ACID property, where
A=Atomicity: either all the operations of the transaction are reflected properly in
the db or none.
C=Consistency: execution of a transaction in isolation preserves the consistency
of the db.
I=Isolation: Even though multiple transactions may execute concurrently each
transaction is unaware of other transactions executing concurrently in the system.
D=Durability: After a transaction completes successfully, the changes it has made
to the db persist, even if there are system failures.
3) What is roll back?
Any changes that the aborted transaction made to the database will be
undone is called roll back. This undo will be done by compensative
transaction.
4) Name the five states of transaction.
Active
Partially committed
Failed
Aborted
Committed
During the transaction failure, either we can restart the transaction or kill the
transaction.
5) What is shadow copy?
In a shadow copy scheme, a transaction that wants to update the db first
creates a complete copy of the db. A pointer called db pointer is maintained, it
points to the current copy of db. During the transaction, all updates are done on
the new db copy, leaving the original copy (shadow copy), untouched.
If at any pt of transaction has to be aborted, the system merely deletes the
new copy.
63
If the transaction completes, the db system updates the db-pointer to pt to
the new copy of the db, the new copy then becomes the current copy of te
db. The old copy of the db is then deleted.
6) What is the roll of recovery management component in db system?
Recovery management component provides support for atomicity and
durability by a variety of schemes.
7) What is the use of concurrent transaction over serial transaction?
Improved throughput and resource utilization
Reduced waiting time
8) What is the need of concurrency-control schemes?
The db system must control the interaction among the concurrent
transactions o prevent them from destroying the consistency of the db.
9) What is recoverable schedule?
A recoverable schedule is one where, for each pair of transaction T1 and
T2 such that T2 reads a data item previously written by T1 the commit operation
of T1 appears before the commit operation of T2.
10) What is cascading rollback?
Cascading rollback is a phenomenon, in which a single transaction failure leads to
a series of transaction rollbacks.
11) What is a lock in concurrency control?
When a transaction needs to update the db, then it can set the lock, and proceed.
When the db is locked be one transaction, the other can’t access the db. Once the
transaction gets over, the lock will be released and the other transactions can
access it.
12) List the lock modes.
Shared: If a transaction T has obtained a shared-mode lock on item Q,
then T can read but can’t write Q.
Exclusive: If a transaction T has obtained an exclusive-mode lock on item
T can both read and write Q.
13) What is a locking protocol?
Each transaction in the system follows a set of rules called locking protocol,
indicating when a transaction may lock and unlock each of the data items.
64
14) List the type of protocols used in concurrency control.
Locking protocol
Graph-based protocol
Timestamp-based protocol
Validation-based protocol
15) What is a deadlock state in Transaction?
A system is in a deadlock state if there exists a set of transactions such that
every transaction in the set is waiting for another transaction in the set.
16) List the two approaches to deadlock prevention in transaction.
One approach ensures that no cyclic waits can occur by ordering the
request for locks or requiring all locks to be acquired together. The other approach
is closer to deadlock recovery, and performs transaction rollback instead of
waiting for a lock, whenever the wait could potentially result in a deadlock.
17) How the deadlock detection and recovery handled in transaction?
The transactions are sketched as a wait-for-graph and check for a presence of a
cycle.
Recovery from Deadlock involves the following steps:
Selection of a victim (the transaction which is responsible for deadlock)
Rollback
18) List the classification of a db system failure.
• Transaction Failure
o Logical Error: The transaction can no longer continue with its normal
executing because of some internal conditions, such as bad input, data not
found, overflow or resource limit exceeded.
o System Error: The system has entered an undesirable state like deadlock.
• System Crash:
o There is a hardware malfunction or a bug in the db software or the os, that
causes the lost of the content of volatile storage and brings transaction
processing to a halt.
• Disk Failure:
o A disk block loses its content as a result of either a head crash or failure
during a data transfer operation.
19) Which are all the storage types used for db?
Based on the relative speed, capacity and resilience to failure I is classified
as follows:
• Volatile storage
• Non volatile storage
• Stable storage
65
(iv)
DATABASE DESIGN & TUNNING (NORMALIZATION &
DENORMALIZATION)
1).What are the Phases involved Database Application System Life Cycle?
a. System definition: the scope of database system, its users, and its applications are
defined.
b. Database design: At the end of this phase, a complete logical and physical design of
the database system on the chosen DBMS is ready.
c. Database implementation: specify the conceptual, external, and internal database
definitions, creating empty database files, and implementing the software applications.
d. Loading or data conversion: the database is populated either by loading the data
directly or by converting existing files into the database system format.
e. Application conversion: any software applications from a previous system are
converted to the new system.
f. Testing and validation: the new system is tested and validated.
g. Operation: the database system and its applications are put into operation.
h. Monitoring and maintenance: the system is constantly monitored and maintained.
Growth and expansions can occur in both data content and software applications.
2).What are the Database Design Phases?
Six main phases of the database design are:
Requirements collection and analysis.
Conceptual database design.
Choice of a DBMS.
Data model mapping (logical database design).
Physical database design.
Database system implementation and tuning.
3).What is a Conceptual Database Design?
This phase of database design involves two parallel activities:
1. Conceptual Schema Design: examines the data requirements resulting from Phase 1
and produces a conceptual database schema.
2. Transaction and Application Design: examines the database applications analyzed in
Phase 1 and produces highlevel specifications for these applications.
4).What are the factors considered for discussing the economic and organizational
factors that affect the choice of DBMS?
a. Software acquisition cost: This is the "up-front" cost of buying the software,
including language options, different interfaces such as forms and screens,
recovery/backup option, special access methods, and documentation. The correct DBMS
for a specific operating system must be selected.
66
b. Maintenance cost: This is the recurring cost of receiving standard maintenance service
from the vender and for keeping the DBMS version up to date.
c. Hardware acquisition cost: New hardware may be needed, such as additional memory,
terminals, disk units, or special DBMS storage.
d. Database creation and conversion cost. This is the cost of either creating the database
system from scratch or converting an existing system to the new DBMS software.
e. Personnel cost: Acquisition of DBMS software for the first time is often accompanied
by hiring new database people.
f. Training cost. Because DBMSs are often complex systems, personnel must often be
trained to use and program the DBMS.
g. Operating cost. The cost of continued operation of the database system is typically not
worked into an evaluation of alternatives because it is incurred regardless of the DBMS
selected.
5).What is Logical database design?
Logical database design is called the Data Model Mapping. The Mapping can be carried
out in two phases.
System independent mapping: the mapping does not consider any specific characteristics
or special cases that apply to the DBMS implementation of the data model.
Tailoring the schemas to a specific DBMS: we may have to adjust the schemas obtained
in stage 1 to conform to the specific implementation features of a data model as used in
the selected DBMS.
6).What is a Physical database design?
Physical database design is the process of choosing specific storage structures and access
paths for the database files to achieve good performance for the various database
applications. Each DBMS offers a variety of options for file organization and access
paths. These usually include various types of indexing, clustering of related records on
disk blocks, linking related records via pointers. Once a specific DBMS is chosen, the
physical database design process is restricted to choosing the most appropriate structures
for the database from among the options offered by that DBMS.
7).What are the Elements of a Good Database Design ?
The Elements of a good Database design are
• Relation design
• Data Model
• Processing Rules
• Additional System requirements
These models contribute to a proper Database Design
8).List the Database Design Process?
Create tables and columns from entities and attributes
Select primary keys
Represent relationships
Specify constraints
Re-examine normalization criteria
67
9).Define Response time.
This is the elapsed time between submitting a database transaction for execution and
receiving a response. A major influence on response time is under the control of the
DBMS is the database access time for data items referenced by the transaction. Response
time is also influenced by factors not under DBMS control, such as system load,
operating system scheduling, or communication delays.
10).Define Space Utilization.
Space utilization: This is the amount of storage space used by the database files and their
access path structures.
11).Define Transaction throughput.
This is the average number of transactions that can be processed per minute by the
database system; it is a critical parameter of transaction system such as those used for
airline reservations or banking. Transaction throughput must be measured under peak
conditions on the system. Typically, average and worst-case limits on the preceding
parameters are specified as part of the system performance requirements. Analytical or
experimental techniques, which can include prototyping and simulation, are used to
estimate the average and worst-case values under different physical design decisions, to
determine whether they meet the specified performance requirements.
12).What are surrogate Keys?
A surrogate key is a unique, DBMS-supplied identifier used as the primary key of a
relation. The values of a surrogate key have no meaning to the users and are normally
hidden on forms and reports. DBMS does not allow the value of a surrogate key to be
changed.
13).Disadvantages of the Surrogate Keys
Disadvantages are
i. Foreign keys that are based on surrogate keys have no meaning to the users.
ii. When data shared among different databases contain the same ID, merging
those tables might yield unexpected results.
14).How Relationships using he Surrogate keys are represented?
If the parent in an ID-dependent relationship has a surrogate key as its primary key, but
the child has a data key, use the parent’s surrogate key as a primary key. A mixture of a
surrogate key with a data key does not create the best design as the composite key will
have no meaning to the users. Therefore, whenever any parent of an ID-dependent
relationship has a surrogate key, the child should have a surrogate key as well. By using
68
surrogate keys in the child table, the relationship type has changed to
1:N non-identifying relationship
15).Name the 3 principles of Relationship representation
Preservation of referential integrity constraints, Specification of referential integrity
actions, Representation of minimum cardinality.
16).How the Referential Integrity actions are specified?
If default referential integrity constraint is too strong, overriding the default referential
integrity enforcement could be defined during database design. The policy will be
programmed into triggers during implementation.
17).Two referential integrity overrides
1).Cascading updates: automatically change the value of the foreign key in all related
child rows to the new value
2).Cascading deletions: automatically delete all related child rows
18).What type of Referential Integrity actions must be declared during database
design?
When enforcing the minimum cardinality between the child and the parent, a required
child can be represented by creating update and delete referential integrity actions on the
child and insert referential integrity actions on the parent. Such referential integrity
actions must be declared during database design and trigger codes must be written during
implementation
19).How ID-Dependent Relationships are represented?
To represent ID-dependent relationships, primary key of the parent relation is added to
the child relation. The new foreign key attribute becomes part of the child’s composite
primary key.
Referential integrity actions should be carefully determined
– For cascading updates, data values are updated to keep child rows consistent with
parent rows
– If the entity represents multi-value attributes, cascading deletions are appropriate
– Check user requirements when designing more complex situation
20).How is the Ternary Relationships represented?
Ternary and higher-order relationships can be treated as combinations of binary
relationships. There are three types of binary constraints:
MUST, MUST NOT, and MUST COVER
69
i. MUST NOT constraint: the binary relationship indicates combinations that are
not allowed to occur in the ternary relationship
ii. MUST COVER constraint: the binary relationship indicates all combinations
that must appear in the ternary relationship
iii. Because none of these constraints can be represented in the relational design,
they must be documented as business rules and enforced in application programs
or triggers
DATABASE TUNING
1).What is Database Tuning?
It is the activity of making a database application run faster.
– Faster means higher throughput (or response time)
– Avoiding transactions that create bottlenecks or avoiding queries that run for hours
unnecessarily is a must.
– A 5% improvement is significant.
2).What is the need for Database Tuning?
Nearly 80 to 85 percent of database performance problems arise from the application
database's design or the application's own code. Good transaction throughput requires an
application designed from the database up, with performance and scalability in mind.
With optimum design and tuning, all applications' performance will soar.
3).What are the three main reasons for Database Tuning?
• Troubleshooting:
Make managers and users happy given an application and a DBMS
•Capacity Sizing:
Buy the right DBMS given application requirements
• Application Programming:
Coding your application for performance
4).What are the principles for a Database Tuning?
• Think globally, fix locally
– Localizing the problems
• Partitioning breaks bottlenecks (temporal and spatial)
– ONE part of the system limits the overall performance
– Two approaches:
• Fix locally
• Partitioning the LOAD
– eg. Free list, lock contention due to long transactions
• Partitioning in space/logical resources/time.
70
5).What is achieved by tuning a database? Specify the Goals.
Improvement in database and application performance helps to support the database
application. The overall performance of the Database server is improved by tuning its
components.
Tuning is begun with clear, quantifiable, real-world performance objectives; for example,
the objective to process 10,000 orders per day. The tuning that we implement for
response time, reading and writing, or other operations should be geared toward your
real-world objectives.
Database server is tuned to achieve certain goals, such as reducing response time,
minimizing contention, and optimizing the utilization of CPU and memory resources.
6).What are the Goals of Tuning an Oracle 8i Server?
• One of the goals of tuning an Oracle8i server is to ensure that SQL statements
access the smallest possible number of Oracle blocks needed to complete their
work. This speeds up the response time for the statement and reduces the impact
on other queries and data manipulation language (DML) statements being
executed in the database.
• Another goal of tuning is to ensure that frequently used data blocks are cached in
memory. Data blocks cached in memory can be readily accessed. The availability
of needed data in memory reduces the number of overall physical reads.
• An additional goal of tuning is to ensure that users share code, such as frequently
used SQL statements, within memory areas. This helps to avoid reparsing and
duplicate storage of such SQL statements.
• A further goal of tuning the Oracle8i server is to ensure that reading and writing
are performed as rapidly as possible. For example, separating different segment
types across tablespaces can minimize write contention. Maximizing the number
of blocks that are obtained during a read operation minimizes the number of read
operations required.
• Minimizing or preventing contention for resources is another goal of Oracle8i
server tuning. Good application tuning and database design help ensure that users
have access to data when required.
• Finally, the database should be tuned to optimize the speed in which backup
procedures and housekeeping tasks are performed. This protects data and
minimizes the impact of those tasks on database performance.
7).What are the steps involved in tuning the Oracle 8i Server?
• Six steps are involved in tuning the Oracle8i database. In the first step, you
tune the architecture and design of the data model.
• The second step involves tuning the database applications. The most common
applications in Oracle8i are Decision Support Systems and Online
Transaction Processing.
• Decision Support Systems (DSS) applications distill large amounts of information
into understandable reports. Decision makers in an organization use this data to
determine what strategies the organization should follow.
71
• Online Transaction Processing (OLTP) applications are high-throughput,
insert/update intensive systems. They involve large volumes of data that grow
continuously.
• To tune the applications, it is important to design efficient SQL statements.
Efficient SQL statements take less time to process. You must also consider the
data access and manipulation operations in detail.
• The next step in tuning the Oracle8i server is to tune memory structures.
Memory tuning involves tuning the System Global Area (SGA) and user
process memory.
• Input/output (I/O) tuning is the next step in tuning the Oracle8i server. I/O
tuning involves configuring the distribution of data files, segment types, and
tablespaces. I/O tuning also involves monitoring the interaction between data
files and memory during reads and writes.
• The interaction between files can be tuned by separating the most frequently
accessed data into different tablespaces and separating tables from their
related indexes. You can also separate rollback segments from tables and data
dictionary objects from all other objects.
• Another step in tuning the Oracle8i server involves reducing contention
between actions in the database. Since an Oracle database is normally used by
multiple users, tuning is needed to minimize the time that processes have to wait
until a resource is available. Resources that can involve contention include blocks,
shared pool, locks, and latches.
• The final step in tuning is to tune the operating system to handle the current
demands of the database server.
8).What is performance tuning?
It is the methodology of making small adjustments that affect database performance. By
tuning the system, can tailor its performance to best meet every needs. Performance must
be built in! Performance tuning cannot be performed optimally after a system is put into
production. To achieve performance targets of response time, throughput, and constraints,
application analysis, design, and implementation must be tuned in.
9).Who tunes the System performance?
Everyone involved with the system has some role in the tuning process. When people
communicate and document the systems characteristics, tuning becomes significantly
easier and faster.
• Business executives must define and then reexamine business rules and
procedures to provide a clear and adequate model for application design. They
must identify the specific kinds of rules and procedures that can influence the
performance of the whole system.
• Application designers must design around potential performance bottlenecks.
They must communicate the system design so that everyone can understand the
flow of data in an application.
72
• Application developers must communicate the implementation strategies they
choose so that modules and SQL statements can be quickly and easily identified
during statement tuning.
• Database administrators (DBAs) must carefully monitor and document system
activity so that they can identify and correct unusual system performance.
• The hardware and software administrators must document and communicate the
configuration of the system so that everyone can design and administer the system
effectively.
Decisions made in application development and design have the most impact on
performance. Once the application is deployed the database administrator usually has the
primary responsibility for tuning--within the limitations of the design.
10).Is it necessary for setting the performance goal of the system while tuning?
Brief.
Whether we are designing or maintaining a system, we should set specific performance
goals so that we know when to tune. We can needlessly spend time tuning our system
without significant gain if we attempt to alter initialization parameters or SQL statements
without a specific goal.
When designing the system, we should set a specific goal: for example, an order entry
response time of less than three seconds. If the application does not meet that goal,
identify the bottleneck causing the slowdown (for example, I/O contention), determine
the cause, and take corrective action. During development, you should test the application
to determine if it meets the designed performance goals before deploying the application.
11).How is the performance of the system is evaluated?
With clearly defined performance goals, we can readily determine when performance
tuning has been successful. Success depends on the functional objectives we have
established with the user community, our ability to objectively measure whether or not
the criteria are being met, and our ability to take corrective action to overcome any
exceptions. The rest of this tuning manual describes the tuning methodology in detail,
with information about diagnostic tools and the types of corrective actions we can take.
DBAs who are responsible for solving performance problems must keep a wide view of
the all the factors that together determine response time. The perceived area of
performance problems is frequently not the actual source of the problem. Users in the
preceding example might conclude that there is a problem with the database, whereas the
actual problem is with the network. A DBA must monitor the network, disk, CPU, and so
on, to find the actual source of the problem--rather than simply assume that all
performance problems stem from the database. Ongoing performance monitoring enables
you to maintain a well tuned system. Keeping a history of the application's performance
over time enables you to make useful comparisons. With data about actual resource
73
consumption for a range of loads, you can conduct objective scalability studies and from
these predict the resource requirements for load volumes you may anticipate in the future.
12).What are the basic methodology involved in tuning the Database Server and
Clients?
Determine the level of tuning - Component-level tuning or system-level tuning? Do you
want to tune the database server as an isolated component or as part of a larger
application?
Understand the end-user community - Gather metrics regarding the manner in which
the database will be accessed. What SQL queries will be executed? What business
transactions will be executed? How often are transactions executed?
Gather performance requirements - Determining the exit criteria for tuning needs to be
established in order to know when sufficient testing has occurred.
Automate test scripts - Create automated test scripts that issue the necessary SQL
queries, updates and deletes. Generate automated test scripts that emulate the business
scenarios.
Execute & analyze tests - Run the planned tests and collect metrics, such as response
times, transaction volumes, operating system statistics, database server statistics.
Application Profilers - Implement ancillary tools to profile transaction characteristics.
Determine the network characteristics of a transaction, such as bandwidth utilization and
conversational chattiness. Ascertain the CPU utilization on the database server and client,
memory utilization, query compilation and execution times.
13).How is the database server capability and the scalability is increased?
As a result of Database Tuning the database server capability and scalability is increased
by addressing:
a. The use of a small packet size between the client and the server.
b. Chatty conversation over high latency network links.
c. Large amounts of unused data returned to the client.
d. Redundant database queries. additional tuning methods.
14).What are the roles of a System Architect involved in Database Tuning?
The system architect communicates information about the Oracle8i database design to the
database administrator (DBA) and the application developer. The database design
includes the logical structure of the database along with database size information. The
database must be designed correctly to ensure that database performance does not
deteriorate. It is difficult to tune an Oracle8i database that is based on a poor design.
Often, an application designer performs some of the system architect's duties, focusing
only on the logical structures and sizing involved with the applications as they are
74
designed. By communicating the database design information, the system architect
ensures that everyone using the database understands the flow of data and the processes
involved in the execution of applications.
15).How an Application Developer ensures application tuning?
An application developer communicates the implementation strategies to the system
architect and to the DBA. The implementation strategies include information about the
effect of applications on databases and the relationship between diverse application
processes. Application tuning begins with the design and development of the application
using the implementation strategies, modules, and SQL statements that are part of the
application programs. Much of performance tuning is achieved through effective
application design and development.
16).What are the jobs of a DBA and a h/w, s/w Administrator?
A DBA's tuning responsibility is to configure the database and monitor and document
system activity, so that system performance problems can be identified and corrected as
quickly as possible.
A hardware/software administrator documents and communicates the configuration of the
system environment in which the database and application reside. The configuration is
communicated to the system architect, the DBA, and the application developer. This
helps in designing and administering the system effectively.
17).How Queries and Views can be tuned.
If a query runs slower than expected, check if an index needs to be re-built, or if statistics
are too old.
Sometimes, the DBMS may not be executing the plan
you had in mind. Common areas of weakness:
–Selections involving null values.
–Selections involving arithmetic or string expressions.
–Selections involving OR conditions.
–Lack of evaluation features like index-only strategies orcertain join methods or poor size
estimation
Check the plan that is being used then adjust the choices of indexes or rewrite the
Query/Views.
18).Mention the guidelines for Query Tuning.
a).Minimize the use of GROUP BY and HAVING
75
b).Consider DBMS use of index when writing arithmetic expressions: E.age=2*D.age
will benefit from index on E.age, but might not benefit from index on D.age!
c).Minimize the use of DISTINCT:
Don’t need it if duplicates are acceptable, or if answer contains a key.
Avoid using intermediate relations:
SELECT E.dno, AVG(E.sal) FROM Emp E, Dept D WHERE E.dno=D.dno
AND
D.mgrname=‘Joe’
GROUP BY E.do
SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno
AND
D.mgrname=‘Joe’ and
SELECT T.dno, AVG(T.sal) FROM Temp T GROUP BY T.dno
Does not materialize the Intermediary relations:
d).If there is a dense B+ tree index on <dno, sal>, an index-only plan can be used to
avoid retrieving Emp tuples in the second query.
19).What are the ways of achieving refined conceptual schema in terms of Tuning?
The conceptual schema should be refined by considering performance criteria and
workload:
– May choose 3NF or lower normal form over BCNF.
– May choose among alternative decompositions into BCNF (or 3NF) based upon the
workload.
– May denormalize, or undo some decompositions.
– May decompose a BCNF relation further!
– May choose a horizontal decomposition of a relation.
– Importance of dependency-preservation based upon the dependency to be preserved,
and the cost of the IC check.
Can add a relation to ensure dep-preservation (for 3NF, not BCNF!); or else, can
check dependency using a join.
20).What is Functional Dependency?
A Functional dependency is denoted by X -> Y between two sets of attributes X and Y
that are subsets of R specifies a constraint on the possible tuple that can form a relation
state r of R. The constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they
have t1[Y] = t2[Y]. This means the value of X component of a tuple uniquely determines
the value of component Y.
21).What is Multivalued Dependency?
76
Multivalued dependency denoted by X -> Y specified on relation schema R, where X and
Y are both subsets of R, specifies the following constraint on any relation r of R: if two
tuples t1 and t2 exist in r such that t1[X] = t2[X] then t3 and t4 should also exist in r with
the following properties:
t3[X] = t4[X] = t1[X] = t2[X]
t3[Y] = t1[Y] and t4[Y] = t2[Y]
t3[Z] = t2[Z] and t4[Z] = t1[Z]
where [Z = (R-(X U Y)) ]
22).What is Normalization?
Normalization is a refined process of removing anomalies from the table. With
Normalization redundancy is minimized, insertion, deletion and updation anomalies are
also minimized.
23).What are the Three Normal Forms?
1NF: The domain of attribute must include only atomic (simple, indivisible) values.
2NF: A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R
is fully functionally dependent on primary key.
3NF: A relation schema R is in 3NF if it is in 2NF and for every FD X A either of the
following is true,
X is a Super-key of R.
A is a prime attribute of R.
In other words, if every non prime attribute is non-transitively dependent on primary key.
24).What is BCNF?
A relation schema R is in BCNF if it is in 3NF and satisfies an additional constraint that
for every FD X -> A, X must be a candidate key.
25).What is 4NF?
A relation schema R is said to be in 4NF if for every Multivalued dependency X -> Y
that holds over R, one of following is true
X is subset or equal to (or) XY = R.
X is a super key.
26).What is 5NF?
A Relation schema R is said to be 5NF if for every join dependency {R1, R2, ..., Rn}
that holds R, one of the following is true:
77
Ri = R for some i.
The join dependency is implied by the set of FD, over R in which the left side is
key of R.
27).What is Domain-Key Normal Form?
A relation is said to be in DKNF if all constraints and dependencies that should hold on
the the constraint can be enforced by simply enforcing the domain constraint and key
constraint on the relation.
28).When do we settle for 3NF against BCNF?
If the Decomposition is performed but there is no dependency-preserving we go for
settling it with 3NF.
For example: A Query may require a join on the decomposed schema, but it can be
answered by a scan of the original relation got by the 3NF.
29).What is Horizontal Decomposition?
Relation is replaced by a collection of relations that are projections. In most important
case, sometimes, might want to replace relation by acollection of relations that are
selections.
–Each new relation has same schema as the original, but a subset of the rows.
–Collectively, new relations contain all rows of the original. Typically, the new relations
are disjoint.
30).How will you Denormalize Contracts (Cid, Sid, Jid, Did, Pid, Qty, Val) into
BCNF.
There are 2 ways to decompose CSJDPQV into BCNF:
–SDP and CSJDQV; lossless-join but not dependent-preserving.
–SDP, CSJDQV and CJP; dependent-preserving as well
78
(v)
SQL REFRESHER
1. What is a concatenation operator?
‘||’
eg: select lastname || ‘works in ‘ || compname from employees.
It’s same as concat.
2. What is the difference between SQL and SQL plus?
Sql – language, has functions
Sql plus – environment, has commands
3. What is a membership operator?
IN (set)
4. How to select all names whose pattern in SA_ ?
% zero or more character
_ single character
eg: select name from employees where name like ‘%sa\_%’ escape ‘\’;
‘ \’ -escape sequence
5. How to check for null values?
IS NULL
Logical operators:
AND,OR,NOT
Sorting:
ORDER BY [DESC}
79
Single row functions:
Character Functions:
o Case Manipulation:
LOWER (‘attr’)
UPPER(‘attr’)
INITCAP(‘attr’)
o Character manipulations
CONCAT
SUBSTR
LENGTH
INSTR-instr(column,string,[m],[n])
LPAD/RPAD –lpad/rpad(column,n,string)
TRIM-trim(‘H’ from “hello world”)
REPLACE – replace(text,search-string,replacement-string)
Number functions:
o Round(column, n)
o Trunk(column, n)
o Mod(m, n)
6. What is sysdate - variable/command/function?
Sysdate is a function that returns date and time.
Date functions:
o months_between (date1,date2)
o add_months (date,n)
o next_day(date,’char’)
o last_day(date)
o round(date,[format])
o trunc(date,[format])
80
Conversion functions:
o To_char (date,’format_model)
Format model: {yyyy,year,month,mon,dy,day,dd}
o to_char(number,’format_model’)
Format model: {9 L 0 $ . ,}
o To_number(char,’format_model’)
o To_date(char,’format_model’)
General functions:
o NVL(expr1,expr2)
o NVL2(expr1,expr2,expr3)
o NULLIF(expr1,expr2)
o Coalesce(expr1,expr2…,exprn) -- return 1st not
null in expr list.
o NVL(jobid,’No job yet’)
Advantage of coalesce over NVL:
Can take multiple values.
• Case:
Eg: case expr when comp_expr1 then return expr1
[when..
when
else else – expr]
• Decode:
Eg: select name,salary decode(trunk(salary,
OUTER JOIN:
Put (+) on the side that is deficient in information.
Eg:
Select tab1.col, tab2.col from table1, table2 where tab1.col=tab2.col(+)
SELF JOIN:
Eg:
select work.last_name ||’works for’|| manager.lastname from employee
work,employee manager where work.manager_id=manager.employee_id.
NATURAL JOIN:
All column with same name.
USING clause – to specify some columns.
Eg:
-- Select deptid,deptname,locid,city from department NATURAL JOIN locations
81
-- Select e.employee_id,e.last_name,d.location_id from employee e JOIN department d
USING (dept_id);
Multiple row or Group Functions:
• AVG
• COUNT
• MAX
• MIN
• SUM
• STDDEV
• VARIANCE
Having and Where:
Having – having is a clause to restrict groups.
Where – clause to restrict in selection of rows and not groups.
7. What are the forms that a transaction can be of?
- many DML’s
- One DDL
- One DCL
Rules for naming tables and columns: (pertaining to oracle)
- must begin with
- can be of length up to 30 chars
- allowed characters: a-z,A-Z,0-9,_,$,#
Contents of a database object:
Table
View
Sequence
Index
Synonym
The above details can be got by using the query:
Select * from user_catalog – display objects specific to user
All_catalog – for all
Dba_catalog - for dba info
U$_catalog - to provide details regarding lock…
8. Difference between date and time stamp data types:
Date date type provides us only with the date.
Time stamp data type provides with date as well as time.
82
Some commands:
Forms of alter command:
Alter table table_name - add
- Modify
- Drop
- Set unused { may be used to set a column unused for a
period of time if available in the table but not required
currently}
- Drop unused { to drop a column marked as unused
permanantely}
rename table1 to table2
truncate – removes all rows from the table.
DDL commands;
- create
- alter
- drop
- rename
- truncate
- comment
Constraints:
Not null
Unique
Primary key
Foreign key
Check
For altering constraints:
Alter table employee
Add constraint
Drop constraint
Drop primary key cascade
Disable emp_pk cascade
Enable emp_pk cascade
9. How to find the nth maximum in the given relation?
create table stud(rollno number,marks number);
select distinct(a.marks) from stud a
where &N =
(select count(distinct(b.marks))from stud b
where b.marks >= a.marks)
83
Part – III
OPERATING
SYSTEM
84
Topics Covered:-
i) Introduction to Operating System, Comparison and File System
–Deepti S Babu
ii) Process Management –Dinesh K
iii) Memory Management –Kamalesh R S
iv) Input/Output & Disc Management(Misc) –Vaibhav Dutt Sharma
v) Unix FAQs - Kalamadhuri Ch & Aranganathan K P
85
(i)
INTRODUCTION TO OPERATING SYSTEM , COMPARISON
AND
FILE SYSTEM
1. What are the basic functions of an operating system?
Operating system controls and coordinates the use of the hardware among the various
applications programs for various uses. Operating system acts as resource allocator and
manager. Since there are many possibly conflicting requests for resources the operating
system must decide which requests are allocated resources to operating the computer
system efficiently and fairly. Also operating system is control program which controls the
user programs to prevent errors and improper use of the computer. It is especially
concerned with the operation and control of I/O devices.
2. What is a file management system?
A file management system is a set of system software that provides services to users and
applications related to the use of files.
3. What are the objectives of a file management system?
• To meet the data management needs and requirements of the user and the ability
to perform various operations.
• To guarantee, to the extent possible that the data in the files are valid.
• To optimize performance
• Tp provide I/O support for a variety of types of storage deivces.
• To minimize or eliminate the potential for lost or destroyed data.
• To provide a standardized set of I/O interface routines.
• To provide support for multiple users.
4. Discuss the 3 methods of record blocking.
Fixed blocking: Records of fixed length are used. An integral number of records are
stored in a block. There may be unused space at the end of each block.
Variable-length spanned blocking: Variable length records are used and are packed
into blocks with no unused space. Thus, some records may span 2 blocks.
Variable-length unspanned blocking: Here the records are of variable length but the
records cannot span over a block. Thus, the length of a record cannot be greater than 1
block.
86
5. Differentiate pre-allocation and dynamic allocation.
A pre-allocation policy would require that the maximum size of the file be declared at the
time of the file creation request. A dynamic allocation policy allocates space to a file in
portions as and when required.
6. What is disk inter-leaving?
Multiple disks can be used to store a single file by a technique called disk inter-leaving or
disk stripping. A group of disks are said inter-leaved if successive portions of a file are
stored on different disks. Since a file is stored across many disks, any request for file I/O
tends to be uniformly distributed across all the disks, hence yielding a better throughput
and improved response time.
7. How is a file system mounted?
Just as a file should be opened before it is used, a file system must be mounted before it
can be available to the processes on the system. To mount a file system, the name of the
device and the location within the file system is given to the operating system.
8. What is a mount point?
The location within the file structure at which we attach the file system is called as the
mount point. Typically, a mount point is an empty directory at which the mounted file
system will be attached.
9. What is Input spooling?
Input spooling takes the information from the input device, prepares the job for
scheduling, and places an entry in a job queue. Using input spooling, you can usually
shorten job run time, increase the number of jobs that can be run sequentially, and
improve device throughput.
The main elements of input spooling are:
Job queue
An ordered list of batch jobs submitted to the server for running and from which
batch jobs are selected to run.
Reader
A function that takes jobs from an input device or a database file and places them
on a job queue.
When a batch job is read from an input source by a reader, the commands in the input
stream are stored in the server as requests for the job, the inline data is spooled as inline
data files, and an entry for the job is placed on a job queue. The job information remains
87
stored in the server where it was placed by the reader until the job entry is selected from
the job queue for processing by a subsystem.
10. What is Output spooling?
Output spooling allows the server to produce output on multiple output devices, such as
printer and diskette devices, in an efficient manner. It does this by sending the output of a
job destined for a printer or diskette to disk storage. This process breaks a potential job
limitation imposed by the availability or speed of the output devices.
Spooling is especially important in a multiple-user environment where the number of jobs
running often exceeds the number of available output devices. Using output spooling, the
output can be easily redirected from one device to another.
The main elements of output spooling are:
Device description
A description of the printer or diskette device
Spooled file
A file containing spooled output records that are to be processed on an output
device
Output queue
An ordered list of spooled files
Writer
A program that sends files from an output queue to a device
Application program
A high-level language program that creates a spooled file using a device file with
the spooling attribute specified as SPOOL(*YES)
Device file
A description of the format of the output, and a list of attributes that describe how
the server should process the spooled file
11. What is a Job queue?
A job queue is an ordered list of jobs waiting to be processed by a particular subsystem.
Jobs will not be selected from a job queue by a subsystem unless the subsystem is active
and the job queue is not held. You can use job queues to control the order in which jobs
are run.
A base set of job queues is provided with your server. In addition, you may create
additional job queues that you need.
12. Name five major categories of system calls
System calls can be roughly grouped into five major categories:
88
• Process & Job Control: with end, abort, load, execute, create/fork or submit,
get/set attributes, terminate/exit, wait, dump, trace, etc. system calls which
perform controlling over processes, process status, and memory.
• File manipulation: with open, create, delete, read, write, reposition, close, and
get/set file attributes, etc..
• Device Management: with request, release, read, write, and reposition, which
provide additional resources needed by processes.
• Information Maintenance: with time, date, version, memory, chmod, get/set
process attributes, etc..
• Communication: with get hostid, get processid, open/close connection, accept
connection, wait for connection, read message, write message, map memory, etc..
13. What does the API abbreviation stand for?
API, standing for Application Program Interfaces, is the interface between a process and
the operating system; which provides convenience of interactions between users and an
operating system, and is implemented by a bunch of system calls.
14. Explain Priority inversion.
The issue of a higher priority thread being blocked from execution because a lower
priority thread is holding a resource (e.g., lock) needed to run is described as priority
inversion. The dispatcher should address this problem through priority inheritance.
15. Explain the features of some common operating systems.
• DOS
o Disk Operating System
o only really provides a files system – as name implies
o virtually no resource protection
o no multi-tasking
o little fault monitoring
o little convenience in input/output
o no built-in networking
o commercial product
• Windows 3.x
o added Graphic User Interface (GUI) to DOS
o basic operating system is DOS with all its imperfections
o improved user convenience
o some multitasking
o little protection
o no built-in networking
89
• Windows 95 (98)
o different OS from DOS but provides some DOS and Win 3.x compatibility
o improved convenience – better GUI
o easier addition and management of resources like devices and drivers
o integrates some basic networking support
built in Internet clients
support for Internet protocols – TCP/IP
o still basically single user
o some crude multi-tasking
o little process protection
• Windows NT
o all the advantages of Windows 95 including GUI
o different operating system from Windows 95
o fully multi-tasking
o more advanced networking than Windows 95
incorporates Internet client/server capabilities
supports Internet protocols – TCP/IP
networking applications based on NETBIOS and the associated
NETBUI protocol
BIOS – basic input/output system available on PC’s, DOS etc
NETBIOS – extends input/output over a network
NETBUI
NETBIOS Extended User Interface
includes NETBIOS plus additions done by IBM and
Microsoft
o comes as
single user for high end PCs
multi-user on network servers – limitations – security problems
• UNIX – LINUX
o LINUX is a UNIX clone running on PC and workstations
o multi-tasking
o multi-user
o better user process and resource protection
o many arguable advantages over Windows NT
o more robust system
o better process logging
o built in networking – TCP/IP
o better device drivers
o FREE and down-loadable from Internet
o several freely available applications
o ideal for research & development environments
o disadvantage – less friendly user interface but GUIs under development
90
• we use LINUX extensively for both servers and clients
16. What is a device file?
Device files are files that provide access to attached devices such as tapes, diskettes,
printers, displays, spools, and other systems that are attached by a communications line.
Example- Tape files, which allow access to data files on tape devices.
17. What Is Multics?
Multics (Multiplexed Information and Computing Service) is a timesharing operating
system begun in 1965 and used until 2000. The system was started as a joint project by
MIT's Project MAC, Bell Telephone Laboratories, and General Electric Company's Large
Computer Products Division. Prof. Fernando J. Corbató of MIT led the project. Bell Labs
withdrew from the development effort in 1969, and in 1970 GE sold its computer
business to Honeywell, which offered Multics as a commercial product and sold a few
dozen systems.
18. Operating systems can be classified based on the following criteria:
GUI - Short for Graphical User Interface a GUI Operating System contains graphics and
icons and is commonly navigated using by using a computer mouse Below are some
examples of GUI Operating Systems.
System7.x
Windows98
Windows CE
Multi-user - A multi-user Operating System allows for multiple users to use the same
computer at the same time and/or different times. Below are some examples of multi-user
Operating Systems.
Linux
UNIX
Windows 2000
Multiprocessing - An Operating System capable of supporting and utilizing more than
one computer processor. Below are some examples of multiprocessing Operating
Systems.
Linux
UNIX
Windows 2000
91
Multitasking - An Operating systems that is capable of allowing multiple software
processes to be run at the same time. Below are some examples of multitasking Operating
Systems.
UNIX
Windows 2000
Multithreading - Operating systems that allow different parts of a software program to
run concurrently. Operating systems that would fall into this category are:
Linux
UNIX
Windows 2000
19. What is the Zeroth Generation of computers?
The term zeroth generation is used to refer to the period of development of computing,
which predated the commercial production and sale of computer equipment. The period
might be dated as extending from the mid-1800s, and Charles Babbage’s Analytical
Engine, to the development of the fist commercial computer in 1951.
20. Features of Microsoft Windows 95
Some of the new features that Windows 95 have which Windows 3.x Does not have are:
Plug and Play Allows hardware devices to be automatically installed into the computer
with the proper software. Does not require jumpers to be played with.
32 Bit 32-Bit operating system allowing the computer to run faster and more efficiently.
Registry Combines the power of multiple configuration files into two files allowing the
system configurations to be located easier.
Memory Windows 95 has an improved memory handling processes compared to
Windows 3.11.
Right mouse click Allows you new access and text manipulation by utilizing both
buttons instead of one.
CD-Player Enhanced CD-Player with improved usability and AutoPlay feature.
21. Features of Microsoft Windows 98
92
Microsoft Windows 98 is the upgrade to Microsoft Windows 95. While this was not as
big as release as Windows 95, Windows 98 has significant updates, fixes and support for
new peripherals. The following is a list of some of its new features.
Protection - Windows 98 includes additional protection for important files on your
computer such as backing up your registry automatically.
Improved support - Improved support for new devices such as AGP, Direct
X,DVD,USB,MMX.
FAT32 - Windows 98 has the capability of converting your drive to FAT32 without
loosing any information.
Interface - Users of Windows 95 and NT will enjoy the same easy interface.
PnP - Improved PnP support, to detect devices even better then Windows 95.
Internet Explorer 4.0 - Included Internet Explorer 4.0
Customizable Taskbar - Windows adds many nice new features to the taskbar which 95
and NT do not have.
Includes Plus! - Includes features only found in Microsoft Plus! free.
Active Desktop - Includes Active Desktop which allows for users to customize their
desktop with the look of the Internet.
22. Features of Microsoft Windows Me
Windows Millennium also known as Windows ME was introduced September 14, 2000
to the general public as the upgrade for Windows 95 and Windows 98 users and is
designed for the end-users. Overall Windows ME has the look and feel of Windows 98
with some additional fixes and features not available in previous operating systems.
While Windows ME includes some of the latest fixes and updates and some enticing new
features we recommend this update only for users that may find or want some of the new
features listed below or for users who are purchasing a new computer with this operating
system included on it.
93
NEW FEATURES:
Some of the most noticeable new features include:
Revert back to backup of computer - Windows Me allows the user to automatically
restore an old backup incase files are corrupted or deleted.
Protect important system files - Windows Me allows the user to protect important system
files and will not allow these files to be modified by any type of other software.
Movie editor - Allows users to edit and or combine Microsoft movie files. Importing
movies requires additional hardware.
Windows Media Player - Includes Media Player 7, which enables users a more advanced
way of listening and organizing their media files.
23. Features of Microsoft Windows NT
Windows NT 4.0 is the look and feel of Windows 95 however is a completely different
Operating System. Windows NT contains advanced security features, advanced network
support, Full 32-bit operating system, advanced multitasking, user administration and
much more. While NT is a very advanced operating system it does lack the support of
drivers, features, and gaming support when compared to Windows 95 / Windows 98 and
is why even today Windows NT is still used primarily by businesses and technical users.
24. Features of Microsoft Windows XP
Codenamed Whistler, Microsoft Windows XP is short for Windows Experienced and is
the convergence of the two major Microsoft operating systems into one.
Windows XP is available in the following versions:
HomeEdition
Professional
Windows XP includes various new features not found in previous versions of Microsoft
Windows. Below is a listing of some of these new features.
• New interface - a completely new look and ability to change the look.
• Updates - new feature that automatically obtains updates fro the Internet.
• Internet Explorer 6 - Includes internet explorer 6 and new IM.
• Multilingual support - added support for different languages.
In addition to the above features, Windows XP does increase reliability when compared
to previous versions of Microsoft Windows
94
25. What are the advantages and disadvantages of Monolithic and microkernel?
• Advantages and disadvantages
– monolithic:
» faster switching to kernel functions
» simple interfaces between functions i.e. procedure calls
» easier access to data structures shared between functions
» larger memory resident core
» may become too large to maintain easily
– micro-kernel
» better partitioning - should be easier to implement and maintain
» smaller memory resident core
» slower switching to system processes
» inter-system-process communications may be slower
26. Mention a few differences between Windows and linux.
Topic Linux Windows
PRICE The majority of Linux
variants are available for
free or at a much lower
price than Microsoft
Windows.
Microsoft Windows can run
between $50.00 - $150.00
US dollars per each license
copy.
EASE Although the majority
Linux variants have
improved dramatically in
ease of use, Windows is still
much easier to use for new
computer users.
Microsoft has made several
advancements and changes
that have made it a much
easier to use Operating
System, and although
arguably it may not be the
easiest Operating System, it
is still Easier than Linux.
RELIABILITY The majority of Linux
variants and versions are
notoriously reliable and can
often run for months and
years without needing to be
rebooted
Although Microsoft
Windows has made great
improvements in reliability
over the last few versions of
Windows, it still cannot
match the reliability of
Linux.
SOFTWARE Linux has a large variety of
available software
programs, utilities, and
games. However, Windows
has a much larger selection
of available software.
Because of the large amount
of Microsoft Windows
users, there is a much larger
selection of available
software programs, utilities,
and games for Windows.
SOFTWARE COST Many of the available
software programs, utilities,
Although Windows does
have software programs,
95
and games available on
Linux are freeware and/or
open source. Even such
complex programs such as
Gimp, OpenOffice,
StarOffice, and wine are
available for free or at a low
cost.
utilities, and games for free,
the majority of the
programs will cost
anywhere between $20.00 -
$200.00+ US dollars per
copy.
HARDWARE Linux companies and
hardware manufacturers
have made great
advancements in hardware
support for Linux and today
Linux will support most
hardware devices. However,
many companies still do not
offer drivers or support for
their hardware in Linux.
Because of the amount of
Microsoft Windows users
and the broader driver
support, Windows has a
much larger support for
hardware devices and a
good majority of hardware
manufacturers will support
their products in Microsoft
Windows.
SECURITY Linux is and has always
been a very secure
Operating System.
Although it still can be
attacked when compared to
Windows, it much more
secure
Although Microsoft has
made great improvements
over the years with security
on their Operating System,
their Operating System
continues to be the most
vulnerable to viruses and
other attacks.
OPEN SOURCE Many of the Linux variants
and many Linux programs
are open source and enable
users to customize or
modify the code however
they wish to.
Microsoft Windows is not
open source and the
majority of Windows
programs are not open
source.
SUPPORT Although it may be more
difficult to find users
familiar with all Linux
variants, there are vast
amounts of available online
documentation and help,
available books, and
support available for Linux.
Microsoft Windows
includes its own help
section, has vast amount of
available online
documentation and help, as
well as books on Each of
the versions of Windows.
27. Differentiate command line and GUI.
Topic Command line GUI
Ease Because of the Although new users may
96
memorization and
familiarity needed to
operate a command line
interface new users find it
much more difficult to
successfully navigate and
operate a command line
interface.
have a difficult at time
learning to use the mouse to
operate and use a GUI most
users pick up this interface
much easier when
compared to a command
line interface
Control Users have much more
control of their file system
and operating system in a
command line interface. For
example users can easily
copy a specific type of file
from one location to another
with a one-line command.
Although a GUI offers
plenty of control of a file
system and operating
system often advance users
or users who need to do
specific task may need to
resort to a command line to
complete that task.
Multitasking Although many command
line environments are
capable of multitasking they
do not offer the same ease
and ability to view multiple
things at once on one
screen.
GUI users have windows
that enable a user to easily
view, control, and
manipulate multiple things
at once and is commonly
much faster to do when
compared to a command
line.
Scripting Because command line
users only need to use their
keyboards to navigate a
command line interface and
often only need to execute a
few lines to perform a task
an advanced command line
interface user would be able
to get something done faster
then an advance GUI user.
Although A GUI enables a
user to create shortcuts,
tasks, or other similar
actions to complete a task
or run a program it doesn't
even come close in
comparison to what is
available through a
command line.
97
(ii)
PROCESS MANAGEMENT
1. What is a process?
A program under execution that competes for CPU time and other resources. It is
an active entity where as program is a passive entity.
2. What is degree of multiprogramming?
The number of process running simultaneously and competing for the CPU is
known as the degree of multiprogramming.
3. What is a context switch?
Switching from one process to another process without changing the state of
process that is currently in the running state.
4. What are process states?
The state of a process is defined in part by the current activity of that process.
New, Running, Ready, Waiting and Terminated.
5. Difference between multiprogramming & multiprocessing.
More than one processes running on a processor is former and the later is more
than one processes running on more than one processors.
6. Can a multitasking OS also a multiprogramming OS?
No, a multiprogramming OS must be a multitasking but not vice versa.
7.How is the concurrent processes related?
By two factors
1) Competition
2) Co-operation
8.What does a PCB consist of?
It is also called as a task control block. It has
Process state
Program counter
CPU registers
CPU scheduling information
98
Memory management information
Accounting information
I/O status information
9.What is job queue, ready queue, device queue?
Processes entering the system are put into job queue.
Processes that are ready to run are put into ready queue.
List of processes waiting for a particular I/O device is called a
device queue.
10. What are types of schedulers?
Long term/job scheduler: selects processes from mass storage device,
where processes are spooled for later execution and loads into memory for
execution.
Short term/CPU scheduler: selects from among the processes that are
ready to run, and allocates the CPU to one of them. (Frequent)
Medium term scheduler: does swapping.
11. What is cascading termination?
If a process terminates then all its children must also be terminated. This
phenomenon, referred to as cascading termination, is normally initiated by the
operating system.
12. What are benefits of multithreaded programming?
Responsiveness, resource sharing, economy, utilization of multiprocessor
architecture.
13. What are user level and kernel level threads?
User threads are supported above the kernel and are implemented by a thread
library at the user level. They are fast to create and manage. If kernel is single
threaded then a thread performing a blocking system call will cause the entire
process to block.
Kernel threads are supported by operating system.
14. What are different multithreading models?
Many-to-one: maps many user level threads to one kernel thread. Concurrency is
not achieved.
One-to-one: maps each user thread to a kernel thread. Has the burden of creating a
kernel thread for each user thread.
99
Many-to-many: maps many user threads to a small or equal number of kernel
threads.
15. What is thread cancellation?
Task of terminating a thread before it has completed. A thread that is to be
cancelled is often referred to as the target thread.
16. What are types of cancellation?
Asynchronous: one thread immediately terminates the target thread.
Deferred: the target thread periodically check if it should terminate.
17. What is a thread pool and what are its benefits?
Threads are created during process startup and placed into a pool called thread
pool, where they wait for work.
i. Faster to service a request
ii. Limits no of threads
18. What are two types of scheduling?
Preemptive and non-preemptive scheduling.
19. What are functions involved in dispatching?
Dispatcher is the module that gives control of the CPU to the process selected by
the short-term scheduler. This involves
Switching context
Switching to user mode
Jumping to the proper location in the user program
20. What dispatch latency?
The time taken by the dispatcher to stop one process and start another running is
known as the dispatch latency.
21. What are scheduling criteria?
CPU utilization,
Throughput,
Turnaround time,
Waiting time,
Response time
22. Give some scheduling algorithms?
100
FCFS, SJF, priority scheduling, round-robin, multilevel feedback queue.
23. What is convoy effect?
All small process waiting for one big process to get off the CPU. It is usually
experienced in FCFS scheme.
24. What is time quantum in RR scheduling?
The constant time allocated for every process to stay in CPU.
25. Brief about multilevel queue scheduling.
It partitions the ready queue into several queues. The processes are permanently
assigned to one queue, generally based on some property of the process, such as
memory size, process priority, or process type. Each queue has its own scheduling
algorithm.
26. What is an idle thread?
If no ready thread is found, the dispatcher will execute a special thread called the
idle thread.
27. What is a race condition?
A situation where several processes access and manipulate the same date
concurrently and the outcome of the execution depends on the particular order in
which the access takes place, is called a race condition.
28. What is critical section?
Each process has a segment of code, in which the process may be changing
common variables, updating a table, writing a file, and so on. This section can be
executed only by one process at a time.
29. What is mutual exclusion?
If one process is in the critical section no other process can enter the critical
section. This property is known as mutual exclusion.
30. What is the solution to critical solution problem?
Mutual exclusion,
Progress,
Bounded waiting.
101
31. What is a semaphore?
It is a synchronization tool used to solve critical section problem. It is an integer
variable and can be accessed only through two atomic operations: wait and signal.
32. What is busy wait?
Process checks for CPU for every clock cycle or often when it is waiting to
resume its execution.
33. What is deadlock?
Two or more processes waiting indefinitely for an event that can be caused only
by one of the waiting process leads to dead lock.
34. Counting semaphore Vs Binary semaphore.
Counting semaphore: its integer value can range over an unrestricted domain.
Binary semaphore: can take values 0 and 1 only.
35. What is a monitor?
High level synchronization construct is the monitor type. It is characterized by a
set of programmer-defined operators. The representation of a monitor type
consists of declarations of variables whose values define the state of an instance
of the type, as well as the bodies of procedures or functions that implement
operations on the type.
36. What are the conditions for deadlock?
i. Mutual exclusion
ii. Hold and wait
iii. No preemption
iv. Circular wait
37. What is deadlock prevention and deadlock avoidance
Deadlock prevention is a set of methods for ensuring that at least one of the
necessary conditions cannot hold.
Deadlock avoidance on the other hand, requires that the operating system be
given in advance additional information concerning which resources a process
will request and use during its life time so that avoiding the deadlock well before
scheduling the processes.
38. How process termination is done to avoid deadlock?
102
Abort all deadlocked processes
Abort one process at a time until the deadlock cycle is eliminated.
39. What are issues to be considered for resource preemption?
Selecting a victim process to preempt
Rollback the process to a safe state
Ensure that starvation does not occur for any process
40. What for the Banker’s algorithm is used?
It is a deadlock avoidance algorithm.
103
(iii)
MEMORY MANAGEMENT
• Selection of memory management system depends mainly on the hardware
design of the system.
• Input queue:
The collection of processes on the disk that is waiting to be brought into memory
for execution are placed in the input queue.
• Ways of binding instructions and data to memory addresses can be done at
- compile time:
Deals with absolute code i.e., the compiler code will always start from
that location. Eg: MS_DOS .com format programs are absolute code
bound at compile time
- load time:
not known at compile time regarding the location where the process will
reside in memory, these codes are relocatable called as relocatable code.
Binding is done only during load time.
- execution time :
binding is done only during run time. Requires special hardware for this
scheme to work..
• Logical Address space: (Virtual Address space with regards to execution time
binding) Address generated by the CPU is commonly referred to as logical
address. The set of all logical addresses generated by a program is a logical
address space.
• Physical Address Space:
104
• Addresses seen by the memory unit i.e., the one loaded into the memory address
register is called the physical address and the set of all physical addresses
corresponding to the logical addresses is a physical address space.
• Memory management unit, a hardware device unit is responsible for mapping
from virtual to physical addresses.(A relocation register is used during mapping)
• What is a stub? And in what way its related to memory management?
A stub is a small piece of code that indicates how to locate the appropriate
memory-resident library routine, or how to load the library if the routine is not
already present.
• Overlays:
The idea of overlays is to keep in memory only those instructions and data that are
needed at any given time. When other instructions are needed, they are loaded into space
occupied previously by instructions that are no longer needed.
Generally, the memory required by the bigger process is allocated so that the other
processes can be overlaid at the same allocated memory.
Ex: in a 2 pass assembler the contents used during pass1 may not be required
during pass2 so the memory allocated for pass1 can be used to place the instructions of
pass2.
• Dynamic linking requires support from OS.
• Dynamic ling and overlays doesn’t require support from OS.
Roll in and Roll out:
The task of swapping process in and out of the memory can be termed as roll in and roll
out.
Ex: A higher priority process may be given service by swapping out (rolling out) a lower
priority process under execution. When the higher priority process finishes execution the
lower priority process may be swapped back in (rolled in).
105
Work of a dispatcher:
- check whether the next process in the queue is in the memory.
- If not and if there is no free memory region, the dispatcher swaps out a
process currently in memory and swaps in the desired process.
- Reload registers as normal and transfer control to the selected process.
1. Where is the swap space actually allocated?
Swap space is allocated as a chunk of disk, separate from the file system, so that its use is
as fast as possible.
2. When is swapping done?
swapping is done only when there are many processes running and were using a threshold
amount of memory. The process of swapping will be halted once the load on the system
is reduced.
In Microsoft windows 3.1 it is the user who decides on whether to swap a process (in or
out), rather than the scheduler deciding it.
3. What are the 2 main partitions of memory for?
- one for the resident operating system.
- one for the user processes.
4. Where is the OS normally placed and why?
The operating system is placed normally in the low memory since this is where the
interrupt vector is normally located. Interrupt vector is the one that mainly helps in
deciding the location of the os.
5. Types of memory protection:
- protecting the OS from the user process.
- protecting user processes from one another.
106
6. Registers involved in providing memory protection:
- Relocation register: contains the value of the smallest physical address.
- Limit register: contains the range of logical address.
Always the logical address must be less than that of the (upper bound of the) limit
register.
7. Transient operating system code:
There may be some code loaded in to the memory that may not be required
always and which can be allotted for some other processes when not required. So this
type of code that may be loaded and removed as needed are called transient
operatingsystem
code.
Eg: operating system may contain code and buffers for device drivers, if a device driver
is not commonly used then there is no need to keep the code and data in memory.
8. Memory Allocation:
Degree of multiprogramming is bounded by the number of partitions.
Partitioning is one of the major techniques used in memory allocation.
Discuss the memory partitioning scheme?
Fixed size partitioning
- equal
- unequal
• using single queue
• using multiple queue
Dynamic (variable size partitioning)
- first fit
- best fit
- next fit
- worst fit
107
- block list
- block header
hole: Available block of memory.
Memory allocation can also be done using the Buddy System technique.
Here division of memory in to two consecutively in the powers of 2 is done
during allocation and the parts are recombined during de-allocation if their adjacent node
is free.
9. What is 50-percent rule?
Statistical analysis of first fit, for instance reveals that, even with some optimization,
given N allocated blocks, another 0.5N blocks will be lost due to fragmentation. That is
one-third of memory may be unusable. This property is known as 50-percent rule.
10. Fragmentation and its types?
Inability to use the available memory is generally termed as fragmentation.
Internal Fragmentation:
Fragment (part) of Memory that is internal to a partition but is not being used.
External Fragmentation:
External fragmentation exists when enough total memory space exists to satisfy a request
but it is not contiguous; storage is fragmented into a large number of small holes.
Solutions:
- Compaction: shuffling and placing all free memory together in one large
block.
- Permitting the logical address space of a process to be non-contiguous.
- Paging
- Segmentation
- Combination of segmentation and paging
108
11. Is compaction always possible?
Compaction is only possible if relocation is dynamic, and is done at execution time.
12. What is paging?
Paging is a memory management scheme that permits the physical address space of a
process to be non-contiguous.
Here in paging technique the address generated by the CPU i.e., the logical address is
divided into 2 parts
- page number : used as index in to page table.
- page offset
If the size of the logical address space is and the page size is addressing
units (bytes or words), then the high-order m-n bits of logical address designate the page
number and the n lower-order bits designate the page offset.
13. Fragmentation in paging:
No external fragmentation but may have internal fragmentation.
User view: user program views that memory as one single contiguous space, containing
only that program.
System view: the user program is scattered throughout physical memory, which also
holds other programs.
Contents of the frame table (data structure used in the paging technique):
- Allocated frames
- Available frames
- Total no of frames
109
An entry for each page is made in the frame table.
Page Table Base Register is used to point to the page table.
While using PTBR two-memory access is required, this is time consuming so to
overcome the extra memory access we go for TLB.
What is a TLB?
TLB stands for Translation Look-aside Buffer.
- it’s a small ,special, fast lookup hardware cache.
TLB entries:
- A key (tag)
- Value
14. What is Wired Down TLB entries ?
Those entries that cannot be removed from the TLB are called as wired down entries.
15. What is an ASID?
ASID stands for Address-Space Identifiers, which is sometimes stored in each entry of
TLB.
An ASID uniquely identifies each process and is used to provide address space protection
for that process.
16. What is a hit ratio?
The percentage of times that a particular page number is found in the TLB is called the
hit ratio.
17. What is a Reentrant code or pure code or non-self modifying code?
This code never changes during execution, thus 2 or more processes can execute the same
code at the same time.
18. What is segmentation?
110
Segmentation is a memory management scheme that supports user view of memory. A
logical address space in segmentation is a collection of segments.
Each segment has
– name
– length (offset)
(In contrast to paging where the user specified only a single address, which was
partitioned by the hardware into a page number and an offset, all invisible to the
programmer)
19. Data structure used in segmentation:
Segment table.
Entries of a segment table are,
- Segment base – contains the physical address where the segment resides I
memory.
- Segment limit – specifies the length of the segment.
20. What is virtual memory?
Virtual memory is one that allows the execution of processes that may not be completely
in memory.
Advantages:
- programs can be larger than the physical memory.
- Frees users from the concern of memory storage limitations.
- Allows sharing of files and address spaces.
- Provides an efficient mechanism for process creation.
Disadvantage:
- may decrease the performance if used carelessly.
21. Virtual memory is commonly implemented by demand paging.
Demand paging:
111
Demand paging is one where a page is loaded in to main memory from the secondary
memory only on demand.
Lazy swapper:
A lazy swapper never swaps a page into memory unless that page will be needed.
22. What is thrashing?
Generally high paging activity is called thrashing. A process is thrashing if it
spends more time in paging than executing.
112
(iv)
INPUT/OUTPUT & DISC MANAGEMENT(MISC)
1) What do you mean by buffer swapping ?
Buffer swapping is also know as double buffering and it smoothes out the flow of data
between the I/O device and a process. It is an improvement over the single buffering
technique and involves assigning two system buffers to the operation. A process now
transfers data to(and from ) one buffer while the operating system empties(or fills) the
other.
2) What is a circular buffer and when do we use it ?
Double buffering is inadequate if the process performs rapid bursts of I/O. In this case
the problem can be alleviated by using more than two buffers. When more than two
buffers are used , the collection of buffers is itself referred to as circular buffer.
3) What is a device controller ?
I/O devices are not directly used to communicate with the computer system. Instead,
there is an intermediate electronic device called the device controller. This device
controller is really a special purpose computer with a limited instruction set, limited to
input/output control. The processor gives it one instruction at a time. However some
device controllers do know how to execute programs of more than one instruction. Then
they are called as I/O processors (IOPs) or sometimes channels.
4) Which algorithm is called the Elevator algorithm ?
The SCAN algo is called the Elevator algo since it behaves like an elevator and goes in
one direction until there are no more requests in that direction, and then reverse and go
the other direction.
5) Define the following terms :
Seek time : The seek time is the time for the disk arm to move the heads to the cylinder
containing the desired sector.
Rotational latency : rotational latency is the additional time waiting for the disk to rotate
the desired sector to the disk head.
Access time : The sum of seek time, if any, and the rotational delay is called as access
time.
113
Disk bandwith = total no of bytes transferred
--------------------------------
( Time of completion of last transfer -
Time of first req for service )
6) List the various disk scheduling algorithms :
FCFS ( first come,first served )
SSTF ( Shortest-seek-time-first )
SCAN algorithm
C-SCAN (Circular Scan algorithm )
LOOK Scheduling
C-LOOK Scheduling
N-STEP SCAN
F-SCAN
7) What is " arm stickiness " and how is avoided ?
With SSTF, SCAN and C-SCAN it is possible that the arm may not move for a
considerable peroid of time. Arm stickiness occurs when one or more processes have
high access rates and monopolise the entire device by repeated request to that track . To
avoid this arm stickiness the disk request queue must be segmented. Two examples of
this approach is N-step-SCAN and FSCAN.
8) Explain the N-STEP SCAN policy and the F-SCAN policy?
The n step scan policy segments the disk request into subqueues of the len L . Sub
quesues are processed one at at a time , using the SCAN . While a queue is being
processed , new requests must be added to some other queue.
If fewer than N requests are avilable at the ned of the scan , then all of them are
processed with the next scan.FSCAN policy uses two subqueues. When the scan begins,
all new request are in one of the queues , with the other empty. During the scan , all the
new requets are put in the other queue. Thus , the serveice of the new request is deferred
until all of the old processes have been processed.
114
9) Explain the difference betwen Physical formatting and Logical formatting ?
A new disk is simply is just a platter of magnetic material. Before we can store data , it
must be divided into sectors that the disk controller can read and write. This process is
called low-level formatting or Physical formatting. Logical formatting corressponds to "
making the file system " . In this step the operating system stores the initial file system
data structures onto the disk.
10) What is "sector sparing" ? Explain !
When the disk is intialised during the low level format at the factory , some spare sectors
are set aside and are not visible even by the operating system. When a bad sector is
encounterd , the controller replaces each bad sector logically with one of the spare
sectors. This scheme is called as sector sparing / Fowarding .
11) What is DRAM? In which form does it store data?
DRAM is not the best, but it’s cheap, does the job, and is available almost everywhere
you look. DRAM data resides in a cell made of a capacitor and a transistor. The capacitor
tends to lose data unless it’s recharged every couple of milliseconds, and this recharging
tends to slow down the performance of DRAM compared to speedier RAM types.
12) What is Dispatcher?
Dispatcher module gives control of the CPU to the process selected by the short-term
scheduler; this involves: Switching context, Switching to user mode, Jumping to the
proper location in the user program to restart that program, dispatch latency – time it
takes for the dispatcher to stop one process and start another running.
13) Is disk scheduling, other than FCFS scheduling, useful in a
singleuserenvironment?
Explain your answer.
In a single-user environment, the I/O queue usually is empty.Requests generally arrive
from a single process for one block or for a sequence of consecutive blocks. In these
cases, FCFS is an economical method of disk scheduling. But LOOK is nearly as easy to
program and will give much better performance when multiple processes are performing
concurrent I/O, such as when aWeb browser retrieves data in the background while the
operating system is paging and another application is active in the foreground.
14) Why is rotational latency usually not considered in disk scheduling? How would
you modify SSTF, SCAN, and C-SCAN to include latency optimization?
115
Most disks do not export their rotational position informationto the host. Even if they did,
the time for this information to reach the scheduler would be subject to imprecision and
the time consumed by the scheduler is variable, so the rotational position information
would become incorrect. Further, the disk requests are usually given in terms of logical
block numbers, and the mapping between logical blocks and physical locations is very
complex.
15) State three advantages of placing functionality in a device controller, rather
than in the kernel. State three disadvantages.
Three advantages:
Easy debugging : Bugs are less likely to cause an operating system crash.
Performance Optimization : Performance can be improved by utilizing
dedicated hardware and hard-coded algorithms.
Thin kernel : The kernel is simplified by moving algorithms out of it.
Three disadvantages:
Bugs are harder to fix.
a new firmware version or new hardware is needed Improving algorithms
likewise require a hardware update rather than just a kernel or device-driver
update.
Embedded algorithms could conflict with application’s use of the device,causing
decreased performance.
16) Why might a system use interrupt-driven I/O to manage a single serial port, but
polling I/O to manage a front-end processor, such as a terminal concentrator?
Polling can be more efficient than interrupt-driven I/O. This is the case when the I/O is
frequent and of short duration. Even though a single serial port will perform I/O
relatively infrequently and should thus use interrupts, a collection of serial ports such as
those in a terminal concentrator can produce a lot of short I/O operations, and
interrupting for each one could create a heavy load on the system. A well-timed polling
loop could alleviate that load without wasting many resources through looping with no
I/O needed.
17) Distinguish between a STREAMS driver and a STREAMS module.
The STREAMS driver controls a physical device that could be involved in a STREAMS
operation. The STREAMS module modifies the flow of data between the head (the user
interface) and the driver.
18) How does DMA increase system concurrency? How does it complicate hardware
design?
116
DMA increases system concurrency by allowing the CPU to perform tasks while the
DMA system transfers data via the system and memory buses. Hardware design is
complicated because the DMA controller must be integrated into the system, and the
system must allow the DMA controller to be a bus master. Cycle stealing may also be
necessary to allow the CPU and DMA controller to share use of the memory bus.
19) What problems could occur if a system allowed a file system to be mounted
simultaneously at more than one location?
There would be multiple paths to the same file, which could confuse users or encourage
mistakes (deleting a file with one path deletes the file in all the other paths).
20) Consider a system that supports the strategies of contiguous, linked, and
indexed allocation.What criteria should be used in decidingwhich strategy is best
utilized for a particular file?
• Contiguous—if file is usually accessed sequentially, if file is relatively small.
• Linked—if file is large and usually accessed sequentially.
• Indexed—if file is large and usually accessed randomly.
117
(v)
UNIX FAQs
1) What does fork() do?
The fork() function is used to create a new process from an existing process. The new
process is called the child process, and the existing process is called the parent. You can
tell which is which by checking the return value from fork(). The parent gets the child's
pid returned to him, but the child gets 0 returned to him. Thus this simple code illustrate's
the basics of it.
Inherited by the child from the parent:
* process credentials (real/effective/saved UIDs and GIDs)
* environment
* stack
* memory
* open file descriptors (note that the underlying file positions are shared between the
parent and child, which can be confusing)
* close-on-exec flags
* signal handling settings
* nice value
* scheduler class
* process group ID
* session ID
* current working directory
* root directory
* file mode creation mask (umask)
* resource limits
* controlling terminal
Unique to the child:
* process ID
* different parent process ID
* Own copy of file descriptors and directory streams.
* process, text, data and other memory locks are NOT inherited.
* process times, in the tms struct
* resource utilizations are set to 0
* pending signals initialized to the empty set
* timers created by timer_create not inherited
* asynchronous input or output operations not inherited
2) What's the difference between fork() and vfork()?
118
Some systems have a system call vfork(), which was originally designed as a
loweroverhead
version of fork(). Since fork() involved copying the entire address space of the
process, and was therefore quite expensive, the vfork() function was introduced (in
3.0BSD).
However, since vfork() was introduced, the implementation of fork() has improved
drastically, most notably with the introduction of `copy-on-write', where the copying of
the process address space is transparently faked by allowing both processes to refer to the
same physical memory until either of them modify it. This largely removes the
justification for vfork(); indeed, a large proportion of systems now lack the original
functionality of vfork() completely. For compatibility, though, there may still be a vfork()
call present, that simply calls fork() without attempting to emulate all of the vfork()
semantics.
As a result, it is very unwise to actually make use of any of the differences between fork()
and vfork(). Indeed, it is probably unwise to use vfork() at all, unless you know exactly
why you want to.
The basic difference between the two is that when a new process is created with vfork(),
the parent process is temporarily suspended, and the child process might borrow the
parent's address space. This strange state of affairs continues until the child process either
exits, or calls execve(), at which point the parent process continues.
This means that the child process of a vfork() must be careful to avoid unexpectedly
modifying variables of the parent process. In particular, the child process must not return
from the function containing the vfork() call, and it must not call exit() (if it needs to exit,
it should use _exit(); actually, this is also true for the child of a normal fork()).
3) Why use _exit rather than exit in the child branch of a fork?
There are a few differences between exit() and _exit() that become significant when
fork(), and especially vfork(), is used.
The basic difference between exit() and _exit() is that the former performs clean-up
related to user-mode constructs in the library, and calls user-supplied cleanup functions,
whereas the latter performs only the kernel cleanup for the process.
In the child branch of a fork(), it is normally incorrect to use exit(), because that can lead
to stdio buffers being flushed twice, and temporary files being unexpectedly removed. In
C++ code the situation is worse, because destructors for static objects may be run
incorrectly. (There are some unusual cases, like daemons, where the parent should call
_exit() rather than the child; the basic rule, applicable in the overwhelming majority of
cases, is that exit() should be called only once for each entry into main.)
119
In the child branch of a vfork(), the use of exit() is even more dangerous, since it will
affect the state of the parent process.
4) How can I get/set an environment variable from a program?
Getting the value of an environment variable is done by using getenv().
#include <stdlib.h>
char *getenv(const char *name);
Setting the value of an environment variable is done by using putenv().
#include <stdlib.h>
int putenv(char *string);
The string passed to putenv must not be freed or made invalid, since a pointer to it is kept
by putenv(). This means that it must either be a static buffer or allocated off the heap. The
string can be freed if the environment variable is redefined or deleted via another call to
putenv().
Now suppose you wanted to create a new environment variable called MYVAR, with a
value of MYVAL. This is how you'd do it.
static char envbuf[256];
sprintf(envbuf,"MYVAR=%s","MYVAL");
if(putenv(envbuf))
{
printf("Sorry, putenv() couldn't find the memory for %s\n",envbuf);
/* Might exit() or something here if you can't live without it */
}
5) How can I read the whole environment?
If you don't know the names of the environment variables, then the getenv() function isn't
much use. In this case, you have to dig deeper into how the environment is stored.
A global variable, environ, holds a pointer to an array of pointers to environment strings,
each string in the form "NAME=value". A NULL pointer is used to mark the end of the
array. Here's a trivial program to print the current environment (like printenv):
extern char **environ;
120
6) How can I sleep for less than a second?
The sleep() function, which is available on all Unixes, only allows for a duration
specified in seconds. If you want finer granularity, then you need to look for alternatives:
* Many systems have a function usleep()
* You can use select() or poll(), specifying no file descriptors to test; a common
technique is to write a usleep() function based on either of these (see the
comp.unix.questions FAQ for some examples)
* If your system has itimers (most do), you can roll your own usleep() using them (see
the BSD sources for usleep() for how to do this)
* If you have POSIX realtime, there is a nanosleep() function
Of the above, select() is probably the most portable (and strangely, it is often much more
efficient than usleep() or an itimer-based method). However, the behaviour may be
different if signals are caught while asleep; this may or may not be an issue depending on
the application.
Whichever route you choose, it is important to realise that you may be constrained by the
timer resolution of the system (some systems allow very short time intervals to be
specified, others have a resolution of, say, 10ms and will round all timings to that). Also,
as for sleep(), the delay you specify is only a minimum value; after the specified period
elapses, there will be an indeterminate delay before your process next gets scheduled.
7) How can I get a finer-grained version of alarm()?
Modern Unixes tend to implement alarms using the setitimer() function, which has a
higher resolution and more options than the simple alarm() function. One should
generally assume that alarm() and setitimer(ITIMER_REAL) may be the same
underlying timer, and accessing it both ways may cause confusion.
Itimers can be used to implement either one-shot or repeating signals; also, there are
generally 3 separate timers available:
ITIMER_REAL
counts real (wall clock) time, and sends the SIGALRM signal
ITIMER_VIRTUAL
counts process virtual (user CPU) time, and sends the SIGVTALRM signal
ITIMER_PROF
counts user and system CPU time, and sends the SIGPROF signal; it is intended for
interpreters to use for profiling.
Itimers, however, are not part of many of the standards, despite having been present since
4.2BSD. The POSIX realtime extensions define some similar, but different, functions.
121
8) How can a parent and child process communicate?
A parent and child can communicate through any of the normal inter-process
communication schemes (pipes, sockets, message queues, shared memory), but also have
some special ways to communicate that take advantage of their relationship as a parent
and child.
One of the most obvious is that the parent can get the exit status of the child.
Since the child inherits file descriptors from its parent, the parent can open both ends of a
pipe, fork, then the parent close one end and the child close the other end of the pipe.
This is what happens when you call the popen() routine to run another program from
within yours, i.e. you can write to the file descriptor returned from popen() and the child
process sees it as its stdin, or you can read from the file descriptor and see what the
program wrote to its stdout. (The mode parameter to popen() defines which; if you want
to do both, then you can do the plumbing yourself without too much difficulty.)
Also, the child process inherits memory segments mmapped anonymously (or by
mmapping the special file `/dev/zero') by the parent; these shared memory segments are
not accessible from unrelated processes.
9) What is a zombie?
When a program forks and the child finishes before the parent, the kernel still keeps some
of its information about the child in case the parent might need it -- for example, the
parent may need to check the child's exit status. To be able to get this information, the
parent calls wait(); when this happens, the kernel can discard the information.
In the interval between the child terminating and the parent calling wait(), the child is
said to be a `zombie'. (If you do `ps', the child will have a `Z' in its status field to indicate
this.) Even though it's not running, it's still taking up an entry in the process table. (It
consumes no other resources, but some utilities may show bogus figures for e.g. CPU
usage; this is because some parts of the process table entry have been overlaid by
accounting info to save space.)
This is not good, as the process table has a fixed number of entries and it is possible for
the system to run out of them. Even if the system doesn't run out, there is a limit on the
number of processes each user can run, which is usually smaller than the system's limit.
This is one of the reasons why you should always check if fork() failed, by the way!
If the parent terminates without calling wait(), the child is `adopted' by init, which
handles the work necessary to cleanup after the child. (This is a special system program
with process ID 1 -- it's actually the first program to run after the system boots up).
122
10) How do I prevent them from occuring?
You need to ensure that your parent process calls wait() (or waitpid(), wait3(), etc.) for
every child process that terminates; or, on some systems, you can instruct the system that
you are uninterested in child exit states.
Another approach is to fork() twice, and have the immediate child process exit straight
away. This causes the grandchild process to be orphaned, so the init process is
responsible for cleaning it up. For code to do this, see the function fork2() in the
examples section.
The other technique is to catch the SIGCHLD signal, and have the signal handler call
waitpid() or wait3(). See the examples section for a complete program.
11) How do I get my program to act like a daemon?
A daemon process is usually defined as a background process that does not belong to a
terminal session. Many system services are performed by daemons; network services,
printing etc.
Simply invoking a program in the background isn't really adequate for these long-running
programs; that does not correctly detach the process from the terminal session that started
it. Also, the conventional way of starting daemons is simply to issue the command
manually or from an rc script; the daemon is expected to put itself into the background.
Here are the steps to become a daemon:
1. fork() so the parent can exit, this returns control to the command line or shell
invoking your program. This step is required so that the new process is guaranteed not to
be a process group leader. The next step, setsid(), fails if you're a process group leader.
2. setsid() to become a process group and session group leader. Since a controlling
terminal is associated with a session, and this new session has not yet acquired a
controlling terminal our process now has no controlling terminal, which is a Good Thing
for daemons.
3. fork() again so the parent, (the session group leader), can exit. This means that we, as
a non-session group leader, can never regain a controlling terminal.
4. chdir("/") to ensure that our process doesn't keep any directory in use. Failure to do
this could make it so that an administrator couldn't unmount a filesystem, because it was
our current directory. [Equivalently, we could change to any directory containing files
important to the daemon's operation.]
5. umask(0) so that we have complete control over the permissions of anything we
write. We don't know what umask we may have inherited. [This step is optional]
6. close() fds 0, 1, and 2. This releases the standard in, out, and error we inherited from
our parent process. We have no way of knowing where these fds might have been
redirected to. Note that many daemons use sysconf() to determine the limit
_SC_OPEN_MAX. _SC_OPEN_MAX tells you the maximun open files/process. Then
123
in a loop, the daemon can close all possible file descriptors. You have to decide if you
need to do this or not. If you think that there might be file-descriptors open you should
close them, since there's a limit on number of concurrent file descriptors.
7. Establish new open descriptors for stdin, stdout and stderr. Even if you don't plan to
use them, it is still a good idea to have them open. The precise handling of these is a
matter of taste; if you have a logfile, for example, you might wish to open it as stdout or
stderr, and open `/dev/null' as stdin; alternatively, you could open `/dev/console' as stderr
and/or stdout, and `/dev/null' as stdin, or any other combination that makes sense for your
particular daemon.
Almost none of this is necessary (or advisable) if your daemon is being started by inetd.
In that case, stdin, stdout and stderr are all set up for you to refer to the network
connection, and the fork()s and session manipulation should not be done (to avoid
confusing inetd). Only the chdir() and umask() steps remain as useful.
12) How can I look at process in the system like ps does?
You really don't want to do this.
The most portable way, by far, is to do popen(pscmd, "r") and parse the output. (pscmd
should be something like `"ps -ef"' on SysV systems; on BSD systems there are many
possible display options: choose one.)
In the examples section, there are two complete versions of this; one for SunOS 4, which
requires root permission to run and uses the `kvm_*' routines to read the information
from kernel data structures; and another for SVR4 systems (including SunOS 5), which
uses the `/proc' filesystem.
It's even easier on systems with an SVR4.2-style `/proc'; just read a psinfo_t structure
from the file `/proc/PID/psinfo' for each PID of interest. However, this method, while
probably the cleanest, is also perhaps the least well-supported. (On FreeBSD's `/proc',
you read a semi-undocumented printable string from `/proc/PID/status'; Linux has
something similar.)
13) Given a pid, how can I tell if it's a running program?
Use kill() with 0 for the signal number.
There are four possible results from this call:
* kill() returns 0
o this implies that a process exists with the given PID, and the system would allow
you to send signals to it. It is system-dependent whether the process could be a zombie.
* kill() returns @math{-1}, errno == ESRCH
o either no process exists with the given PID, or security enhancements are causing
the system to deny its existence. (On some systems, the process could be a zombie.)
124
* kill() returns @math{-1}, errno == EPERM
o the system would not allow you to kill the specified process. This means that
either the process exists (again, it could be a zombie) or draconian security enhancements
are present (e.g. your process is not allowed to send signals to anybody).
* kill() returns @math{-1}, with some other value of errno
o you are in trouble!
The most-used technique is to assume that success or failure with EPERM implies that
the process exists, and any other error implies that it doesn't.
An alternative exists, if you are writing specifically for a system (or all those systems)
that provide a `/proc' filesystem: checking for the existence of `/proc/PID' may work.
14) What's the return value of system/pclose/waitpid?
The return value of system(), pclose(), or waitpid() doesn't seem to be the exit value of
my process... or the exit value is shifted left 8 bits... what's the deal?
The man page is right, and so are you! If you read the man page for waitpid() you'll find
that the return code for the process is encoded. The value returned by the process is
normally in the top 16 bits, and the rest is used for other things. You can't rely on this
though, not if you want to be portable, so the suggestion is that you use the macros
provided. These are usually documented under wait() or wstat.
Macros defined for the purpose (in `<sys/wait.h>') include (stat is the value returned by
waitpid()):
WIFEXITED(stat)
Non zero if child exited normally.
WEXITSTATUS(stat)
exit code returned by child
WIFSIGNALED(stat)
Non-zero if child was terminated by a signal
WTERMSIG(stat)
signal number that terminated child
WIFSTOPPED(stat)
non-zero if child is stopped
WSTOPSIG(stat)
number of signal that stopped child
WIFCONTINUED(stat)
non-zero if status was for continued child
WCOREDUMP(stat)
If WIFSIGNALED(stat) is non-zero, this is non-zero if the process left behind a core
dump.
15) How do I find out about a process' memory usage?
125
Look at getrusage(), if available.
16) Why do processes never decrease in size?
When you free memory back to the heap with free(), on almost all systems that doesn't
reduce the memory usage of your program. The memory free()d is still part of the
process' address space, and will be used to satisfy future malloc() requests.
If you really need to free memory back to the system, look at using mmap() to allocate
private anonymous mappings. When these are unmapped, the memory really is released
back to the system. Certain implementations of malloc() (e.g. in the GNU C Library)
automatically use mmap() where available to perform large allocations; these blocks are
then returned to the system on free().
Of course, if your program increases in size when you think it shouldn't, you may have a
`memory leak' -- a bug in your program that results in unused memory not being freed.
17) How do I change the name of my program (as seen by `ps')?
On BSDish systems, the ps program actually looks into the address space of the running
process to find the current argv[], and displays that. That enables a program to change its
`name' simply by modifying argv[].
On SysVish systems, the command name and usually the first 80 bytes of the parameters
are stored in the process' u-area, and so can't be directly modified. There may be a system
call to change this (unlikely), but otherwise the only way is to perform an exec(), or write
into kernel memory (dangerous, and only possible if running as root).
Some systems (notably Solaris) may have two separate versions of ps, one in `/usr/bin/ps'
with SysV behaviour, and one in `/usr/ucb/ps' with BSD behaviour. On these systems, if
you change argv[], then the BSD version of ps will reflect the change, and the SysV
version won't.
Check to see if your system has a function setproctitle().
18) How can I find a process' executable file?
This would be a good candidate for a list of `Frequently Unanswered Questions', because
the fact of asking the question usually means that the design of the program is flawed. :-)
You can make a `best guess' by looking at the value of argv[0]. If this contains a `/', then
it is probably the absolute or relative (to the current directory at program start) path of the
executable. If it does not, then you can mimic the shell's search of the PATH variable,
looking for the program. However, success is not guaranteed, since it is possible to
invoke programs with arbitrary values of argv[0], and in any case the executable may
have been renamed or deleted since it was started.
126
If all you want is to be able to print an appropriate invocation name with error messages,
then the best approach is to have main() save the value of argv[0] in a global variable for
use by the entire program. While there is no guarantee whatsoever that the value in
argv[0] will be meaningful, it is the best option available in most circumstances.
The most common reason people ask this question is in order to locate configuration files
with their program. This is considered to be bad form; directories containing executables
should contain nothing except executables, and administrative requirements often make it
desirable for configuration files to be located on different filesystems to executables.
A less common, but more legitimate, reason to do this is to allow the program to call
exec() on itself; this is a method used (e.g. by some versions of sendmail) to completely
reinitialise the process (e.g. if a daemon receives a SIGHUP).
19) So where do I put my configuration files then?
The correct directory for this usually depends on the particular flavour of Unix you're
using; `/var/opt/PACKAGE', `/usr/local/lib', `/usr/local/etc', or any of several other
possibilities. User-specific configuration files are usually hidden `dotfiles' under $HOME
(e.g. `$HOME/.exrc').
From the point of view of a package that is expected to be usable across a range of
systems, this usually implies that the location of any sitewide configuration files will be a
compiled-in default, possibly using a `--prefix' option on a configure script (Autoconf
scripts do this). You might wish to allow this to be overridden at runtime by an
environment variable. (If you're not using a configure script, then put the default in the
Makefile as a `-D' option on compiles, or put it in a `config.h' header file, or something
similar.)
User-specific configuration should be either a single dotfile under $HOME, or, if you
need multiple files, a dot-subdirectory. (Files or directories whose names start with a dot
are omitted from directory listings by default.) Avoid creating multiple entries under
$HOME, because this can get very cluttered. Again, you can allow the user to override
this location with an environment variable. Programs should always behave sensibly if
they fail to find any per-user configuration.
20) Why doesn't my process get SIGHUP when its parent dies?
Because it's not supposed to.
SIGHUP is a signal that means, by convention, "the terminal line got hung up". It has
nothing to do with parent processes, and is usually generated by the tty driver (and
delivered to the foreground process group).
127
However, as part of the session management system, there are exactly two cases where
SIGHUP is sent on the death of a process:
* When the process that dies is the session leader of a session that is attached to a
terminal device, SIGHUP is sent to all processes in the foreground process group of that
terminal device.
* When the death of a process causes a process group to become orphaned, and one or
more processes in the orphaned group are stopped, then SIGHUP and SIGCONT are sent
to all members of the orphaned group. (An orphaned process group is one where no
process in the group has a parent which is part of the same session, but not the same
process group.)
21) How can I kill all descendents of a process?
There isn't a fully general approach to doing this. While you can determine the
relationships between processes by parsing ps output, this is unreliable in that it
represents only a snapshot of the system.
However, if you're lauching a subprocess that might spawn further subprocesses of its
own, and you want to be able to kill the entire spawned job at one go, the solution is to
put the subprocess into a new process group, and kill that process group if you need to.
The preferred function for creating process groups is setpgid(). Use this if possible rather
than setpgrp() because the latter differs between systems (on some systems `setpgrp();' is
equivalent to `setpgid(0,0);', on others, setpgrp() and setpgid() are identical).
Putting a subprocess into its own process group has a number of effects. In particular,
unless you explicitly place the new process group in the foreground, it will be treated as a
background job with these consequences:
* it will be stopped with SIGTTIN if it attempts to read from the terminal
* if tostop is set in the terminal modes, it will be stopped with SIGTTOU if it attempts
to write to the terminal (attempting to change the terminal modes should also cause this,
independently of the current setting of tostop)
* The subprocess will not receive keyboard signals from the terminal (e.g. SIGINT or
SIGQUIT)
In many applications input and output will be redirected anyway, so the most significant
effect will be the lack of keyboard signals. The parent application should arrange to catch
at least SIGINT and SIGQUIT (and preferably SIGTERM as well) and clean up any
background jobs as necessary.
22) Which address spaces would be accessed by the 2 modes or execution levels ?
User mode -> User stack and Kernel mode -> kernel stack
128
23) The swapper is sometimes called a 'scheduler' why ?
Swapping contributes to scheduling sometimes.
24) what does 'i' stands for in inode ?
Index
25) what does a fork system call do?
It duplicates parent process and gives a new PID to child, returned to calling process.
26) How many u areas can kernel access at any point of time ? why ?
1 uarea,using the only 1 register triplet for the currently running process.
27) Relate critical region and processor execution levels ?
When critical regions/shared regions are executed, processor execution level is RAISED.
28) "No process can put another process to sleep" True/False
True
29) In buffer caching, it's a physical or conceptual cache ? why required ?
Conceptual cache. For Faster access of info. in RAM.
30) why is a block read ahead left Unlocked ?
It is an advance read that may/may not be required.
31) Difference between unnamed regular file and simple pipe ?
For simple pipes: Only direct blocks are used, and data is transient here
32) what is a mount point ?
The point in the tree, where a new path/tree is attached to.
33) what are the 3 logical sections of a process in the UNIX system ?
Data,Text/Code,Stack.
34) what are user register triples and kernel register triples ?
129
It is used to store Page Table Info, virtual address for kernel & user processes.
35) when and how are signals for a process handled ?
When a process gets its slot, the signal bits are checked for setting.
36) "Round robin with multilevel feedback" what does it mean ?
It's a mechanism that takes care of levels (or) process groups and an arithmetic for
priority that maintains an order, between processes from different groups also.
130
Part – IV
OBJECT ORIENTED
PROGRAMMING
131
Topics Covered:-
i) Introduction to OOP, Friend function - Soumya Ann Jose
ii) Inheritance - Ivy R
iii) Polymorphism , STL , Misc -Mukesh Kumar Singh
iv) Operator Overloading -Rathika N
v) Introduction to Java, Comparison -Vijayarani K
132
(i)
INTRODUCTION TO OOP, FRIEND FUNCTION
1. Difference between Object-Based And Object-Oriented.
Object-Based Programming usually refers to objects without inheritance
and hence without polymorphism, as in '83 Ada and Modula-2.These languages support
abstract data types (ADT) and not classes, which provide inheritance and polymorphism.
Object Oriented : a language is object-oriented if and only if it satisfies the following
requirements:
1.It supports objects that are data abstractions with an interface
of named operations and a hidden local state.
2.Objects have an associated type.
3.Types may inherit attributes from supertypes.
object-oriented = data abstractions + object types + type inheritance
or
Object-Oriented = Classes and Objects
+ Inheritance
+ Communication with messages
2. Benefits of object-orientation?
Reuse
Increased quality
Emphasis on modeling the real world
Faster development
Increased Quality
Easier maintenance
Enhanced modifiability
Systems more change resilient, evolvable
Reduced development risks for complex systems
3. Differences between C and C++.
133
1. In C, when a function takes no parameters, its prototype has the word void inside
its function parameter list. Eg char f1(void).In C++, the void is optional ie char
f1()
2. In C an empty parameter list would mean nothing is said about the parameters to
the function. In C++, it means that the function has no parameters.
3. In C prototypes are optional, in C++ they are required.
4. In C++ if a function is declared as returning a value, it must return a value ie if a
function has a return type other than void, any return statement within that
function must contain a value. In C, a non-void function is not required to actually
return a value.
5. In C, if you don’t explicitly specify the return type of a function, an integer type is
assumed. In C++ you must explicitly declare the return type of all functions.
6. In C, local variables can be declared only at the start of a block. In C++, local
variables can be declared anywhere.
4. What is a class?
"A class is a set of objects that share a common structure and a common behavior."
(Booch)
general format:
class class_name{
//private functions and variables
public:
//public functions and variables
} object-list;
5. What Is An Object?
Some Of the common definitions are :
An object has state, behavior, and identity; the structure and behavior of similar objects
are defined in their common class; the terms instance and object are interchangeable.
(Booch)
An object represents an individual, identifiable item,unit, or entity, either real or abstract,
with a well-defined role in the problem domain. ( Smith and Tockey )
An object is an abstraction of a set of real-world things such that:
* all of the real-world things in the set - the instances - have the same characteristics
* all instances are subject to and conform to the same rules. (Shlaer)
6. What is a method?
134
A method is a function or procedure which is defined in a class and typically can access
the internal state of an object of that class to perform some operation. It can be thought
of as a procedure with the first parameter as the object to work on. This object is called
the receiver, which is the object the method operates on. An exception exists with C++'s
static member functions which do not have a receiver, or "this" pointer.
7. Major Object Oriented Features
Abstraction:
Encapsulation : mechanism that binds together code and the data
It manipulates, and keeps both safe from outside interference and misuse.
Polymorphism: Polymorphism(many forms) is the quality that allows one name to be
used for two or more related but technically different purposes. Polymorphism allows one
name to specify a general class of actions. Within a general class of actions, the specific
action to be applied is determined by the type of data.
Inheritance : Inheritance is the process by which one object can acquire the
properties of another. More specifically, an object can inherit a general set of properties
to which it can add those features that are specific only to itself.
8. What is object encapsulation (Or Protection)?
Encapsulation is the process of hiding all of the details of an object that do not
contribute to its essential characteristics.(Booch)
C++ has a very general encapsulation/protection mechanism with public,
private and protected members. Public members (member
data and member functions) may be accessed from anywhere. A Stack's Push and Pop
methods will be public. Private members are only accessible from within a class. A
Stack's representation, such as a list or array, will usually be private. Protected members
are accessible from within a class and also from within subclasses (also called derived
classes). A Stack's representation could be declared protected allowing subclass access.
C++ also allows a class to specify friends (other (sub)classes and functions), that can
access
all members (its representation).
9. Constructor
A constructor is a function that is applied automatically to each class object prior to
the first use of that object within the program.
135
a class’s constructor is called each time an object of that class is created.
Has the same name as the class of which it is a part and has no return type.
For global objects, an object’s constructor is called once, when the program first
begins execution. For local objects, the constructor is called each time the
declaration statement is executed.
A constructor can take parameters.
10. Destructor
complement of a constructor is the destructor.
called when an object is destroyed.
Name of the destructor is the name of its class , preceded by a ~.
Local objects are destroyed when they go out of scope. Global objects are destroyed
when the program ends.
Destructor functions cannot have parameters.
11. Difference between structure and a class
By default, the members of a class are private but the members of a structure
are public.
Syntax of a structure
Struct type_name
{
// public function and data members
private:
// private function and data members
} object-list;
12. Relation between union and class
A union defines a class type that can contain both functions and data as members.
Union similar to structure in that by default all members are public until the private
specifier is used. In union however , all data members share the same memory
location
The unions ability to link code and data allows you to create class types in which all
data uses a shared location.
They cannot inherit any other class and they cannot be used as a base class for any
other type.
They also must not contain any object that has a constructor or destructor.
The union itself can have a constructor and destructor.
Unions cannot have virtual member functions.
13. Friend Functions
136
a friend function is not a member of a class but still has access to its private
members.
Enables a function to have access to the private members of two or more different
classes.
A friend function is not a member and cannot be qualified by an object name. It
must be called just like a normal function.
Although a friend function has knowledge of the private elements of the class for
which it is a friend, it can only access them through an object of the class. A friend
can access the variables only in conjunction with an object that is declared within or
passed to the friend function.
A friend function is not inherited.
A friend function can be friends with more than one class.
An example of a friend function
class myclass{
int n,d;
public:
myclass(int i, int j) { n = i ; d = j; }
// declare a friend of myclass
friend int isfactor(myclass ob);
};
/* friend function definition. It returns true if d is a factor of n. the keyword
friend is not used in the definition of isfactor() */
int isfactor(myclass ob)
{
if(!(ob.n % ob.d)) return 1;
else return 0;
}
int main()
{
myclass ob1(10, 2);
if(isfactor(ob1)) cout<< “ 2 is a factor of 10”;
else cout<<”2 is not a factor of 10”;
return 0;
}
Here , myclass declares its constructor function and the friend isfactor()
inside its class declaration. Because isfactor() is a friend of myclass, isfactor() has access
137
to its private members. That is why, within isfactor(), it is possible to directly refer to
ob.n and ob.d.
A function can be a member of one class and a friend of another.
Example
class B; //forward declaration
class A
{
public:
void function_name(B obj) { cout<< a; }
};
class B
{
public:
friend void A:: function_name(B obj);
};
int A::function_name(B obj)
{
// function definition
}
int main()
{
A obj1;
B obj2;
obj1.function_name(obj2);
return 0;
}
138
(ii)
INHERITENCE
1.What is Inheritance?
The mechanism of deriving a new class from an old one is called Inheritance. The old
class is called as base class and the new class is called as derived class.
2.Why do we go for inheritance?
• Reusabilty
3.Types of Inheritance:
• Single Inheritance
• Multiple Inheritance
• Multilevel Inheritance
• Hybrid Inheritance
B
D
B1 B2
D
S
B
D
S
B1 B2
139
• Hierarchical Inheritance
4.Define Derived Class:
A derived class can be defined by specifying its relationship with the base class in
addition to its own details.
Class derieved_class_name : visibility_mode base_class_name
{
----------
----------
};
visibility_mode:
• specifies Public, Private, Protected….
• if not specified .. taken as private
• is optional
VISIBILITY MODES:
5.What happens when a base class is privately inherited?
‘Public’ members of the base class become ‘Private’ members of derived class. Therefore
public members of base class can be accessed only through member functions of the
derived class. They are not accessible to the objects of the derived class.
6.What happens when a base class is publicly inherited?
‘Public ‘ members of the base class become ‘Public’ members of the derived class and
therefore are accessible to the objects of the derived class.
7. NOTE: In both private and public inheritance the private members of the base
class are not inherited.
8.How to make a Private member inheritable?
• Inherit as Protected
• A member function declared as Protected is accessible by the member
functions within the class and any class immediately derived from it. It
cannot be accessed by functions outside these two classes.
D
B
D1 D2 D3
140
9. NOTE1: When a protected member is inherited in Public mode, it becomes
protected in the derived class too and is therefore accessible only by the member
functions of the derived class.
NOTE2: When a protected member is inherited in Private mode, it becomes Private
int the derived class.
10. Protected Derivation:
Both Public and Protected members of the base class become protected members of the
derived class.
11. Visibility of Inherited members:
Base Class Visibility Derived Class Visibility
Public Derivation Private Derivation Protected Derivation
Private
Not inherited Not inherited Not inherited
Protected
Protected Private Protected
Public
Public Private Protected
12. What are the various functions that can access the private and protected members
of a class?
• Member functions of the class.
• A function that is a friend of the class.
• A member function of a class that is a friend of the class.
• A member function of a derived class.
13. Difference between Friend functions and member functions of derived class in
accessing the private and protected members of a class?
* The Friend functions and the member functions of a friend class can access both
the private and protected data.
* The member functions of a derived class can directly access only the protected
data. However, they can access the private data through the member functions of the base
class.
14. An object of a derived class can be treated as an object of the base class when
manipulated with pointers and references.
141
Base Derived
Employee: Manager:
/*
Manager is an employee. So a Manager* can be used as an Employee*
However all employee cannot be managers, therefore an employee* cannot be used as
manager*
*/
If a class Derived has a public base class Base, then a Derived* can be assigned to a
variable of type Base* without the use of explicit type conversion. The reverse needs
explicit type conversion.
Eg:
Void fun(Manager mm, Employee ee)
{
Employee* pe = &mm; // ok : base class pointer pointing to
derived class object
Manager* pm = &ee; // error : derived class pointer cannot point
to base class object….need explicit type
conversion
………..
………..
}
The explicit type cast can be done by:
Manager* pm=static_cast<Manager*>(pe); // ok… since ‘pe’ points to the
manager mm;
15. Member Functions:
consider the example……..:
class Employee
{
char *first_name, *family_name;
char middle_initial;
….
Protected:
Char *str;
….
first_name
family_name
…
first_name
family_name
…
group
level
….
142
Public:
Int I;
……
Void print() const;
……
Char * fam_name() const
{
return family_name;
}
……
};
class manager : public employee
{
int level;
public:
…….
void print() const;
….
}
A member function of the derived class can use the public and protected members of its
base class as if it where declared in the derived class it self
Void member :: print() const
{
cout<<”level”<<level<<endl; //ok derived class - member
cout<<”name is “<< fam_name()<<endl;//ok base class members accessed
through base class member functions
cout<<”I :”<<I<<endl; //ok base class-public member
cout<<”str is”<<str<<endl; //ok base class-protected member
cout<<”family_name: “<<family_name<<endl;//error base class-private
member
// a derived class cannot use the base class private names directly
}
16. Constructors and Destructors:
Default constructors can be invoked implicitly.
However if all constructors for a base class require arguments, then the constructor for
that base class should be explicitly called.
This is done using constructer header initialization list.
class Employee
{
143
char *first_name,;
int dept;
Public:
Employee(char* fname, int dept);
};
{
int level;
manager(char * fname, int dep, int lev);
}
employee:: employee(char * fname, int dep)
: first_name(fname), dept(dep)
{}
manager :: manager(char * fname,int dep, int lev)
: employee(fname,dep),
level(lev)
{}
17. Order of invoking constructors:
1. base class constructor
2. members
3. derived class constructor
18. They are destroyed in the opposite order.
Virtual Functions:
Class employee
{
….
public:
virtual void print() const; // virtual function
virtual void display()=0; // pure virtual function
};
void employee:: print() const
{
cout<<family_name<<”\t”<<dept<<endl;
}
{
int level;
144
public:
void print () const;
};
void manager::print() const
{
employee::print();
cout<<”\t”<<level;<<endl;
}
19. What does the keyword virtual mean?
The keyword VIRTUAL indicates that the print() can act as an interface to the print()
function defined in the base class and the print() functions derived in classes derived from
it.
NOTE:
* The virtual function must be defined for the class in which it is first declared (unless it
is declared to be pure virtual function).
* A class with one or more pure virtual functions is called as abstract class, and no
objects can be created for the abstract class.
MULTIPLE INHERITANCE:
Class task
{
delay() { … }
virtual void pending()=0;
virtual int fun();
};
class Displayed
{
draw() { … }
virtual void draw()=0;
virtual int fun();
};
class Satellite : public task, public Displayed
{
transmit() { … }
void pending(); //override task::pending
void draw(); //override displayed::draw
void f(satellite * sp)
{
int g=sp->fun(); // error: ambiguous
g=sp->task::fun(); //ok
g=sp->displayed::fun(); //ok}};
145
(iii)
POLYMORPHISM , STANDARD TEMPLATE LIBRARY AND
MISC
Polymorphism
Hiding many implementation behind a common interface.
Two types of polymorphism
1. state poplymorphism (compile time)
2. dynamic polymorphism (run time)
In c++ overloading of function is an example of static polymorphism. e.g.
int A(int);
char A(char *);
Virtual functions are example for dynamic polymorphism.
Virtual Function
• A virtual function is sometimes called method also. ( Only with C++ )
• Virtual function is used with the base and derived class.(it's not necessary but
without it, not useful).
• The declaration of virtual function should be same in base and derived class.
• A virtual function must be defined for the class in which it is first declared
(exception a pure virtual function) .
• To get runtime polymorphism, member function call must be virtual and object
must be manipulated through pointers and references.
• A virtual function can be inline.
• The class in which a virtual function is declared stores one pointer (hidden).
• The traditional and obvious implementation of virtual function is simply an
indirect function call.
class A{
...
...
virtual print(char *) =0 ; // Pure virtual function
};
class A{
...
...
virtual int print(char *);
...
...
};
146
class B:public class A
{
...
...
int print(char *); // in derive class, no need to write virtual keyword
...
...
}
It is not necessary that a virtual function should be defined in derived class.
Class Hierarchy
Inheritance
simple inheritance
multiple inheritance
Inheritance and using declaration.
if A & B are two base classes for C and both contains int f(int) and char f(char)
respectively. Now with the help of using ( Keyword ) we can declare both.
e.g.
class A{
...
...
int f(int);
char f(char);
...
...
};
class B{
public:
double f(double);
...
...
};
class AB:public A, public B{
publi:
using A::f;
using B::f;
char f(char); hides A::f(char);
AB f(AB);
147
};
Virtual Base Class:
AA
BD
C
Now C has two copy of members and attributes of A.
Let in A we have f();
Now,
C.f() // error
C.B::f() // ok
C.A::f()// > error
Solution is Virtual Base Class.
It should be defined like this.
class A{
...
...
};
class B:public virtual A{
...
...
};
class D:public virtual A{
...
...
};
class C:public B, public D{
...
...
};
Standard Template Library
148
The following function and descriptions are common for all containers differences are
specified separately.
• The facilities of the Standard Template Library are defined in the Standard
namespace(std)
• A container is an object which holds other objects. e.g. : list, vector , stack, queue
etc.
• There are 2 ways to define container
o Define a base class container and extend it.( problem is in two containers
here may not be any common interface).
o Define them separately ( It follows in c++)
• Fat interface means a lot of methods in which all are not useful and necessary
present in the interface e.g. : subscripting in list is not necessary.
• In c++ all containers implements all the standard container interface. (but there is
no common iterator base class)
• To be used generally the containers use allocators which is passed as template
argument. It is necessary because a vector can contain numbers or a vector of
objects of a user defined class.
e.g.
Template <class T,class A : Allocator <T>>
Class ::vector{
----
----
}
• Iterator:
Iterator is used to navigate containers
Key member functions of iterator are:
1. iterator begin(); points to first element
2. const-iterator begin() const;
3. iterator end(); points to one past last element
4. const-iterator end() const;
5. reverse-iterator rbegin() points to 1st element of the reverse sequence.
6. reverse-iterator rend(); points to one past last element of reverse sequence
7. const for both also exists.
149
How to use find ?
r1= find (c.rbegin (), c.rend (), v);
It checks the value v from last to first and if found returns a pointer to that objects *r1
gives object. If not found returns c.rend();
Similar for find(c.begin(),c.end(),v);
Vector also provides [ ](script) and at() but at() checks for rangewhere as [ ]
doesn’t (like c);
It is defined as, reference operator [] (size-type n); // operator overloading
Reference at (size-type n);
/* Const versions available. */
Reference front(); // first element
Reference back(); // last element
Only vector and dequeue support subscripting.
• begin() returns pointer to first element.
• constructors are ‘explicit’ in vector. // explicit doesn’t support = in initializing
e. g. : vector <record> vr (100);
If record has default constructor then vr is a vector of 100 records, if not then
error we must explicitly call constructor of record.
Eg: vector<record. vr(1000,record(“n”)); // vr doesn’t have default constructor
Vector can’t be initialized with negative number. It depends on the compiler to
convert negative to positive or to say error.
Vector <int> v1(10); ok
Vector<int> v2=vector<int>(10); ok
150
Vector <int>v3=v2; ok
Vector <int>v4=10; error; explicit not allowed.
How to pass containers into a function …the 3 methods are here 3rd shouldn’t be
used..
Void f (vector<int>v2& ); // pass by reference
Void f (const vector<int>v2& ); // pass by reference
Void f (vector <int>v ); creates and copy another vector( pass by value)
Use of assign.
Assign() fnction is used to assign values. It also changes size.
e. g. : vn.assign(10,num(0)); // now size is 10
Char s[]=”abcd”;
vc.assign(s,&s[sizeof(s)-1]); size=4. // kind of assignment from start address to
end address
List <book> lb;
Vb.assign (lb.begin (), lb.end ()); // kind of assignment
Some others functions.
Void push_back(const t &x); // add to end
Void pop_back(); // remove last element
After each push_back() vector size grows by 1.
s.back() gives most recently pushed element but doesn’t pops.
Size(); // returns the size.
Overflow, underflow is undefined.
As a list , vector also provides facility to add and delete element at any position.
Iterator insert(iterator pos,const T &x);
Void insert(iterator pos,size_type n,const T &x); // add n elements
Iterator erase(iterator first,iterator last); // erase sequence.
It also supports
Bool empty();
Sizetype max_size();
Size_type capacity();
resize(size_type sz,T val=t()); // like realloc
151
//added elements only initialized
void reserve(size_type n) ; // only reserve not initialized.
When vector grows till range of reserve it needn’t be relocated but reserve
has no impact on size.
Capacity gives the current no of reserved memory slot.
To know the limit to which vector can grow without reallocation.
v.capacity () - r.size ();
List doesn’t have capacity and reserve.
HELPER FUNCTIONS
Vectors can be compared using = = and < .
= = menas size and contents equal
< means lexicographically less.
STL also provides <=, !=, >=, > ,=
Vector <bool> is implemented separately. // doesn’t follow the rules discussed
above since it need only 1 bit to represent a element.
TABLE from book(pg no: 462-464 stroustrup)
Representation: it specify how the containers will be implemented. It is left to
implementers but with some constraint.
Generally,
Vector uses arrays
Lists uses link lists
Map uses balance tree
String uses sequence of array of some chars.
Element: the object in container.
Comparison: we can write our own comparison function or by default < is used.
e.g. : template <class A > void sort(A a,A b); // default < used here
Template <class A> void sort(A a,A b,Cmp cmp); // cmp( ) will specify the
comparison rule.
Now how a and b i.e., 2 objects of A will be compared is specified in function cmp.
it must return bool.
152
To sort names we must write cmp ( ) , if we want alphabetic (dictionary) order. i.e., a A
b B. By default (ASCII) AB ab
Relation operation !=,<=,>,>= are not included by default. It is defined in std:
reloperators
namespace.
Sequence: Fundemental sequences are vector,list, dequeue.
A list provides all the member types and operations offered by vector except
subscripting, capacity() and reserve(),at()
LIST’s operations:
Void splice(iterator pos, list &x’) moves all element from x to before pos in this list
without copying.
Void splice(iterator pos, list &x,iterator p); moves *p before pos from list x without
copying.
Similarly
Void splice(iterator pos,list x,iterator first,iterator last)
Void merge(list &) merge sorted list.
Template < class Cmp> void merge (list&,Cmp);
Void sort();
Template<class Cmp> void sort(Cmp);
Above list operations are stable i.e., relative order of equal elements doesn’t change.
Before using merge we must use sort();
Reference front();
Void push_front(const t&) ;
Reference front()const;
Void push_front(const t&) const ;
Void remove(const t& val) removes val from list.
Template<class pred> void remove_if(pred p); removes on condition.
Void unique();
Template <class BinPred> void unique(Binpredb);
Void reverse(); reverses the list.
Eg., let fruit is a list of fruit names.
Fruit.remove(“orange’);
153
Fruit.remove_if(initial(‘l’));
Unique() works only on sorted list because it checks consecutive elements.
Unique(initial (‘P’)); will give pear apple apple if list contain peer pear apple apple.
A dequeue (it rhymes with check) is double ended queue.
operation on ends are efficient like list.
subscripting is efficiend like vector.
ADAPTOR:
Stack, queue are adaptor of containers vector , dequeue and list. Because they can be
defined efficiently in terms of 3 containers.
By default stack uses dequeue.
Eg. Stack,char> s1; dequeue container
Stack <int,vector<int> s2; vector container.
• Stack relies on the iterator of container used to implement stack.
• queue by default uses dequeue as container to implement.
• vector doesn’t provide pop_front() so it cant be used for queue.
Associative array also called map or dictionary keeps pairs of keys, given only we
can get other.
Map is a sequence of (key,value) pair.
Miscellaneous:-
1. What is an object in C++?
An object is a package that contains related data and instructions. The data relates to
what the object represents, while the instructions define how this object relates to
other objects and itself.
2. What is a message?
A message is a signal from one object to another requesting that a computation take
place. It is roughly equivalent to a function call in other languages.
3. What is a class?
154
A class defines the characteristics of a certain type of object. It defines what its
members will remember, the messages to which they will respond, and what form the
response will take.
4. What is an instance?
An individual object that is a member of some class.
5. What is a super-class?
Given a class, a super-class is the basis of the class under consideration. The given
class is defined as a subset (in some respects) of the super-class. Objects of the given
class potentially posses all the characteristics belonging to objects of the super-class.
6. What is inheritance?
Inheritance is property such that a parent (or super) class passes the characteristics of
itself to children (or sub) classes that are derived from it. The sub-class has the option
of modifying these characteristics in order to make a different but fundamentally
related class from the super-class.
7. To what does message protocol refer?
An object's message protocol is the exact form of the set of messages to which the
object can respond.
8. What is polymorphism?
Polymorphism refers to the ability of an object to respond in a logically identical
fashion to messages of the same protocol, containing differing types of objects.
Consider 1 + 5 and 1 + 5.1. In the former, the message "+ 5" is sent to an object of
class integer (1). In the later, the message "+ 5.1" is sent to the same integer object.
The form of the message (its protocol) is identical in both cases. What differs is the
type of object on the right-hand side of these messages. The former is an integer
object (5) while the later is a floating point object (5.1). The receiver (1) appears (to
other objects) to respond in the same way to both messages. Internally, however, it
knows that it must treat the two types of objects differently in order to obtain the
same overall response.
9. What are instance variables?
These represent an object's private memory. They are defined in an object's class.
10. What are class variables?
These represent a class's memory which it shares with each of its instances.
155
11. What is a method?
A method is a class's procedural response to a given message protocol. It is like the
definition of a procedure in other languages.
12. In C++ what is a constructor? A destructor?
A constructors and destructors are methods defined in a class that are invoked
automatically when an object is created or destroyed. They are used to initialize a
newly allocated object and to cleanup behind an object about to be removed.
13. Compare and contrast C and C++.
Comparison: C++ is an extension to the C language. When C++ is used as a
procedural language, there are only minor syntactical differences between them.
Contrast: When used as a procedural language, C++ is a better C because:
o It vigorously enforces data typing conventions.
o It allows variables to be defined where they are used.
o It allows the definition of real (semantically significant) constants.
o It allows for automatic pointer dereferencing.
o It supports call-by-reference in addition to call-by-value in functions.
o It supports tentative variable declarations (when the type and location of a
variable cannot be known before hand.
As an object oriented language, C++ introduces much of the OOP paradigm
while allowing a mixture of OOP and procedural styles.
14. What is operator overloading?
It is the process of, and ability to redefine the way an object responds to a C++
operator symbol. This would be done in the object's class definition.
15. What is cin and cout?
They are objects corresponding to a program's default input and output files.
16. Contrast procedural and object oriented programming.
The procedural paradigm performs computation through a step-by-step manipulation
of data items. Solving problems this way is akin to writing a recipe. ie: All the
ingredients (data items) are defined. Next a series of enumerated steps (statements)
are defined to transform the raw ingredients into a finished meal. The object oriented
model, in contrast, combines related data and procedural information into a single
156
package called an object. Objects are meant to represent logically separate entities
(like real world objects). Objects are grouped together (and defined by) classes. (This
is analogous to user defined data types in procedural languages.) Classes may pass-on
their "makeup" to classes derived from them. In this way, Objects that are of a similar
yet different nature need not be defined from scratch. Computation occurs though the
intercommunication of objects. Programming this way is like writing a play. First the
characters are defined with their attributes and personalities. Next the dialog is
written so that the personalities interact. The sum total constitutes a drama.
17. How do you link a C++ program to C functions?
By using the extern "C" linkage specification around the C function declarations. You
should know about mangled function names and type-safe linkages. Then you should
explain how the extern "C" linkage specification statement turns that feature off
during compilation so that the linker properly links function calls to C functions.
Another acceptable answer is "I don't know. We never had to do that." Merely
describing what a linker does would indicate to me that you do not understand the
issue that underlies the question.
18. Explain the scope resolution operator.
The scope resolution operator permits a program to reference an identifier in the
global scope that has been hidden by another identifier with the same name in the
local scope. The answer can get complicated. It should start with "colon-colon,"
however. (Some readers had not heard the term, "scope resolution operator," but they
knew what :: means. You should know the formal names of such things so that you
can understand all communication about them.) If you claim to be well into the
design or use of classes that employ inheritance, you tend to address overriding
virtual function overrides to explicitly call a function higher in the hierarchy. That's
good knowledge to demonstrate, but address your comments specifically to global
scope resolution. Describe C++'s ability to override the particular C behavior where
identifiers in the global scope are always hidden by similar identifiers in a local
scope.
19. What are the differences between a C++ struct and C++ class?
The default member and base class access specifiers are different. This is one of the
commonly misunderstood aspects of C++. Believe it or not, many programmers think
that a C++ struct is just like a C struct, while a C++ class has inheritance, access
specifiers, member functions, overloaded operators, and so on. Some of them have
even written books about C++. Actually, the C++ struct has all the features of the
class. The only differences are that a struct defaults to public member access and
public base class inheritance, and a class defaults to the private access specifier and
private base class inheritance. Getting this question wrong does not necessarily
157
disqualify you because you will be in plenty of good company. Getting it right is a
definite plus.
20. How many ways are there to initialize an int with a constant?
Two.
There are two formats for initializers in C++ as shown in Example 1. Example 1(a)
uses the traditional C notation, while Example 1(b) uses constructor notation. Many
programmers do not know about the notation in Example 1(b), although they should
certainly know about the first one. Many old-timer C programmers who made the
switch to C++ never use the second idiom, although some wise heads of C++ profess
to prefer it.
A reader wrote to tell me of two other ways, as shown in Examples 2(a) and 2(b),
which made me think that maybe the answer could be extended even further to
include the initialization of an int function parameter with a constant argument from
the caller.
21. How does throwing and catching exceptions differ from using setjmp and
longjmp?
The throw operation calls the destructors for automatic objects instantiated since
entry to the try block. Exceptions are in the mainstream of C++ now, so most
programmers, if they are familiar with setjmp and longjmp, should know the
difference. Both idioms return a program from the nested depths of multiple function
calls to a defined position higher in the program. The program stack is "unwound" so
that the state of the program with respect to function calls and pushed arguments is
restored as if the calls had not been made. C++ exception handling adds to that
behavior the orderly calls to the destructors of automatic objects that were instantiated
as the program proceeded from within the try block toward where the throw
expression is evaluated. It's okay to discuss the notational differences between the
two idioms. Explain the syntax of try blocks, catch exception handlers, and throw
expressions. Then specifically address what happens in a throw that does not happen
in a longjmp. Your answer should reflect an understanding of the behavior described
in the answer just given.
One valid reason for not knowing about exception handling is that your experience is
exclusively with older C++ compilers that do not implement exception handling. I
would prefer that you have at least heard of exception handling, though.
It is not unusual for C and C++ programmers to be unfamiliar with setjmp/ longjmp.
Those constructs are not particularly intuitive. A C programmer who has written
recursive descent parsing algorithms will certainly be familiar with setjmp/ longjmp.
Others might not, and that's acceptable. In that case, you won't be able to discuss how
setjmp/longjmp differs from C++ exception handling, but let the interview turn into a
158
discussion of C++ exception handling in general. That conversation will reveal to the
interviewer a lot about your overall understanding of C++.
22. What is your reaction to this line of code?
delete this; It's not a good practice.
A good programmer will insist that the statement is never to be used if the class is to
be used by other programmers and instantiated as static, extern, or automatic objects.
That much should be obvious.
The code has two built-in pitfalls. First, if it executes in a member function for an
extern, static, or automatic object, the program will probably crash as soon as the
delete statement executes. There is no portable way for an object to tell that it was
instantiated on the heap, so the class cannot assert that its object is properly
instantiated. Second, when an object commits suicide this way, the using program
might not know about its demise. As far as the instantiating program is concerned, the
object remains in scope and continues to exist even though the object did itself in.
Subsequent dereferencing of the pointer can and usually does lead to disaster.
A reader pointed out that a class can ensure that its objects are instantiated on the
heap by making its destructor private. This idiom necessitates a kludgy DeleteMe
kind of function because the instantiator cannot call the delete operator for objects of
the class. The DeleteMe function would then use "delete this."
I got a lot of mail about this issue. Many programmers believe that delete this is a
valid construct. In my experience, classes that use delete this when objects are
instantiated by users usually spawn bugs related to the idiom, most often when a
program dereferences a pointer to an object that has already deleted itself.
23. What is a default constructor?
A constructor that has no arguments or one where all the arguments have default
argument values. If you don't code a default constructor, the compiler provides one if
there are no other constructors. If you are going to instantiate an array of objects of
the class, the class must have a default constructor.
24. What is a conversion constructor?
A constructor that accepts one argument of a different type. The compiler uses this
idiom as one way to infer conversion rules for a class. A constructor with more than
one argument and with default argument values can be interpreted by the compiler as
a conversion constructor when the compiler is looking for an object of the type and
sees an object of the type of the constructor's first argument.
159
25. What is the difference between a copy constructor and an overloaded
assignment operator?
A copy constructor constructs a new object by using the content of the argument
object. An overloaded assignment operator assigns the contents of an existing object
to another existing object of the same class.
First, you must know that a copy constructor is one that has only one argument,
which is a reference to the same type as the constructor. The compiler invokes a copy
constructor wherever it needs to make a copy of the object, for example to pass an
argument by value. If you do not provide a copy constructor, the compiler creates a
member-by-member copy constructor for you.
You can write overloaded assignment operators that take arguments of other classes,
but that behavior is usually implemented with implicit conversion constructors. If you
do not provide an overloaded assignment operator for the class, the compiler creates a
default member-by-member assignment operator.
This discussion is a good place to get into why classes need copy constructors and
overloaded assignment operators. By discussing the requirements with respect to data
member pointers that point to dynamically allocated resources, you demonstrate a
good grasp of the problem.
26. When should you use multiple inheritance?
There are three acceptable answers: "Never," "Rarely," and "When the problem
domain cannot be accurately modeled any other way."
There are some famous C++ pundits and luminaries who disagree with that third
answer, so be careful.
Let's digress to consider this issue lest your interview turn into a religious debate.
Consider an Asset class, Building class, Vehicle class, and CompanyCar class. All
company cars are vehicles. Some company cars are assets because the organizations
own them. Others might be leased. Not all assets are vehicles. Money accounts are
assets. Real-estate holdings are assets. Some real-estate holdings are buildings. Not
all buildings are assets. Ad infinitum. When you diagram these relationships, it
becomes apparent that multiple inheritance is an intuitive way to model this common
problem domain. You should understand, however, that multiple inheritance, like a
chainsaw, is a useful tool that has its perils, needs respect, and is best avoided except
when nothing else will do. Stress this understanding because your interviewer might
share the common bias against multiple inheritance that many object-oriented
designers hold.
27. What is a virtual destructor?
160
The simple answer is that a virtual destructor is one that is declared with the virtual
attribute.
The behavior of a virtual destructor is what is important. If you destroy an object
through a pointer or reference to a base class, and the base-class destructor is not
virtual, the derived-class destructors are not executed, and the destruction might not
be complete.
28. Explain the “IS A” and “HAS A” class relationships. How would you
mplement each in a class design?
A specialized class "is a" specialization of another class and, therefore, has the ISA
relationship with the other class. An Employee ISA Person. This relationship is best
implemented with inheritance. Employee is derived from Person. A class may have
an instance of another class. For example, an Employee "has a" Salary, therefore the
Employee class has the HASA relationship with the Salary class. This relationship is
best implemented by embedding an object of the Salary class in the Employee class.
The answer to this question reveals whether you have an understanding of the
fundamentals of object-oriented design, which is important to reliable class design.
There are other relationships. The USESA relationship is when one class uses the
services of another. The Employee class uses an object (cout) of the ostream class to
display the employee's name onscreen, for example. But if you get ISA and HASA
right, you usually don't need to go any further.
29. When is a template a better solution than a base class?
When you are designing a generic class to contain or otherwise manage objects of
other types, when the format and behavior of those other types are unimportant to
their containment or management, and particularly when those other types are
unknown (thus the genericity) to the designer of the container or manager class. Prior
to templates, you had to use inheritance; your design might include a generic List
container class and an application-specific Employee class. To put employees in a
list, a ListedEmployee class is multiply derived (contrived) from the Employee and
List classes. These solutions were unwieldy and error-prone. Templates solved that
problem.
30. What is the result of compiling this program in a C compiler? in a C++
compiler?
int __ = 0;
main() {
int ___ = 2;
printf("%d\n", ___ + __);
}
161
On GNU compiler both C and C++ give output as 2. I believe other C++ compiler
will complain "syntax error", whereas C compiler will give 2.
31. What is the difference between C and C++ ? Would you prefer to use one
over the other ?
C is based on structured programming whereas C++ supports the object-oriented
programming paradigm. Due to the advantages inherent in object-oriented programs
such as modularity and reuse, C++ is preferred. However almost anything that can be
built using C++ can also be built using C.
32. What are the access privileges in C++ ? What is the default access level ?
The access privileges in C++ are private, public and protected. The default access
level assigned to members of a class is private. Private members of a class are
accessible only within the class and by friends of the class. Protected members are
accessible by the class itself and it's sub-classes. Public members of a class can be
accessed by anyone.
33. What is data encapsulation ?
Data Encapsulation is also known as data hiding. The most important advantage of
encapsulation is that it lets the programmer create an object and then provide an
interface to the object that other objects can use to call the methods provided by the
object. The programmer can change the internal workings of an object but this
transparent to other interfacing programs as long as the interface remains unchanged.
34. What is inheritance ?
Inheritance is the process of deriving classes from other classes. In such a case, the
sub-class has an 'is-a' relationship with the super class. For e.g. vehicle can be a
super-class and car can be a sub-class derived from vehicle. In this case a car is a
vehicle. The super class 'is not a' sub-class as the sub- class is more specialized and
may contain additional members as compared to the super class. The greatest
advantage of inheritance is that it promotes generic design and code reuse.
35. What is multiple inheritance ? What are it's advantages and disadvantages ?
Multiple Inheritance is the process whereby a sub-class can be derived from more
than one super class. The advantage of multiple inheritance is that it allows a class to
inherit the functionality of more than one base class thus allowing for modeling of
complex relationships. The disadvantage of multiple inheritance is that it can lead to a
lot of confusion when two base classes implement a method with the same name.
162
36. What is polymorphism?
Polymorphism refers to the ability to have more than one method with the same
signature in an inheritance hierarchy. The correct method is invoked at run-time
based on the context (object) on which the method is invoked. Polymorphism allows
for a generic use of method names while providing specialized implementations for
them.
37. What do the keyword static and const signify ?
When a class member is declared to be of a static type, it means that the member is
not an instance variable but a class variable. Such a member is accessed using
Classname.Membername (as opposed to Object.Membername). Const is a keyword
used in C++ to specify that an object's value cannot be changed.
38. How is memory allocated/deallocated in C ? How about C++ ?
Memory is allocated in C using malloc() and freed using free(). In C++ the new()
operator is used to allocate memory to an object and the delete() operator is used to
free the memory taken up by an object.
39. What is UML ?
UML refers to Unified Modeling Language. It is a language used to model OO
problem spaces and solutions.
40. What is the difference between a shallow copy and a deep copy?
A shallow copy simply creates a new object and inserts in it references to the
members of the original object. A deep copy constructs a new object and then creates
in it copies of each of the members of the original object.
41. What is a mutable member?
One that can be modified by the class even when the object of the class or the
member function doing the modification is const. Understanding this requirement
implies an understanding of C++ const, which many programmers do not have. I have
seen large class designs that do not employ the const qualifier anywhere. Some of
those designs are my own early C++ efforts. One author suggests that some
programmers find const to be such a bother that it is easier to ignore const than to try
to use it meaningfully. No wonder many programmers don't understand the power
and implications of const. Someone who claims to have enough interest in the
163
language and its evolution to keep pace with the ANSI deliberations should not be
ignorant of const, however.
42. What is an explicit constructor?
A conversion constructor declared with the explicit keyword. The compiler does not
use an explicit constructor to implement an implied conversion of types. Its purpose is
reserved explicitly for construction.
43. Describe run-time type identification.
The ability to determine at run time the type of an object by using the typeid operator
or the dynamic_cast operator.
44. What problem does the namespace feature solve?
Multiple providers of libraries might use common global identifiers causing a name
collision when an application tries to link with two or more such libraries. The namespace
feature surrounds a library's external declarations with a unique namespace that
eliminates the potential for those collisions. This solution assumes that two library
vendors don't use the same namespace, of course.
45. Are there any new intrinsic (built-in) data types?
Yes. The ANSI committee added the bool intrinsic type and its true and false value
keywords and the wchar_t data type to support character sets wider than eight bits.
Other apparent new types (string, complex, and so forth) are implemented as classes
in the Standard C++ Library rather than as intrinsic types.
164
(iv)
OPERATOR OVERLOADING
1. Why Operator Overloading?
It allows us to provide an intuitive interface to users of our class, plus makes it possible
for templates to work equally well with classes and built-in/intrinsic types. Operator
overloading allows C/C++ operators to have user-defined meanings on user-defined types
(classes). Overloaded operators are syntactic sugar for function calls.
2.What are the benefits of operator overloading?
By overloading standard operators on a class, we can exploit the intuition of the users of
that class. This lets users program in the language of the problem domain rather than in
the language of the machine.The ultimate goal is to reduce both the learning curve and
the defect rate.
3. What are some examples of operator overloading?
A few of the many examples of operator overloading:
• myString + yourString might concatenate two std::string objects
• myDate++ might increment a Date object
• a * b might multiply two Number objects
• a[i] might access an element of an Array object
• x = *p might dereference a "smart pointer" that "points" to a disk record — it
could seek to the location on disk where p "points" and return the appropriate
record into x.
4. Operator Overloading makes classes look clumsy.Comment on this.
Operator overloading makes life easier for the users of a class, not for the developer of
the class!
Consider the following example.
class Array {
public:
int& operator[] (unsigned i); // Some people don't like this syntax
...
};
inline
int& Array::operator[] (unsigned i) // Some people don't like this syntax
165
{
...
}
Some people don't like the keyword operator or the somewhat funny syntax that goes
with it in the body of the class itself. But the operator overloading syntax isn't supposed
to make life easier for the developer of a class. It's supposed to make life easier
for the users of the class:
int main()
{
Array a;
[3] = 4; // User code should be obvious and easy to understand...
}
5.What operators can/cannot be overloaded?
Most can be overloaded. The only C operators that can't be are . and ?: (and sizeof,
which is technically an operator). C++ adds a few of its own operators, most of
which can be overloaded except :: and .*.
6. Is it possible to overload operator== so it lets us compare two char[] using a string
comparison?
No: at least one operand of any overloaded operator must be of some user-defined type
(most of the time that means a class).But even if C++ allowed you to do this, which it
doesn't, you wouldn't want to do it anyway since you really should be using a
std::stringlike
class rather than an array of char in the first place since arrays are evil.
7.Can we create a operator** for "to-the-power-of" operations?
No. The names of, precedence of, associativity of, and arity of operators is fixed by the
language. There is no operator** in C++, so we cannot create one for a class type.
Consider that x ** y is the same as x * (*y) (in other words, the compiler assumes y is a
pointer). Besides, operator overloading is just syntactic sugar for function
calls. This particular syntactic sugar doesn't add anything fundamental. It is
suggested to overload pow(base,exponent) (a double precision version is in <cmath>).By
the way, operator^ can work for to-the-power-of, except it has the wrong precedence and
associativity.
8.What are some guidelines / "rules of thumb" for overloading operators?
If the overloaded operator makes the output easier and safer for the users, do it;
otherwise don't. This is the most important guideline. In fact it is, in a very real sense, the
only guideline; the rest are just special cases.
While defining arithmetic operators, maintain the usual arithmetic identities.
166
You should provide mixed-mode arithmetic operators only when they make logical sense
to users. For example, it makes sense to add a duration (say 35 days) to a date (say July
4, 1776), so you might define date + duration to return a Date. Similarly date - duration
could also return a Date. But duration - date does not make sense at the conceptual level
(what does it mean to subtract July 4, 1776 from 35 days?) so you should not define that
operator.
9.What is the rule to be followed when using constructive operators?
If you provide constructive operators, they should return their result by value.If it returns
by reference, that will probably run into lots of problems figuring out who owns the
referent and when the referent will get destructed. Doesn't matter if returning by reference
is more efficient; it is probably wrong. If you provide constructive operators, they should
not change their operands and allow promotion of the left hand operand.
10.While using x==y, what does it mean by ‘the objects need be behaviourally
equivalent’?
If you define x == y, then x == y should be true if and only if the two objects are
behaviorally equivalent. The term "behaviorally equivalent" means the observable
behavior of any operation or sequence of operations applied to x will be the same as
when applied to y. The term "operation" means methods, friends, operators, or just about
anything else you can do with these objects (except, of course, the address-of operator).
You won't always be able to achieve that goal, but you ought to get close, and you ought
to document any variances (other than the address-of operator).
11.Is it recommended to overload short-circuiting operatos?
Avoid overloading short-circuiting operators: x || y or x && y. The overloaded versions
of these do not short-circuit — they evaluate both operands even if the left-hand operand
"determines" the outcome, so that confuses users.
12.Why should we avoid overloading the comma operator?
Avoid overloading the comma operator: x, y. The overloaded comma operator does not
have the same ordering properties that it has when it is not overloaded, and that confuses
users.
13.How do I create a subscript operator for a Matrix class?
Use operator() rather than operator[].
When there are multiple subscripts, the cleanest way to do it is with operator() rather
than with operator[]. The reason is that operator[] always takes exactly one
parameter, but operator() can take any number of parameters (in the case of a rectangular
matrix, two parameters are needed).
14.Why shouldn't any Matrix class's interface look like an array-of-array?
The operator() approach is never worse than, and sometimes better than, the [][]
approach.
• The operator() approach is never worse because it is easy to implement the dense,
row-major physical layout using the operator() approach, so when that
167
configuration happens to be the optimal layout from a performance standpoint, the
operator() approach is just as easy as the [][] approach (perhaps the operator()
approach is a tiny bit easier, but I won't quibble over minor nits).
• The operator() approach is sometimes better because whenever the optimal layout
for a given application happens to be something other than dense, row-major, the
implementation is often significantly easier using the operator() approach
compared to the [][] approach.
15.How can I overload the prefix and postfix forms of operators ++ and --?
Via a dummy parameter.
Since the prefix and postfix ++ operators can have two definitions, the C++ language
gives us two different signatures.Both are called operator++(), but the prefix
version takes no parameters and the postfix version takes a dummy int.(Although
this discussion revolves around the ++ operator, the -- operator is completely
symmetric, and all the rules and guidelines that apply to one also apply to the
other.) However we must *not* make the postfix version return the 'this' object by
reference.
168
(v)
INTRODUCTION TO JAVA, COMPARISON
1. what is self assignment?
Self assignment is when someone assigns an object to itself. For example,
void userCode(Fred& x)
{
x = x; // Self-assignment
}
Obviously no one ever explicitly does a self assignment like the above, but since more
than one pointer or reference can point to the same object (aliasing), it is possible to have
self assignment without knowing it:
2. Is Encapsulation a Security device?
No.
Encapsulation != security.
Encapsulation prevents mistakes, not espionage.
3. What's the difference between the keywords struct and class?
The members and base classes of a struct are public by default, while in class, they
default to private. struct and class are otherwise functionally equivalent.
A struct simply feels like an open pile of bits with very little in the way of encapsulation
or functionality. A class feels like a living and responsible member of society with
intelligent services, a strong encapsulation barrier, and a well defined interface. Since
that's the connotation most people already have, you should probably use the struct
keyword if you have a class that has very few methods and has public data.
4.Is there any difference between List x; and List x();?
A big difference!
Suppose that List is the name of some class. Then function f() declares a local List object
called x:
void f()
{
List x; // Local object named x (of class List)
...
}
But function g() declares a function called x() that returns a List:
void g()
{
List x(); // Function named x (that returns a List) ... }
169
5. Should the constructors use "initialization lists" or "assignment"?
Initialization lists. In fact, constructors should initialize as a rule all member objects in
the initialization list. One exception is discussed further down.
Consider the following constructor that initializes member object x_ using an
initialization list: Fred::Fred() : x_(whatever) { }. The most common benefit of doing this
is improved performance. For example, if the expression whatever is the same type as as
member variable x_, the result of the whatever expression is constructed directly inside
x_ - the compiler does not make a separate copy of the object. Even if the types are not
the same, the compiler is usually able to do a better job with initialization lists than with
assignments.
The other (inefficient) way to build constructors is via assignment, such as: Fred::Fred() {
x_ = whatever; }. In this case the expression whatever causes a separate, temporary
object to be created, and this temporary object is passed into the x_ object's assignment
operator. Then that temporary object is destructed at the ;. That's inefficient.
As if that wasn't bad enough, there's another source of inefficiency when using
assignment in a constructor: the member object will get fully constructed by its default
constructor, and this might, for example, allocate some default amount of memory or
open some default file. All this work could be for naught if the whatever expression
and/or assignment operator causes the object to close that file and/or release that memory
(e.g., if the default constructor didn't allocate a large enough pool of memory or if it
opened the wrong file).
Conclusion: All other things being equal, your code will run faster if you use
initialization lists rather than assignment
6.Should the this pointer can be used in the constructor?
Here is something that always works: the {body} of a constructor (or a function called
from the constructor) can reliably access the data members declared in a base class and/or
the data members declared in the constructor's own class. This is because all those data
members are guaranteed to have been fully constructed by the time the constructor's
{body} starts executing.
Here is something that never works: the {body} of a constructor (or a function called
from the constructor) cannot get down to a derived class by calling a virtual member
function that is overridden in the derived class. If your goal was to get to the overridden
function in the derived class, you won't get what you want. Note that you won't get to the
override in the derived class independent of how you call the virtual member function:
explicitly using the this pointer (e.g., this->method()), implicitly using the this pointer
(e.g., method()), or even calling some other function that calls the virtual member
function on your this object. The bottom line is this: even if the caller is constructing an
object of a derived class, during the constructor of the base class, your object is not yet of
that derived class. You have been warned.
Here is something that sometimes works: if you pass any of the data members in this
object to another data member's initializer, you must make sure that the other data
member has already been initialized. The good news is that you can determine whether
the other data member has (or has not) been initialized using some straightforward
170
language rules that are independent of the particular compiler you're using. The bad news
it that you have to know those language rules (e.g., base class sub-objects are initialized
first (look up the order if you have multiple and/or virtual inheritance!), then data
members defined in the class are initialized in the order in which they appear in the class
declaration). If you don't know these rules, then don't pass any data member from the this
object (regardless of whether or not you explicitly use the this keyword) to any other data
member's initializer! And if you do know the rules, please be careful.
7.Why are classes with static data members getting linker errors?
Because static data members must be explicitly defined in exactly one compilation unit.
If you didn't do this, you'll probably get an "undefined external" linker error. For
example:
// Fred.h
class Fred {
public:
...
private:
static int j_; // Declares static data member Fred::j_
...
};
1 The linker will holler at you ("Fred::j_ is not defined") unless you define (as
opposed to merely declare) Fred::j_ in (exactly) one of your source files:
// Fred.cpp
#include "Fred.h"
int Fred::j_ = some_expression_evaluating_to_an_int;
// Alternatively, if you wish to use the implicit 0 value for static ints:
// int Fred::j_;
The usual place to define static data members of class Fred is file Fred.cpp (or Fred.C or
whatever source file ext
8.What's the order that local objects are destructed?
In reverse order of construction: First constructed, last destructed.
9.Can I overload the destructor for my class?
No.
You can have only one destructor for a class Fred. It's always called Fred::~Fred(). It
never takes any parameters, and it never returns anything.
You can't pass parameters to the destructor anyway, since you never explicitly call a
destructor
171
10.What is a friend?
Something to allow your class to grant access to another class or function.
Friends can be either functions or other classes. A class grants access privileges to its
friends. Normally a developer has political and technical control over both the friend and
member functions of a class (else you may need to get permission from the owner of the
other pieces when you want to update your own class).
11.Do friends violate encapsulation?
No! If they're used properly, they enhance encapsulation.
12.What are some advantages/disadvantages of using friend functions?
They provide a degree of freedom in the interface design options.
Member functions and friend functions are equally privileged (100% vested). The major
difference is that a friend function is called like f(x), while a member function is called
like x.f(). Thus the ability to choose between member functions (x.f()) and friend
functions (f(x)) allows a designer to select the syntax that is deemed most readable,
which lowers maintenance costs.
The major disadvantage of friend functions is that they require an extra line of code when
you want dynamic binding. To get the effect of a virtual friend, the friend function should
call a hidden (usually protected) virtual member function. This is called the Virtual
Friend Function
13.What's the deal with inline functions?
When the compiler inline-expands a function call, the function's code gets inserted into
the caller's code stream (conceptually similar to what happens with a #define macro).
This can, depending on a zillion other things, improve performance, because the
optimizer can procedurally integrate the called code - optimize the called code into the
caller.
There are several ways to designate that a function is inline, some of which involve the
inline keyword, others do not. No matter how you designate a function as inline, it is a
request that the compiler is allowed to ignore: it might inline-expand some, all, or none of
the calls to an inline function.
14.What are the benefits of operator overloading?
By overloading standard operators on a class, you can exploit the intuition of the users of
that class. This lets users program in the language of the problem domain rather than in
the language of the machine.
The ultimate goal is to reduce both the learning curve and the defect rate.
15.What operators can/cannot be overloaded?
172
Most can be overloaded. The only C operators that can't be are . and ?: (and sizeof, which
is technically an operator). C++ adds a few of its own operators, most of which can be
overloaded except :: and .*.
16.Can I overload operator== so it lets me compare two char[] using a string
comparison?
No: at least one operand of any overloaded operator must be of some user-defined type
(most of the time that means a class).
But even if C++ allowed you to do this, which it doesn't, you wouldn't want to do it
anyway since you really should be using a std::string-like class rather than an array of
char in the first place since arrays are evil.
17.Which is more efficient: i++ or ++i?
++i is sometimes faster than, and is never slower than, i++. .
18.Is the type of "pointer-to-member-function" different from "pointer-
tofunction"?
Yes.
Consider the following function:
int f(char a, float b);
The type of this function is different depending on whether it is an ordinary function or a
non-static member function of some class:
• Its type is "int (*)(char,float)" if an ordinary function
• Its type is "int (Fred::*)(char,float)" if a non-static member function of class Fred
19.What is a reference?
An alias (an alternate name) for an object.
References are frequently used for pass-by-reference:
20.What does object.method1().method2() mean?
It chains these method calls, which is why this is called method chaining.
The first thing that gets executed is object.method1(). This returns some object, which
might be a reference to object (i.e., method1() might end with return *this;), or it might
be some other object. Let's call the returned object objectB. Then objectB becomes the
this object of method2().
The most common use of method chaining is in the iostream library. E.g., cout << x << y
works because cout << x is a function that returns cout.
21.When should references, and pointers are used?
173
Use references when you can, and pointers when you have to.
References are usually preferred over pointers whenever you don't need "reseating". This
usually means that references are most useful in a class's public interface. References
typically appear on the skin of an object, and pointers on the inside.
The exception to the above is where a function's parameter or return value needs a
"sentinel" reference. This is usually best done by returning/taking a pointer, and giving
the NULL pointer this special significance .
22.What's the idea behind templates?
A template is a cookie-cutter that specifies how to cut cookies that all look pretty much
the same (although the cookies can be made of various kinds of dough, they'll all have the
same basic shape). In the same way, a class template is a cookie cutter for a description
of how to build a family of classes that all look basically the same, and a function
template describes how to build a family of similar looking functions.
23.What is a "parameterized type"?
Another way to say, "class templates."
A parameterized type is a type that is parameterized over another type or some value.
List<int> is a type (List) parameterized over another type (int).
24.What is a "virtual member function"?
From an OO perspective, it is the single most important feature of C++
A virtual function allows derived classes to replace the implementation provided by the
base class. The compiler makes sure the replacement is always called whenever the
object in question is actually of the derived class, even if the object is accessed by a base
pointer rather than a derived pointer. This allows algorithms in the base class to be
replaced in the derived class, even if users don't know about the derived class.
The derived class can either fully replace ("override") the base class member function, or
the derived class can partially replace ("augment") the base class member function. The
latter is accomplished by having the derived class member function call the base class
member function, if desired.
25.How can C++ achieve dynamic binding yet also static typing?
When you have a pointer to an object, the object may actually be of a class that is derived
from the class of the pointer (e.g., a Vehicle* that is actually pointing to a Car object; this
is called "polymorphism"). Thus there are two types: the (static) type of the pointer
(Vehicle, in this case), and the (dynamic) type of the pointed-to object (Car, in this case).
Static typing means that the legality of a member function invocation is checked at the
earliest possible moment: by the compiler at compile time. The compiler uses the static
type of the pointer to determine whether the member function invocation is legal. If the
type of the pointer can handle the member function, certainly the pointed-to object can
174
handle it as well. E.g., if Vehicle has a certain member function, certainly Car also has
that member function since Car is a kind-of Vehicle.
Dynamic binding means that the address of the code in a member function invocation is
determined at the last possible moment: based on the dynamic type of the object at run
time. It is called "dynamic binding" because the binding to the code that actually gets
called is accomplished dynamically (at run time). Dynamic binding is a result of virtual
functions.
26.what is the difference between overloading and overriding?
overloading occurs when two or more methods use the same name (but different
parameter lists),such that both methods are available side by side in the sam
eclass;overriding occurs when a subclass method and a superclass method use yje same
name (and matching parameter lists),such that the subclass mathod blocks out the
superclass method.
27.what is the syper keyword is used for?
The super keyword gives a class explicit access to the constructors ,methods ,and
variables of its superclass.
28.what is a final class?
A final class is a classs that is declared with final keyword so that it can never be
subclassed.
using final keyword in thwe java language has the general meaning that you define an
entity once and cannot change it or derve from it later.more specifically:
>a final class cannot be subclassed.
>a final method cannot be overridden.
>a final variable cannot change from its initialized value.
29.what is an interface?
In java language,an interface is like a stripped-down class:it specifies a set of methods
that an instance must handle,but it omits inheritance relations and method
implementations.
30.How is an abtract class different from concrete class and interface?
concrete class:
i)specifies the full set of methods for an object.
ii)implements all of its methods.
iii)can have instances.
iv)can have subclasses.
175
Abstract class:
i)specifies the full set of methods for an object
ii)implements none,some ,or all of its methods.
iii)can't have instances.
iv)must have subclasses;useless without them
Interface:
i)specifies a subset of methods for an object.
ii)implements none of its methods.
iii)can't have instances
iv)can't have subclasses;must have classes that implement it;useless without them.
31.what is the instanceof keyword ,and what does it do?
The instanceof keyword is a two -argument operator that tests whether the run time type
of its first argument is assignment compatible with its second argument.
32.what are packages and what are they used for?
A package is a collection of classes and inetrfaces that provides a high-level layer of
access protection and name space management.
33.why f is put after a floating point constant?
The f suffix directs the compiler to create a float value from a sequence of characters
representing a floating point number (a float literal) - otherwise the compiler would by
default create either a double value or an int values.
34.Are java objects are pointers?
no:java objects are objects (typed containers for data plus associated methods ) but the
java language lets you access objects only via references.
35.In a method invocation,does java pass arguments by reference or by value?
java method invocations pass arguments by value.
java methods invocations pass all arguments by value-primitive data types and reference
data types alike.This means that the value of each argument is copied to a corresponding
method-internal variable before the method's code is executed.The method 's code can act
only on this internal copy of the data.
36.what is an exception?
An exception is a condition (typically an error condition ) that transfers program
execution from a thrower ( at the source of the condition) to a catcher ( handler for the
condition ); information about the condition is passed as an exeception or error object.
176
37.what is the difference between a runtime exception and a plain exception -why
don't runtime exceptions have to be declared?
The java laguage specifies that all runtime exceptions are exempted from the standard
method declarations and compiler checks.such exceptions belong more to the system as a
whole than to the method that happens to be executing when the exception is thrown.
38.when,and by whom ,is the main method of a class invoked?
The java virtua machine invokes the main method of java class as its starting point of
execution.
39.what are bytecodes?
Java bytecodes encode are a language of machine instructions understood by the java
virtual machine and usually generated (compiled) from java language source code.
40.what is an applet?
An applet is a java -compatible program that you can embed in a web page. it is usually
composed of several pieces : 1.code :-the java class files that represent the executable
code of the applet. 2.other resources:-data needed by the applet,including images and
sounds.
41.How do applets differ from applications?
Applications are stand-alone,full-featured programs ,whereas applets are
embeddable,almost full-featured programs.
42.what is an abstract class?
An abstract class is a class designed with implementation gaps for subclasses to fill in.an
abstract class is deliberately incomplete. it defines a skeleton that various subclasses can
flesh out with custoomized implementation details.The different concrete subclasses
provide a family of variants.
43.what is an object reference?
An object reference is essentially an object pointer with strong restrictions for integrity
and safety.An object reference provides two kinds of information to be the run-time
system:
1.a pointer to the object's instance information -its data.
2.a pointer to the object's class information -its run-time type and its table of methods.
44. What is the difference between an instance variable and class variable?
177
An instance variable represents a separate data item for each instance of a class, whereas
a class variable represents a single data item shared by the class and all its instances.
45. What are the component and container classes?
Component and container are the two fundamental classes in the abstract window
toolkit(AWT).component instances represent individual user-interface elements and
container instances represent groups of components.
178
Part – V
DATA
STRUCTURES
AND
ALGORITHM
179
Topics Covered:-
Data Structures and Algorithm - Kiran Thomas and Sumitra K S
180
1. Overview of Data Structure
Data Structure encompasses the study in the following areas:
Machines that holds data / executing algorithms
Languages for describing data manipulation / algorithms
Foundation of algorithms
Structures for representing data &
Analysis of algorithms
A data type is a well-defined collection of data with a well-defined set of operations on it.
A data structure is an actual implementation of a particular abstract data type. Abstraction
can be thought of as a mechanism for suppressing irrelevant details while at the same
time emphasizing relevant ones.
2. What is an algorithm?
An algorithm is a finite set of instructions, which accomplish a particular task. Every
algorithm must satisfy the following criteria:
Input – Zero or more quantities which are externally supplied
Output – at least one quantity is produced
Definiteness – Each instruction must be clear and unambiguous
Finiteness – If we trace out the instructions of an algorithm, then for all cases the
algorithm will terminate after a finite number of steps
Effectiveness – Every instruction must be sufficiently basic that a person using
only pencil and paper can in principle carry it out. If is not enough that each
operation be definite but it must also be feasible. The logical or mathematical
model of organization of data is called a data structure. A data structure describes
not just a set of data but also how they are related. Data Structures and algorithms
should be thought of as a unit, neither one makes sense without the other.
3. Types of Data Structures
There are several data structures that are available and the choice of an appropriate one
depends on the requirement. Some of the data structures that are commonly used include:
Arrays
Linked Lists
Stacks
Queues
Trees
Graphs
4. Static vs. Dynamic Data Structures
A static data structure has a fixed size. Arrays are static; once you define the number of
elements it can hold, the number doesn’t change. A dynamic data structure grows and
181
shrinks at execution time as required by its contents. A dynamic data structure is
implemented using linked lists.
5. Introduction to Arrays
The simplest type of data structure is a linear array and most often it is the only data
structure that is provided in any programming language. An array can be defined as a
collection of homogenous elements, in the form of index/value pairs, stored in
consecutive memory locations. An array always has a predefined size and the elements of
an array are referenced by means of an index / subscript.
6. Memory Organization for an array
If A is an array of 5 integer elements declared as:
int A[5] = {1, 2, 3, 4, 5};
Then the memory organization for this array is as shown below:
The elements of the array are always stored in consecutive memory locations. The
elements are accessed by adding the index with the base address and this being a pointer
arithmetic will always fetch the appropriate address depending on the type of data.
7. Multi Dimensional Arrays
An array can be of more than one dimension. There are no restrictions to the number of
dimensions that we can have. But as the dimensions increase the memory requirements
increase drastically which can result in shortage of memory. Hence a multidimensional
array must be used with utmost care.
For example the following declaration
int a[50][50][50] will result in occupying 50 * 50 * 50 = 1,25,000 * 4 = 5,00,000 bytes of
memory.
8. Memory Organization
The memory arrangement for a multidimensional array is very similar to linear arrays.
Only the way in which we access the data varies i.e. if it is a 2 dimensional array for
example, then it can be accessed by using 2 indexes one, which represents the row and
the other representing the column. If A is a 2 dimensional array defined like int arr[3][3];
The maximum elements that this array can hold is 3 * 3 which is 9 elements and the
182
elements of this array can be referred as arr[0][0], arr[0][1], arr[0][2], arr[1][0],
arr[1][1]….arr[2][2].
9. Advantages
Searching is faster as elements are in continuous memory locations
Memory management is taken care of by the compiler itself
Limitations
Number of elements must be known in advance
Inserting and Deletions are costlier as it involves shifting the rest of the elements
10. Introduction to Pointers
A pointer is nothing but a variable, which can hold the address of another
variable.
Memory Representation of a variable
int * p;
p = &a;
This stores the address of the variable ‘a’ in the pointer ‘p’, 1000 in our example.
11. De-referencing a Pointer
A pointer is a datatype whose value is used to refer to ("points to") another value stored
elsewhere in the computer memory. Obtaining the value that a pointer refers to is called
Memory Representation of a Pointer Variable
dereferencing the pointer. In the above example the * operator is used to get the value of
the pointer p.
12. Assigning a value through pointer
183
Whenever a value is assigned to a pointer the * operator has to be used so that it gets
stored in the address pointed by p. Whenever an address is assigned do not use the * as
we need to store an address only inside p and not at where p is pointing to.
*p = 50;
Observe the results of the following printf statements
printf(“P = %u”, p);
printf(“&P = %u”, &p);
printf(“*p = %d”, *p);
printf(“a = %d”, a);
The results of the first two printf statements are exactly similar to the previous ones. But
the third printf statement prints the value 50. Surprisingly the last printf statement also
prints 50 and not 10 which was assigned earlier to a. This is because *p refers to the place
where p is pointing. In other words *p denotes that the value will be stored on the address
pointed by p which is the same as the address of the variable a. Thus with the help of the
pointer it is possible to change the value of the variable/memory location it is pointing to.
If p is a pointer and a is a variable defined as
int *p;
int a;
then the following assignments are invalid :
*p = &a; // as *p can hold only a value and not an address
p = a; // as P can hold only an address and not a value
Note: *(&a) is equal to a. [ *(&a) = a ] this means the content of the address of ‘a’ is
nothing but ’a’ itself.
If p is a pointer then,
Whenever we refer the pointer as p we are referring to the address that p is
pointing to.
Whenever we refer the pointer as &p we are referring to the address of the pointer
variable.
Whenever we refer the pointer as *p we are referring to the value pointed by the
address that the pointer contains.
Note: When more than one value needs to be returned from a method, pointers are the
only way through which it is possible. So from the above example let us look into the
definition of Passing by value and passing by reference
13. Pointers and Arrays
Having seen how to use pointer with variables let’s see how pointers can be used with
arrays. Just like a pointer can point to a variable, it can also point to an array. It really
doesn’t matter as to what it is pointing to as long as a valid address is assigned to a
pointer.
14. Pointer Arithmetic
184
Just like a normal variable, increment / decrement of a pointer variable is a valid
operation. That is if we say P++ or P += 2 what does that it mean? Whenever a pointer
variable is incremented the number of bytes that gets incremented depends on the data
type. If it is an integer pointer and if we say P++, then the pointer moves by 2 bytes in a
turbo C compile and 4 bytes in Microsoft Visual Studio. i.e. it points to the next valid
address. Similarly if it is a character pointer, then incrementing the pointer will increment
it by one byte. This is referred to as pointer arithmetic. Having understood pointer
arithmetic how do we print the subsequent elements from an array? There are various
ways for doing it. One way is to increment the pointer and then print the value again.
This we can repeat till the end of the array is reached i.e.
p++;
printf(“%d”, *p);
This time it prints the second element of the array, as the pointer is incremented. For
printing subsequent elements the above 2 steps can be repeated. The more easier and
elegant way of writing it is as follows:
for(i = 0; i< 5; i++)
{
printf(“Value = %d\n”, *(p+i));
}
The above loop will print the values of all the elements in the array. Here we keep the
base address of the pointer as constant and adding the value of i to p (pointer arithmetic)
and hence we will be referring to the elements of the array properly. It is also possible to
write the above block as follows
for(i =0; i< 5; i++)
{
printf(“Value = %d\n”, p[i]);
}
Yes. We can access a pointer just like an array. Similarly we can access an array like a
pointer i.e. the following reference to the array is perfectly valid.
for(i =0; i< 5; i++)
{
printf(“Value = %d\n”, *(arr + i));
}
In other words, arrays are nothing but constant pointers. The difference between arrays
and pointers are:
An array has a fixed base address whereas a pointer’s can point to different
addresses at different points of time
A pointer can be incremented / decremented where as an array cannot be
Incremented/decremented.
15. Pointers and multi dimensional arrays
As the internal representation of a multi dimensional array is also linear. A pointer can
point to an array of any dimension. Only the way in which we access the pointer varies
185
according to the dimension. For example if arr is a2 dimensional array of 3 rows and 3
cols
int arr[3][3] = {1, 2, 3, 4, 5, 6, 7, 8, 9};
then the following assignment is valid to make
int* p = arr;
The elements can be accessed using the following loop
for(i =0; i< 9; i++)
{
printf(“Value = %d\n”, *(p+i));
}
16. Character Pointers
Just like an integer pointer a character pointer is a pointer, which is capable of holding
the address of a character variable. Suppose if we have a character array declared as
char name[30] = {“Data Structures”};
and if we need to store the address of this array we need a character pointer declared as
follows
char *p = NULL;
Once the pointer is declared we need to assign the address of the array to this pointer. We
know that when an array is referenced by its name we are referring to the address of the
0th element i.e. p = name; will assign the address of the 0th element to p.
Pointers can be used interchangeably with arrays.
17. Memory Allocation
So far we have been discussing about a pointer pointing to another variable or array.
Can’t we directly make use of pointers? i.e.
int *p;
followed by
scanf(“%d”, p);
Note that there is no ‘&’ in the scanf for the pointer as referring a pointer by its name
actually refers to the address it points to. The above statements are treated valid by the
compiler and the code might execute properly as well.
But this is really a dangerous programming. Because p is a pointer and we have not
assigned anything to it, it might be having some value in it. Trying to store a value in that
address can cause serious damage to the system, as we are not aware of what p will be
pointing to. So it is always a better programming practice to initialize any pointer
declaration to NULL i.e. int *p = NULL;
This ensures that the pointer is not pointing to anywhere. Even after this we should not
accept values from the user because there is no valid memory location for storing this
value. In such cases before we start using the pointer we should allocate memory to it.
Malloc() is the function used for memory allocation in C. malloc() takes one argument
that is the number of bytes to allocate. Suppose P is a pointer then
int *p;
p = (int *)malloc(sizeof(int));
186
The above assignment will allocate memory for holding an integer value. Remember
whenever malloc is used for memory allocation the system allocates memory from the
heap and not from the stack. So it is the responsibility of the user to free the memory that
is allocated. For freeing the allocated memory the function used is, free(<pointer
variable>);
18. Allocating memory for arrays
int *p = (int*)malloc(10 * sizeof(int));
The above statement will allocate memory for 10 integer elements.
19. Pointer To Pointers
Most often in real time situations the number of rows as well as columns might vary
dynamically. This being the case the above approach cannot be used. In such situations
we can make use of pointer to pointers. A pointer to pointer is nothing but a pointer,
which is capable of holding the address of another pointer variable in it. As discussed
earlier a pointer is also a variable and hence it is quite possible to hold the address of the
pointer also in another pointer. A pointer to pointer can be declared as,
int **p;
The above declaration means that p is a pointer, which can hold the address of another
integer pointer ( a pointer which can hold the address of an integer ).
As such there is no restriction imposed by the compiler as to how many levels we can go
about in using a pointer. i.e. the following declaration is perfectly valid
int *****p; However beyond 3 levels if we make use of pointers then it will make the
code highly complex and un-maintainable.
20. Pointer and structures
A pointer can be used with a structure just like any other data type. The only difference is
in the way we access the data members of the structure. Suppose if we have a structure as
typedef struct player{
Memory Organization of Array of Pointers
187
char name[30];
int age;
}player;
player p1;
Then we can access the members as,
p1.name and p1.age
Suppose if we change the declaration as
Player* p1;
then for accessing the members of the structure we can follow either one of the following
approaches
p1->name or (*p1).name
p1->age (*p1).age
21. Function Pointers
Just like variables have address functions also have address. This means that we can
make use of a pointer to represent the address of a function as well. If we use a pointer to
represent the address of an integer we declare it as int* p. Similarly if it is a character
pointer we declare it as char* p. But if it is a function pointer how can we declare it.
Suppose we have a function as
void add(int x, int y)
{
printf(“Value = %d”, x + y);
}
if we want to declare a pointer to represent this function we can declare it as
void (*p)(int x, int y);
The above declaration means that p is a function pointer that can point to any function
whose return type is void and takes two integer arguments. The general syntax for a
function pointer declaration will look like,
<return type> (<Pointer Variable>) (<data type> [variable name], <data type> [variable
name]); Variable names in the argument list are optional. It is only the data type that
matters. Having seen how we can declare a function pointer the next step is to assign the
address of a function to this pointer. But how do we get the address of a function? Just
like an array whenever we refer a function by its name we actually refer to the address of
the function. This means that we can assign the function address as, p = add;
Note that we are not specifying the function parenthesis for add. If we put a function
parenthesis it means that we are calling a function. Now how can we use this function
pointer? Wherever we can call add we can use this function pointer interchangeably i.e.
we can say,
(*p)(10, 20);
Which will make a call to add and print the result. Since it is a pointer it can be made to
point to different functions at different places. i.e. if we have another method sub
declared as void sub(int x., int y)
{
188
printf(“Value = %d”, x - y);
}
Then we can say,
p = sub
and call it as
(*p)(20, 10);
This time it will make a call to sub and print the result as 10.
22. Advantages
It gives direct control over memory and thus we can play around with memory by
all
possible means. For example we can refer to any part of the hardware like
keyboard,
video memory, printer, etc directly.
As working with pointers is like working with memory it will provide enhanced
performance
Pass by reference is possible only through the usage of pointers
Useful while returning multiple values from a function
Allocation and freeing of memory can be done wherever required and need not be
done in advance
23. Limitations
If memory allocations are not freed properly it can cause memory leakages
If not used properly can make the program difficult to understand
24. What is Recursion?
Recursion is one of the wonderful features provided by C. A function making a call to
itself is called as recursion. This can be helpful in a lot of situation making the logic
compact and easy to understand. When a function is going to call itself how will this
recursive call get terminated? There must always be one condition based on which the
recursion should terminate, otherwise it will go in an infinite loop resulting in a stack
overflow error.
Recursion is typically meant for programs where
The problem itself is recursively defined, for example factorial
The data structure that the algorithm operates is recursively defined, for example
binary trees
Most importantly whenever we need to describe a backtracking procedure
25. Back Tracking Algorithms
Back tracking algorithms are those places where we need to repeat a process going back
and forth. One famous example for back tracking is the eight-queen problem. The
problem is to place 8 queens on a chessboard without clashing each other. If we do it
189
normally on chessboard how we will go about doing it. First we can place up to 5 queens
easily. When we are about to place the 6th queen we might not be having any place for it
to be placed. In such situations we will take the 5th queen and place it on some other
place and we will try to place the 6th one again. If we still couldn’t locate a place then we
will replace the 4th queen also on a different place and then we will be trying all possible
permutation combinations. Similarly we will be trying going back and forth till we
successfully place all the queens. Such kinds of problems are called as back tracking
problems. It is on these places recursion can play its role wonderfully and make the
algorithm look compact without much of memory or performance overhead.
26. Introduction To Stacks
A stack is an ordered list in which items are inserted and removed at only one end called
the TOP. There are only 2 operations that are possible on a stack. They are the ‘Push’ and
the ‘Pop’ operations. A Push operation inserts a value into the stack and the Pop
operation retrieves the value from the stack and removes it from the stack as well.
An example for a stack is a stack of plates arranged on a table. This means that
the last item to be added is the first item to be removed. Hence a stack is also
called as Last-In-First-Out List or LIFO list. To understand the usage of a stack
let us see what does the system does when a method call is executed:
Whenever a method call is executed the compiler needs to know the return
address for it to resume from the place where it left off in the calling method.
27. Graphical Representation
A graphical representation of a stack is shown below:
Graphical Representation of a Stack
28. Memory Organization of a Stack
190
The memory organization of stack is very similar to that of a linear list.
29. Programmatic representation of a stack
typedef struct node
{
int iData;
struct node* pNext;
}node;
node* pTop = NULL;
The node of a stack is very similar to the node in a list. The difference is only in the way
the data is organized. As discussed there are only 2 operations permitted on a stack: the
Push and the Pop
Memory organization of a Stack
30. Push Operation
A push operation is for inserting an element in to a stack. Elements are always inserted in
the beginning of a stack.
31. Steps for the push operation
Create a new node
Make the new node’s next point to the node pointed by Top
Make the top point to the new node
void push(int iData)
{
node* pNewNode = (node *)malloc(sizeof(node));
pNewNode->iData = iData;
pNewNode->pNext = pTop;
pTop = pNewNode;
}
191
If you had a careful look over this code this is nothing but equal to inserting an element at
the beginning of a list.
32. Pop Operation
A pop operation is basically used for retrieving a value from the stack. This also removes
the element from the Stack.
int pop()
{
int iData;
node* pTempNode = NULL;
if (pTop == NULL)
{
iData = -1;
printf(“Stack Empty”);
}
else
{
pTempNode = pTop;
iData = pTop->iData;
pTop = pTop->pNext;
free(pTempNode);
}
return iData;
}
33. Application of Stacks
Evaluating Expressions
In normal practice any arithmetic expression is written in such a way that the operator is
placed between its operands. For example (A + B) * (C + D) Such kind of expressions is
called as infix notation. Polish notation refers to the notation in which the operator is
placed before the operands. For example the above expression can be written as
*+AB+CD The idea is, whenever we write expressions in this notation, parenthesis are
not required for determining the priority of the expressions. Let us see the steps involved
in the conversion (A+B)*(C+D) = (+AB)*(+CD) = *+AB+CD Reverse polish notation is
exactly the opposite of polish notation i.e. the operator is always placed after the
operands. For example, the above expression can be written in Reverse Polish Notation
as
(A+B) * (C+D) = (AB+) * (CD+) = AB+CD+* Whenever an expression is evaluated by
the system it usually performs it by means of 2 steps. First it converts any expression into
prefix or postfix notation and then it evaluates the expression as it makes the job much
simpler. Converting an infix expression to postfix or prefix expression makes use of
192
stacks extensively. The following is the algorithm for doing this conversion. A complete
program requires a lot more than what is described in the algorithm.
PostFixExpression ConvertToPolishNotation(InfixExpression)
{
1. Push “(“ onto the STACK and add “)” to the end of the InfixExpression
2. Scan the InfixExpression from left to right and repeat steps 3 to 6 for each element of
InfixExpression until the stack is empty
3. If an operand is encountered, add it to Postfix Expression
4. If a left parenthesis is encountered, push it to STACK
5. If an operator XX is encountered
a. Pop from STACK repeatedly and add it to Postfix expression which has the
same/ higher precedence than XX.
b. Add XX to STACK
6. If a right parenthesis is encountered, then
a. Pop from STACK repeatedly and add it to Postfix Expression until a left
parenthesis is encountered.
b. Remove the left parenthesis
7. Exit
}
34. Introduction To Queues
A Queue is an ordered list in which all insertions can take place at one end called the rear
and all deletions take place at the other end called the front. The two operations that are
possible in a queue are Insertion and Deletion. A real time example for a queue is people
standing in a queue for billing on a shop. The first
person in the queue will be the first person to get the service. Similarly the first element
inserted in the queue will be the first one that will be retrieved and hence a queue is also
called as First In First Out or FIFO list.
Graphical Representation of a Queue
193
36. Programmatic representation of a Queue
The node of a queue will have 2 parts. The first part contains the data and the second part
contains the address of the next node. This is pretty much similar to a linear list
representation. But in a queue insertions and deletions are going to occur on two different
ends. If we have only
one pointer called START then for every insertion we need to traverse the complete
queue as insertions are always on the end and hence will be time consuming. To avoid
this we are going to have 2 pointers to represent a queue, one pointer is called the
FRONT which will always be pointing to the first element and is used for deletions, and
the other pointer called REAR which will always be pointing to the last element in the
queue and is used in insertions.
typedef struct node
{
int iData;
struct node* pNext;
}node;
node* pFRONT = NULL, *pREAR = NULL;
When the queue is empty, both FRONT and REAR will be pointing to NULL. When
there is only one element in the queue, then both FRONT and REAR will be pointing to
the same element.
37. Steps for inserting an element into a Queue
Inserting into a queue can happen only at the REAR end.
1. Create a new node
2. Make the next of new node as NULL as it is the last element always
3. If the queue is empty, then make FRONT point to new node
4. Otherwise make the previous node’s next ( the node pointed by REAR is always the
previous
node ) point to this node
5. Make Rear point to new node
void QInsert(int iData)
{
194
node* pTempNode = getNode(iData);
if ( pFRONT == NULL )
{
//If the queue is empty
pFRONT = pTempNode;
}
else
{
pREAR->pNext = pTempNode;
}
pREAR = pTempNode;
}
38. Steps for deleting an element from the queue
The deletion is the only way through which we can retrieve values from a queue. Along
with retrieving the value this will remove the entry from the queue. Deletions always
happen in the FRONT end.
1. Store the node pointed by FRONT in a temporary pointer
2. Make Front as Front’s next
3. Delete the node pointed by temporary pointer
4. If all the nodes are deleted from the queue make the FRONT and REAR as NULL
int QDelete()
{
int iData = -1;
node* pDelNode = NULL;
if (pFRONT == NULL)
{
printf(“Queue Empty”);
}
else
{
iData = pFRONT->iData;
pDelNode = pFRONT;
pFRONT = pFRONT->pNext;
if (pFRONT == NULL)
{
pREAR = NULL;
}
free(pTempNode);
}
return iData;
}
39. Deque
195
A Deque is a queue in which insertions and deletions can happen at both ends of a queue.
A deque, or double-ended queue is a data structure, which unites the properties of a
queue and a stack. Like the stack, items can be pushed into the deque; once inserted into
the deque the last item pushed in may be extracted from one side (popped, as in a stack),
and the first item pushed in may be pulled out of the other side (as in a queue).
40. Implementation of Deque
The push (insert/assign) and pop operation is done at both the end that is start and end of
the deque. The following pictures show how a deque is formed based on this change in
algorithm. Initially the base and end pointer will be pointing to NULL or 0 (zero). We
define two pointers, p_base and p_end to keep track of front and back of the deque.
Initially when the deque object is created, both p_base and p_end would point to NULL.
When the first node is created, p_base assumes the position and p_end starts pointing to
p_base. The next and previous pointer of p_base assumes NULL or 0(zero). Further
insertions happen at p_end.
41. Priority Queues
A priority queue is the one in which each element will have a priority associated with it.
The element with the highest priority is the one that will be processed/deleted first. If two
or more nodes have the same priority then they will be processed in the same order as
they were entered in to the queue.
42. Application of Queues
The most common occurrence of a queue in computer applications is for scheduling of
print jobs. For example if several people are giving print requests, all the request get
queued up in the printer and is processed on first come first serve basis. Priority queues
are used for job scheduling by the operating system. The operating system will assign a
priority for every process running currently in the system. The jobs with the highest
priory are the ones which are taken up by the operating system for processing first and
once all the jobs in that priority are over it will try to find the job with the next priority
and so on.
43. Evolution of Linked Lists
In real time systems most often the number of elements will not be known in advance.
The major drawback of an array is that the number of elements must be known in
advance. Hence an alternative approach was required. This give rise to the concept called
linked lists. A linear list is a linear collection of data elements called nodes, where the
linear order is given by means of pointers. The key here is that every node will have two
parts: first part contains the information/data and the second part contains the link/address
of the next node in the list. Memory is allocated for every node when it is actually
required and will be freed when not needed.
196
44. What is a Structure in C?
Aggregate data types built using elements of other types.
Example:
struct Time {
int hour;
int minute;
int second;
};
45. Self-referential structure
Self-referential structures contain pointers within the structs that refer to another identical
structure. Linked lists are the most basic self-referential structures. Linked lists allow you
to have a chain of structs with related data.
46. Graphical Representation of Linked List
A graphical representation of a node will look like
This means that for traversing through the list we need to know the starting point always.
This is achieved by having a START pointer always pointing to the first element in the
list.
Programmatic representation of a node
typedef struct node
197
{
int iData;
struct node* pNext;
}node;
node* pSTART = NULL;
The node structure contains the data and the pointer for the next node. The address of the
next node must again be of type node and hence we have a pointer to the node within
node itself. Such kind of structures which refers to itself are called as Re-Directions or
Self Referential1 Structures.
47. Inserting a node into a list
Insertion logic varies depending on where in the list the element is going to be inserted,
first, middle or at the end. In all situations it is as simple as making the pointer point to
different nodes and hence insertion at any place will be much faster.
48. Steps for inserting an element in the beginning of a list
2. Make the next part of new node point to the node pointed by START
3. Make the START pointer point to new node
void add_begin(int iData)
{
node* pNewNode = (node*)malloc(sizeof(node));
pNewNode->pNext = pSTART;
pSTART = pNewNode;
}
49. Steps for inserting an element in the middle
2. Make the next part of new node point to the next of previous node
3. Make the previous node point to new node
void add_Middle(int iData, int iLoc)
{
int iPos = 1;
node* pNewNode = NULL;
if (iLoc == 1)
{
add_begin(iData);
}
else
{
198
for (pTempNode = pSTART; pTempNode->pNext != NULL && iPos < iLoc;
pTempNode = pTempNode->pNext, iPos++);
if (iPos == iLoc)
{
pNewNode = (node*)malloc(sizeof(node));
pNewNode->pNext = pTempNode->pNext;
pTempNode->pNext = pNewNode;
}
}
}
50. Steps for inserting an element in the end
2. Make the next of new node point to NULL
3. Make the previous node point to new node
void add_End(int iData)
{
int iPos = 0;
for (pTempNode = pSTART; pTempNode->pNext != NULL;
pTempNode = pTempNode->pNext);
pNewNode = (node*)malloc(sizeof(node));
pNewNode->pNext = NULL;
}
51. Steps for deleting a node in the beginning
The next step is deleting a node from a list. Again deletion logic varies depending on
from where the node is getting deleted, beginning, middle or at the end.
1. Store the node to be deleted in a temporary pointer
2. Make START point to next node in the list
52. Steps for deleting a node from the middle / end
1. Store the node to be deleted in a temporary pointer
2. Make the previous node’s next point to the next of the node that is being deleted
void delete(int iData)
{
199
node* pPrevNode = NULL;
for(pTempNode = pPrevNode = pSTART; pTempNode != NULL;
pPrevNode = pTempNode, pTempNode =
pTempNode->pNext)
{
if (iData == pTempNode->iData)
{
pDelNode = pTempNode;
pPrevNode->pNext = pDelNode->pNext;
if (pDelNode == pSTART)
{
pSTART = pSTART->pNext;
}
free(pDelNode);
}
}
}
53. Applications of Linked Lists
A linked list is used typically in computer applications for memory allocations. The
system keeps tracks of all the memory available in the system with the help of a list. This
area of memory is called as the memory pool. As and when the user requests for memory
the system allocates memory to the user from this available pool. Once allocated these
blocks will be deleted from the available list. Once the user frees the memory again the
memory will be added to the list. In general linked lists are used in almost all places
whenever a collection of elements are required and when the number of elements in the
collection is not known in advance.
Advantages
The number of elements in the linked lists need not be known in advance. As and
when required elements can be added to a list
Insertions and deletions are much faster in a linked list as there is no physical
movement of elements of a list.
Limitations
Searching a list is always sequential and hence it is a time consuming operation
Traversal is always possible in only one direction.
Circular Linked Lists
Circular linked lists are very similar to a linear list except that the last node will be
pointing to the first node again instead of NULL. So care should be taken while doing the
200
traversal to avoid infinite loops. A graphical representation of a circular linked list is as
follows
Assuming that someNode is some node in the list, this code iterates through that list
starting with someNode:
Forwards
node := someNode
do
<do something with node.value>
node := node.next
while node not someNode
54. Introduction To Doubly Linked Lists
With the linked list traversal is possible in only one direction. This limitation is overcome
with the help of doubly linked lists. A doubly linked list is a list in which each node will
have 3 parts: one part containing the information / data, one part containing the
pointer/address of next node and one part containing the pointer / address of previous
node. In addition to the START pointer, which contains the first node there will also be a
TAIL pointer pointing to the last node in the list.
Memory organization of a Doubly Linked List
Programmatic Representation of a node in Doubly Linked Lists
typedef struct node
{
201
int iData;
struct node* pNext;
struct node* pPrev;
}node;
node* pSTART = NULL, *pTAIL = NULL;
The node of a doubly linked list will have 2 pointers apart from the data, one for pointing
to the previous element and the other for pointing to the next element in the list. There
will also be a TAIL pointer pointing to the last node in the list. It is possible for us to
traverse from the beginning of a list till the end or from the end of the list to the
beginning. The end of the list is identified with the help of a NULL value in the next /
previous parts.
55. Inserting an element in a doubly linked list
Inserting an element in a doubly linked list involves just changing the pointers unlike
arrays where every element after the inserted element needs to be shifted once. Hence
inserting an element in a doubly linked list is much faster than inserting a node in an
array.
56. Steps for inserting in the beginning of a doubly linked list
2. Make the next part of new node equal to START
3. Make the previous part of new node equal to NULL
4. Make the previous part of the node pointed by START to new node
5. Make START point to new node
node* getNode(int iData)
{
pNewNode->pNext = NULL;
pNewNode->pPrev = NULL;
}
void addHead(int iData)
{
node* pNewNode = getNode(iData);
pNewNode->pNext = pSTART;
pSTART->pPrev = pNewNode;
pSTART = pNewNode;
}
57. Steps for inserting an element in the middle of a doubly linked list
2. Make the next part of new node equal to next part of the previous node
3. Make the previous part of new node equal to previous part of next node
4. Make the next part of previous node point to new node
202
5. Make the previous part of next node point to new node
void insertAt(int iLoc, int iData)
{
int iPos = 1;
if (iLoc == 1)
{
addHead(iData);
}
else
{
for(pTempNode = pSTART; pTempNode->pNext != NULL && iLoc < iPos;
if (iLoc == iPos)
{
pNewNode = getNode(iData);
pNewNode->pNext = pTempNode->pNext;
pNewNode->pPrev = pTempNode;
if (pTempNode->pNext != NULL)
{
pTempNode->pNext->pPrev = pNewNode;
}
else // If it is the last node
{
pTAIL = pNewNode;
}
}
else
{ printf(“Invalid Position”);
}
}
}
58. Steps for inserting an element at the end of a doubly linked list
2. Make the next part of the new node equal to NULL
3. Make the previous part of the new node equal to TAIL
4. Make the next part of the previous node equal to new node
5. Make TAIL equal to new node
void addTail(int iData)
{
203
pNewNode = getNode(iData);
pNewNode->pPrev = pTAIL;
pTail->pNext = pNewNode;
pTAIL = pNewNode;
}
59. Deleting an element from a doubly linked list
Deleting a node from a list is as simple as changing the links. Hence deleting a node from
a list is much faster when compared to arrays. Like insertion the deletion logic also varies
depending on from where in the list we are going to delete the node.
60. Deletion in the beginning of a doubly linked list
1. Make the temporary pointer point to the node to be deleted
2. Make the START point to the next node of START
3. Make the previous of the next node equal to previous of the node to be deleted
4. Delete the node pointed to by temporary pointer
void removeHead()
{
pDelNode = pSTART;
pSTART = pSTART->pNext;
if (pDelNode == pTAIL) // If it is the last element in the list
{
pTAIL = NULL;
}
else
{
pSTART->pPrev = NULL;
}
free(pTempNode);
}
61. Deletion in the middle of a doubly linked list
2. Make the next part of the previous node equal to next of the node to be deleted
3. Make the previous part of the next node equal to previous part of the node to be
deleted
void deleteAt(int iLoc)
{
204
int iPos = 1;
if (iLoc == 1)
{
removeHead();
}
else
{
for (pTempNode = pSTART; iPos < iLoc && pTempNode->pNext != NULL;
if (iLoc == iPos)
{
pDelNode = pTempNode->pNext;
pTempNode->pNext = pDelNode->pNext;
if (pDelNode == pTAIL)
{
pTAIL = pTAIL->pNext;
}
else
{
pDelNode->pNext->pPrev = pTempNode;
}
free(pDelNode);
}
}
}
62. Deletion at the end of a doubly linked list
2. Make the next part of the previous node equal to next of the node to be deleted
3. Make the TAIL equal to the previous part of the node to be deleted
void removeTail()
{
struct node* pTempNode = NULL;
struct node* pDelNode = NULL;
if (pSTART == NULL)
{
printf(“List is Empty”);
}
else
{
for (pTempNode = pSTART; pTempNode->pNext != NULL;
pTempNode = pTempNode->pNext);
pDelNode = pTempNode->pNext;
pTempNode->pNext = NULL;
205
pTAIL = pTAIL->pNext;
if (pTAIL == NULL)
{
pSTART = NULL;
}
free(pDelNode);
}
}
63. Traversing a doubly linked list
A doubly linked list can be traversed in both directions.
void displayFromStart()
{
for(pTempNode = pSTART; pTempNode != NULL; pTempNode = pTempNode-
>pNext)
{
printf(“Data = %d\n”, pTempNode->iData);
}
}
void displayFromTail()
{
for(pTempNode = pTAIL; pTempNode != NULL; pTempNode = pTempNode-
>pPrev)
{
printf(“Data = %d\n”, pTempNode->iData);
}
}
64. Application of Doubly Linked Lists
A doubly linked list can be used in all the places where we use a linear list if the
traversal happens frequently in both the directions.
Advantages
The number of elements in a doubly linked list need not be known in advance. As
and when required elements can be added to a list
Insertions and deletions are much faster as there is no physical movement of
elements of a list.
Traversal is possible in both the directions unlike a linear list where the traversal
is possible in only one direction
Limitations
206
Takes up additional memory for storing the previous pointer in each node
Makes the insertion and deletion logic a bit complex
65. Introduction To Trees
A tree is a finite set of one or more nodes such that
i) There is a specially designated node called the root
ii) The remaining nodes are partitioned into n >= 0 disjoint sets T1, ..., Tn where each of
these sets is a tree. T1, ..., Tn are called sub trees of the root
A tree is a non-linear data structure. This structure is mainly used to represent data
containing a hierarchical relationship between elements like family tree, organization
chart etc. Some of the commonly used terms and their definitions with respect to a tree
are as follows:
67. Definitions
Degree of a node
The number of sub trees of a node is called its degree. In the diagram shown above the
degree of A is 3, B is 3, C is 0 , D is 2, ...
Terminal or Leaf Nodes
Nodes that have degree 0 are called as Terminal or Leaf nodes. In other words the nodes
that does not have any sub trees are called Terminal or Leaf nodes
Siblings
The children of the same parent are called as siblings
207
Level
The level of the root node is 1. If a node is at level L then its children are at level L + 1.
Height or Depth of a Tree
The height or depth of a tree is the maximum level of any node in the tree.
Forest
A forest is a set of n>= disjoint trees.
68. Introduction to Binary Trees
A binary tree T is defined as a finite set of elements, called nodes, such that:
T is empty or
T contains a distinguished node R, called the root of T, and the remaining nodes
of T forms an ordered pair of disjoint binary trees T1 and T2.
If T does contain a root R, then the two trees T1 and T2 are called the left and
right sub trees of R
Graphical Representation
A binary tree is said to be complete if all its levels except the last level have the
maximum number of possible nodes and if all the nodes at the last level appear as far left
as possible.
69. Binary Search Trees
208
A binary tree T is called a binary search tree if each node of T has the property: The
value at N is greater than every value in the left sub tree of N and is less than every value
in the right sub tree of N.
Programmatic Representation of a Binary Tree
typedef struct node
{
int iData;
struct node* pLeft;
struct node* pRight;
}node;
node* pROOT = NULL;
70. Inserting an element in a binary search tree
Inserting an element in a binary search tree involves creating a new node and
reorganizing
the parent nodes link to accommodate the new node. Typically insertions will
always be on a terminal node. We will never be inserting a node in the middle of a binary
search tree.
71. Steps for inserting an element in a binary search tree
1. If the tree is empty, then make the root node point to this node
2. If the tree contains nodes then compare the data with node’s data and take appropriate
path till the terminal node is reached
3. If data < node’s data then take the left sub tree otherwise take the right sub tree
4. When there are no more nodes add the node to the appropriate place of the parent
node.
node* getNode(int iData)
{
pNewNode->m_pLeft = NULL;
pNewNode->m_pRight = NULL;
return pNewNode;
}
void insert(node* pParent, int iData)
{
if (pParent == NULL)
{
pParent = getNode(iData);
if (pRoot == NULL)
209
{
pRoot = pParent;
}
}
else
{
if (iData < pParant->iData)
{
insert(pParent->pLeft, iData);
}
else
{
insert(pParent->pRight, iData);
}
} } // end of function …
72. Deleting an element from a binary tree
The logic for deleting an element from a binary tree varies depending on from where we
are going to delete the node, whether it is a leaf node, a node with a single child or a node
in the middle.
If the node being deleted is a terminal node then deletion is a straightforward
procedure of just making the parent node point to NULL.
If the node being deleted has only one child then the parent node will have to
point to the child node of the node being deleted.
If the node being deleted has both the children then we first need to find the
inorder successor of the node being deleted and replace the node being deleted
with this node. (The inorder successor of a node can be obtained by taking the
right node of the current node and traversing in the left till we reach the left most
nodes.)
73. Steps for deleting an element from a binary search tree
If the node N has no children, then change the location of this node in the parent
node to NULL.
If the node N has exactly one child, then change the location of this node, in the
parent, to child of N.
If the node N has 2 children, then
a. Store N in a temporary pointer
b. Determine the inorder successor of N ( an inorder successor is obtained by
moving to the right node and finding the left most node of that node )
c. Make the left of this node point to left of N
d. Make the right of this node point to right of N
210
e. Make the parent of N point to this node.
f. Delete the node pointed by temporary pointer
node* delete(node* pParent, int iData)
{
node* pTempNode1, *pTempNode2;
if (pROOT == NULL)
{
printf(“Tree is Empty”);
return NULL;
}
else
{
if (iData == pParent->iData)
{
if (pParent->pLeft == pParent->pRight) // if it is a terminal node
{
free(pParent);
return NULL;
}
else if (pParent->pLeft == NULL) // if there is only right child
{
pTempNode1 = pParent->pRight;
free(pParent);
return pTempNode1;
}
else if (pParent->pRight == NULL) // if there is only left child
{
pTempNode1 = pParent->pLeft;
free(pParent);
return pTempNode1;
}
else //if it has both the child
{
pTempNode2 = pParent->pRight;
// Get the inorder successor of pParent
for (pTempNode1 = pParent->pRight; pTempNode1->pLeft != NULL;
pTempNode1 = pTempNode1-
>pLeft);
pTempNode1->pLeft = pParent->pLeft;
free(pParent);
return pTempNode2;
}
}
if (iData < pParent->iData)
{
211
pParent->pLeft = delete(pParent->pLeft, iData);
}
else
{
pParent->pRight = delete(pParent->pRight, iData);
}
return pParent;
}
} // end of function delete …
74. Binary Tree Traversal
A binary tree can be traversed in 3 ways namely
Preorder Traversal / NLR Traversal
a) Process the root
b) Traverse the left sub tree in preorder
c) Traverse the right sub tree in preorder
Inorder Traversal / LNR Traversal
a) Traverse the left sub tree in inorder
b) Process the root
c) Traverse the right sub tree in inorder
Postorder Traversal / LRN Traversal
a) Traverse the left sub tree in postorder
b) Traverse the right sub tree in postorder
c) Process the root
From the definition itself it is very clear that the implementation of this traversal
algorithm involves recursion.
75. Code Snippet for Inorder Traversal
void InOrder(struct node* pParent)
{
if ( pParent != NULL )
{
InOrder(pParent->pLeft);
printf("Data = %d\n", pParent->iData );
InOrder(parent->pRight);
}
}
212
76. Code Snippet for Preorder Traversal
void PreOrder(struct node* pParent)
{
{
PreOrder(pParent->pLeft);
PreOrder(parent->pRight);
}
}
77. Code Snippet for Postorder Traversal
void PostOrder(struct node* pParent)
{
{
PostOrder(pParent->pLeft);
PostOrder(parent->pRight);
}
}
78. Application of trees
A tree can be used in a wide variety of applications, which includes set representations,
decisionmaking, game trees, etc. Within the context of programming language execution,
compilers utilize tree structures to obtain forms of an arithmetic expression, which can be
evaluated efficiently. The in-order traversal of the binary tree for an arithmetic expression
produces the infix form of the expression, while the pre-order and post-order traversal
lead to the prefix and postfix (reverse Polish) forms of the expression respectively. The
advantage of reverse Polish notation is that arithmetic expressions can be represented in a
way, which can be simply read from left to right without the need for parentheses. For
example, consider the expression a * (b + c). This expression can be represented by the
binary tree in the following way,
*
/\
a+
/\
bc
If we perform a post-order traversal in which we traverse the left branch of the tree, the
right branch, followed by the node, we immediately recover the reverse Polish form of
the expression, namely abc+*. This expression can then be evaluated using a stack.
Starting from the left of the expression, each time an operand (one of the numerical
values a, b or c in this example) is found, it is placed on the top of the stack. When an
213
operator (* or +) is read, the top two elements are removed form the stack and the
appropriate operator applied to them and the result placed on the stack. When the
complete expression has been evaluated, the stack will contain a single item, which is the
result of evaluating the expression.
79. Introduction To Graph Theory
A graph G consists of two things:
a) A set of elements called vertices / nodes / points V.
b) A set of edges E, such that each edge e in E is identified with a unique pair [u,v] of
nodes in V, denoted by e = [u, v]
A graph can also be represented as G =(V, E)
Before getting into the details of a Graph let’s see few of the commonly used terms with
respect to graph and their definitions:
80. Adjacent Nodes
If e = [u, v] then the nodes u and v are called endpoints of e and u and v are said to be
adjacent nodes or neighbors.
81. Definitions
Degree of a node
The degree of a node u is the number of edges containing u. If deg(u) = 0, then the node
is called as an isolated node.
Path
A Path P of length n from node u to v is defined as a sequence of n + 1 nodes. P = (V0,
V1,....Vn)
Closed Path
A path P is said to be closed if V0 = Vn
Simple Path
The path P is said to be simple, if all the nodes are distinct, with the exception that v0
may equal v1, that is, P is simple if the nodes v0, v1, ... vn-1 are distinct and the nodes
v0, v1, ...vn are distinct.
Cycle
A cycle is a closed simple path.
Connected Graph
A graph G is connected if and only if there is a simple path between any two nodes in G.
Complete Graph A graph G is said to be complete if every node u in G is adjacent to
every other node v in G. A graph G is said to be labeled, if its edges are assigned data. A
214
graph is said to be weighted if each edge e in G is assigned a non-negative numerical
value called the weight or length of e.
Multiple edges
Distinct edges e and e’ are called multiple edges if they connect the same end points. i.e.
if e = [u, v] and e’ = [u, v].
Loops
An edge e is called a loop if it has identical endpoints. i.e. e = [u, u]
Multi Graph
A graph, which contains multiple edges or Loops, is called as Multigraph M.
Directed Graphs
A directed graph G is the same as a Multigraph except that each edge e in G is assigned a
direction.
Out Degree
The number of edges beginning at a node u is the outdegree of that node and the number
of edges ending at the node e is the indegree of e. A node is called a Source if it has
positive outdegree and zero indegree. A node is called as a Sink if it has a positive
indegree and zero outdegree.
82. Memory Organization of a Graph
A graph is represented in memory with the help of a linear list of all the nodes in the
graph called as the node list. The adjacent nodes for every node are again maintained by
means of a list, 1which is called as the adjacency list. Consider the following graph:
215
Graphical Representation of Graph
First prepare an adjacency list for every node which is as shown in the table:
Table - Adjacency List for the graph
Another way of representing graphs is by means of adjacency matrix
Structural Representation of a Graph
typedef struct listNode
{
struct node* pData;
struct listNode* pNext;
}listNode;
The above structure is for the adjacency list. The data part represents a pointer to a node
in the node list and the next part contains the link for the subsequent node in the
adjacency list
typedef struct node
{
int iData;
struct node* pNext;
struct listNode* pAdjNode;
}node;
The above structure is for the node list. It contains 3 parts. The first part contains the data,
the second part contains a pointer to the next node in the node list and the third part
contains a pointer to the adjacency list.
83. Inserting into graph
We can insert either a node or an edge in a graph. Inserting a node involves inserting an
element in the node list and the adjacent part has to be set NULL as this will be an
216
isolated node. Inserting an edge involves inserting an element in to the adjacency list of
all the nodes, which are adjacent to that node.
84. Deleting from a graph
Deleting an element from a graph again involves deleting an edge in which case the
adjacency list of all the nodes adjacent to this node will have to be deleted and the node
as such will have to be deleted from the node list. Deleting an edge involves deleting an
element from the adjacency list of the node that is getting deleted.
85. Traversing a graph
There are 2 ways by means of which a graph can be traversed namely:
Breadth First Search
Check the starting node A. The check all the neighbors of A. Then examine the neighbors
of neighbors of A and so on.
Depth First Search
Check the starting node A. Then search each node N along the path P, which begins at
node A. i.e., search neighbor of A, then neighbor of neighbor of A and so on. Once we
reach the dead end then we backtrack on P until we can continue along another path P’
and so on.
86. Application of Graphs
Graphs are used in a wide variety of applications. A graph can be used by an airlines for
maintaining the flight information such as the various flights, their routes and the distance
between the places, etc. With the help of such a graph one will be easily able to find out
whether there is any flight for a particular destination and if there are several routes to
reach that destination which one is having the optimum distance or cost, etc.
COLLECTED BY --B.RAJ--

Network

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Network

Hochgeladen von

Copyright:

Verfügbare Formate

The Core FAQ File

Das könnte Ihnen auch gefallen