Sie sind auf Seite 1von 42

Chapter 1-3 :

Q. What is Kernel in Linux ?

Ans : The Kernel is the core program that runs programs and manages hardware devices,
such as disks and printers. It execute the commands which provide by environment.
Kernel provides an interface between shell and hardware.

Q2. Define the features of Linux ?

Ans : 1. Multi-tasking :

Linux supports true preemptive multi-tasking. All processes run entirely independently of
each other. No process needs to be concerned with making processor time available to
other processes.
Multi-user access :

A multi-user system is a computer that is able to concurrently and independently execute

several applications belonging to two or more users.

Multi-processing :

Linux also runs on multi-processor architectures. This means that the O. S. can
distribute several applications across several processors.

Architecture independence (Portability) :

Linux runs on several hardware platforms, from the Amiga to the PC to DEC Alpha
workstations. Such hardware independence is achieved by no other serious O. S.

Demand load executables :

Only those parts of a program actually required for execution are loaded into memory.
When a new process is created using fork(), memory is not requested immediately, but
instead the memory for the parent process is used jointly by both processes.

Paging :

Linux provide a very important concept of paging. Despite the best efforts to use
physical memory efficiently, it can happen that the available memory is fully taken up.

Dynamic cache for hard disk :

Linux dynamically adjusts the size of cache memory in use to suit the current memory
usage situation.
Shared Libraries :

Libraries are collections of routines needed by a program for processing data. There are
a number of standard libraries used by more than one process at the same time.

Memory protected mode :

Linux uses the processor’s memory protection mechanisms to prevent the process from
accessing memory allocated to the system kernel or other processes.

Support for national keyboards and fonts :

Under Linux, a wide range of national keyboards and character sets can be used : for
example, the Latin1 set defined by the International Organization for Standardization
(ISO) which also includes European special characters.

Different file systems :

Linux supports a variety of file systems. The most commonly used file system at present
is the Second Extended (Ext2) File system. This supports filenames of up to 255
characters and has a number of features making it more secure than conventional Unix
file systems.

Q. Define the file structure of Linux ?

Ans: The file structure of any O. S. is includes the arrangement of files & folders. Linux
organizes files into a hierarchically connected set of directories. Each directory may
contain either files or other directories. Because of the similarities to a tree, such a
structure is often referred to as a tree structure and also called parent-child structure.

The Linux file structure branches into several directories beginning with a root
directory, /. Within the root directory several system directories contain files and
programs that are features of the Linux system. These system directories as follows :-

/ root : Begins the file system structure, called the root

/fs : The virtual file system interface in in the fs directory. The
implementations of the various file systems supported by LINUX
held in the respective subdirectories.
/home : Contains users’ home directories
/bin : Holds all the standard commands and utility programs
/usr : Holds those files and commands used by the system; this
breaks down into several sub-directory
/usr/bin : Holds user-oriented commands and utility programs
/usr/sbin : Holds system administration commands
/usr/lib : Holds libraries for programming languages
/usr/doc : Holds Linux documentation
/usr/man : Holds the online manual Man files
/usr/spool : Holds spooled files, such as those generated for printing jobs and
network transfers
/sbin : Holds system administration commands for booting the system
/var : Holds files that vary, such as mailbox files
/dev : Holds file interfaces for devices such as the terminals and printers
/etc : Holds system configuration files and any other system files.
/init : contains all the functions needed to start the kernel. Like
/net : contains the implementations of various network protocols and the
code for sockets to the UNIX and Internet domains.
/arch : architecture -dependent code is held in the subdirectories of arch/
/mm : contains Memory management sources for the kernel.

Q2. Define the Kernel Architecture ?

Ans : Most Unix kernels are monolithic : each kernel layer is integrated into the whole kernel
program and urns in Kernel Mode on behalf of the current process. Microkernel
operating systems demand a very small set of functions from the kernel, generally
including a few synchronization primitives, a simple scheduler, and an interprocess
communication mechanism. Although Microkernels oriented O. S. are generally slower
than monolithic ones, since the explicit message passing between the different layers of
the O. S. might have some theoretical advantages over monolithic ones.

Define the process and task_structure ?

Ans : The concept of a process is fundamental to any multiprogramming operating system. A
process is usually defined as an instance of a program in execution; thus, if 16 users are
running vi at once, there are 16 separate processes ( although they can share the same
executable code).

Each & every process have some unique information, which store in task_struct type
process descriptor, which is the object of task_struct.

Struct task _struct

volatile long state;
long counter;
long priority;
unsigned long signal;
unsigned long blocked;
unsigned long flags;
int errno;
int debugreg[8];
struct task_struct *next_task;
struct task_struct *prev_task;
struct mm_struct mm;
int pid, uid,gid;
struct fs_struct fs;
long utime, stime, cutime, cstime, start_time;

state field of the task_struct describes what is currently happening to the process. The
following are the possible process states :

TASK_RUNNING : The process is either executing on the CPU or waiting to be

TASK_INTERRUPTIBLE : The process is suspended (sleeping) until some condition
becomes true. Raising a hardware interrupt, releasing a system resource the process is
waiting for, or delivering a signal are examples of conditions that might wake up the
process, that is put its state back to TASK_RUNNING.
TASK_UNINTERRUPTIBLE: In this state process is uninterruptible of any hardware
interrupt, or any signal.
TASK_STOPPED: Process execution has been stopped : the process enters this state
after receiving a SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU signal.
TASK_ZOMBIE : Process execution is terminated, but the parent process has not
stopped. The kernel cannot discard the data contained in the dead process task_struct
because the parent could need it.

The counter variable holds the time in ‘ticks’ for which the process can still run before a
mandatory scheduling action is carried out. The schedular uses the counter value to
select the next process.

The priority holds the priority of a process.

The signal variable contains a bit mask for signals received for the process.

The bolcked contains a bit mask for all the signals the process plans to handle later.

flags contains the system status flags.

errono contains the error code if generated.

debugreg[8] assigns the debugger to that err code.

*next_task and *prev_task all processes are entered in a doubly linked list with the help
of these two components.

mm_struct mm the data for each process needed for memory management are
collected, mm_struct store those data.

Every process has its own process ID number , pid, user ID, uid, goup ID, gid.

The file-system-specific data are stored in fs_struct fs.

The utime and stime variables hold the time the process has spent in User Mode and
System Mode, cutime and cstime contain the totals of the corresponding times for all
child processes, start_time contains the time at which the current process was
Q. What is process table in Linux kernel ?

Ans : Every process occupies exactly one entry in the process table. In Linux, this is statically
organized and restricted in size to NR_TASKS. NR_TASKS denotes the maximum
number of process.

Struct task_struct *task [NR_TASKS] ;

In older versions of the Linux kernel, all the processes present could be traced by
searching the task[ ] process table for entries. In the newer versions this information is
stored in the linked lists next_task and prev_task, which can be found in the task_struct
structure. The external variable init_task points to the start of the doubly linked linked
circular list.
The entry task[0] has a special significance in Linux. Task[0] is the INIT_TASK mentioned
above, which is the first to be generated when the system is booted and has something of a
special role to play.

Q. What is inode? How it is used for storage of regular files?

Ans : All enitities in Linux are trated as files. The information related to all these files (not the
contents ) is stored in an Inode Table on the disk. For each file, there is an inode entry in
the table. Inodes contain information such as the file’s owner and access rights.

The inode structure-

struct inode
dev_t idev;
unsigned long i_ino;
umode_t i_mode;
uid_t i_uid;
gid_t i_gid;
off_t i_size;
time_t i_mtime;
time_t i_atime;
time_t i_ctime;

The component :

i_dev is a description of the device on which the file is located.

i_ino identifies the file within the device.
i_mode mode which file is open
i_uid user id
i_gid group id
i_size the size in bytes
i_mtime times of the last modification
i_atime times of last access
i_ctime time of last modification to the inode
The component dev, ino pair thus provides an identification of the file which is uniquely
identified the file in entire file system.

Q. What is Interrupts ? Define the slow and fast interrupts ?

Ans: Whenever a special signal is generated by any hardware is called interrupt. Interrupts
are used to allow the hardware to communicate with the O. S.

There are two types of interrupt in Linux slow and fast-

Slow Interrupts :

Slow interrupts are the usual kind. After a slow interrupt has been processed, additional
activities requiring regular attention are carried out by the system - for example, the timer

Fast Interrupts :

Fast interrupts are used for short, less complex tasks. While they are being handled, all
other interrupts are blocked, unless the handling routine involved explicitly enables them.
A typical example is the keyboard interrupt.

Q. What is the Booting process of Linux system?

Ans : There is something magical about booting a Linux system. First of all LILO ( The LInux
LOader ) finds the Linux kernel and loads it into memory. It then begins at the entry point
start : as the name suggests, this is assembler code responsible for initializing the
hardware. Once the essential hardware parameters have been established, the process
is switched into Protected Mode by setting the protected mode bit in the machine status
word. Then initiates a jump to the start address of the 32 bit code for the actual operating
system kernel and continues from startup_32: . Once initialization is complete, the first
C function start_kernal() is called.

The first saves all the data the assembler code has found about the hardware up to that
point. All areas of the kernel are then initialized. The process now running is process 0. It
now generates a kernel thread which executes the init() function.

The init() function carries out the remaining initialization. It starts the bdflush and
kswap daemons which are responsible for synchronization of the buffer cache contents
with the file system and for swapping.

Then the system call setup is used to initialize file systems and to mount the root file
system. Then an attempt is made to execute one of the programs /etc/init, /bin/init or
/sbin/init. These usually start the background processes running under Linux and make
sure that the getty program runs on each connected terminal - thus a user can log in to
the system.

If none of the above-mentioned programs exists, an attempt is made to process /etc/rc

and subsequently start a shell so that the superuser can repair the system.
Q Define the system calls getpid, nice, pause, fork, execve, exit, wait.

Ans : getpid:

The getpid call is a very simple system call - it merely reads a value from the task
structure and returns it :

asmlinkage int sys_getpid(void)

return current->pid;

nice :

The system call nice is a little more complicated : nice expects as its argument a number
by which the static priority of the current process is to be modified. Only the superuser is
allowed to raise his/her own priority. Note that a large argument for sys_nice() indicates
a lower priority.

pause :

A call to pause interrupts the execution of the program until the process is reactivated by
a signal. This merely amounts to setting the status of the current process to
TASK_INTERRUPTIBLE and then calling the scheduler. This results in another task
becoming active.


The system call fork is the only way of starting a new process. This is done by creating a
identical copy of the process that has called fork. Fork is a very demanding system call.
All the data of the process have to be copied, and these can easily run to a few

execve :

The system call execve enables a process to change its executing program. Linux
permits a number of formats for executable files. Linux supports the widely used
executable file format COFF(Common Object File Format) and ELF(Executable and
Linkable Format).

exit :

A process is always terminated by calling the kernel function do_exit. This is done either
directly by the system call _exit or indirectly on the occurrence of a signal which cannot
be intercepted. It merely has to release the resources claimed by the process and, if
necessary, inform other processes.
wait :

The system call wait enables a process to wait for the end of a child process and
interrogate the exit code supplied. Depending on the argument given, wait4 will wait for a
specified child process, a child process in a specified process group or any child

Q. What is the output of command ps ?

Ans : ps command output which processes are running at any instant. Linux assigns a unique
number to every process running in memory. This number is called process ID or simply


2269 tty01 0:05 sh
2396 tty01 0:00 ps

PID : Process ID
TTY : Terminal Id Which The Processes Were Launched
TIME : The Time That Has Elapsed since the Processes Were Launched
COMMAND : The Names Of The Processes.

What is links ? What is the difference between Hard links & Symbolic links ?

Ans : If you might want to reference a file using different different filenames to access it from
different directories then you create a link of that file with the help of ln command.

$ ln original-file-name link-name

Hard links & Symbolic links :

Links within one disk & one user environment is called Hard links. A hard link may in
some situations fail when you try to link to a file on some other user’s directory. A file in
one file system can’t be linked by a hard link to a file in another file system. If you try to
link to a file on another user’s directory that is located on another file system, your hard
link will fail. To overcome this restriction, you use symbolic links. A symbolic link holds
the pathname of the file to which it is linking.
Chapter 4 : Memory Management
Q Define the architecture - independent memory model in Linux ?

Ans : Memory Management is primarily concerned with allocation of main memory to requests
processes. Two important features of memory management function are : Protection and
Sharing. Memory management activity in a Linux kernel. Some of main issues related to
memory management are :

Pages of Memory :

The physical memory is divided into pages. The size of a memory page is defined by the
PAGE_SIZE macro. For the x86 processor, the size is set to 4 KB, while the Alpha
processor uses 8 KB.

Virtual address space :

A process is run in a virtual address space. In the abstract memory model, the virtual
address space is structured as a kernel segment plus a user segment. Code and data
for the kernel can be accessed in the kernel segment, and code and data for the process
in the user segment. A virtual address is given by reference to a segment selector and
the offset within the segment. When code is being processed, the segment selector is
already set and only offsets are used. In the kernel, however, access is needed not only
to data in the kernel segment but also to data in the user segment, for the passing of
parameters. For this purpose, the put_user() and get_user() functions are defined.

Programmers casually refer to a memory address as the way to access the contents of a
memory cell. In x86 Micro processors, we have three kind of address.

(i) Logical Addresses :

Included in the machine language instructions to specify the address of an

operand or of an instruction. Each logical addresses consists of a segment and
an offset that denotes the distance from the start of the segment to the actual

(ii) Linear Address :

A single 32 bit unsigned integer that can be used to address upto 4 GB, that is
upto 232 memory cells. Linear addresses are usually represented in hexa
decimal notation; Their values ranges from 0x00000000 to 0xffffffff.
(iii) Physical Address :

Physical address is used to address memory cells included in memory chips.

They correspond to the electrical signals sent along the address pins of the
microprocessor to the memory bus. Physical Address are represented as
32 bit unsigned integer.

Converting the Linear address:

Linux adopted a three - level paging model so paging is feasible on 64 bit architectures.
The x86 processor only supports a two - level conversion of the linear address. While
Alpha processor supports three-level conversion because the Alpha processor supports
linear addresses with a width of 64 bits.

Three level paging model defines three types of paging table:

Page (Global) directory

Page middle directory
Page Table

Page Global Directory :

Page Global Directory includes the addresses of several page middle directory. It is of
12 bit length. Different functions available for modification of Page Global directory are :

(i) pgd_alloc () : Allocates a Page Directory and filles with 0.

(ii) pgd_bad() : Can be used to test whether the entry in Page Directory is
(iii) pgd_clear() : Delete the entry in page directory.
(iv) pgd_free() : Releases the page of memory allocate to page directory.
(v) pgd_none() : Tests whether the entry has been initialized.

Page Middle Directory :

It includes the address of several Page Tables. It is of 13 bit length. Functions used for
handling Page Middle directory are :

(i) pmd_alloc() : Allocates a Page Middle directory to manage memory in

user area.
(ii) pmd_bad() : Test whether the entry in the Page Middle directory is
(iii) pmd_clear() : Deletes the entries in the page middle directory is valid.
(iv) pmd_free() : Releases a Page Middle Directory for memory in user
segment. (v) pmd_offset(): Returns the address of an entry in the page middle
directory to
which the address in argument is allocated.
(vi) pmd_none() : Tests whether the entry in the page middle directory has been
Page Table :

Each Page Table entries points to page frames. It is of 25 bits length. The ‘dirty’ attribute
is set when the contents of the memory page has been modified. A page table entry
contains a number of flags which describe the legal access modes to the memory page
and their state :

PAGE_NONE : No physical memory page is referenced by page table

PAGE_SHARE : All types of Access are permitted.
PAGE_COPY : This macro is historical & identical to PAGE_READONLY.
PAGE_READONLY: Only read and execute access is allowed to this Page of
PAGE_KERNEL : Access to this page of memory is only allowed in the kernel

Following are some functions have been defined to mainpulate the page table entries
and their attributes :

(i) mk_pte() : Returns a page table entry generated from the memory
of a page and a variable of the pgprot_t type.
(ii) pte_alloc() : Allocates new page table.
(iii) pte_clear() : clears the page table entry.
(iv) pte_dirty() : checks whether ‘dirty’ attributes is set.
(v) pte_free() : Releases the page table.

Q Define the Virtual Address Space for a process in LINUX ?

Ans : The Virtual Address Space of a Linux process is segmented : a distinction is made
between the kernel segment and the user segment. For the x86 processor, two selectors
along with their descriptors must be defined for each of these segments. The data
segment selector only permits data to be read or modified, while the code segment
selector allows code in the segment to be executed and data to be read. The user
process can modify its local descriptor table, which holds the segment descriptors.

The user segment :

In User Mode, a process can access only the user segment. As the user segment
contains the data and code for the process, this segment needs to be different from
those belonging to other processes, and this means in turn that the page directories, or
at least the individual page tables for the different processes, must also be different. In
the system call fork, the parent process’s page directories and page tables are copied
for the child process. An exception to this is the kernel segment, whose page tables are
shared by all the processes.

The system call fork has an alternative : clone. Both system calls genrate a new thread,
but in clone the old thread and the thread generated by clone can fully share the
memory. Thus, Linux regards threads as tasks which share their address space with
other tasks. The handling of additional task - specific resources, such as the stack, can
be controlled via parameters of the system cal clone.
Virtual memory :

All Linux systems provide a useful abstraction called virtual memory. Virtual memory
acts as a logical layer between the application memory requests and the hardware
Memory management Unit (MMU). Virtual memory has many purposes and advantages:
•Several processes can be executed concurrently.
•It is possible to run applications whose memory needs are larger than the available
physical memory.
•Processes can execute a program whose code is only partially loaded in memory.
•Each process is allowed to access a subset of the available physical memory.
•Processes can share a single memory image of a library or program.
•Programs can be relocatable, that is, they can be placed anywhere in physical memory.
•Programmers can write machine-independent code, since they do not need to be
concerned about physical memory organization.

A virtual memory area is defined by the data structure vm_area_struct. The structure
vm_operations_struct defines the possible function pointers enabling different
operations to be assigned to different areas.

System call brk :

At the start of a process the value of brk field in the process table entry point to the end
of the BSS (Bash memory segment) segment for non-statically initialized data. By
modifying thus pointer the process can allocate and release dynamic memory.

The system call brk can be used to find the current value of the pointer or to set it to a
new value. If the argument is smaller than the pointer to the end of process code, the
current value of brk will be returned. Otherwise an attempt will be made to set a new

The kernel function sys_brk() calls do_mmap() to map a private and anonymous area
between the old and new values of brk, corrected to the nearest page boundary and
returns new brk value.

The kernel segment :

A Linux system call is generally initiated by the software interrupt 0x80 being triggered.
The processor then reads the gate descriptor stored in the interrupt descriptor table. The
processor jumps to this address with the segment descriptor in the CS register pointing
to the kernel segment. The assembler routine then sets the segment selectors in the DS
and ES registers in such a way that memory accesses will read or write to data in the
kernel segment.

As the page tables for the kernel segment are identical for all processes, this ensures
that any process in system mode will encounter the same kernel segment. In the kernel
segment, physical addresses and virtual addresses are the same except for the virtual
memory areas mapped by vmalloc().

In an x86 processor, the next step involves loading to the segment register FS a data
segment selector pointing to the user segment. Accesses to the user segment can then
be made using the put_user() and get_user() functions mentioned earlier. This may
cause a general protection error, if the referenced address is protected. And occur a
page fault error, if page can’t be access. To avoid these problems, system routines have
to call the verify_area() function before they access the user segment. This checks
whether read or write access to the given area of the user segment is permitted,
investigating all the virtual memory areas affected by the area involved.

Q Define the static & Dynamic memory allocation in the kernel segment ?

Ans : Static memory allocation in the kernel segment :

In the system kernel, it is often necessary to allocate memory for kernel process. Before
a kernel generates its first process when it is run, it calls initialization routines for a range
of kernel components. These routines are able to reserve memory in the kernel
segment. The initialization routine is start_kernel(). The initialization function reserves
memory by returning a value higher than the parameter memory_start.

Dynamic memory allocation in the kernel segment :

The functions used for Dynamic memory allocation are kmalloc() and kfree(). The
kmalloc() function attempts to reserve the extent of memory specified by size. The
memory that has been reserved can be released again by the function kfree(). The
function _get_fee_pages() may be called and, if no free pages are available and other
pages therefore need to be copied to secondary storage, this may block.

In the Linux kernel, the _get_free_pages() function can only be used to reserve
contiguous areas of memory. As kmalloc() can reserve far smaller areas of memory,
however, the free memory in these areas needs to be managed. The central data
structure for this is the table sizes[ ], which contains descriptors for different sizes of
memory area.

One page descriptor manages each contiguous area of memory. This page descriptor is
stored at the beginning of every memory area reserved by kmalloc(). Within the page
itself, all the free blocks of memory are managed in a linear list. All the blocks of memory
in a memory area collected into one list are the same in size.

The block itself has a block header, which in turn holds a pointer to the next element if
the block is free, or else the actual size of the memory area allocated in the block.

Structures for kmalloc

Kmalloc provided the only facility for dynamic allocation of memory in the kernel. In
addition, the amount of memory that could be reserved was restricted to the size of one
page of memory . The situation was improved by the function vmalloc() and its
counterpart vmfree(). The advantage of the vmalloc() function is that the size of the area
of memory requested can be better adjusted to actual needs than when using kmalloc(),
which requires 128 KB of consecutive physical memory to reserve just 64 KB. Besides
this, vmalloc() islimited only by the size of free physical memory and not by its
segmentation, as kmalloc() is. Since vmalloc() does not return any physical addresses
and the reserved areas of memory can be spread over non-consecutive pages, this
function is not suitable for reserving memory for DMA.

Q Define the update and bdflush processes ?

Ans : The update process is a Linux process which at periodic intervals calls the system call
bdflush with an appropriate parameter. All modified buffer blocks that have not been
used for acertain time are writeen back to disk, together with all superblock and inode
information. The interval used by update as a default under Linux is five seconds.

bdflush is implemented as a kernel thread and is started during kernel initialization. In an

endless loop, it writes back the number of block buffers marked ‘dirty’ given in the
bdflush parameter ( default is 500). Once this is completed, a new loop starts
immediately it the proportion of modified block buffers to the total number of buffers to
the total number of buffers in the cache becomes too high. Otherwise, the process
switches to the TASK_INTERRUPTIBLE state.

The kernel thread can be woken up using the wakup_bdflush() function.

Q Define the paging under Linux ?

Ans : The RAM memory in a computer has always been limited and, compared to fixed disks,
relatively expensive. Particularly in multi-tasking operating systems, the limit of working
memory is quickly reached. Thus it was not long before someone hit on the idea of
offloading temporarily unused areas of primary storage(RAM) to secondary storage.

The traditional procedure for this used to be the so-called ‘swapping’ which involves
saving entire processes from memory to a secondary medium and reading them in
again. This approach does not solve the problem of running processes with large
memory requirements in the available primary memory. Besides this, saving and reading
in whole processes is very inefficient.

When new hardware architectures (VAX) were introduced, the concept of demand
paging was developed. Under the control of a memory management unit (MMU) the
entire memory is divided up into pages, with only complete pages of memory being read
in or saved as required. As all modern processor architectures, including the x86
architecture, support the management of paged memory, demand paging is employed
by Linux. Pages of memory which have been mapped directly to the virtual address area
of a process using do_mmap() without write authorization are not saved, but simply
discarded. Their contents can be read in again from the files which were mapped.
Modified memory pages, in contrast, must be written into swap space.

Pages of memory in the kernel segment cannot be saved, for the simple reason that
routines and data structures which read memory pages back from secondary storage
must always be present in primary memory.

Linux can save pages to external media in two ways. In the first, a complete block device
is used as the external medium. This will typically be a partition on a hard disk. The
second uses fixed-length files in a file system for its external storage. The term ‘swap
space’ may refer to either a swap device or a swap file.

Using a swap device is more efficient than using a swap file. In a swap device, a page is
always saved to consecutive blocks, whereas in a swap file, the individual blocks may be
given various block numbers depending on how the particular file system fragmented the
file when it was set up. These blocks then need to be found via the swap file’s inode. On
a swap device, the first block is given directly by the offset for the page of memory to be
saved or read in.
Q Define the IPC ?

Ans : There are many applications in which processes need to cooperate with each other. The
Linux IPC (Inter Process communication) facility provides many methods for multiple
process to communicate with each other.

A variety of forms of inter-process communication can be used under Linux. These

•resource sharing
•connectionless and
•connection oriented data exchance

Resource sharing :

If processes have to share a resource (such as printer). It is important to make sure that
no more than one process is accessing the resource- that is, sending data to the printer-
at any given time. If different process send data on same time the race condition is fired,
and communication between process must prevent it. Eliminating race condition is only
one possible use of inter-process communication.

Synchronization in the kernel :

As the kernel manages the system resources, access by processes to these resources
must be synchronized. A process will not be interrupted by the scheduler so long as it is
executing a system call. This only happens it it locks or itself calls schedule() to allow the
execution of other process. Whenever a process is running in its critical section no other
process running in its critical section, for achieving this different schronization methods
are provided by Linux IPC.

Connection less data exchange :

In connection less data exchange a process simply sends data packets, which may be
given a destination address or a message type, and leaves it to the infrastructure to
deliver them. For example : - we send a letter we rely on a connection less model.

Connection oriented data exchange :

In connection-oriented data exchange, the two parties to the communication must set up
a connection before communication can start. For example :- we make a telephone call,
and an client application give the request for server by client socket and server socket
receive the request and create the connection, we are using a connection - oriented data
Q How Linux implements all the forms of interprocess communication explain

Ans : Linux implements the Interprocess communication in different forms :-

Communication by files :

Communication via files is in fact oldest way of exchanging data between programs.
Program A writes data to a file and program B reads the data out again. In a multi-
tasking system, however both programs could be run as processes at least quasi-
parallel to each other. Race conditions then usually produce inconsistencies in the file
data, which result from one program reading a data area before the other has completed
modifying it, or both processes modifying the same area of memory at the same time.
Avoiding the race conditions in files different types of locking mechanisms used in
Linux :-

Mandatory Locking : -

Mandatory locking blocks read and write operations throughout the entire area.

There are two methods for locking entire files.

In addition to the file to be locked there is an auxiliary file known as a Lock file is created,
which refuses access to the file when it is present. The system call link, create,
open used for this locking. link system call create the lock file if lock file does not yet
exist. create aborts with an error code if the process which is being called does not
possess the appropriate access right. In open the lock file is opened if it does not
already exist.

The drawback to all three of these is that after a failure the process must repeat its
attempt to set up a lock file. Usually, the process will call sleep() to wait for one
second and then try again.

Lock the entire file by means of fcntl system call. This functions is invoked either
through flock() or lock() system call.

2. Advisory Locking : -

With advisory locking, all processes accessing the file for read or write operations have
to set the appropriate lock and release it again.

Locking file areas is usually refereed as record locking. Advisory locking of file areas can
be achieved with the system call fcntl. The prototype of fcntl() is

Int sys_fcntl(unsigned int fd, unsigned int cmd, unsigned long arg);

fd : The parameter fd is used to pass a file descriptor.

cmd : command for locking purpose it can be F_GETLK, F_SETLK, FSETLKW
arg : arg must be a pointer to an flock structure which store the lock type (
position, length, process id.

Semantics of fcntl locks.

Existing Locks Set read lock Set write lock

None Possible Possible

More than one Possible Not legal

read lock

One write lock Not legal Not legal

List of locked file are managed by a Doubly linked list file_lock_table.

Pipes : -

A PIPE is a one-way flow of data between processes : all the data written by a
processes to the Pipe is routed by the kernel to another process, which can thus read it.

In UNIX shells, pipes can be created by means of | operator. For example the following
statement instructs the shell to create two processes connected by a pipe.

$ ls | more

The standard output of the first process, which executes the ls program, is redirected to
the pipe; the second process, which executes the more program, reads its input from
the pipe.

Another varient of pipes consists of named pipes, also known as FIFOs. They can be
set up in a file system using the command

$ mkfifo filename

pipes are special type of files in Linux, which file type is p.

The system call pipe creates a pipe, which involves setting up a temporary inode and
allocating a page of memory. The call returns one file descriptor for reading and one for

System V IPC : -

IPC is an abbreviation that stands for interprocess communication. The classical forms
of inter-process communication-semaphores, message queues and shared memory-
were implemented in a special variant of UNIX. These were later integrated into System
V and are now known as System V IPC. It denotes a set of system calls that allows a
user mode process to :
Synchronize itself with either process by means of semaphores.
Send messages to other processes or receive messages from them.
Share a memory area with other process.

IPC data structures are created dynamically when a process requests an IPC resource
( a semaphore, a message queue, or a shared memory segment). An IPC resource may
be used by any process, including those that do not share the ancestor that created the

Since a process may require several IPC resources of same type, each new resource is
identified by a 32 bit IPC key, which is similar to the file pathname in the system’s
directory tree. IPC identifiers are assigned to IPC resources by the kernel and are
unique within the system, while IPC keys can be freely chosen by programmers.

Access permissions are managed by the kernel in the structure ipc_perm .

Semaphores :

Semaphores are counters used to provide controlled access to shared data structures
for multiple processes. The semaphore value is positive if the protected resource is
available, and negative or zero if the protected resource is currently not available. A
process that wants to access the resource decremented by 1 the semaphore value. It is
allowed to use the resource only it the old value was positive; otherwise the process
waits until the semaphore becomes positive. Depending on no of resources. An array of
semaphores can be set up using system calls.

Struct semaphore
int count;
struct wait_queue *wait;

A semaphore is taken to be occupied if count has value less than or equal to 0. All the
process wishing to occupy the semaphore enter themselves in the wait queue. They are
then notified when it is released by another process. There are two auxiliary functions to
occupy or release semaphore, up() and down() .

Message queues :

Process can communicate with each other by means of IPC messages. Each message
generated by a process is sent to an IPC message queue where it stays until another
process reads it.

A message is composed of a fixed sized header and a variable length text; it can be
labeled with an integer value ( the message type), which allows a process to selectively
retrieve messages from its message queue. Once a process has read a message from
the IPC message queue, the kernel destroys it; therefore, only one process can retrieve
a given message.
In order to send a message, a process invokes the msgsnd() function, passing as
parameters :

•The IPC identifier or the destination message

•The site of message text
•The address of a user mode buffer that contains the message type immediately
followed by the message text.

To retrieve a message, a process invokes the msgrcv() function, passing to it :

•The IPC identifier of the IPC message queue resource.

•The pointer to a user mode buffer to which the message type and message text should
by copied
•The site of this buffer
•A value t that specifies what message should be retrieved

Shared Memory :

The most useful IPC mechanism is shared memory, which allows two or more processes
to access some common data structures by placing them in a shared memory segment.
Each process that wants to access the data structures included in a shared memory
segment must add to its address space a new memory region, which maps the page
frames associated with the shared memory segment. Such page frames can thus be
easily handled by the kernel through demand paging.

Shmget() function is invoked to get the IPC identifier of a shared memory segment,
optionally creating it if it does not already exist.

The drawback to shared memory is that the processes need to use additional
synchronization mechanisms to ensure that race conditions do not arise.

Q Define the system call ptrace ?

Ans : Execution Tracing is a technique that allows a program to monitor the execution of
another program. The traced program can be executed step-by-step, until a signal is
received, or until a system call is invoked. Execution tracing is widely used by
debuggers, together with other techniques like the insertion of breakpoints in the
debugged program and run-time access to its variables. In Linux, execution tracing is
performed through the ptrace() system call, which can handle the following commands :

PTRACE_TRACEMEStart execution tracing for the current processPTRACE_ATTACHStart

execution tracing for another processPTRACE_DETACHTerminate execution
tracingPTRACE_KILLKill the traced processPTRACE_PEEKTEXTRead a 32 bit value
from the text segmentPTRACE_PEEKDATARead a 32 bit value from the data
segmentPTRACE_POKETEXTWrite a 32 bit value from the text
segmentPTRACE_POKEDATAWrite a 32 bit value from the data
segmentPTRACE_CONTResume execution
Several monitored events can be associated with a traced program :
•End of execution of a single assembly instruction
•Entering a system call
•Exiting from a system call
•Receiving a signal

When a monitored event occurs, the traced program is stopped and a SIGCHLD signal
is sent to its parent. When the parent wishes to resume the child’s execution, it can use
one of the PTRACE_CONT.
A process can also be traced using some debugging features of the Intel Pentium
processors. For example, the parent could set the values of the dr0,….dr7 debug
registers for the child by using the PTRACE_POKEUSR command. When a monitored
event occurs, the CPU raises the “Debug” exception; the exception handler can then
suspend the traced process and send the SIGCHLD signal to the parent.
Chapter 6 : The Linux file system

Q The Explain the representation of file systems in the kernel of Linux?

Ans: The file system is the most visible aspect of an operating system. It provides the
mechanism for on-line storage of and access to both data and programs of the
operating system. A central demand made of a file system is the purposeful structuring
of data. When selecting a purposeful structure, however, two factors not to be
neglected are the speed of access to data and a facility for random access.

Each file system starts with a boot block. This block is reserved for the code required
to boot the operating system.

The range of file systems supported is made possible by the unified interface to the
Linux kernel. This is the Virtual File System Switch (VFS). The virtual file system is a
kernel software layer that handles all system calls related to a standard Linux
filesystem. Its main strength is providing a common interface to several kinds of

For instance, let us assume that a user issues the shell command:

$ cp /mnt/floppy/TEST /tmp/test

Where /mnt/floppy is the mount point of an MS-DOS diskette and /tmp is a normal
EXT2 directory. The cp program is not required to know the filesystem types of
/mnt/floppy/TEST and /tmp/test. Instead, cp interacts with the VFS by means of
generic system calls well known to anyone who has done Linux programming.

Whenever a different filesystem is used, first register the filesystem. This is the
responsibility of the VFS, which call the register_filesystem(). This functions fills the
information of file_system_type structure, which store the information about the

Once a file system implementation has been registered with the VFS, file system of this
type can be administered.

The common file model consists of the following structure types :

The superblock structure
The inode structure
The file structure
Mounting :

Before a file can be accessed, the file system containing the file must be mounted. This
can be done using either the system call mount or the function mount_root(). The
mount_root function takes care of mounting the first file sytem. It is called by the system
call setup after all the file system implementations permanently included in the kernel
have been registered. The setup call itself is called just once, immediately after the init
process is created by the kernel function init().

The superblock :

All the information which is essential for managing the file system is held in the
superblock. Every mounted file system is represented by a super_block structure.
These structures are held in the static table super_block[ ]. The superblock is
initialized by the function read_super() in the Virtual File System. The superblock
contains information on the entire file system, such as block size, access rights and
time of the last change. The superblock also holds references to the file system’s root

Some important possible operations on super_block structure are as follows :

write_super() : The write_super function is used to save the information of the


put_super() : The VFS calls this function when unmounting file systems, when it
should also release the superblock and other information buffers.

read_inode() : The inode structure is initialized by this function like read_super() fills
super_block structure.

notify_change() : The changes made to the inode via system calls are acknowledged
by notify_change().
write_inode() : This function saves the inode structure, analogous to write_super().

The inode :
Some important possible operations on inode structure are as follows :

Create() : creates a new disk inode for a file.

Lookup() : searches a inode for given file.
Link() : This function sets up a hard link.
Unlink(): This function deletes the specified file in the directory specified.
Symlink() : create a symbolic link.
The file structure :

The file structure describes how a process interacts with a file it has opened. The
structure is created when the file is opened and consists of a file structure. The structure
contains information on a specific file’s access rights f_mode, the current file position
f_pos, the type of access f_flags and the number of accesses f_count. The file
structures are managed in a doubly linked list via the pointers f_next and f_prev. This
file table can be accessed via the pointer first_file.

Some important possible operations on inode structure are as follows :

Lseek() : The job of the lseek function is to deal with positioning within the file.
Read(): This function copies count bytes from the file into the buffer buf in the
user address space.
Write(): The write function operates in an analogous manner to read() and copies
data from the user address space to the file.
Select(): This function checks whether data can be read from a file or written to
Ioctl(): The ioctl() function sets device-specific parameters.

Q Explain the proc filesystem ?

Ans : Linux supports different filesystem so in this place explain the process file
system(proc) of system V Release 4. Each process in the system which is currently
running is assigned a directory /proc/pid, where pid is the process identification number
of the relevant process. This directory contains files holding information on certain
characteristics of the process.

When the Proc file system is mounted, the VFS function read_super() is called by
do_mount(), and in turn calls the function pror_read_super() for the Proc file system in
the file_system list.

iget() generate the inode for the proc root directory, which is entered in the superblock.
parse_options() function then processes the mount options data that have been
provided and sets the owner of the root inode.

Accessing the file system is always carried out by accessing the root inode of the file
system. The first access is made by calling iget(). If the inode does not exist, this
function then calles the proc_read_inode() function entered in the proc_sops structure.

This inode describes a directory with read and execute permissions for all processes.
The proc_root_inode_operations only provides two functions: the component readdir
in the form of the proc_readroot() function and the component lookup as the
proc_lookuproot() function. Both function operate using the table root_dir[ ], which
contains the different entries for the root directory.

The individual structures contain the inode number, the length of the filename, and the
name itself. Proc_lookuproot(), which determines the inode of a file by reference to the
inode for the directory and the name of a file contained in it.
The function proc_read_inode(), the inode for most normal files is assigned the function
vector proc_array_inode_operations. All that is implemented in this, however, is the
function array_read() in the standard file operations to read the files.

Q Explain the Linux file system (ext2)?

Ans : As Linux was initially developed under MINIX, it is hardly surprising that the first LINUX
file system was the MINIX file system. However, this file system restricts partitions to a
maximum of 64 MB and filenames to no more than 14 characters, so the search for a
better file system was not long in starting. The result was the Ext file system - the first to
be designed especially for LINUX. Although this allowed partitions of up to 2 GB and
filenames up to 255 characters. It included several significant extensions but offered
unsatisfactory performance. The second Extended Filesystem (Ext2) was introduced in
1994 : besides including several new features, it is quite efficient and robust and has
become the most widely used LINUX file system.

The most significant features are :

Block fragmentation :

System administrators usually choose large block sizes for accessing recent disks. As a
result, small files stored in large blocks waste a lot of disk space. This problem can be
solved by allowing several files to be stored in different fragments of the same block.

Access Control Lists :

Instead of classifying the users of a file under three classes - owner, group, and others -
an access control list (ACL) is associated with each file to specify the access rights for
any specific users or combinations of users.

Handling of compressed and encrypted files :

The new option, which must be specified when creating a file, will allow users to store
compressed and / or encrypted versions of their files on disk.

Logical deletion :

An undelete option will allow users to easily recover, if needed, the contents of
previously removed file.

The structure of the Ext2 file system :

The first block in any Ext2 partition is never managed by the Ext2 filesystem, since it is
reserved for the partition boot sector. The rest of the Ext2 partition is split into block group.
Block groups reduce file fragmentation, sice the kernel tries to keep the data blocks
belonging to a file in the same block group if possible. Each block in a block group contains
one of the following pieces of information :
•A copy of the filesytem’s superblock
•A copy of the group of block group descriptors
•A data block bitmap
•A group of indoes
•An inode bitmap
•A chunk of data belonging to a file; that is, a data block

An Ext2 disk superblock is stored in an ext2_super_block structure, which contains the Total
number of inodes, Filesystem size in blocks, number of reserved blocks, free blocks
counter, Free inodes counter, block size, fragement size and other important information.

Each block group has its own group descriptor, an ext2_group_desc structure and contains
the inode table.

Directories in the Ext2 file system

In the Ext2 file system, directories are administered using a singly linked list. Ext2 implements
directories as a special kind of file whose data blocks store filenames together with the
corresponding indoe numbers. In particular, such data blocks contain structres of type
ext2_dir_entry2. The structure has a variable length, since the last name field is a variable
length array of up to EXT2_NAME_LEN characters (usually 255). The name_len field stores
the actual file name length. The rec_len field may be interpreted as a pointer to the next valid
directory entry : it is the offset to be added to starting address of the directory entry to get the
starting address of the next valid directory entry.

Block allocation in the Ext2 file system

A problem commonly encountered in all file systems is the fragmentatation of files- that is, the
‘scattering’ of files into small pieces as a result of the constant deleting and creating of new files.
The Ext2 file system uses two algorithms to limit the fragmentation of files.

Target-oriented allocation :

This algorithm always looks for space for new data blocks in the area of a ‘target block’. If this
block is itself free, it is allocated. Otherwise, a free block is sought within 32 blocks of the target
block, and if found, is allocated. If this fails, the block allocation routine tries to find a free block
which is at least in the same block group as the target block. Only after these avenues have
been exhausted are other block groups investigated.

Pre-allocation :

If a free block is found, up to eight following blocks are reserved (if they are free). When the file
is closed, the remaining blocks still reserved are released. This also guarantees that as many
data blocks as possible are collected into one cluster.
Chapter 7 : Device drivers under Linux

Device drivers is an interface between device and O. S. Device driver is a software

which operate the hardware. There is a wide variety of hardware available for LINUX
computers. Each hardware have an own device driver. Without these, an operating
system would have no means of input or output and no file system. Device drivers are
uniquely identified by their major numbers. A device driver may be controlling a number
of physical and virtual devices, for example a number of hard disks and partitions; thus,
the individual device is accessed via its minor number, an integer between 0 and 255.
Each individual device can thus be uniquely identified by the device type (block or
character), the major number of the device driver and its minor number.

Q Explain character and block devices under Linux. ?

Ans : Block devices :

Block devices are those to which transfer the data in block wise and provide the facility
of random access. Block devices are divided into a specific number of equal - sized
blocks and each block have a unique number. So file system define the address system
with the help of these block number. Using this address you can access any data
random whenever you want at any location directly. for read and write from block device,
Linux maintain a buffer area in RAM. Random access is an absolute necessity for file
systems, which means that they can only be mounted on block devices. RAM, Hard
disk, Floppy disk, CD-ROM all are block devices.

Character devices :

Character devices on the other hand processed data character by character and
sequentially. And Linux doesn’t maintain the buffer area for that. Some character
devices maintain its own buffer for its internal operation for block transferring but These
blocks are sequential in nature, and cannot be accessed randomly. For example - a ink
printer and laser printer print the character in line and page wise respectively so all
characters stores in buffer and when a required limit is reach, device send whole block
of data to printing. Some character devices are : Printer, Scanner, sound cards,
monitor, PC speaker.

Q In the context of LINUX device drivers, write short notes of the following :

Polling Interrupt
Interrupt Sharing Bottom Halves
Task Queues DMA
Ans : Polling :

In polling, the driver constantly checks the hardware. The driver defines a timeout (jiffies
+ waiting time), and driver continuously check the hardware until timeout limit is not
reach. Whenever a timeout limit is over the timeout error handling will then give the
appropriate error messages in case of printer like printer is out of paper, offline. In
polling mode results pointless wasting of processor time; but it is sometimes the fastest
way of communicating with the hardware. The device driver for the parallel interface
works by polling as the default option.

Interrupt :

The use of interrupt, on the other hand, is only possible if these are supported by the
hardware. Here, the device informs the CPU via an interrupt channel (IRQ) that it has
finished an operation. This breaks into the current operation and carries out an interrupt
service routine (ISR). Further communication with the device then takes place within the

In the serial mouse, every movement of which sends data to the serial port, triggering an
IRQ. The data from the serial port is read first by the handling ISR, which passes it
through to the application program.

IRQs are installed using the function :

Request_irq() in which pass different parameters like irq number, address of handling
routine, device name, device id, and irqflags.

Irqflags specifies the type of interrupt. If irqflags is off (NULL) then interrupt is slow
interrupt, if is set the value SA_INTERRUPT then interrupt is a fast interrupt, if
SA_SHIRQ then it is a sharable interrupt.

Interrupt sharing :

Various hardware is used the same irq number. If different hardware which used same
interrupt, are used in same PCI board then hardware are conflict each other. In this case
interrupt sharing provides the facility to use both device in same PCI board. For this if
one device is used the PCI buses the second device wait for freeing that buses. If an
ISR capable of interrupt sharing is installed, this must be communicated to the
request_irq() function by setting the SA_SHIRQ flag. If another ISR also capable of
interrupt sharing was already installed on this interrupts, a chain is built.
Bottom Halves :

It frequently happens that not all the functions need to be performed immediately after
an interrupt occurs; although ‘important’ actions need to be taken care of at once, others
can be handled later or would take a relatively long time and it is preferable not to block
the interrupt. A bottom half is a low-priority function, usually related to interrupt handling,
that is waiting for the kernel to find a convenient moment to run it.

Before invoking a bottom half for the first time, it must be initialized. This is done by
invoking the init_bh() function, which inserts the routine address in the nth entry of
bh_base. bh_base table to group all bottom halves together. It is an array of pointers to
bottom halves and can include up to 32 entries, one for each type of bottom half.

Some Linux Bottom Halves are as follows:

CONSOLE_BH : Virtual console

KEYBOARD_BH : Keyboard
NET_BH : Network Interface
SCSI_BH : SCSI interface
SERIAL_BH : Serial port
TIMER_BH : Timer

Task Queues :

Task queue is a dynamic extension of the concept of bottom halves. Use of bottom
halves is somewhat difficult because their number is limited to only 32, and some tasks
are already assigned to fixed numbers. Task queue allow a number of functions to be
entered in a queue and processed one after another at a later time.

A queue element is described by the tq_struct which holds :

- the pointer to next entry in *next

- synchronization flag sync
- function to be called
- argument passed to the function at call time in *data.

Before a function can be entered in a task queue, a tq_struct structure must be created
and initialized.

DMA mode :

Direct memory access or DMA, is the hardware mechanism that allows peripheral
components to transfer their I/O data directly to and from main memory without the need
for the system processor to be involved in the transfer. Use of this mode is ideal for
multi-tasking, as the CPU can take care of other tasks during the data transfer. The
device will generally trigger an IRQ after the transfer, so that the next DMA transfer can
be prepared in the ISR handling the procedure.
In a DMA operation the data transfer takes place without CPU intervention : the data bus
is directly driven by the I/O device and the DMAC(Direct Memory Access controller).
Therefore, when the kernel sets up a DMA operation, it must write the bus address of
the memory buffer involved in the proper I/O ports of the DMAC or I/O device.

Q How a driver can be implemented explain with following functions :

setup init
open release
read write
IOCTL select

Ans : setup () :

The setup() function must initialize the hardware devices in the computer and set up the
environment for the execution of the kernel program. Although the BIOS already
initialized most hardware de4vices, Linux does not rely on it but reinitializes the devices
in its own manner to enhance portability and robustness. Sometimes it is desirable to
pass parameters to a device driver or to the Linux kernel in general. These parameters
will come in the form of a command line from the Linux loader LILO. This command line
will be analyzed into its component parts by the function parse_options(). The
checksetup() function is called for each of the parameters and compares the beginning
of the paramerer with the string stored in the bootsetups[ ] field, calling the
corresponding setup( ) function whenever these match. The checksetup() function will
attempt to convert the first ten parameters into integer numbers. If this is successful,
they will be stored in a field.

Init() :

The init() function is only called during kernel initialization, but is responsible for
important tasks. This function tests for the presence of a device, generates internal
device driver structures and registers the device.

The call to the init function must be carried out in one of the following functions,
depending on the type of device driver:


Character devices : chr_dev_init()

Block devices : blk_dev_init()
SCSI devices : scsi_dev_init()
Network devices : net_dev_init()
Before Linux can make use of the driver, it must be registered using the functions

The init() function is also the right place to test whether a device supported by the driver
is present at all. This applies especially for devices which cannot be connected or
changed during operation, such as hard disks.

Open ():

The open function is responsible for administering all the devices and is called as soon
as a process opens a devices file. If only one process can work with a given device.
-EBUSY should be returned if other device wants to open the device. If a device can be
used by a number of processes at the same time, open() should set up the necessary
wait queues. If no device exists it should return -ENODEV. The open() function is also
the right place to initialize the standard settings needed by the driver.

Release() :

The release() function is only called when the file descriptor for the device is released.
The tasks of this function comprise cleaning-up activities global in nature, such as
clearing wait queues. For some devices it can also be usefule to pass through to the
device all the data still in the buffers.

Read() & write() :

The read() and write() functions perform a similar task, that is, copying data from and to
application code. Whenever a input device is used read() function is fired and for output
devices write() function is fired, because only read operation is possible by input device
like mouse, keyboard and only write operation is possible by output devices like printer,


Each device has its own characteristics, which may consist in different operation modes
and certain basic settings. It may also be that device parameters such as IRQs, I/O
addresses and so on need to be set at run-time. IOCTL usually only change variables
global to the driver or global device settings.
Select () :

The select () function checks whether data can be read from the device or written to it. If
the device is free or argument wait is NULL, the device will only be check. If it is ready
for the function concerned, select() will return 1, otherwise a 0. If wait is not NULL, the
process must be held up until the device becomes available.
Chapter 8 : Network Implementation

Q Define The Socket Structure ?

Ans : Sockets are used to handle communication links between application over the network.
Communication between the client and the server is through the socket. To
communicate client and server programs establish a connection and bind a socket. The
socket programming interface provides for communication via a network as well as
locally on a single computer. The Client socket sends a request for server socket,
server socket receive this request and send an acknowledgement for client, client
receive this ACK and send a concluding ACK for server. Now connection is established.

A socket is represented in the kernel by the data structure socket.

Struct socket
short type;
socket_state state;
Long flags;
Struct proto_ops *ops;
Void *data;
Struct *inode;
Struct fasync_struct *fasync_list;
Struct file *file;

type determines the type of protocol used in connection. Valid entries for type are
SOCK_STREAM, SOCK_DGRAM and SOCK_RAW. Sockets of the type
SOCK_STREAM are used for TCP connections, SOCK_DGRAM for the UDP protocol
and SOCK_RAW for sending and receiving IP packets.

In state, the current state of the socket is stored. The most important states are

flags are used to store the additional value for socket like SYN_SENT when a client
send a SYN to server.

The ops pointer points to the operation vector proto_ops, where the specific operations
for this protocol are entered.

The data pointer points to the substructure of the socket corresponding to the protocol

There is also an inode for each BSD socket. A reference to the corresponding inode is
stored in inode, whereas file holds a reference to the primary file structure associated
with this node.
If different clients want to communicate with this socket then all the client request stored
in fasync_list.

Q Define the Network devices under Linux?

Ans : There is a great variety of hardware that can be used to connect computers. The data
structure device controls an abstract network device. This structure element describes
the hardware device. The some devices used in network are as follows :

Ethernet :
Linux supports two groups of adaptors for Ethernet. These include on the one hand the
classic Ethernet cards connected to the PC bus, and on the other adaptors linked to the
PC via the PCMCIA bus.

The network devices for Ethernet cards are named ‘eth0,….,eth3. Whenever a computer
is started, the network card is detected. The kernel outputs a message on the cards
detected and there allocation to the network devices. Two types of card are popular in
network, WD8013 and NE2000. In case of WD8013, these cards doesn’t compatible with
some hardware but NE 2000 generally supported most of the hardware.

Every Ethernet adaptor has a completely unique address. These addresses are 6 bytes
long. Ethernet card supports the various types of protocol like IP, ARP, IPX, TCP. The
type determine which types of packets are send or receive.

Q Difference between SLIP and PLIP devices?


The difference between SLIP and PLIP is that the one protocol uses the computer’s
serial interface for data transfer while the other transfers data via the parallel port. In
SLIP devices data sends in 1 bit at a time, on the other hand PLIP devices sends data
8bits or 16 bits at a time depending on devices, either device is 8 bit or 16 bit. PLIP
enables a very powerful link to be set up between two computers. SLIP is the simplest
way of connecting a computer or a local network to the Internet via a serial link. For
example : a modem connection to a telephone network. SLIP and PLIP both differ from
Ethernet in that they can only transmit IP packets.

The loopback device :

The loopback device provides the facilities of communication to applications on the local
computer using sockets. Suppose you have a modem and create a applications using
the modem. How can you check the application communicate with other computer by
modem so you can use the NULL modem, which provides the facilities of return the
sending data to this computer.
The dummy device :

In case of dummy device no real hardware device is present. If you want to use the all
facilities of network, just load the device driver of the network device. For example :
whenever you wish to connect the internet, you want to load the network card so you
can load the device driver of the network card which enable the all services of network.

Q Define the following :


Ans : IP (Internetwork Protocol) :

The IP layer provides a packet transfer service - that is, it can be given a packet and the
addressees and it will take care of the transfer. It is an unreliable and connectionless
datagram protocol- a best-effort delivery service. The term best-effort means that IP
provides no error checking or tracking. IP assumes the unreliability of the underlying
layers and does its best to get a transmission through to its destination, but with no
guarantees. Transmissions along physical networks can be destroyed for a number of
reasons. Noise can cause bit errors during transmission across a medium; a congested
router may discard a datagram if it is unable to relay it before a time limit runs out.

The following much simplified picture describes the tasks of the IP layer.

The schematic flow of the outgoing packet stream of IP is as follows :

•Receipt of a packet.

•Option handling.

•Routing to the destination address.

•Generating the hardware header.

•Creating the IP packet. This involves generating an IP header, which is simply added to
the hardware header along with the data packet.

•Fragmenting the IP packet, if the IP packet is too large for the device.

•Passing the IP packet to the appropriate network device.

The schematic flow of the incoming packet stream of IP is :

•Checking the IP header.

•Comparing destination address with local address.

•Decrementing the ttl (which holds the no. of sending packets) field.
•Defragmenting the IP packet.

•Forwarding the packet to the next protocol.

Q Define the FIB (Forwarding Information Base) ? (5)

A route must be established by the IP for every packet that is sent. The decision on
whom the packet is sent to, and via which network device, is made by reference to the
Forwarding Information Base (FIB). In FIB use the struct fib_zone, which are
responsible for one zone each. A zone denotes all routes that have the same route
mask. Thus, all host routes are in the same zone.

Fib_node and fib_info structure hold all information for a determined route. The
information is divided into two structures because much of the information for different
routes is identical. For continuous fast access, there is yet another hash table of the
struct rtable, which holds the all routes references.

When a network device is deactivated, the transfer of packets via this device is no longer
possible. This means that routes in the table which refer to this device are no longer
operable, and they are therefore automatically deleted from the table when a device is
taken off the network.

Q Define the IP packet filters & IP accounting and IP firewalling ? (5)

IP packet filters, a very powerful tool has been placed in the hands of network
administrators. Using these filters, they can specify very precisely which IP packets are
to be send or receive. In a big organizations, in which no. of computers are used,
administrator restrict all the user with the help of IP packet filter for sending or receiving
unnecessary data. A filter consists of a list of packet patterns. If a packet matches a
pattern in the list it will be recognized by the corresponding filter.

The characteristics of IP packet filters are used by IP accounting and IP firewalling. In IP

accounting, the complete network is traced. The administrator check it which data are
sent or received by network. In firewalling, a firewall machine is always located at a
geteway, the checking mechanisms can be implemented relatively easily.
Call_in_firewall() and call_out_firewall() restrict receiving and sending of IP packets,

TCP (Transmission Control Protocol) :

The transmission Control Protocol (TCP) provides full transport layer services to
applications. TCP is a reliable stream transport port-to-port protocol. The term stream, in
this context, means connection-oriented : a connection must be established between
both ends of a transmission before either may transmit data. By creating this connection,
TCP generates a virtual circuit between sender and receiver that is active for the
duration of a transmission. Reliability is ensured by provision for error detection and
retransmission of damaged frames; all segments must be received and acknowledged
before the transmission is considered complete and the virtual circuit is discarded.

TCP protocol to be implemented with correct timing behavior. For this purpose
reset_timer(), delete_timer(), net_timer() functions are used.

In a TCP connection, client uses the function connect() to set up a connection to the
server. The function send a SYN to the server and then goes over to the SYN_SENT
state. The process now blocked until it receives the SYN/ACK from the server. Server
receive the SYN by listen() function and send a SYN/ACK for client. Client receive this
SYN and again send a concluding ACK, and the connection is now established.

By calling close() function client sends a FIN to server. Server receive this FIN and
release the all information related to that particular client, and send the ACK to client,
and now connection is terminated.

UDP (User Datagram Protocol) :

UDP are not reliable and are, therefore, used only when there is little data to be
transmitted, and there is not much distance between the sender and the receiver. In
UDP, there is no guarantee that the data packet sent will reach its destination. If the
network traffic is high, or the receiving program is handling multiple requests from other
programs, there is a chance of the datagram packet being lost.

The UDP protocol does not make the use of a checksum mandatory. If does not provide
any sequencing or reordering functions and cannot specify the damaged packet when
reporting an error. It does not have an ability to specify which packet has been lost.

ARP (Address Resolution Protocol) :

The address resolution protocol (ARP) associates an IP address with the hardware
address. The task of the ARP is to convert the abstract IP addresses into real hardware
addresses. This conversion is required because a hardware network cannot do anything
with IP addresses. The Linux ARP is capable of mapping Ethernet addresses, arcnet
addresses and AX.25 addresses to the corresponding IP addresses.

The reverse function is handled by RARP (reverse ARP). Unlike, ARP, the RARP in
Linux can at present only convert Ethernet addresses into IP addresses.

The central element in address resolution is the ARP table, which consists of a field of
pointers to structures of the type arp_table. A further facility offered by Linux is ‘proxy’
ARP. This enables subnetworks which should really be directly interconnected by
hardware to be separated.
Chapter 9 : Modules and debugging
Q What are modules? How implemented in the kernel ?

Ans : Modules are components of the Linux kernel that can be loaded and attached to it as
needed. To add support for a new device, you can now simply instruct a kernel to load
its module. In some cases, you may have to recompile only that module to provide
support for your device. The use of modules has the added advantage of reducing the
size of the kernel program. The kernel can load modules in memory only as they are
needed. For example, the module for the BLOCK devices, and FILE SYSTEM,
whenever you use the device and use the file system.

Implementation in the kernel :

Linux provides three system calls : create_module, init_module and delete_module

for implementation of Linux modules. A further system call is used by the user process to
obtain a copy of the kernel’s symbol table.

The administration of modules under Linux makes use of a list in which all the modules
loaded are included. This list also administers the modules’ symbol tables and

As far as the kernel is concerned, modules are loaded in two steps corresponding to the
system calls create_module and init_modules. For the user process, this procedure
divides into four phases.

The process fetches the content of the object file into its own address space. To get the
code and data into a form in which they can actually be executed, the actual load
address must be added at various points. This process is known as relocating.

The system call create_module is now used, firstly to obtain the final address of the
object module and secondly to reserve memory for it. To do this, a structure
module is entered for the module in the list of modules and the memory is
allocated. The return value gives us the address to which the module will later be

The load address received by create_module is used to relocate the object file. This
procedure takes place in a memory area belonging to the process-if process is a
user process then load in user area, and if kernel process load in kernel

When a module is already use in a process and other process wish to use this
then it uses the module which earlier loaded. This mechanism is known as
module stacking.
Once the preliminary work is complete, we can load the object module. This uses the
system call init_modules. cleanup() function is called when the module is

By using the system call delete_module, a module that has been loaded can be
removed again. Two preconditions need to be met for this : there must be no
references to the modules and the module’s use counter must hold a value of

Q Define the Kenel Daemon ?

Ans : The kernel daemon is a process which automatically carries out loading and removing of
modules without the system user noticing it. For example : whenever a file is accessed
by floppy, so kernel daemon load the block device module for handling the block device
and load the file system modules for particular file system. But how does the kernel
daemon know that modules need to be loaded ?

Communication between the Linux kernel and the kernel daemon is carried out by
means of IPC. The kernel daemon opens a message queue with the new flag
IPC_KERNELD. The kernel sends the messages to the kernel daemon by
kerneld_send function. Request is stored in kerneld_msg struct, which includes
different information :

mtype : component contains the message

ld : indicates whether the kernel expects an answer
pid : component holds the PID of the process that triggered the kernel request.

Responsibility for loading and releasing modules lies with the functions :

request_module : kernel requests the loading of a module and waits

the operation has been carried out.

release_module : removes a module

delayed_release_module : allows a module to be removed with a specified
cancel_release_module : allows a module to be removed with a specified

Q Define the Debugging ?

Ans : Debugging is the process in which find out the errors and also whenever an error is
occurred at run time, rectify that error and warn for that. Only few cases a section of
program code be free of bugs as soon as it is written. Usually the program will need
debugging, for which it will be loaded into a debugger such as gdb and run step by step
until the error has been found.

The most common debugging techinque is monitoring. When you are debugging kernel
code, you can accomplish this goal with printk.

Printk :

In printk debugger, code is checked and an error occurred create the check points and
print an appropriate alarm message. For example : whenever a kernel segment process
wish to call the data and code of user segment process, verify_area () functions is fired,
which check all area related to process and if any error is occurred, call the printk
debugger, which print the appropriate message.

Gdb - GNU debugger :

Execution Tracing is a technique that allows a program to monitor the execution of

another program. The traced program can be executed step-by-step, until a signal is
received, or until a system call is invoked. Execution tracing is widely used by
debuggers, together with other techniques like the insertion of breakpoints in the
debugged program and run-time access to its variables. In Linux, execution tracing is
performed through the ptrace() system call. Gdb debugger works on ptrace () system
call. Check the code and data, if any error is occurred, try to repair this error if error is
repaired then relocating the control otherwise print an appropriate message.
Chapter 10 : Multi-processing
Q Define the SMP ?

Ans : Most systems are single processor systems; that is, they have only one main CPU. But
sometimes applications require more processors power. So in this situation use the
multiple processor for close communication, sharing the computer bus, the clock, and
sometimes memory and peripheral devices. The most common multiple-processor
systems now use the symmetric-multiprocessing (SMP) model, in which each
processor runs an identical copy of the operating system, and these copies
communicate with one another as needed.

Most of the currently available multi-processor main boards for PCs use i486, Pentium or
Pentium Pro processors. The Pentium already has some internal functions which
support multi-processor operation, such as cache synchronization, inter-processor
interrupt handling.

It defines a highly symmetrical architecture in terms of :

Q Difference between Memory symmetry and I/O symmetry ?

Memory Symmetry :
All processors share the same main memory; in particular, all physical addresses are the
same. This means that all processors execute the same operating system, all data and
applications are visible to all processors and can be used or executed on every

I/O Symmetry :

All processors share the same I/O subsystem (including the I/O port and the interrupt
controller). I/O symmetry allows reduction of a possible I/O bottleneck. However, some
MP systems assign all interrupts to one single processor and on the other hand use the
I/O APIC (Advanced Programmable Interrupt Controller). All CPU are connected by
ICC (Interrupt Controller Communications) bus.

One processor is chosen by the BIOS; it is called the boot processor (BSP) and is
used for system initialization. All other processors are called application processors
(AP) and are initially halted by the BIOS.

Problems with multi-processor systems :

For the correct functioning of a multi-tasking system it is important that data in the kernel
can only be changed by one processor so that identical resources cannot be allocated
twice. For this use coarse grained locking; sometimes even the whole kernel is locked so
that only one process can be present in the kernel. And also use the finer grained
locking which, normally used only for multi-processor and real-time operating system.

In the Linux kernel implementation, various rules were established :

No process running in kernel mode is interrupted by another process running in kernel

mode, except when it releases control and sleeps.

Interrupt handling can interrupt a process running in kernel mode, but that in the end
control is returned back to this same process. A process can block interrupts and
thus make sure that it will not be interrupted.

Interrupt handling cannot be interrupted by a process running in kernel mode. This

means that the interrupt handling will be processed completely, or at most be
interrrupted by another interrupt of higher priority.

In the development of the multi-processor LINUX kernel a decision was made to

maintain these three basic rules. All processes to monitor the transition to kernel mode
use one single semaphore. This semaphore is used to ensure that no process running in
kernel mode can be interrupted by another process. Furthermore, it guarantees that only
a process running in kernel mode can block the interrupts without another process taking
over the interrupt handling.

Changes to the Kernel :

In order to implement SMP in the LINUX kernel, changes have to be made :

Kernel Initialization :

The first problem with the implementation of multi-processor operation arises when
starting the kernel. Initially BIOS running the boot processor and halted all Aps. Only
this processor enters the kernel starting function start_kernel(). After it has executed
the normal LINUX initialization, smp_init() is called. This function activates all other
processors by calling smp_boot_cpus().

Scheduling :

The LINUX scheduler, which responsibility is allocated the processor to running

process. The Linux scheduler shows only slight changes. First of all, the task
structure now has a processor component which contains the number of the running
processor. The last_processor component contains the number of the processor
which processed the task last.

Message exchange between processors :

Messages in the form of inter-processor interrupts are handled via interrupts 13 and
16. Interrupt 13 is defined as a fast interrupt which, however does not need the
kernel lock and can thus always be processed. Interrupt 16 is a slow interrupt which
waits for the kernel lock and can trigger scheduling. It is used to start the schedulers
on the other processors.

Entering kernel mode :

The kernel is protected by a single semaphore. All interrupt handlers, syscall routines
and exception handlers need this semaphore and wait in a processor loop until the
semaphore is free.

Interrupt Handling :

Interrupts are distributed to the processors by the I/O APIC. At system start,
however, all interrupts are forwarded only to the BSP. Each SMP operating system
must therefore switch the APIC into SMP mode, so that other processors too can
handle interrupts.

Linux does not use this operating mode, that is, during the whole time the system is
operating, interrupts are only delivered to the BSP. This compromises the latency