Sie sind auf Seite 1von 32

UNIT VI

FILE MANAGEMENT

File System
Disks are used as a primary storage medium for information.
The file system provides a convenient mechanism to store and retrieve the data
and programs from this medium.
In fact, files are used as a collection of related information, the meaning of which
is defined by its creator or author.
These files are mapped to the disks or other storage media by the OS.
Files themselves are organized in the form of a directory.
A file is a named collection of related information that is recorded on secondary
storage such as magnetic disks, magnetic tapes and optical disks. In general, a
file is a sequence of bits, bytes, lines or records whose meaning is defined by the
files creator and user.
A file allows one to write, read, save and retrieve the program and data on any
type of storage device.

11/10/2014

ChonuChinnus

File System
However, as a logical concept, a file is not stored on permanent media. Therefore,
after saving all the work, a file needs to be mapped on to the storage device.
(Note: This is a background work which is not seen by the user)

The system views all the work required to map the logical file to the secondary
storage.
The OS abstracts the actual storage of the program and data from the user and
provides a logical and convenient file concept.

11/10/2014

ChonuChinnus

File Structure
The basic element of data, field, is single valued item, for example, name, date,
employee ID and so on.
It is characterized by its length and data type.
When multiple fields are combined to form a meaningful collection, it is known
as record.
For example, a students record can be one record, consisting of fields such as
roll number, name, qualification and so on.
When such similar records are collected, it is known as a file.
File also has a name similar to a field or record. Thus, a file is treated as a single
entity that may be used by a programmer or application.
The files can also be flat in the form of an unstructured sequence of bytes, that
is structure has no fields or records. Thus, a file, composed of bytes, has no
records, and is looked upon as a sequence of bytes by the programs that use it.
UNIX and Windows use this kind of a file structure.
11/10/2014

ChonuChinnus

File Structure
But, the OS must support a required structure for a certain type of file. For
example, an executable file must have a defined structure, so that the OS can
determine the location in the memory to load the file and locate first instruction
to be executed.
Roll No
Name

Fields

Class
Address

Record

Roll No
Name
Class
Address

11/10/2014

File
ChonuChinnus

File Structure:

Internal Structure and Record Blocking

Locating an offset within a logical file may be difficult for an OS. Since the logical
file will be mapped to the secondary storage, it is better to define the internal
structure of a file in terms of the units of secondary storage.

In general, the disk is used for secondary storage and hence block is unit token
for the storage.
The block unit needs to mapped to the logical file structure as well.
For example, a file is considered a stream of bytes in UNIX. Each byte in the file
can be found having a start address of the file.
Consider the logical record size as 1 byte. Now assume the packing of some bytes
or records in the file into the disk blocks. For example, a group of 512 bytes is
packed into one block of disk. In this way, a file may be considered as a
sequence of blocks and therefore, all basic I/O functions are performed in terms
of blocks. This is known as Record Blocking.
11/10/2014

ChonuChinnus

File Structure:

Internal Structure and Record Blocking

The larger the size of a block, the more number of records will be mapped on to
the block of disk. In turn, the large number of bytes will be transferred in one
I/O operation. This is advantageous, in case the file is searched sequentially,
thereby, reducing the number of I/O operations as well.
Another issue regarding record blocking is whether the blocks should be fixed or
variable-size.
Based on this issue, there are three methods of blocking
Fixed Blocking
Variable-length Spanned Blocking
Variable-length Unspanned Blocking

11/10/2014

ChonuChinnus

File Structure:

Internal Structure and Record Blocking

Fixed Blocking: In this method, fixed sizes records are used. Therefore there may
be a mismatch between the sizes of records and blocks, leaving some unused
space in the blocks. This causes internal fragmentation. In the fig below R1, R2,
R3 fit in the fixed blocks but R4 does not, leaving some space in the last block.
The space may be left unused, if there is not enough space to allocate a block at
the end of the tract of the disk space. This method is advantageous when
sequential files are used.
R1

R2

R3

R4

Variable-length Spanned Blocking: In this method, variable-length records may


be considered instead of fixed size. Therefore, some records may span more than
once block in continuation. It happens when one block is smaller in size as
compared to the size of the records. For example, in the fig below R3 spans two
variable-sized blocks
R1
11/10/2014

R2

R3

R3

R4

ChonuChinnus

File Structure:

Internal Structure and Record Blocking

Variable-length Unspanned Blocking: In this method, records of variable lengths


are used, but spanning is not considered in case of small-size blocks. A small
size block is left unused, causing wastage of memory space, and the records are
allocated to a bigger block. It causes wastage of block space.
R1

11/10/2014

R2

R3

ChonuChinnus

R4

File Naming and File Types


A file needs to be named, so that a user can store and retrieve the information
from the storage device.
There are rules for naming of the files, which may vary from system to system. In
general, the file name has two parts, separated by a period(.).
The first part is the name of the file, defined by the user and second part is
known as an extension. The extension part usually indicates what type of file
has been created. The extension part is generally composed of three letters as in
MS-DOS.
An OS must recognize the type of the file, because the operations performed on it
depend on its type.

11/10/2014

ChonuChinnus

10

File Naming and File Types


Based on the extensions of a file name, there are several types of files, such as
the following:
Source Code File: It is used to write a program in the language chosen. For
example, C language program will be written in a file with the name filename.c
Object File: It is a compiled or machine language format based file which is
generated when the source code file has been compiled successfully. Its
extension may be .obj or .o.

Executable File: When an object file has been linked properly and is ready to
run, it is known as an executable file. Its extension may .exe, .com, .bin etc
Text File: A general text format-level document is known as a text file. Its
extension is .txt.
Batch File: It is a file consisting of some commands to be executed and given to
command interpreter. Its extension is .bat.
11/10/2014

ChonuChinnus

11

File Naming and File Types


Archive File: When a group of files are compressed in a single file, it is known as an
archive file and its extension may be .zip, .rar etc.
Multimedia File: It is a file containing audio or video information with probable
extensions .mpeg, .mov, .mp3, jpg etc
Many other types of files may also exist today, depending on the necessity and
support provided by the OS.
Regular File: Regular files contain the user information, which may be either in ASCII
or binary type.
Directory: Directory is a file type used to organize the list of files in a group i.e., it
organizes the files in a hierarchy. It is an ordinary file which can be read by any user
but write operation is permitted in file system only.
Special: This file contains no data but provides a mechanism that maps physical
devices to file names i.e., these are used to access I/O devices. Character Special
Files to map serial I/O devices like terminals, printers etc and Block Special Devices
to model devices like disks
11/10/2014

ChonuChinnus

12

File Attributes
Besides the name and data, a file has other attributes as well. These attributes
vary from system to system.
But some of the attributes are very common in every system, for example, data
and time of creation of a file. Some attribute types are as following:
General Information: Some attributes of a file are general. For example, name,
type, location, size, time and date of creation.
Protection-related Attributes: A file may be enabled with access protection. Users
cannot access it in their own way. Before accessing, they must know its access
rights. For example, read, write and execute permissions. Password to the file
and creator/owner of the file also contribute to protection attributes.
Flags: Some flags control or enable some specific property of a file. Some of them
are:
1. Read-only Flag: It is used for making a file read-only. It is 0 for read/write and
1 for read-only.
11/10/2014

ChonuChinnus

13

File Attributes
2. Hidden Flag: It is used to hide a file in the listing of the files. It is set for hiding
the file, otherwise, the file is displayed.
3. System Flag: It is used to designate a file as system file. It is set for making a
file system file, otherwise, the file is a normal one.
4. Archive Flag: It is used to keep track of whether the file has been backed up or
not. The OS sets it whenever a file is changed. The flag is 0, when the changed
file has been backed up.

5. Access Flag: It is used to convey how the file is accessed. It is set when the file
is accessed randomly, otherwise, the file is accessed sequentially.
Time of Last Change and Last Access: It is used to provide information about the
time when the file was last modified, and when it was last accessed.

11/10/2014

ChonuChinnus

14

File Operations
A file is of an abstract data type, the kind of operations that can be performed on
it must be known. The OS provides system calls for each operation to be
implemented on the file. The following are the operations that are performed on a
file:
Create a File: It is a file creation operation. Two steps are necessary to create a
file:
1.

Space in the file system must be found for the file.

2.

An entry for the new file must be made in the directory.

Write a File: The write operation needs the name of the file and the data to be
written. The OS must have a pointer in the file for reading and writing. The
system must keep a write pointer to the location in the file where the next write
is to take place. The write pointer must be updated whenever a write occurs.

11/10/2014

ChonuChinnus

15

File Operations
Saving a File: The contents of the file must be saved on the disk. For this, the OS
must look for space on the disk and then save it. The appropriate entry in the
directory, where the file is created, is also done.

Deleting a File: To delete a file, we search the directory for the named file.
Having found the associated directory entry, we release all file space, so that it
can be reused by other files, and erase the directory entry.
Open a File: Before a process uses a file for any operation, the file must be
opened. The OS fetches its attributes and list of disk addresses into the main
memory for quick access to open the file.
Close a File: A file, when not needed, for any access may be closed. The close
operation frees memory space for attributes and disk addresses. Closing the file
doesnt mean deleting it as the file has not been removed from the disk.

11/10/2014

ChonuChinnus

16

File Operations
Read a File: To read from a file, we use a system call that specifies the name of
the file and where (in memory) the next block of the file should be put. The
system needs to keep a read pointer to the location in the file where the next
read is to take place.
1.

Since a process is usually either reading from or writing to a file, the current operation
location can be kept as a per-process current-file-position pointer.

2.

Both the read and write operations use this same pointer, saving space and reducing system
complexity.

Append a File: This is similar to the writing a file with a difference that this
operation is performed only at the end of the file. The OS locates the end of the
file, using the pointer, and then appends the data to be written in the file.
Repositioning the Current Position Pointer: The directory is searched for the
appropriate entry, and the current-file-position pointer is repositioned to a given
value. Repositioning within a file need not involve any actual I/O. This file
operation is also known as a file seek.
11/10/2014

ChonuChinnus

17

Implementation of File Operations


To perform any operation on a file, the file needs to be opened. The following
data structures are used for opened files:
Open File Table: As the open operation fetches the attributes of the file to be
opened, the OS uses a data structure known as open file table (OFT), to keep the
information of an opened file. When an operation needs to be performed on the
file, it is specified through an index into this table, and therefore, it need not be
searched in its directory entry every time.
OFT is maintained per process i.e., it maintains the detail of every file opened by
a process. OFT stores the attributes of the opened file.
The OFT maintains a counter known as open_count that keeps the count of the
opened files. Each open operation increments this counter, and similarly each
close operation decrements this counter. When the counter reaches zero, the file
is no longer in use, and is closed.
11/10/2014

ChonuChinnus

18

Implementation of File Operations


System-wide Open File Table (SOFT): This is used to maintain the status of open
files, but on the system basis rather than the process. The status of the open
files is now counted for the system. An entry in the OFT points to a SOFT, where
the process-independent information about the open file is maintained, for
example, location of a file on the disk, file size and so on. Thus once a process
opens a file, the file gets entry in both OFT and SOFT.
Address of the File on Disk: The address of the file, where it resides on the disk,
may be needed every time the file is opened. Therefore, to avoid reading the
address from the disk repeatedly, its address is kept in the memory.

11/10/2014

ChonuChinnus

19

File Access
The files stored on the disk are required to be retrieved by the user. But there are
many ways to access a file. The file access depends on the blocking strategy on
the disk and the logical structuring of records. The following are some file access
methods:
Sequential File Access: The file is accessed sequentially, i.e., the information in
the files is accessed in the order it is stored in the file. Information in the file is
processed in order, one record after the other.
Editors and compilers usually access files sequentially.
Reads and writes make up the bulk of the operations on a file.
A read operation: readnext(): reads the next portion of the file and automatically
advances a file pointer, which tracks the I/O location.
Similarly, the write operation: writenext(): appends to the end of the file and
advances to the end of the newly written material (the new end of file).
11/10/2014

ChonuChinnus

20

File Access
Such a file can be reset to the beginning, and on some systems, a program may
be able to skip forward or backward n records for some integer n and perhaps
only for n= 1.

Sequential access, which is depicted in Figure below, is based on a tape model of


a file and works as well on sequential-access devices as it does on randomaccess ones.

11/10/2014

ChonuChinnus

21

File Access
Direct Access: A file is made up of fixed-length logical records that allow
programs to read and write records rapidly in no particular order.
The direct-access method is based on a disk model of a file, since disks allow
random access to any file block.
For direct access, the file is viewed as a numbered sequence of blocks or records.
Thus, we may read block 14, then read block 53, and then write block 7. There
are no restrictions on the order of reading or writing for a direct-access file.

Direct-access files are of great use for immediate access to large amounts of
information. Databases are often of this type. When a query concerning a
particular subject arrives, we compute which block contains the answer and
then read that block directly to provide the desired information.

11/10/2014

ChonuChinnus

22

File Access
For the direct-access method, the file operations must be modified to include the
block number as a parameter.
Thus, we have read(n), where n is the block number, rather than readnext(), and
write(n) rather than writenext().
An alternative approach is to retain readnext()and writenext(), as with sequential
access, and to add an operation positionfile(n) where n is the block number.
Then, to effect a read(n),we would positionfile(n)and then readnext().

The block number provided by the user to the operating system is normally a
relative block number. A relative block number is an index relative to the
beginning of the file. Thus, the first relative block of the file is 0, the next is 1,
and so on, even though the absolute disk address may be 14703 for the first
block and 3192 for the second. The use of relative block numbers allows the
operating system to decide where the file should be placed and helps to prevent
the user from accessing portions of the file system that may not be part of her
file.
11/10/2014

ChonuChinnus

23

File Access

Simulation of Sequential Access on a Direct Access File

11/10/2014

ChonuChinnus

24

File Access
Indexed Access: The indexed access, is like an index in the back of a book,
contains pointers to the various blocks.
To find a record in the file, we first search the index and then use the pointer to
access the file directly and to find the desired record.
For example, IBMs indexed sequential-access method (ISAM) uses a small
master index that points to disk blocks of a secondary index. The secondary
index blocks point to the actual file blocks. The file is kept sorted on a defined
key.
To find a particular item, we first make a binary search of the master index,
which provides the block number of the secondary index.
This block is read in, and again a binary search is used to find the block
containing the desired record. Finally, this block is searched sequentially.

11/10/2014

ChonuChinnus

25

File Access
In this way, any record can be located from its key by at most two direct-access
reads.
Figure below, shows a similar situation as implemented by VMS index and
relative files.

11/10/2014

ChonuChinnus

26

Directory and Disk Structure


A storage device can be used in its entirety for a file system. It can also be
subdivided for finer-grained control. For example, a disk can be partitioned into
quarters, and each quarter can hold a separate file system.

Storage devices can also be collected together into RAID sets that provide
protection from the failure of a single disk . Sometimes, disks are subdivided and
also collected into RAID sets.
Partitioning is useful for limiting the sizes of individual file systems, putting
multiple file-system types on the same device, or leaving part of the device
available for other uses, such as swap space or unformatted (raw) disk space.
A file system can be created on each of these parts of the disk. Any entity
containing a file system is generally known as a volume. The volume may be a
subset of a device, a whole device, or multiple devices linked together into a
RAID set.
11/10/2014

ChonuChinnus

27

Directory and Disk Structure


Each volume can be thought of as a virtual disk. Volumes can also store multiple
operating systems, allowing a system to boot and run more than one operating
system. For Example, Dual Boot Operating Systems in PCs or Laptops.

Each volume that contains a file system must also contain information about the
files in the system. This information is kept in entries in a device directory or
volume table of contents.
The device directory (or directory) records information such as name, location,
size, and type for all files on that volume.

11/10/2014

ChonuChinnus

28

Directory and Disk Structure:

Storage Structure

A general-purpose computer system has multiple storage devices, and those


devices can be sliced up into volumes that hold file systems. Computer systems
may have zero or more file systems, and the file systems may be of varying types.

For example, a typical Solaris system may have dozens of file systems of a dozen
different types.
Consider the types of file systems in the Solaris OS:
tmpfs: A temporary file system that is created in volatile main memory and has its contents
erased if the system reboots or crashes.
objfs: A virtual file system (essentially an interface to the kernel that looks like a file system)
that gives debuggers access to kernel symbols.
ctfs: A virtual file system that maintains contract information to manage which processes start
when the system boots and must continue to run during operation.
lofs: A loop back file system that allows one file system to be accessed in place of another one.
procfs: A virtual file system that presents information on all processes as a file system.
ufs, zfs: General-purpose file systems.
11/10/2014

ChonuChinnus

29

Directory and Disk Structure:

Directory Overview

The directory can be viewed as a symbol table that translates file names into
their directory entries.
A symbol table is a data structure used by a language translator such as a
compiler or interpreter, where each identifier in a program's source code is
associated with information relating to its declaration or appearance in the
source, such as its type, scope level and sometimes its location.
The organization must allow us to insert entries, to delete entries, to search for a
named entry, and to list all the entries in the directory.
The operations that are to be performed on a directory:
Search for a File: We need to be able to search a directory structure to find the
entry for a particular file. Since files have symbolic names, and similar names
may indicate a relationship among files, we may want to be able to find all files
whose names match a particular pattern.
11/10/2014

ChonuChinnus

30

Directory and Disk Structure:

Directory Overview

Create a File: New files need to be created and added to the directory
Delete a File: When a file is no longer needed, we want to be able to remove it
from the directory.
List a Directory: We need to be able to list the files in a directory and the
contents of the directory entry for each file in the list.
Rename a File: Since the name of a file represents its contents to its users, we
must be able to change the name when the contents or use of the file changes.
Renaming a file may also allow its position within the directory structure to be
changed.

11/10/2014

ChonuChinnus

31

Directory Structure:

Common Schemes For Defining


Logical Structure of Directory

The most common schemes for defining the logical structure of a directory:

Single-Level Directory
Two Level Directory
Tree Structured Directory
Acyclic Graph Directory
General Graph Directory

11/10/2014

ChonuChinnus

32

Das könnte Ihnen auch gefallen