Sie sind auf Seite 1von 118

Understanding Operating Systems

Sixth Edition

Chapter 8
File Management
Learning Objectives

After completing this chapter, you should be able to


describe:
• The fundamentals of file management and the
structure of the file management system
• File-naming conventions, including the role of
extensions
• The difference between fixed-length and variable-
length record format

Understanding Operating Systems, Sixth Edition 2


Learning Objectives (cont'd.)

• The advantages and disadvantages of contiguous,


noncontiguous, and indexed file storage techniques
• Comparisons of sequential and direct file access
• Access control techniques and how they compare
• The role of data compression in file storage

Understanding Operating Systems, Sixth Edition 3


The File Manager
• The File Manager controls every file in the system.
• The File Manager (File Management System) is the
software responsible for creating, deleting,
modifying, and controlling access to files, as well as
for managing the resources used by the files.
• Provides support for:
– Libraries of programs and data to online users;
– Spooling operations;
– Interactive computing.
• These functions are performed in collaboration with
the Device Manager.

Understanding Operating Systems, Sixth Edition 4


Responsibilities of the File Manager
• The File Manager is in charge of the system’s
physical components, its information resources, and
the policies used to store and distribute the files.
• To carry out its responsibilities, it must perform four
tasks:
– Keep track of where each file is stored.
– Use a policy that will determine:
• Where and how the files will be stored;
• Efficiently use the available storage space;
• Provide efficient file access to the files.
– Allocate each file when a user has been cleared for
access to it;
• Record its use.
Understanding Operating Systems, Sixth Edition 5
Responsibilities of the File Manager
(cont’d)
– Deallocate the file when the file is to be returned to
storage;
– Communicate the file availability to others who may
be waiting for it.
• The File Manager keeps track of its files with
directories that contain:
– the filename;
– its physical location in secondary storage;
– Other important information about each file.

Understanding Operating Systems, Sixth Edition 6


Responsibilities of the File Manager
(cont'd.)
• The File Manager’s policy determines:
– Where each file is stored;
– How the system, and its users, will be able to access
them.
• Via device-independent commands.
– The policy must determine who will have access to
what material.
• This involves two factors:
– Flexibility of access to the information;
– Its subsequent protection.

Understanding Operating Systems, Sixth Edition 7


Responsibilities of the File Manager
(cont'd.)
• The File Manager does this by:
– Allowing access to shared files;
– Providing distributed access;
– Allowing users to browse through public directories.
• The OS must:
– Protect its files against system malfunctions;
– Provide security checks via account numbers and
passwords to preserve the integrity of the data;
– Safeguard against tampering.

Understanding Operating Systems, Sixth Edition 8


Responsibilities of the File Manager
(cont'd.)
• The computer system allocates a file by activating
the appropriate secondary storage device and
loading the file into memory while updating its
records of who is using what file.
• The File Manager deallocates a file by updating the
file tables and rewriting the file (if revised) to the
secondary storage device.
• Any processes waiting to access the file are then
notified of its availability.

Understanding Operating Systems, Sixth Edition 9


Responsibilities of the File Manager
Definitions
• Field
– A group of related bytes that can be identified by the
user with a name, type, and size.
• Record
– A group of related fields.
• File
– A group of related records that contains information to
be used by specific application programs to generate
reports.
– Flat file
• No connections to other files, no dimensionality

Understanding Operating Systems, Sixth Edition 10


Responsibilities of the File Manager
Definitions
• Database
– Groups of related files that are interconnected at
various levels to give users flexibility of access to the
stored data.
– If the user’s database requires a specific structure,
the File Manager must be able to support it.
• Program files
– Contain instructions;
– Data files contain data;
– As far as storage is concerned, the File Manager
treats them the same way.

Understanding Operating Systems, Sixth Edition 11


Responsibilities of the File Manager
Definitions
• Directories
– Special files with listings of filenames and their
attributes.
– Data collected to monitor system performance and
provide for system accounting is collected into files.
– Every program and data file accessed by the
computer system, as well as every piece of computer
software, is treated as a file.

Understanding Operating Systems, Sixth Edition 12


Responsibilities of the File Manager
Definitions

Understanding Operating Systems, Sixth Edition 13


Interacting with the File Manager
• The user communicates with the File Manager,
which responds to specific commands.
– OPEN, DELETE, RENAME, COPY
• Files can also be created with other system-specific
terms:
– The first time a user gives the command to save a file,
it’s actually the CREATE command.
– In other OS, the OPEN NEW command within a
program indicates to the File Manager that a file must
be created.
• These commands and many more were designed to
be very simple to use.
• They are device independent.
Understanding Operating Systems, Sixth Edition 14
Interacting with the File Manager
(cont’d)
• Device Independent
– To access a file, the user doesn’t need to know its
exact physical location on the disk pack
• Cylinder
• Surface
• Sector
– The medium in which it’s stored
• Archival tape
• Magnetic disk
• Optical disc
• Flash Storage
– The network specifics.
Understanding Operating Systems, Sixth Edition 15
Interacting with the File Manager
(cont'd.)

Understanding Operating Systems, Sixth Edition 16


Interacting with the File Manager
(cont'd.)
• File access is a complex process.
• Each logical command is broken down into a
sequence of signals that trigger the step-by-step
actions performed by the device and supervise the
progress of the operation by testing the device’s
status.

Understanding Operating Systems, Sixth Edition 17


Interacting with the File Manager
(cont'd.)
• For example, when a user’s program issues a
command to read a record from a disk, the READ
instruction has to be decomposed into the following
steps:
– Move the read/write heads to the cylinder or track
where the record is to be found.
– Wait for the rotational delay until the sector containing
the desired record passes under the read/write head.
– Activate the appropriate read/write head and read the
record.
– Transfer the record to main memory.
– Send a flag to indicate that the device is free to satisfy
another request.
Understanding Operating Systems, Sixth Edition 18
Interacting with the File Manager
(cont'd.)
• While all of this is going on, the system must check
for possible error conditions.
• The File Manager does all of this, freeing the user
from including in each program the low-level
instructions for every device to be used.
• Without the File Manager, every program would
need to include instructions to operate all of the
different types of devices and every model within
each type.
• It would be impractical to require each program to
include these minute operational details.
– The advantage of device independence.

Understanding Operating Systems, Sixth Edition 19


Interacting with the File Manager
Typical Volume Configuration
• Normally the active files for a computer system
reside on secondary storage:
– CDs
– DVDs
– Floppy disks
– USB devices
– Hard disks
– Nonremovable disk packs
– Other removable media
• Files that aren’t frequently used can be stored offline
and mounted only when the user specifically
requests them.
Understanding Operating Systems, Sixth Edition 20
Interacting with the File Manager
Typical Volume Configuration
• Each storage unit (removable or not) is considered a
volume.
• Each volume can contain several files.
– Multifile volumes.
• Some files are extremely large and are contained in
several volumes.
– Multivolume files.
• Each volume in the system is given a name.
• The File Manager writes this name and other
descriptive information on an easy-to-access place
on each unit.

Understanding Operating Systems, Sixth Edition 21


Interacting with the File Manager
Typical Volume Configuration
– The innermost part of the CD or DVD;
– The beginning of the tape;
– The first sector of the outermost track of the disk pack
• Once identified, the OS can interact with the storage
unit.

Understanding Operating Systems, Sixth Edition 22


Interacting with the File Manager
Typical Volume Configuration

Understanding Operating Systems, Sixth Edition 23


Interacting with the File Manager
Typical Volume Configuration
• Master file directory (MFD)
– Is stored immediately after the volume descriptor.
– Lists the names and characteristics of every file
contained in that volume.
– The filenames in the MFD can refer to:
• Program files;
• Data files;
• System files;
• Subdirectories;
– If the File Manager supports subdirectories.
– The remainder of the volume is used for file storage.

Understanding Operating Systems, Sixth Edition 24


Interacting with the File Manager
Typical Volume Configuration
• Master File Directory (MFD)
– The first OSs supported only a single directory per
volume.
• This directory was created by the File Manager and
contained the names of files, usually organized in
alphabetical, spatial, or chronological order.
– Major Disadvantages:
• It would take a long time to search for an individual file.
• If the user had more than 256 small files stored in the
volume, the directory space (with a 256 filename limit)
would fill before the disk storage space filled up. The
user would then receive a message of “disk full” when
only the directory itself was full.
Understanding Operating Systems, Sixth Edition 25
Interacting with the File Manager
Typical Volume Configuration
• Master File Directory (MFD) (cont’d)
– Major Disadvantages (contid):
• Users couldn’t create subdirectories to group the files
that were related.
• Multiple users couldn’t safeguard their files from other
users because the entire directory was freely made
available to every user in the group on request.
• Each program in the entire directory needed a unique
name, even those directories serving many users.
– Only one person using that directory could have a
program named Program1.

Understanding Operating Systems, Sixth Edition 26


Interacting with the File Manager
Introducing Subdirectories
• File Managers create an MFD for each volume that
can contain entries for both files and subdirectories.
• A Subdirectory is created when a user opens an
account to access the computer system.
• Although the user directory is treated as a file, its
entry in the MFD is flagged to indicate to the File
Manager that this file is really a subdirectory and
has unique properties:
– Its records are filenames pointing to files.

Understanding Operating Systems, Sixth Edition 27


Interacting with the File Manager
Introducing Subdirectories (cont’d)
• Although this was an improvement from the single
directory scheme, it didn’t solve the problems
encountered by prolific users who wanted to group
their files in a logical order to improve the
accessibility and efficiency of the system.
• Today’s File Managers encourage users to create
their own subdirectories, so related files can be
grouped together.
– Folders
• This structure is an extension of the previous
two-level directory organization, and it’s
implemented as an upside-down tree (Figure 8.4)
Understanding Operating Systems, Sixth Edition 28
Interacting with the File Manager
Introducing Subdirectories (cont'd.)

Understanding Operating Systems, Sixth Edition 29


Interacting with the File Manager
Introducing Subdirectories (cont'd.)
• Tree structures allow the system to efficiently search
individual directories because there are fewer
entries in each directory.
– However, the path to the requested file may lead
through several directories.
• For every file request, the MFD is the point of entry.
• It’s accessible only by the OS.
• When the user wants to access a specific file:
– The filename is sent to the File Manager.
– The File Manager first searches the MFD for the
user’s directory.
– It then searches the user’s directory and any
subdirectories for the requested file and its location.
Understanding Operating Systems, Sixth Edition 30
Interacting with the File Manager
Introducing Subdirectories (cont'd.)
• Each file entry in every directory contains
information describing the file.
• File Descriptor:
– Filename – Within a single directory, filenames must
be unique; in some OS, the filenames are case
sensitive.
– File type – The organization and usage that are
dependent on the system (files and directories).
– File size – Although it could be computed from other
information, the size is kept here for convenience.
– File location – Identification of the first physical block
(or all blocks) where the file is stored.
– Date and time of creation
Understanding Operating Systems, Sixth Edition 31
Interacting with the File Manager
Introducing Subdirectories (cont'd.)
• File Descriptor:
– Owner
– Protection information - Access restrictions, based on
who is allowed to access the file and what type of
access is allowed.
– Record size – Its fixed size or its maximum size,
depending on the type of record.

Understanding Operating Systems, Sixth Edition 32


Interacting with the File Manager
File-Naming Conventions
• A file’s name can be much longer than it appears.
• Depending on the File Manager, it can have from
two to many components.
• The two components common to many filenames
are:
– Relative filename
– An extension
• The relative filename is the name that differentiates
it from other files in the same directory.
– Can vary in length from one to many characters and
can include letters of the alphabet, as well as digits.

Understanding Operating Systems, Sixth Edition 33


Interacting with the File Manager
File-Naming Conventions (cont’d)
• Relative Filename
– Every OS has specific rules that affect the length of
the relative name and the types of characters
allowed.
– Most OS allow names with dozens of characters
including spaces, hyphens, underlines, and certain
other keyboard characters.

Understanding Operating Systems, Sixth Edition 34


Interacting with the File Manager
File-Naming Conventions (cont'd.)
• Extensions
– Some OS require an extension that’s appended to the
relative filename.
– It’s usually two or three characters long and is
separated from the relative name by a period.
– Its purpose is to identify the type of file or its contents.
– Example
• BASIA_TUNE.MP3
• TAKE OUT MENU.RTF
• TAKE OUT MENU.DOC

Understanding Operating Systems, Sixth Edition 35


Interacting with the File Manager
File-Naming Conventions (cont'd.)
• Extensions
– If an extension is incorrect or unknown
• Requires user intervention (Figure 8.5).
– Some extensions (EXE, BAT, COB, FOR) are
restricted by certain OS because they serve as a
signal to the system to use a specific compiler or
program to run these files.

Understanding Operating Systems, Sixth Edition 36


Interacting with the File Manager
File-Naming Conventions (cont'd.)

Understanding Operating Systems, Sixth Edition 37


Interacting with the File Manager
File-Naming Conventions (cont'd.)
• There may be other components required for a file’s
complete name.
– These other components are OS specific.
• Depending on the OS, there may be other
components required for a file’s complete name:

INVENTORY_COST.DOC

Understanding Operating Systems, Sixth Edition 38


Interacting with the File Manager
File-Naming Conventions (cont'd.)
• INVENTORY_COST.DOC
– Windows:
• The file’s complete name is composed of its relative
name and extension, preceded by the drive label and
directory name:

– C:\IMFST\FLYNN\INVENTORY_COST.DOC

» This indicates to the system that the file


INVENTORY_COST.DOC requires a word
processing application program, and it can be found
in the directory IMFST; subdirectory FLYNN in the
volume residing on drive C.
Understanding Operating Systems, Sixth Edition 39
Interacting with the File Manager
File-Naming Conventions (cont'd.)
• INVENTORY_COST.DOC
– UNIX/Linux
/usr/imfst/flynn/inventory_cost.doc
• The first entry is represented by the forward slash (/).
– This represents a special master directory called the root.
• Next is the name of the first subdirectory:
usr/imfst
• Followed by a sub-subdirectory:
/flynn
• The final entry is the file’s relative name:
inventory_cost.doc

Understanding Operating Systems, Sixth Edition 40


Interacting with the File Manager
File-Naming Conventions (cont'd.)
• The names tend to grow in length as the File
Manager grows in flexibility.
• The folders on a system with a GUI (Windows or
Macintosh) are actually directories or subdirectories.
– When someone creates a folder, the system creates a
subdirectory in the current directory or folder.

Understanding Operating Systems, Sixth Edition 41


File Organization

• The arrangement of records within a file.


– All files are composed of records.
• When a user gives a command to modify the
contents of a file, it’s actually a command to access
records within the file.

Understanding Operating Systems, Sixth Edition 42


File Organization
Record Format
• All files are composed of records.
• Within each file, the records are all presumed to
have the same format:
– Fixed length
– Variable length
• The records, regardless of their format can be
blocked or unblocked.

Understanding Operating Systems, Sixth Edition 43


File Organization
Record Format
• Fixed-Length Records
– Are the most common because they’re the easiest to
access directly.
• Ideal for data files.
– The critical aspect of fixed-length records is the size
of the record.
• If the record is too small, smaller than the number of
characters to be stored in the record, the leftover
characters are truncated.
• If the record is too large, larger than the number of
characters to be stored, storage space is wasted.

Understanding Operating Systems, Sixth Edition 44


File Organization
Record Format
• Variable-Length Records
– Don’t leave empty storage space and don’t truncate
any characters
• Eliminates the two disadvantages of fixed-length
records.
– While they can be easily read (one after the other),
they’re difficult to access directly because it’s hard to
calculate exactly where the record is located.

Understanding Operating Systems, Sixth Edition 45


File Organization
Record Format
• Variable-Length Records
– Used most frequently in files that are likely to be
accessed sequentially
• Text files
• Program files
• Files that use an index to access their records.
– The record format, how it’s blocked, and other related
information is kept in the file descriptor.

Understanding Operating Systems, Sixth Edition 46


File Organization
Record Format

Understanding Operating Systems, Sixth Edition 47


File Organization
Physical File Organization
• Describes the way records are arranged and the
characteristics of the medium used to store it.
• On magnetic disks (hard drives), files can be
organized in one of several ways:
– Sequential
– Direct
– Indexed sequential.

Understanding Operating Systems, Sixth Edition 48


File Organization
Physical File Organization
• To select the best of these file organizations, the
programmer considers these practical
characteristics:
– Volatility of data
• The frequency with which additions and deletions are
made.
– Activity of the file
• The percentage of records processed during a given
run.
– Size of the file

Understanding Operating Systems, Sixth Edition 49


File Organization
Physical File Organization
• Practical Characteristics (cont’d):
– Response time
• The amount of time the user is willing to wait before the
requested operation is complete.
– Especially crucial when doing time-sensitive searches.

Understanding Operating Systems, Sixth Edition 50


File Organization
Physical File Organization
• Sequential Record Organization
– The easiest to implement because records are stored
and retrieved serially, one after the other.
– To find a specific record, the file is searched from its
beginning until the requested record is found.

Understanding Operating Systems, Sixth Edition 51


Physical File Organization (cont’d)
• Sequential Record Organization (cont’d)
– To speed the process, some optimization features
may be built into the system.
• Select a key field from the record and then sort the
records by that field before storing them.
• Later, when a user requests a specific record:
– The system searches only the key field of each record in
the file.
– The search is ended when either an exact match is found
or the key field for the requested record is smaller than
the value of the record last compared, in which case the
message “record not found” is sent to the user and the
search is terminated.

Understanding Operating Systems, Sixth Edition 52


Physical File Organization (cont’d)
• Sequential Record Organization (cont’d)
– Although this technique aids the search process, it
complicates file maintenance because the original
order must be preserved every time records are
added or deleted.
– To preserve the physical order, the file must be
completely rewritten or maintained in a sorted fashion
every time it’s updated.

Understanding Operating Systems, Sixth Edition 53


Physical File Organization (cont'd.)
• Direct Record Organization:
– Uses direct access files, which can be implemented
only on direct access storage devices.
– These files give users the flexibility of accessing any
record in any order without having to begin a search
from the beginning of the file to do so.
• Random organization
• Random access files
– Records are identified by their relative addresses
• Their addresses relative to the beginning of the file
– These logical addresses are computed when the
records are stored, and then again when the records
are retrieved.
Understanding Operating Systems, Sixth Edition 54
Physical File Organization (cont'd.)
• Direct Record Organization:
– The user identifies a field (or combination of fields) in
the record format and designates it as the key field
because it uniquely identifies each record.
– The program used to store the data follows a set of
instructions, called a hashing algorithm, that
transforms each key into a number – the record’s
logical address.
– This is given to the Filer Manager, which takes the
necessary steps to translate the logical address into a
physical address (cylinder, surface, and record
numbers), preserving the file organization.
– The same procedure is used to retrieve a record.
Understanding Operating Systems, Sixth Edition 55
Physical File Organization (cont'd.)
• Direct Record Organization:
– A direct access file can also be accessed sequentially,
by starting at the first relative address and going to
each record down the line.
– Direct access files can be updated more quickly than
sequential files because the records can be quickly
rewritten to their original addresses after modifications
have been made.
• There’s no need to preserve the order of the records so
adding or deleting them takes very little time.

Understanding Operating Systems, Sixth Edition 56


Physical File Organization (cont'd.)
• Direct Record Organization:
– The problem with hashing algorithms is that several
records with unique keys may generate the same
logical address and then there’s a collision.
– When that happens, the program must generate
another logical address before presenting it to the File
Manager for storage.
– Records that collide are stored in an overflow area
that was set aside when the file was created.
– Although the program does all the work of linking the
records from the overflow area to their corresponding
logical address, the File Manager must handle the
physical allocation of space.
Understanding Operating Systems, Sixth Edition 57
Physical File Organization (cont'd.)

Understanding Operating Systems, Sixth Edition 58


Physical File Organization (cont'd.)
• Direct Record Organization:
– The maximum size of the file is established when it’s
created, and eventually either the file might become
completely full or the number of records stored in the
overflow might become so large that the efficiency of
retrieval is lost.
• At that time, the file must be reorganized and rewritten,
which requires intervention by the File Manager.

Understanding Operating Systems, Sixth Edition 59


Physical File Organization (cont'd.)
• Indexed Sequential Record Organization
– Combines the best of sequential and direct access.
– It’s created and maintained through an Indexed
Sequential Access Method (ISAM) application.
– Removes the burden of handling overflows and
removes record order from the shoulders of the
programmer.
– Doesn’t create collisions because it doesn’t use the
results of the hashing algorithm to generate a
record’s address.

Understanding Operating Systems, Sixth Edition 60


Physical File Organization (cont'd.)
• Indexed Sequential Record Organization
– It generates an index file through which the records
are retrieved.
– This organization divides an ordered sequential file
into blocks of equal size.
• Their size is determined by the File Manager to take
advantage of physical storage devices and to optimize
retrieval strategies.
– Each entry in the index file contains the highest
record key and the physical location of the data
block where this record, and the records with smaller
keys are stored.

Understanding Operating Systems, Sixth Edition 61


Physical File Organization (cont'd.)
• Indexed Sequential Record Organization
– To access any record in the file, the system begins
by searching the index file and then goes to the
physical location indicated by that entry.
– The index file acts as a pointer to the data file.
– An indexed sequential file also has overflow areas
spread throughout the file so expansion of existing
records can take place and new records can be
located in close physical sequence as well as in
logical sequence.
– Another overflow area is located apart from the main
data area but is used only when the other overflow
areas are completely filled.
Understanding Operating Systems, Sixth Edition 62
Physical File Organization (cont'd.)
• Indexed Sequential Record Organization
– When retrieval time becomes too slow, the file has to
be reorganized.
• Usually performed b y maintenance software.
• For most dynamic files, indexed sequential is the
organization of choice because it allows both direct
access to a few requested records and sequential
access to many.

Understanding Operating Systems, Sixth Edition 63


Physical Storage Allocation
• The File Manager must work with files not just as
whole units but also as logical units.
• Records within a file must have the same format but
they can vary in length.
• Records are subdivided into fields.
• Their structure is managed by application programs
and not the OS.
– An exception is made for those systems that are
heavily oriented to database applications, where the
File Manager handles field structure.
• When we talk about file storage, we are actually
referring to record storage.
• The File Mgr and the Device Mgr have to cooperate
to ensure successfully record storage and retrieval.
Understanding Operating Systems, Sixth Edition 64
Physical Storage Allocation (cont'd.)

Understanding Operating Systems, Sixth Edition 65


Physical Storage Allocation
Contiguous Storage
• Records stored one after another.
• Used in early OS
– Advantages:
• Any record can be found and read once its starting
address, size are known (Streamlined Directory)
• Ease of direct access because every part of the file is
stored in the same compact area.

Understanding Operating Systems, Sixth Edition 66


Physical Storage Allocation
Contiguous Storage
– Disadvantages:
• A file can’t be expanded un less there’s empty space
available immediately following it (Figure 8.9).
– Room for expansion must be provided when the file is
created.
– If these’s not enough room, the entire file must be
recopied to a larger section of the disk every time
records are added.
• Fragmentation
– Can be overcome by compacting and rearranging files.
– Files can’t be accessed while compaction is taking
place.

Understanding Operating Systems, Sixth Edition 67


Physical Storage Allocation
Contiguous Storage
– The File Manager keeps track of the empty storage
areas by treating them as files.
• They’re entered in the directory but are flagged to
differentiate them from real files.
– Usually the directory is kept in order by sector
number, so adjacent empty areas can be combined
into one large free space.

Understanding Operating Systems, Sixth Edition 68


Physical Storage Allocation
Noncontiguous Storage
• Allows files to use any storage space available on
the disk.
• A File’s records are stored in a contiguous manner,
only if there’s enough empty space.
• Any remaining records and all other additions to the
file are stored in other sections of the disk.
– Extents of the file
• are linked together with pointers;
• The physical size of each extent is determined by the
OS:
• Usually 256 bytes
– Or another power of two – bytes.
Understanding Operating Systems, Sixth Edition 69
Physical Storage Allocation
Noncontiguous Storage
• File extents are usually linked in one of two ways:
– Storage Level
• Each extent points to the next one in the sequence
(Figure 8.10).
• The Directory entry consists of the:
– Filename
– Storage location of the first extent
– Location of the last extent
– Total number of extents (not counting first).

Understanding Operating Systems, Sixth Edition 70


Physical Storage Allocation
Noncontiguous Storage

Understanding Operating Systems, Sixth Edition 71


Physical Storage Allocation
Noncontiguous Storage
• File extents are usually linked in one of two ways
(cont’d):
– Directory Level
• Each extent is listed with its physical address, its size,
and a pointer to the next extent.
• A null pointer indicates that it’s the last one (Shown in
Figure 8.11 as a hyphen (-)).
• Although both noncontiguous allocation schemes
eliminate external storage fragmentation and the
need for compaction, they don’t support direct
access because there’s no easy way to determine
the exact location of a specific record.
Understanding Operating Systems, Sixth Edition 72
Physical Storage Allocation
Noncontiguous Storage

Understanding Operating Systems, Sixth Edition 73


Physical Storage Allocation
Noncontiguous Storage
• Files are usually declared to be either sequential or
direct when they’re created.
– so the File Manager can select the most efficient
method of storage allocation:
• Contiguous for direct files;
• Noncontiguous for sequential.
– OSs must have the capability to support both
storage allocation schemes.
– Eliminates external storage fragmentation
– Eliminates need for compaction

Understanding Operating Systems, Sixth Edition 74


Physical Storage Allocation
Noncontiguous Storage
• Files are usually declared to be either sequential or
direct when they’re created.
– Disadvantage
• No direct access support
– Cannot determine specific record’s exact location

Understanding Operating Systems, Sixth Edition 75


Physical Storage Allocation
Indexed Storage
• Allows direct record access by bringing together the
pointers linking every extent of that file into an index
block.
• Every file has its own index block, which consists of:
– The addresses of each disk sector that make up the
file.
• The index lists each entry in the same order in which
the sectors are linked (Figure 8.12).
– The third entry in the index block corresponds to the
third sector making up the file.

Understanding Operating Systems, Sixth Edition 76


Physical Storage Allocation
Indexed Storage

Understanding Operating Systems, Sixth Edition 77


Physical Storage Allocation
Indexed Storage
– When a file is created, the pointers in the index block
are all set to null.
– As each sector is filled, the pointer is set to the
appropriate sector address.
• The address is removed from the empty space list and
copied into its position in the index block.
• The scheme supports both sequential and direct
access.
– It doesn’t necessarily improve the use of storage
space because each file must have an index block
• Usually the size of one disk sector.

Understanding Operating Systems, Sixth Edition 78


Physical Storage Allocation
Indexed Storage
– For larger files with more entries, several levels of
indexes can be generated in which case, to find a
desired record:
• The File Manager accesses the first index;
– The highest level.
– Which points to a second index (lower level)
– Which points to an even lower-level index
– Which eventually points to the data record.

Understanding Operating Systems, Sixth Edition 79


Access Methods
• Dictated by a file’s organization.
– The most flexibility is allowed with indexed sequential
files;
– The least flexibility is with sequential files.
• A file that has been organized in sequential fashion
can support only sequential access to its records.
• These records can be of either fixed or variable length.
• The File Manager uses the address of the last byte
read to access the next sequential record.
– The Current byte address (CBA) must be updated
every time a record is accessed.
• Such as when a READ command is executed.
• Figure 8.13 shows the difference between storage
of fixed-length and of variable-length records.
Understanding Operating Systems, Sixth Edition 80
Access Methods (cont'd.)

Understanding Operating Systems, Sixth Edition 81


Access Methods
Sequential Access
• Sequential Access of Fixed-Length Records:
– The CBA is updated simply by incrementing it by the
record length (RL), which is constant.
• CBA = CBA + RL
• Sequential Access of Variable-length records:
– The File Manager adds the length of the record (RL)
plus the numbers of bytes used to hold the record
length (N, which holds the constant shown as m, n, p,
or q, in Figure 8.13) to the CBA,
• CBA = CBA + N + RL

Understanding Operating Systems, Sixth Edition 82


Access Methods
Direct Access
• If a file is organized in direct fashion, it can be
accessed in either direct or sequential order if the
records are of fixed length.
• In the case of direct access with fixed-length
records:
– The CBA can be computed directly from the record
length and the desired record number (information
provided through the READ command) minus 1.
• CBA = (RN – 1) * RL
– If we’re looking for the beginning of the eleventh record
and the fixed record length is 25 bytes, the CBA would
be:
» (11-1) * 25 = 250
Understanding Operating Systems, Sixth Edition 83
Access Methods
Direct Access
• If the file is organized for direct access with variable-
length records:
– It’s virtually impossible to access a record directly
because the address of the desired record cannot be
easily computed.
• To access a record:
– The File Manager must do a sequential search
through the records.
– It becomes a half-sequential read through the file
because the File Manager could save the address of
the last record accessed.

Understanding Operating Systems, Sixth Edition 84


Access Methods
Direct Access
• To access a record (cont’d):
– When the next request arrives, it could search
forward from the CBA if the address of the desired
record was between the CBA and the end of the file.
– Otherwise, the search would start from the beginning
of the file.
• An alternative is for the File Manager to keep a table
of record numbers and their CBAs.
– To fill a request:
• The table is searched for the exact storage location of
the desired record.

Understanding Operating Systems, Sixth Edition 85


Access Methods
Direct Access
• Records in an Indexed sequential file can be
accessed either sequentially or directly.
– Either of the procedures to compute the CBA
presented would apply with one extra step:
• The Index file must be searched for the pointer to the
block where the data is stored.
– Because the index file is smaller than the data file, it
can be kept in main memory, and a quick search can
be performed to locate the block where the desired
record is located.
– The block can then be retrieved from secondary
storage, and the beginning byte address of the record
can be calculated.
Understanding Operating Systems, Sixth Edition 86
Levels in a File Management System

• The efficient management of files can’t be separated


from the efficient management of the devices that
house them.
• The highest level module is called the “basic file
system”.
– It passes information through the access control
verification module to the logical file system.
• Which notifies the physical file system;
• Which works with the Device Manager.

Understanding Operating Systems, Sixth Edition 87


Levels in a File Management System

Understanding Operating Systems, Sixth Edition 88


Levels in a File Management System

• Each level of the file management system is


implemented using structured and modular
programming techniques that also set up a
hierarchy.
• The higher positioned modules pass information to the
lower modules.
• They, in turn, can perform the required service and
continue the communication down the chain to the
lowest module.
• Which communicates with the physical device and
interacts with the Device Manager.
• Only then is the record made available to the user.
Understanding Operating Systems, Sixth Edition 89
Levels in a File Management System

• Each of the modules can be further subdivided into


more specific tasks:

READ RECORD NUMBER 7 FROM FILE CLASSES INTO STUDENT

• CLASSES is the name of a direct access file


previously opened for input.
• STUDENT is a data record previously defined within
the program and occupying specific memory
locations.

Understanding Operating Systems, Sixth Edition 90


Levels in a File Management System

• Because the file has already been opened, the file


directory has already been searched to verify the
existence of CLASSES, and pertinent information
about the file has been brought into the OS’s active
file table.
• This information includes:
– Record Size;
– The address of its first physical record;
– Its protection;
– Access control information.

Understanding Operating Systems, Sixth Edition 91


Levels in a File Management System

• This information is used by the basic file system,


which activates the access control verification
module to verify that this user is permitted to
perform this operation with this file.
– If access is allowed, information and control are
passed along to the logical file system.
– If access is not allowed, a message saying “access
denied” is sent to the user.
• The logical file system transforms the record number
to its byte address:
CBA = (RN-1) * RL
Understanding Operating Systems, Sixth Edition 92
Levels in a File Management System

• This result, together with the address of the first


physical record and, in the case where records are
blocked, the physical block size, is passed down to
the physical file system, which computes the
location where the desired record physically resides.

byte address
Block number = integers [-----------------------] + address of the first physical record
• physical block size

byte address
offset = remainder [-----------------------]
physical block size

Understanding Operating Systems, Sixth Edition 93


Levels in a File Management System

• This information is passed on to the device interface


module which transforms the block number to the
actual cylinder/surface/record combination needed
to retrieve the information from the secondary
storage device.

Understanding Operating Systems, Sixth Edition 94


Levels in a File Management System

• Once retrieved, and, using the device-scheduling


algorithms, the information is placed in a buffer and
control returns to the physical file system, which
copies the information into the desired memory
location.
• When the operation is complete, the “all clear”
message is passed on to all other modules.

Understanding Operating Systems, Sixth Edition 95


Levels in a File Management System
• A WRITE command is handled in exactly the same
way until the process reaches the device handler.
• At that point, the portion of the device interface
module that handles allocation of free space, the
allocation module, is called into play because it’s
responsible for keeping track of unused areas in
each storage device.
• Verification, the process of making sure that a
request is valid, occurs at every level of the file
management system.

Understanding Operating Systems, Sixth Edition 96


Levels in a File Management System
• Verification process:
– The first verification occurs at the directory level when
the file system checks to see if the requested file
exists.
– The second occurs when the access control
verification module determines whether access is
allowed.
– The third occurs when the logical file system checks
to see if the requested byte address is within the file’s
limits.
– Finally, the device interface module checks to see
whether the storage device exists.
Understanding Operating Systems, Sixth Edition 97
Access Control Verification Module
• The first OS couldn’t support file sharing among
users.
– Early systems needed 10 copies of a compiler to
serve 10 users.
– Today’s systems require only a single copy to serve
everyone.
• Any file can be shared
– Data files, user-owned program files, system files.
• Advantages:
– It saves space;
– Allows for the synchronization of data updates.
• As when two application program are updating the
same data file.
Understanding Operating Systems, Sixth Edition 98
Access Control Verification Module
• Advantages (cont’d):
– It improves the efficiency of the system’s resources.
• If files are shared in main memory, then there’s a
reduction of I/O operations.
• Disadvantages:
• The integrity of each file must be safeguarded.
– Calls for control over who is allowed to access the file
– What type of access is permitted.
• Five possible actions that can be performed on a
file:
– READ only, WRITE only, EXECUTE only, DELETE
only;
– Some combination of the four.
Understanding Operating Systems, Sixth Edition 99
Access Control Verification Module
Access Control Matrix
• Easy to implement.
• Because of it size it only works well for systems with
a few files and a few users.
• In the matrix:
– Each column identifies a user and each row identifies
a file.
– The intersection of the row and column contains the
access rights for that user to that file.

Understanding Operating Systems, Sixth Edition 100


Access Control Verification Module
Access Control Matrix
• In The Matrix:
– Access rights identified by the letters RWED:
• R = Read
• W = Write
• E = Execute Address
• D = Delete Access
• (-) = Access not allowed
• Table 8.2 illustrates.

Understanding Operating Systems, Sixth Edition 101


Access Control Verification Module
Access Control Matrix

Understanding Operating Systems, Sixth Edition 102


Access Control Verification Module
Access Control Matrix
• In the actual implementation, the letters, RWED are
represented by bits 1 and 0;
– 1 indicates that access is allowed;
– 0 indicates that access is denied.
• As the numbers of files and users increase, the
matrix becomes extremely large – sometimes too
large to store in main memory.
• A lot of space is wasted because many of the
entries are all null.

Understanding Operating Systems, Sixth Edition 103


Access Control Verification Module
Access Control Matrix

Understanding Operating Systems, Sixth Edition 104


Access Control Verification Module
Access Control Lists

Understanding Operating Systems, Sixth Edition 105


Access Control Verification Module
Access Control Lists
• Some systems shorten the access control list even
more by putting every user into a category:
– SYSTEM or ADMIN
• System personnel who have unlimited access to all files
in the system.
– OWNER
• Has absolute control over all files created in the
owner’s account.
– GROUP
• All users belonging to the appropriate group have
access.
– WORLD
• All other users in system.
Understanding Operating Systems, Sixth Edition 106
Access Control Verification Module
Capability Lists
• Lists every user and the files each has access to.
• In OS such as LINUX or UNIX, they can control
access to devices as well as to files.
• The most commonly used is the access control list.

Understanding Operating Systems, Sixth Edition 107


Data Compression
• Data Compression algorithms consist of two types:
– Lossless algorithms typically used for text or
arithmetic files.
• Retains all the data in the file throughout the
compression-decompression process.
– Lossy algorithms typically used for image and sound
files.
• Remove data permanently.
– Unwanted noise
– Tones beyond a human’s ability to hear
– Light spectrum that we can’t see

Understanding Operating Systems, Sixth Edition 108


Data Compression
Text Compression
• Three methods to compress text in a database:
– Records with repeated characters;
– Repeated terms;
– Front-end compression.

Understanding Operating Systems, Sixth Edition 109


Data Compression
Text Compression
• Records with repeated characters:
– Data in a fixed-length field might include a short name
followed by many blank characters.
– This can be replaced with a variable-length field and a
special code to indicate how many blanks were
truncated.

ADAMSbbbbbbbbbb

Encoded looks like this:

ADAMSb10

Understanding Operating Systems, Sixth Edition 110


Data Compression
Text Compression
• Records with repeated characters:
– Likewise, numbers with many zeroes can be
shortened, with a code (#) to indicate how many
zeroes must be added to recreate the original
number.

300000000

Encoded looks like this:

3#8

Understanding Operating Systems, Sixth Edition 111


Data Compression
Text Compression
• Repeated terms:
– Compressed by using symbols to represent each of
the most commonly used words in the database.
– University student database common words:
• Student, course, grade, department could each be
represented with single character
– Of course, the system must be able to distinguish
between compressed and uncompressed data.

Understanding Operating Systems, Sixth Edition 112


Data Compression
Text Compression
• Front-end compression:
– Builds on the previous data element.
• The student database where the students’ names are
kept in alphabetical order could be compressed
(Table 8.6).
– Entry takes given number of characters from previous
entry that they have in common.
– Storage space is gained, but processing time is lost.
– The system must be able to distinguish between
compressed and uncompressed data.

Understanding Operating Systems, Sixth Edition 113


Data Compression
Other Compression Schemes
• Lossy compression allows a loss of data from the
original file to allow significant compression.
– The compression process is irreversible as the
original file cannot be reconstructed.
• The specifics of the compression algorithm are
highly dependent on the type of file being
compressed.
– JPEG – a popular option for still images
– MPEG – for video images

Understanding Operating Systems, Sixth Edition 114


Data Compression
Other Compression Schemes
• The International Organization for Standardization
(ISO) has issued MPEG standards that “are
international standards dealing with the
compression, decompression, processing, and
coded representation of moving pictures, audio, and
their combination.”
• ISO is the world’s leading developer of international
standards.

Understanding Operating Systems, Sixth Edition 115


Summary
• The File manager controls every file in the system
and processes the user’s commands:
– Read, write, modify, create, delete, etc.)
– To interact with any file on the system.
• It also manages the access control procedures to
maintain the integrity and security files under its
control.
• The File Manager must be able to accommodate a
variety of file organizations, physical storage,
allocation schemes, record types, and access
methods.

Understanding Operating Systems, Sixth Edition 116


Summary (cont’d)
• In this chapter we discussed:
– Sequential, direct, and indexed sequential file
organization;
– Contiguous, noncontiguous, and indexed file storage
allocation;
– Fixed-length versus variable-length records;
– Three methods of access control;
– Data compression techniques.

Understanding Operating Systems, Sixth Edition 117


Summary (cont’d)
• To get the most from a File Manager, it’s important
for users to realize:
– The strengths and weaknesses of its segments;
– Which access methods are allowed on which devices;
– With which record structures;
– The advantages and disadvantages of each in overall
efficiency.

Understanding Operating Systems, Sixth Edition 118