Sie sind auf Seite 1von 62

● Files

● Directories
● Files are used to provide permanent storage for your data.
– Consider your music files which are only used when you want to play
them
● Files provide a way to store data that cannot fit in memory
– Databases consist of gigabytes of information
– Database server only loads necessary information into memory
● Files serve as a less complicated way to share information
between processes
– Direct memory access between processes is often restricted
– Files allow a way around this restriction.
● Because of the importance of files, operating systems provide
a file system.
● The file system
– dictates how your data is to be saved in secondary memory
– provide restrictions to who could access them
– Determine how the user retrieves them
– and a lot of other policies which regulate file I/O and access.
……………………………
………………

File Concept
● The filename acts as an interface between the user and the
operating system with regards to files
– User view files as information
– Operating system views files as continuous stream of bytes
– Both refer to files as filenames
● Users referring to a file via its filename is directed by the file
system to the relevant sequence of bytes that store the
information the user wants.
● Different operating systems have different prescriptions on
what consists a valid filename.
– MS-DOS File Allocation Table filename 8 characters plus 3 additional
characters for filename extension.
– Windows systems use the NTFS file system can have up to 255
characters in length
– UNIX files are case sensitive
● Operating systems consider letters and numbes to be valid
characters in filenames (e.g. atlas, doc1), but have different
rules on punctuation marks.
● For example, NTFS allows the exclamation point but does not
allow *, <, >, or | to be part of the filename.
● File systems often add a filename extension to the filename.
● This filename extension indicates additional information about
that file, such as if the file is executable or what application is
used to open the file.
● The following are some typical filename extensions:
– .exe – executable file
– .java – java file
– .class – java class file
– .txt – text file
● The file's type determines how the operating system is to
handle the file.
● Windows, for example, considers two distinct file types
– an executable file in binary format
– a data file in whatever format specified by the application that is used
to interpret that file.
● Executable files are run directly by the operating system.
● Data files are on the other hand are interpreted by an
assigned application
– text editors for .txt files
– image editors for .gif and .jpeg
– the Java Virtual Machine for .class files
● The common idea of a file type, where text files being .txt,
image files being .gif or .jpeg, are all considered by the
Windows operating system to be data files.
● UNIX based systems consider executable files and data files
to be regular files.
● The directory file type is used by the OS to describe
directories.
● Devices in UNIX are represented as special files.
– character special files represent character based I/O devices such as
keyboards
– block special files represent devices which store data in blocks such as
hard disks.
● In MS-DOS, the period character can be used only once as it
is used to separate the filename from the filename extension
(ex. Myresume.pdf).
● In NTFS, the period can be used multiple times, but the
filename extension is considered to be the characters that
come after the last period.
● UNIX systems also allow for file extensions but do not enforce
them. A text file in UNIX does not have to end with a .txt, while
an executable file does not have to have a .exe extension.
……………………………
………………

File Structure
● For the operating system to properly interpret a file, it must
prescribe to a specified structure.
● For example, the following slide shows the data structure MS-
DOS executable file
– All executable files should start with this 64-byte file sequence or the
OS will not recognize it as executable
(a) An executable file (b) An archive
typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header
USHORT e_magic; // Magic number
USHORT e_cblp; // Bytes on last page of file
USHORT e_cp; // Pages in file
USHORT e_crlc; // Relocations
USHORT e_cparhdr; // Size of header in paragraphs
USHORT e_minalloc; // Minimum extra paragraphs needed
USHORT e_maxalloc; // Maximum extra paragraphs needed
USHORT e_ss; // Initial (relative) SS value
USHORT e_sp; // Initial SP value
USHORT e_csum; // Checksum
USHORT e_ip; // Initial IP value
USHORT e_cs; // Initial (relative) CS value
USHORT e_lfarlc; // File address of relocation
table
USHORT e_ovno; // Overlay number
USHORT e_res[4]; // Reserved words
USHORT e_oemid; // OEM identifier (for e_oeminfo)
USHORT e_oeminfo; // OEM information; e_oemid
specific
USHORT e_res2[10]; // Reserved words
LONG e_lfanew; // File address of new exe header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
● For practical purposes, operating systems do not define any
file structures aside from those files that are used directly by
the operating system.
– Doing this reduces the complexity of the operating system by leaving
interpretation of the file to the applications that request for the file.
● Any file structure is determined by the application
programmers.
– The file system acts as nothing more than an entity that sends and
receives a stream of bytes and does not have to check for proper
structure.
● UNIX for example, aside from its special files, consider all
other files to be nothing more than a continuous stream of
bytes.
● The Windows operating system only describes a format for
executable files, the only structure required for data files is
that they are to be divided into blocks 512 bytes long.
● There are two ways to access files, sequential and random
access.
– Sequential access
– Random access
● Sequential access files require reading through all the files in
the system to get to the required file
– Consider a video tape, to get to a particular scene, you would have to
go through all of the other scenes.
● Sequential access still has its use today as a way to backup
data.
● Backups, more often than not, would be read sequentially
from start to finish in order to replace the original file which
was damaged in some way.
● Also, it is currently cost-effective to store large backups in tape
drives, which are sequential by nature.
● Direct Access files allow for access to required file
– Consider how you could watch the scene you want on a DVD
● Direct access files are very advantageous for large files.
● Consider a phone book database.
– A sequential access phone book would require searching through all
names A to Y to get to a name starting with letter Z.
– A direct access file can immediately go to the target name.
– if the database were stored using sequential access. Most operating
systems have random access as the default file access.
● Sequential Access
read next
write next
reset
no read after last write
(rewrite)

● Direct Access – file is fixed length logical records


read n
write n
position to n
read next
write next
rewrite n
● In addition to the filename and the data the file contains, the
operating system keeps track of additional data about the file
which are called file attributes.
● Name – only information kept in human-readable form
● Identifier – unique tag (number) identifies file within file system
● Type – needed for systems that support different types
● Location – pointer to file location on device
● Size – current file size
● Protection – controls who can do reading, writing, executing
● Time, date, and user identification – data for protection, security,
and usage monitoring
● Information about files are kept in the directory structure, which is
maintained on the disk
● Many variations, including extended file attributes such as file
checksum
● Information kept in the directory structure
● One particular file attribute is the access control list frequently
found on unix systems
– As was discussed in an earlier chapter, the access control list
determines if a file could be read, written, or executed by the file owner,
owner group or other users
● File is an abstract data type
● Create
● Write – at write pointer location
● Read – at read pointer location
● Reposition within file - seek
● Delete
● Truncate
● Open(Fi) – search the directory structure on disk for entry Fi, and
move the content of entry to memory
● Close (Fi) – move the content of entry Fi in memory to directory
structure on disk
● Several pieces of data are needed to manage open files:
– Open-file table: tracks open files
– File pointer: pointer to last read/write location, per process that has the
file open
– File-open count: counter of number of times a file is open – to allow
removal of data from open-file table when last processes closes it
– Disk location of the file: cache of data access information
– Access rights: per-process access mode information
● Provided by some operating systems and file systems
– Similar to reader-writer locks
– Shared lock similar to reader lock – several processes can acquire
concurrently
– Exclusive lock similar to writer lock
● Mediates access to a file
● Mandatory or advisory:
– Mandatory – access is denied depending on locks held and requested
– Advisory – processes can find status of locks and decide what to do
● Files
● Directories
● Directores were created to keep files organized
– Most users have a documents directory, with subdirectories for fine-
grain organization such as per subject or task
– Most users would also have a music directory where music files are
categorized into subdirectories per artist and/or album
● Directories are often called folders
– Simiar files are gathered into the same folder, both in the real world
and in the computer
● In the early days of computers, users had a single place to
store files

● Problems
– A single user already has difficulty keeping track of his files, how much
more with other users
– Common file names (MyResume.txt for example)
● To solve this problem, schemes were developed to keep files
organized
– Append username to file
– alice_myresume.txt is different from bob_myresume.txt
● These however would lead to long filenames which are
unweildly to use
● Directories first came out to provide a separation between files
from different users.
● With user directories, computer users do not have to worry
about other people having the same file names.
● Having user directories also adds a level of security, users are
not allowed to access the contents of other user directories.
– However, a secure user directory may be a hindrance to users who
want to share files with other users.
– A solution to this is to have a directory with open access to everyone,
or to a select number of users.
● An additional benefit is being able to place disk quotas on
user directories.
● The system administrator, to prevent abuse of the system, can
place a maximum limit on the file space allocated to the user.
● A user would have to request the system administrator directly
for additional space, which may or may not be granted.
● Files belonging to an application should also be placed in their
own directory to prevent accidental overwriting by users.
● To access these files a system must be devised wherein a
user would be able to access files belonging to another
directory.
● There are different notations used to separate directory names
from filenames.
● UNIX systems use the / character (i.e. /alice/newdoc.txt)
● Windows systems on the other hand use the \ character
(\alice\newdoc.txt),
– which leads to problems as the \ is considered to be a special
character switch
– (i.e. \n means new line in most programming languages).
● The hierarchal directory system is the natural extension of the
two-level directory system.
● By allowing directories to contain directories, users can now
organize their individual files.
● The base level of the hierarchal directory system is what is
refered to as the root directory.
● Since the graph of the hierarchal directory system is a tree,
then each file has a unique path towards the root.
● The listing of all the directories from the file to the root is its
path name.
● For example, if hello.txt is stored in alice's home directory,
then its pathname is /home/alice.
● A point of distinction must be made with regards to the
Windows operating system and Unix operating systems.
● The highest level of the Unix file system is a single root
directory while Windows system seemingly has a top level
directory for each hard disk or data device on its system (also
called drives).
● The Unix file system does not have drives.
– Each hard disk or data device is mounted as a directory in the file
system (often in the /mnt directory).
– This provides a seamless single file system view.
● The Windows system still follows the single root directory
system if you consider each drive to be directory in a virtual
root directory.
● There are two ways to refer to a file. Using its absolute path
name or its relative path name.
– The absolute path name is the file's path all the way from the root.
– The relative path is the path starting from the current active directory.
● For example, if alice stored her MyResume.pdf in a
subdirectory called documents in her home directory then
– If the current directory is /home/alice then the relative pathname of her
file is documents/MyResume.pdf
– The absolute filename of her file is
/home/alice/documents/MyResume.pdf
● Some systems allow for a special subdirectory found in all
directories that references the parent directory.
● In both Windows and UNIX systems this is called the ..
(double dot) directory.
● For example, if the active directory is /home/bob/music, then
you can either refer to Alice's MyResume.pdf via the relative
filename ../../alice/documents/MyResume.pdf.
– The first .. makes goes up to /home/bob.
– The next .. goes up to /home
– And /alice/documents goes down the tree to MyResume.pdf
● Another special directory is the . (dot) directory, which is a
reference for the current directory
● This is used as a shortcut to refer to the current directory, as
for example with copying files
● For example, the following command copies Alice's
MyResume.pdf to the current directory.
– cp /home/alice/documents/MyResume.pdf .
● UNIX systems allow for the creation of special kinds of files
that serve as a link between files.
● There are two kinds of links, symbolic links (also called
symlinks) and hard links.
● Both kinds of links are used to refer to files or directories
elsewhere in the directory tree.
● Links can be used not only to refer to files but to whole
directories as well.
● Since links create a connection between formerly disjoint parts
of the file tree, our directory tree becomes a general graph
directory
– Trees do not allow cycles which links create
● Links are one way by which users can be made to share files
● A symlink only serves as a way to refer to the file via another
pathname
– Deleting the symlink only deletes the link and not the file itself
● Deleting a hard link also means deleting the file it is pointing
to, if the file no longer has other hard links pointing to it.
● To create a hard link, you can use the ln command
– ln <linkname> <filename>
● For example, if you want to create a hard link to Alice's
MyResume.pdf, you can run the command
– ln mylink.pdf /home/alice/documents/MyResume.pdf
● To create a symlink, simply run ln with the -s option
– ln -s mylink.pdf /home/alice/documents/MyResume.pdf
● Two different names (aliasing)
● If dict deletes list  dangling pointer
Solutions:
– Backpointers, so we can delete all pointers
Variable size records a problem
– Backpointers using a daisy chain organization
– Entry-hold-count solution
● New directory entry type
– Link – another name (pointer) to an existing file
– Resolve the link – follow pointer to locate the file
● Allowing directories means also providing commands for the
user to manipulate directories. Here are the common directory
related commands.
– Create directory
– Delete directory
– Renaming directories
– Changing current directory
● An issue with deletion is how to handle deletion of non-empty
directories
– MS-DOS does not allow the deletion of non-empty directories.
– Windows, particularly with the GUI interface, automatically deletes
directory contents.
– UNIX provides a -f (force) option to its delete command (rm) to delete a
non-empty directory. It also provides a -r (recursive) option to delete
any subdirectories inside.
● A file system must be mounted before it can be available to
processes on the system

● A unmounted file system is mounted at a mount point

Operating System Concepts


Operating System Concepts

Das könnte Ihnen auch gefallen