Sie sind auf Seite 1von 30

Lecture 5

Chapter 5 of Robbins Book

Files and Directories

BIL 244 – System Programming


UNIX File System Navigation

• Operating systems organize physical disks into file systems to


provide high-level logical access to the actual bytes of a file.
• A file system is a collection of files and attributes such as location
and name. Instead of specifying the physical location of a file on
disk, an application specifies a filename and an offset. The
operating system makes a translation to the location of the physical
file through its file systems.
• A directory is a file containing directory entries that associate a
filename with the physical location of a file on disk.
• When disks were small, a simple table of filenames and their
positions was a sufficient representation for the directory. Larger
disks require a more flexible organization, and most file systems
organize their directories in a tree structure. This representation
arises quite naturally when the directories themselves are files.

BIL 244 – System Programming


UNIX File System Navigation

• The absolute or fully qualified pathname specifies all of


the nodes in the file system tree on the path from the root
to the file itself. The absolute path starts with a slash (/) to
designate the root node and then lists the names of the
nodes down the path to the file within the file system tree

BIL 244 – System Programming


The current working directory

• A program does not always have to specify files by fully


qualified pathnames. At any time, each process has an associated
directory, called the current working directory, that it uses for
pathname resolution.
• If a pathname does not start with /, the program prepends the
fully qualified path of the current working directory. Hence,
pathnames that do not begin with / are sometimes called relative
pathnames because they are specified relative to the fully
qualified pathname of the current directory.
• A dot (.) specifies the current directory, and a dot-dot (..)
specifies the directory above the current directory.
• The root directory has both dot and dot-dot pointing to itself.

BIL 244 – System Programming


The current working directory

• The PWD environment variable specifies the current working directory of a


process. Do not directly change this variable, but rather use the getcwd
function to retrieve the current working directory and use the chdir function
to change the current working directory within a process.
• The chdir function causes the directory specified by path to become the
current working directory for the calling process.
• The getcwd function returns the pathname of the current working directory.
The buf parameter of getcwd represents a user-supplied buffer for holding
the pathname of the current working directory. The size parameter specifies
the maximum length pathname that buf can accommodate, including the
trailing string terminator.
#include <uninstd.h>

int chdir(const char *path);


char *getcwd(char *buf, size_t size);
• If successful, chdir returns 0. If unsuccessful, chdir returns –1 and sets
errno. If successful, getcwd returns a pointer to buf. If unsuccessful,
getcwd returns NULL and sets errno.
BIL 244 – System Programming
The current working directory

• If buf is not NULL, getcwd copies the name into buf. If buf is
NULL, POSIX states that the behavior of getcwd is undefined.
• In some implementations, getcwd uses malloc to create a buffer
to hold the pathname. (!Do not rely on this behavior !)
• You should always supply getcwd with a buffer large enough to
fit a string containing the pathname.
• The PATH_MAX constant may or may not be defined in limits.h.
The optional POSIX constants can be omitted from limits.h if
their values are indeterminate but larger than the required POSIX
minimum. For PATH_MAX, the _POSIX_PATH_MAX constant
specifies that an implementation must accommodate pathname
lengths of at least 255.
• A vendor might allow PATH_MAX to depend on the amount of
available memory space on a specific instance of a specific
implementation.
BIL 244 – System Programming
A complete program to output the current working directory

#include <limits.h>
#include <stdio.h>
#include <unistd.h>

#ifndef PATH_MAX
#define PATH_MAX 255
#endif

int main(void) {
char mycwd[PATH_MAX];
if (getcwd(mycwd, PATH_MAX) == NULL) {
perror("Failed to get current working directory");
return 1;
}
printf("Current working directory: %s\n", mycwd);
return 0;
}
The current working directory

• A more flexible approach uses the pathconf function to determine the real
value for the maximum path length at run time. The pathconf function is
one of a family of functions that allows a program to determine system and
runtime limits in a platform-independent way.
• The sysconf function takes a single argument, which is the name of a
configurable systemwide limit such as the number of clock ticks per second
(_SC_CLK_TCK) or the maximum number of processes allowed per user
(_SC_CHILD_MAX).
• The pathconf and fpathconf functions report limits associated with a
particular file or directory.
• The fpathconf takes a file descriptor and the limit designator as
parameters, so the file must be opened before a call to fpathconf.
• The pathconf function takes a pathname and a limit designator as
parameters, so it can be called without the program actually opening the file.
• The sysconf function returns the current value of a configurable system
limit that is not associated with files. Its name parameter designates the limit.

BIL 244 – System Programming


The current working directory

#include <uninstd.h>

long fpathconf(int fildes, int name);


long pathconf(const char *path, int name); long sysconf(int name);
long sysconf(int name);

• If successful, these functions return the value of


the limit. If unsuccessful, these functions return –1
and set errno

BIL 244 – System Programming


A program that uses pathconf to output the current working directory

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void) {
long maxpath;
char *mycwdp;
if ((maxpath = pathconf(".", _PC_PATH_MAX)) == -1) {
perror("Failed to determine the pathname length");
return 1;
}
if ((mycwdp = (char *) malloc(maxpath)) == NULL) {
perror("Failed to allocate space for pathname");
return 1;
}

if (getcwd(mycwdp, maxpath) == NULL) {


perror("Failed to get current working directory");
return 1;
}
printf("Current working directory: %s\n", mycwdp);
return 0;
}
Directory Access

• Directories should not be accessed with the ordinary open, close


and read functions. Instead, they require specialized functions
whose corresponding names end with "dir": opendir, closedir
and readdir.
• The opendir function provides a handle of type DIR * to a
directory stream that is positioned at the first entry in the
directory.
• The readdir function reads a directory by returning successive
entries in a directory stream pointed to by dirp. The readdir
returns a pointer to a struct dirent structure containing
information about the next directory entry. The readdir moves
the stream to the next position after each call.
• The closedir function closes a directory stream, and the
rewinddir function repositions the directory stream at its
beginning. Each function has a dirp parameter that corresponds
to an open directory stream.

BIL 244 – System Programming


Directory Access

• opendir provides a handle for the other functions.


• readdir gets the next entry in the directory.
• rewinddir restarts from the beginning.
• closedir closes the handle.
Note that like strtok these are not reentrant.

#include <dirent.h>

DIR *opendir(const char *filename);


struct dirent *readdir(DIR *dirp);
void rewinddir(DIR *dirp);
int closedir(DIR *dirp);

BIL 244 – System Programming


A program to list files in a directory.

#include <dirent.h>
#include <errno.h>
#include <stdio.h>

int main(int argc, char *argv[]) {


struct dirent *direntp;
DIR *dirp;

if (argc != 2) {
fprintf(stderr, "Usage: %s directory_name\n", argv[0]);
return 1;
}

if ((dirp = opendir(argv[1])) == NULL) {


perror ("Failed to open directory");
return 1;
}

while ((direntp = readdir(dirp)) != NULL)


printf("%s\n", direntp->d_name);
while ((closedir(dirp) == -1) && (errno == EINTR)) ;
return 0;
}
Accessing file status information

• This section describes three functions for retrieving file status


information. The fstat function accesses a file with an open
file descriptor. The stat and lstat functions access a file by
name.

#include <sys/stat.h>

int lstat(const char *restrict path, struct stat *restrict buf);


int stat(const char *restrict path, struct stat *restrict buf);
int fstat(int fildes, struct stat *buf);

• stat is given the name of a file.


• fstat is used for open files.
• lstat does the same thing as stat except that if the file is a
symbolic link, it gives information about the link, rather than
the file it is linked to
BIL 244 – System Programming
Accessing file status information

• The contents of the struct stat are system dependent, but


the standard says that it must contain at least the following
fields:
dev_t st_dev; /* device ID of device containing file */
ino_t st_ino; /* file serial number */
mode_t st_mode; /* file mode */
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* user ID of file */
gid_t st_gid; /* group ID of file */
off_t st_size; /* file size in bytes (regular files) */
/* path size (symbolic links) */
time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last data modification */
time_t st_ctime; /* time of last file status change */

BIL 244 – System Programming


The following function displays the time that the file path was last accessed

#include <stdio.h>
#include <time.h>
#include <sys/stat.h>

void printaccess(char *path) {


struct stat statbuf;

if (stat(path, &statbuf) == -1)


perror("Failed to get file status");
else printf("%s last accessed at %s", path, ctime(&statbuf.st_atime));
}
The isdirectory function returns true (nonzero) if path is a directory, and false (0) otherwise.

#include <stdio.h>
#include <time.h>
#include <sys/stat.h>

int isdirectory(char *path) {


struct stat statbuf;

if (stat(path, &statbuf) == -1)


return 0;
else return S_ISDIR(statbuf.st_mode);
}
Unix File System Implementation

Structure of a typical UNIX file system

• Disk formatting divides a physical disk into regions called partitions.


• Each partition can have its own file system associated with it. A particular
file system can be mounted at any node in the tree of another file system.
• The topmost node in a file system is called the root of the file system.
• The root directory of a process (denoted by /) is the topmost directory that
the process can access.
• All fully qualified paths in UNIX start from the root directory /.

BIL 244 – System Programming


UNIX file implementation

• POSIX does not mandate any particular representation of files on


disk, but traditionally UNIX files have been implemented with a
modified tree structure.
• Directory entries contain a filename and a reference to a fixed-
length structure called an inode.
• The inode contains information about the file size, the file
location, the owner of the file, the time of creation, time of last
access, time of last modification, permissions and so on.
• In addition to descriptive information about the file, the inode
contains pointers to the first few data blocks of the file. If the file
is large, the indirect pointer is a pointer to a block of pointers that
point to additional data blocks. If the file is still larger, the double
indirect pointer is a pointer to a block of indirect pointers. If the
file is really huge, the triple indirect pointer contains a pointer to a
block of double indirect pointers.
BIL 244 – System Programming
Inodes

Schematic structure of a traditional UNIX file.

BIL 244 – System Programming


Directory implementation

• A directory is a file containing a correspondence between


filenames and file locations.
• UNIX has traditionally implemented the location specification
as an inode number, but as noted above, POSIX does not
require this.
• The inode itself does not contain the filename. When a
program references a file by pathname, the operating system
traverses the file system tree to find the filename and inode
number in the appropriate directory.
• Once it has the inode number, the operating system can
determine other information about the file by accessing the
inode.

BIL 244 – System Programming


Directory Implementation

• A directory implementation that contains only names


and inode numbers has the following advantages.
1. Changing the filename requires changing only the directory
entry. A file can be moved from one directory to another just by
moving the directory entry, as long as the move keeps the file on
the same partition or slice.
2. Only one physical copy of the file needs to exist on disk, but the
file may have several names or the same name in different
directories. Again, all of these references must be on the same
physical partition.
3. Directory entries are of variable length because the filename is
of variable length. Directory entries are small, since most of the
information about each file is kept in its inode. Manipulating
small variable-length structures can be done efficiently. The
larger inode structures are of fixed length.

BIL 244 – System Programming


Hard Links and Symbolic Links

UNIX directories have two types of links—links and symbolic link


– A link is an association between a filename and an inode ,
sometimes called a hard link,
– A symbolic link, sometimes called a soft link, is a file that stores a
string used to modify the pathname when it is encountered during
pathname resolution
• Each inode contains a count of the number of hard links to the
inode.
• When a file is created, a new directory entry is created an a new
inode is assigned.
• Additional hard links can be created with
ln newname oldname
or with the link system call.

BIL 244 – System Programming


Links

• A new hard link to an existing file creates a new


directory entry but assigns no other additional disk
space.
• A new hard link increments the link count in the
inode.
• A hard link can be removed with the rm command or
the unlink system call.
• These decrement the link count.
• The inode and associated disk space are freed when
the count is decremented to 0.

BIL 244 – System Programming


Symbolic Links

• A symbolic link is a special type of file that contains


the name of another file.
• A reference to the name of a symbolic link causes the
operating system to use the name stored in the file,
rather than the name itself.
• Symbolic lines are created with the command:
ln -s newname oldname
• Symbolic links do not affect the link count in the inode.
• Unlink hard links, symbolic links can span filesystems.

BIL 244 – System Programming


Simple File

• Assume that the directory entry of a file name1 in


directory /dirA is as shown below

A directory entry, inode, and data block for a simple file


BIL 244 – System Programming
The shell command creates an entry “name2 ” in dirB containing a pointer to /dirA/name1
>ln /dirA/name2 /dirB/name2
or equavalently the following program will perform the same (figure illustrates the output)

#include <stdio.h>
#include <unistd.h>

....
if (link("/dirA/name1", "/dirB/name2") == -1)
perror("Failed to make a new link in /dirB")
.....
link and unlink

• The link function creates a new directory entry for the existing file
specified by path1 in the directory specified by path2.
#include <unistd.h>
int link(const char *path1, const char *path2);

• If successful, the link function returns 0. If unsuccessful, link returns –1


and sets errno
• Similarly the unlink function removes the directory entry specified by
path. If the file's link count is 0 and no process has the file open, the unlink
frees the space occupied by the file
#include <unistd.h>
int unlink(const char *path);

• If successful, the unlink function returns 0. If unsuccessful, unlink returns


–1 and sets errno

BIL 244 – System Programming


Creating and removing symbolic links

• Create a symbolic link by using the ln command with


the -s option or by invoking the symlink function.
• The path1 parameter of symlink contains the string
that will be the contents of the link, and path2 gives
the pathname of the link. (path2 is the newly created
link and path1 is what the new link points to).
#include <unistd.h>
int symlink(const char *path1, const char *path2);

• If successful, symlink returns 0. If unsuccessful,


symlink returns –1 and sets errno

BIL 244 – System Programming


The following command creates a symbolic link /dirB/name2
>ln -s /dirA/name1 /dirB/name2
similarly the following code segment performs the same action

#include <stdio.h>
#include <unistd.h>
....
if (symlink("/dirA/name1", "/dirB/name2") == -1)
perror("Failed to create symbolic link in /dirB");
.....

Das könnte Ihnen auch gefallen