Sie sind auf Seite 1von 18

The term filesystem has two somewhat different meanings, both of which are commonly used.

This can be confusing to novices, but after a while the meaning is usually clear from the context. One meaning is the entire hierarchy of directories (also referred to as the directory tree) that is used to organize files on a computer system. On Linux and Unix, the directories start with the root directory (designated by a forward slash), which contains a series of subdirectories, each of which, in turn, contains further subdirectories, etc. A variant of this definition is the part of the entire hierarchy of directories or of the directory tree that is located on a single partition or disk. (A partition is a section of a hard disk that contains a single type of filesystem.) The second meaning is the type of filesystem, that is, how the storage of data (i.e., files, folders, etc.) is organized on a computer disk (hard disk, floppy disk, CDROM, etc.) or on a partition on a hard disk. Each type of filesystem has its own set of rules for controlling the allocation of disk space to files and for associating data about each file (referred to as meta data) with that file, such as its filename, the directory in which it is located, its permissions and its creation date. An example of a sentence using the word filesystem in the first sense is: "Alice installed Linux with the filesystem spread over two hard disks rather than on a single hard disk." This refers to the fact that [the entire hierarchy of directories of] Linux can be installed on a single disk or spread over multiple disks, including disks on different computers (or even disks on computers at different locations). An example of a sentence using the second meaning is: "Bob installed Linux using only the ext3 filesystem instead of using both the ext2 and ext3 filesystems." This refers to the fact that a single Linux installation can contain one or multiple types of filesystems. One hard disk can contain one or multiple types of filesystems (each on at least one separate partition), and a filesystem of a single type can be spread across multiple hard disks. This article is concerned primarily with filesystems in the second sense. However, because of the intimate relationship between the structure of filesystems and types of filesystems, the next section provides a quick review of (or introduction to) Linux filesystems in the first sense. Filesystem Structure

In Linux, everything is configured as a file. This includes not only text files, images and compiled programs (also referred to as executables), but also directories, partitions and hardware device drivers. Each filesystem (used in the first sense) contains a control block, which holds information about that filesystem. The other blocks in the filesystem are inodes, which contain information about individual files, and data blocks, which contain the information stored in the individual files. There is a substantial difference between the way the user sees the Linux filesystem (first sense) and the way the kernel (the core of a Linux system) actually stores the files. To the user, the filesystem appears as a hierarchical arrangement of directories that contain files and other directories (i.e., subdirectories). Directories and files are identified by their names. This hierarchy starts from a single directory called root, which is represented by a "/" (forward slash). (The meaning of root and "/" are often confusing to new users of Linux. This because each has two distinct usages. The other meaning of root is a user who has administrative privileges on the computer, in contrast to ordinary users, who have only limited privileges in order to protect system security. The other use of "/" is as a separator between directories or between a directory and a file, similar to the backward slash used in MS-DOS.) The Filesystem Hierarchy Standard (FHS) defines the main directories and their contents in Linux and other Unix-like operating systems. All files and directories appear under the root directory, even if they are stored on different physical devices (e.g., on different disks or on different computers). A few of the directories defined by the FHS are /bin (command binaries for all users), /boot (boot loader files such as the kernel), /home (users home directories), /mnt (for mounting a CDROM or floppy disk), /root (home directory for the root user), /sbin (executables used only by the root user) and /usr (where most application programs get installed). To the Linux kernel, however, the filesystem is flat. That is, it does not (1) have a hierarchical structure, (2) differentiate between directories, files or programs or (3) identify files by names. Instead, the kernel uses inodes to represent each file. An inode is actually an entry in a list of inodes referred to as the inode list. Each inode contains information about a file including (1) its inode number (a unique identification number), (2) the owner and group associated with the file, (3) the file type (for example, whether it is a regular file or a directory), (4) the file's permission list, (5) the file creation, access and modification times, (6) the size of the file and (7) the disk address (i.e., the location on the disk where the file is physically stored).

The inode numbers for the contents of a directory can be seen by using the i option with the familiar ls (i.e., list) command in a terminal window:
ls -i

The df command is used to show information about each of the filesystems which are currently mounted on (i.e., connected to) a system, including their allocated maximum size, the amount of disk space they are using, the percentage of their disk space they are using and where they are mounted (i.e., the mountpoint). (Herefilesystems is used as a variant of the first meaning, referring to the parts of the entire hierarchy of directories.) df can be used by itself, but it is often more convenient to add the -m option to show sizes in megabytes rather than in the default kilobytes:
df -m

A column showing the type of each of these filesystems can be added to the filesystem table produced by the above command by using the --print-type option, i.e.:
df -m --print-type

This command generates a column labeled Type. For a Red Hat Linux installation on a home computer most of the entries in this column will probably be ext3and/or ext2. Linux Native Filesystems Every native Linux filesystem implements a basic set of common concepts that were derived from those originally developed for Unix. (Native means that the filesystems were either developed originally for Linux or were first developed for other operating systems and then rewritten so that they would have functions and performance on Linux comparable or superior to those of filesystems originally developed for Linux.) Several Linux native filesystems are currently in widespread use, including ext2, ext3, ReiserFS, JFS and XFS. Additional native filesystems are in various stages of development. These filesystems differ from the DOS/Windows filesystems in a number of ways including (1) allowing important system folders to span multiple partitions and multiple hard drives, (2) adding additional information about files, including ownership and permissions and (3) establishing a number of standard folders for holding important components of the operating system.

Linux's first filesystem was minix, which was borrowed from the Minix OS. Linus Torvalds adopted this filesystem because it was an efficient and relatively bug-free piece of existing software that postponed the need to design a new filesystem from scratch. However, minix was not well suited for use on Linux hard disks for several reasons, including its maximum partition size of only 64MB, its short filenames and its single timestamp. But minix can be useful for floppy disks and RAM disks because its low overhead can sometimes allow more files to be stored than is possible with other Linux filesystems. The Extended File System, ext, was introduced in April, 1992. With a maximum partition size of 2GB and a maximum file name size of 255 characters, it removed the two biggest minix limitations. However, there still was no support for the separate access, inode modification and data modification timestamps. Also, its use of linked lists to keep track of free blocks and inodes caused the lists to become unsorted and the filesystem to become fragmented. The Second Extended File System (ext2) was released in January, 1993. It was a rewrite of ext which features (1) improved algorithms that greatly improved its speed, (2) additional date stamps (such as date of last access, date of last inode modification and date of last data modification) and (3) the ability to track the state of the filesystem. Ext2 maintains a special field in the superblock that indicates the status of the filesystem as either clean or dirty. A dirty filesystem will trigger a utility to scan the filesystem for errors. Ext2 also features support for a maximum file size of 4TB (1 terabyte is 1024 gigabytes). Consequently, it has completely superseded ext, support for which has been removed from the Linux kernel. Ext2 is the most portable of the native Linux filesystems because drivers and other tools exist that allow accessing ext2 data from a number of other operating systems. However, as useful as these tools are, most of them have limitations, such as being access utilities rather than true drivers, not working with the most recent versions of ext2, not being able to write to ext2 or posing a risk of causing filesystem corruption when writing to ext2. Journaling Filesystems The lack of a journaling filesystem was often cited as one of the major factors holding back the widespread adoption of Linux at the enterprise level. However, this objection is no longer valid, as there are now four such filesystems from which to choose. Journaling filesystems offer several important advantages over static filesystems, such as ext2. In particular, if the system is halted without a proper shutdown, they guarantee consistency of the data and eliminate the need for a long and complex

filesystem check during rebooting. The term journaling derives its name from the fact that a special file called a journal is used to keep track of the data that has been written to the hard disk. In the case of conventional filesystems, disk checks during rebooting after a power failure or other system crash can take many minutes, or even hours for large hard disk drives with capacities of hundreds of gigabytes. Moreover, if an inconsistency in the data is found, it is sometimes necessary for human intervention in order to answer complicated questions about how to fix certain filesystem problems. Such downtime can be very costly with big systems used by large organizations. In the case of a journaling filesystem, if power supply to the computer is suddenly interrupted, a given set of updates will have either been fully committed to the filesystem (i.e., written to the hard disk), in which case there is not a problem, and the filesystem can be used immediately, or the updates will have been marked as not yet fully committed, in which case the file system driver can read the journal and fix any inconsistencies that occurred. This is far quicker than a scan of the entire hard disk, and it guarantees that the structure of the filesystem is always selfconsistent. With a journaling filesystem, a computer can usually be rebooted in just a few seconds after a system crash, and although some data might be lost, at least it will not take many minutes or hours to discover this fact. Ext3 has been integrated into the Linux kernel since version 2.4.16 and has become the default filesystem on Red Hat and some other distributions. It is basically an extension of ext2 to which a journaling capability has been added, and it provides the same high degree of reliability because of the exhaustively field-proven nature of its underlying ext2. Also featured is the ability for ext2 partitions to be converted to ext3 and vice-versa without any need for backing up the data and repartitioning. If necessary, an ext3 partition can even be mounted by an older kernel that has no ext3 support; this is because it would be seen as just another normal ext2 partition and the journal would be ignored. ReiserFS, developed by Hans Reiser and others, was actually the first journaling filesystem added to the Linux kernel. As was the case with ext2, it was designed from the ground up for use in Linux. However, unlike ext3, it was also designed from the ground up as a journaling filesystem rather than as an add-on to an existing filesystem, and thus it is widely considered to be the most advanced of the native Linux journaling filesystems. Features include high speed, excellent stability and the ability to pack small files into less disk space than is possible with many other filesystems. A new version of ReiserFS, designated Reiser4, was scheduled for release in the first half of 2004. It is a complete rewrite from version 3 and is said to result in major improvements in performance, including higher speeds, the ability to accommodate more CPUs, built-in encryption and ease of customization.

JFS was originally developed by IBM in the mid-1990s for its AIX Unix operating system, and it was later ported to the company's OS/2 operating system. IBM subsequently changed the licensing of the OS/2 implementation to open source, which led to its support on Linux. JFS is currently used primarily on IBM enterprise servers, and it is also a good choice for systems that multiboot Linux and OS/2. XFS was developed in the mid-1990s by Silicon Graphics (SGI) for its 64 bit IRIX Unix servers. These servers were designed with advanced graphics processing in mind, and they feature the ability to accommodate huge files sizes. The company likewise converted XFS to open source, after which it was also adopted by Linux. Because it is a 64-bit filesystem, XFS features size limitations in the millions of terabytes (in contrast to the still generous 4TB limit of ext2). Most Linux distributions that ship with 2.4.x and later kernels support ext2, ext3 and ReiserFS. Support for JFS has been added to the 2.4.20 and 2.5.6 kernels, and XFS was added to the 2.5.36 kernel. JFS and XFS support can be added to earlier kernels by downloading the appropriate patches from the respective websites and compiling as a module or into the kernel. Partitions can then be converted by backing up the data, creating the new filesystem and then restoring the data. Supported Foreign Filesystems Unlike most other operating systems, Linux supports a large number of foreign filesystems in addition to its native filesystems. This is possible because of the virtual file system layer, which was incorporated into Linux from its infancy and makes it easy to mount other filesystems. In addition to reading, foreign filesystem support also often includes writing, copying, erasing and other operations. Among the most commonly used PC filesystems is FAT (File Allocation Table). This is the primary filesystem for MS-DOS and Microsoft Windows 95, 98 and ME, and it is also supported by Windows NT, 2000 and XP and most other operating systems. The first variant, FAT16, was Microsoft's standard filesystem until Windows 95, and the subsequent FAT32 is the standard for Windows 98 and Windows ME. Linux supports both reading from and writing to FAT16 and FAT32, and their main use on Linux is to share files with Microsoft Windows on dual-boot systems and through floppies. FAT filesystems can not accommodate information about files such as ownership and permissions. Also, FAT16 partitions are limited to a maximum of 2GB. Although the theoretical maximum size for FAT32 partitions is 8TB, Windows 98's scandisk (disk checking utility) only supports 128GB, and Windows 2000 does not permit the creation of FAT32 disks larger than 32GB.

NTFS is Microsoft's replacement for FAT. A descendant of HPFS (the native filesystem for IBM's OS/2 operating system), NTFS's purpose was to remove the limitations of the FAT filesystem (such as poor stability) while adding new features not found in HPFS. Of the Windows operating systems, it can only be accessed by NT, 2000 and XP. Under Linux, NTFS is currently supported only in read-only mode and only on some distributions. HFS (Hierarchical File System) is the native filesystem used on most Macintosh computers, and it is sometimes said to be "the Macintosh equivalent of FAT." However, Linux's support for HFS is not as complete as that for many other filesystems. As most Macintoshes include FAT support, it thus might be preferable in some situations to use this filesystem instead of HFS when exchanging data with Macintosh computers. ISO 9660, released in 1988 by an industry committee called High Sierra, is the standard filesystem for CDROMs. Almost all computers with CDROM drives can read files written in ISO 9660 regardless of their operating system. How to Select the Most Appropriate Filesystem When installing Linux, the optimal selection of filesystem(s) depends on several factors, particularly the intended application for the computer(s) and the types of partitions on which they are to be installed. In the case of most computers for individual users, ext2 or ext3 (the default on many such systems) is usually quite adequate. For large-scale and high performance systems, however, making the optimal selection is not as easy and it is much more important. This is because the choice of filesystems can have very noticeable effects on performance, on recovery from errors, on compatibility with other operating systems and on limitations on partition and file sizes. One generalization that can be made with regard to such systems is that it is usually advantageous to use a journaling filesystem because of the greatly reduced startup times after system crashes. For the boot and root partitions, it can be advantageous to use an ext2 or ext3 filesystem because this will allow booting in an emergency even with an older kernel. For other Linux partitions, ext3 or ReiserFS are usually the best choices, the former where ext2 compatibility is emphasized and the latter where performance is paramount. When it is desired for partitions to be accessible to both Linux and Microsoft Windows, FAT should be selected. The questions of which filesystems provide the best disk performance and minimize processor time are not easy to answer. Some studies suggest that XFS and JFS produce the best throughput with small files (e.g., 100MB), while ext2, ext3 are the best with larger files (e.g., 1GB). However, this situation could change

with the introduction of the new version of ReiserFS, with its claims of greatly enhanced speed and scalability. The choice of journaling filesystem can affect disk space availability because of the amount of space needed for the journal. This is a major consideration on small disks, such as Zip disks. For example, on a 100MB Zip disk, ext3fs and XFS each devote 4MB to their journals whereas ReiserFS devotes several times this amount to its journal. Optimizing Linux Filesystems System performance can be optimized not only by selecting the most appropriate filesystem(s), but also by utilizing the various options that are available for most filesystems. There are differences in the availability of options according to the particular filesystem, and some options can be set only at filesystem creation time, while others can be changed later. One generally available option is the ability to set allocation block (the units into which a partition is subdivided) size. Smaller allocation blocks can facilitate more efficient use of disk space, whereas larger blocks can improve performance by reducing both file fragmentation and the time needed to retrieve an entire file. It is not easy to change this option after creation of the filesystem. All of the journaling filesystems support various journal options. For example, a choice of three data journaling modes in ext3 provides tradeoffs between data integrity and system recovery speed. Also, system performance can often be improved by using the journal location option to place the journal on a different physical disk than the main filesystem. Another type of option is the number of blocks automatically set aside for use by the root user. For example, for both ext2 and ext3 the default value is five percent. This might be excessive on large partitions or on relatively non-critical partitions, and reducing this value could make a small amount of additional disk space available to non-root users. In contrast to the filesystems used for Microsoft Windows, fragmentation of files is usually not a major problem with Linux filesystems due to their fundamental differences in design. (Fragmentation refers to parts of files becoming scattered around random, non-contiguous locations on a disk, resulting in reduced speed and reliability.) Thus, whereas the Microsoft Windows operating systems include utilities for defragmentation and encourage their regular use, such utilities are difficult to find for Linux. When Linux users who have come from the Windows world encounter sluggish performance, they are often tempted to attribute it to fragmentation; but it is much more likely the result of running short of memory

(and using relatively slow swap disk space instead of RAM) and/or running too many processes (i.e., programs running in the background). One final word about all of the dozens of filesystems available for Linux. Although having so much to choose from can seem confusing at first, this is just a part of what makes Linux so uniquely flexible and accommodating, i.e., an unparalleled freedom of choice. It is also part of what makes Linux so much fun, at least for those who make the effort to learn about it.

Article reproduced with owner's consent Tired of fscking? Try a journaling filesystem!
One of the most-anticipated of recent Linux developments is the availability of journaling filesystems. Philipp Tomsich provides an overview of the alternatives and his thoughts on which you should consider using, depending on your needs.

Journaling filesystems
Waiting for a fsck to complete on a server system can tax your patience more than it should. Fortunately, a new breed of filesystem is coming to your Linux machine soon. Journaling filesystems maintain a special file called a log (or journal), the contents of which are not cached. Whenever the filesystem is updated, a record describing the transaction is added to the log. An idle thread processes these transactions, writes data to the filesystem, and flags each processed transaction as completed. If the machine crashes, the background process is run on reboot and simply finishes copying updates from the journal to the filesystem. Incomplete transactions in the journal file are discarded, so the filesystem's internal consistency is guaranteed. This cuts the complexity of a filesystem check by a couple of orders of magnitude. A full-blown consistency check is never necessary (in contrast to ext2fs and similar filesystems) and restoring a filesystem after a reboot is a matter of seconds at most.

The players
Today, at least four major players exist in the Linux journaling filesystem arena. They are in various stages of completion, with some of them becoming ready for use in production systems. They are: Hans Reiser's ReiserFS , SGI's XFS/Linux , IBM's JFS , and ext3fs .

Each offers distinct advantages. A detailed technical comparison is available from issue 55 of Linux Gazette. Most of the available options provide support for dynamically extending the filesystems using a logical volume manager (such as LVM), which makes them perfect for large server installations.

ReiserFS
ReiserFS is a radical departure from the traditional Unix filesystems, which are block-structured. It will be available in the upcoming Red Hat 7.1 distribution and is already available in SuSE Linux 7.0. Hans Reiser writes about the filesystem he designed: "In my approach, I store both files and filenames in a balanced tree, with small files, directory entries, inodes, and the tail ends of large files all being more efficiently packed as a result of relaxing the requirements of block alignment and eliminating the use of a fixed space allocation for inodes." The effect is that a wide array of common operations, such a filename resolution and file accesses, are optimized when compared to traditional filesystems such as ext2fs. Furthermore, optimizations for small files are well developed, reducing storage overheads due to fragmentation. ReiserFS is not yet a true journaling filesystem (although full journaling support is currently under development). Instead, buffering and preserve lists are used to track all tree modifications, which achieves a very similar effect. This reduces the risk of filesystem inconsistencies in the event of a crash and thus provides rapid recovery on restart. Beside offering rapid restart capability after a crash and efficient storage of large numbers of small files, it is the developers' intention to offer facilities to store objects much smaller than those that are normally saved as separate files. Future design plans include adding set-theoretic semantics, making it possible to retrieve files by specifying their attributes instead of an explicit pathname. ReiserFS was the first of this new breed that managed to be included in the standard Linux kernel distribution, giving it a head start in building a user community.

XFS/Linux
When SGI needed a high performance and scalable filesystem to replace EFS in 1990, it developed XFS to handle the demands of increased disk capacity and bandwidth, and parallelism with new applications such as film, video, and large databases. These demands included extremely fast crash recovery, support for large filesystems, directories with large numbers of files, and fair performance with small and large files. Now SGI is contributing this technology to the Open Source community and is in the process of finalizing its port to Linux. Technically, XFS is based on the use of B+ trees (similar to the use of balanced trees in ReiserFS) to replace the conventional linear file system structure. B+ trees provide an efficient way to index directory entries and manage file extents, free space, and filesystem metadata. This guarantees quick directory listing and file accesses. The allocation of disk blocks to inodes is done dynamically, which means that you no longer need to create a filesystem with smaller block sizes for your mail server; your filesystem will handle this automatically for you. XFS is also a 64-bit filesystem, which theoretically allows the creation of files that are a few million terabytes in size, which compares favorably to the limitations of 32-bit filesystems. The ability to attach freeform metadata tags to files on an XFS volume is yet another useful feature of this filesystem. XFS also contains good support for multiprocessor machines. This is visible in the implementation of the page buffer subsystem, which uses an AVL tree which is kept separate from the objects to avoid locking problems and cache thrashing on larger SMP systems. Multithreaded operation has been a declared design goal of this filesystem and has been well tested in large multiprocessor IRIX systems worldwide. The Linux port is still undergoing development and some features are still to be finalized. For example, loopmounting a file containing an XFS volume will not work without problems, yet. The X/Open data management API provided on IRIX is still incomplete in the Linux port and guaranteed rate I/O is also an IRIX exclusive, so far. Even now, XFS is more than just a viable alternative on Linux. I've personally used it for a few months on my own systems and have been very happy with its performance, which is at least on a par with ext2fs. Now that an installable CD image (based of the first CD of the Red Hat 7.0 distribution) is available for download , it will be even easier to enjoy the benefits of this filesystem. The user-level tools for filesystem creation, maintenance, and resizing are more functional and easier to use than their ReiserFS counterparts, which mostly stems from the fact that they have been around for a far longer time.

So why should one switch to XFS/Linux if ReiserFS will be readily available in Red Hat 7.1 and SuSE 7.0 (even though it will be a while until it is equally well integrated into and supported by the major distributions)? The main factor is trust, robustness, and maturity... XFS has been deployed on IRIX systems since 1994 and been used in a wide array of mission-critical applications. It's a proven technology, while ReiserFS and ext3fs are relatively new without offering too much new functionality.

JFS
IBM's JFS is a journaling filesystem used in its enterprise servers. It was designed for "high-throughput server environments, key to running intranet and other high-performance e-business file servers" according to IBM's Web site. Judging from the documentation available and the source drops, it will still be a while before the Linux port is completed and included in the standard kernel distribution. JFS offers a sound design foundation and a proven track record on IBM servers. It uses an interesting approach to organizing free blocks by structuring them in a tree and using a special technique to collect and group continuous groups of free logical blocks. Although it uses extents for a file's block addressing, free space is therefore not used to maintain the free space. Small directories are supported in an optimized fashion (i.e., stored directly within an inode), although with different limitations than those of XFS. However, small files cannot be stored directly within an inode. The port of JFS is an interesting project and will benefit the Linux community. However, it seems to be farther from being usable for production systems than its competitors.

ext3fs
ext3fs is an alternative for all those who do not want to switch their filesystem, but require journaling capabilities. It is distributed in the form of a kernel patch and provides full backward compatibility. It also allows the conversion of an ext2fs partition without reformatting and a reverse conversion to ext2fs, if desired. However, using such an add-on to ext2fs has the drawback that none of the advanced optimization techniques employed in the other journaling filesystems is available: no balanced trees, no extents for free space, etc. My personal opinion on ext3fs is that it is about to meet its fate with the availability of more powerful journaling filesystems. A handful of successful sites, such as RPMFind use this filesystem, but it lacks the momentum that the others have.

Conclusion
With the increasing size of hard disks, journaling filesystems are becoming important to an ever-increasing number of users. If you ever waited for a filesystem check on a machine with an 80GB hard disk, you know what I'm talking about. Even if you do not plan to reboot your system often, they can save you a lot of time and trouble if you experience a power failure or a hardware glitch. With the large number of contenders striving to become the de-facto standard in the journaling filesystem space on Linux, we can look forward to interesting months as these filesystems' code bases mature, are integrated into the standard kernel, and are supported in upcoming releases of the major Linux distributions. However, keep in mind that migrating to another filesystem is not a trivial task. It usually requires backing up your data, reformatting, and restoring the data onto the newly created volume. You should thoroughly evaluate your options before making the switch.

Ext2/Ext3 File System


News Solaris File System Structure See also Recommended Links NTFS Disk Repartitioning Filesystems Recovery

Etx2/Ext3

Humor

Etc

Designed for educational purposes, the original Linux file system was limited to 64 MB in size and supported file names up to 14 characters. In 1992, the ext file system was created, and increased the file system size to 2 GB and file name length to 255 characters. However, file access, modification, and creation times were missing from file system data structures and performance tended to be low. Modeled after the Berkeley Fast File System, the ext2 file system used a better on disk layout, extended the file system size limit to 4 TB and file name sizes to 255 bytes, delivered improved performance, and emerged as the de facto standard file system for Linux environments. More information on the logging capabilities of the ext3 file system can be found in EXT3, Journaling File System by Dr. Stephen Tweedielocated at http://olstrans.sourceforge.net/release/OLS2000ext3/OLS2000-ext3.html An evolution of the ext2 file system, the ext3 file system adds logging capabilities tofacilitate fast reboots following system crashes. Key features of the ext3 file system include:

Forward and backward compatibility with the ext2 file system. An ext3 file system can be remounted as an ext2 file system and vice versa. Such compatibility played a role in the adoption of the ext3 file system. Checkpointing. The ext3 file system provides checkpointing capabilities, the logging of batches of individual transactions into compound transactions in memory prior to committing them to disk to improve performance. While checkpointing is in progress, a new compound transaction is started. While one compound transaction is being written to disk, another is accumulating. Furthermore, users can specify an alternative log file location which can enhance performance via increased disk bandwidth. Volume management. The ext3 file system relies on the LVM2 package to perform volume management tasks.

Logging in the ext3 File System The ext3 file system supports different levels of journalling which can be specified as mount options. These options can impact data integrity and performance. This section describes the mount options, and the testing results presented in Chapter 3demonstrate the effects of their use.

data=journal Originally, the ext3 file system was designed to perform full data and metadata journalling. In this mode, the file system journals all changes to the file system, whether the changes affect data or metadata. Consequently, data and metadata can be brought back to a consistent state. Full data journalling can be slow. However, performance penalties can be mitigated by setting up a relatively large journal. data=ordered The ext3 file system includes an operational mode that provides some of the benefits of full journalling without introducing severe performance penalties. In this mode, only metadata is journalled. As a result, the ext3 file system can provide overall filesystem consistency, even though only metadata changes are recorded in the journal. It is possible for file data being written at the time of a system failure to be corruptedin this mode. Note that ordered mode is the default in the ext3 file system. data=writeback While the writeback option provides lower data consistency guarantees than the journal or ordered modes, some applications show very significant speed improvement when it is used. (can be used with noatime option) For example, speed improvements can be seen when heavy synchronous writes are performed, or when applications create and delete large volumes of small files, such as delivering a large flow of short email messages. The results of the testing effort described in Chapter 3 illustrate this topic. When the writeback option is used, data consistency is similar to that provided by the ext2 file system. File system integrity is maintained continuously during normal operation in the ext3 file system. However, in the event of a power failure or system crash, the file system may not be recoverable if a significant portion of data was held only in system memory and not on permanent storage. In this case, the filesystem must be recreated from backups. Often, changes made since the file system was last backed up are inevitably lost.

Old News ;-)


Solaris ZFS and Red Hat Enterprise Linux Ext3 File System Performance White Paper
data=writeback While the writeback option provides lower data consistency guarantees than the journal or ordered modes, some applications show very significant speed improvement when it is used. For example, speed improvements can be seen when heavy synchronous writes are performed, or when applications create and delete large volumes of small files, such as delivering a large flow of short email messages. The results of the testing effort described in Chapter 3 illustrate this topic.

When the writeback option is used, data consistency is similar to that provided by the ext2 file system. However, file system integrity is maintained continuously during normal operation in the ext3 file system. In the event of a power failure or system crash, the file system may not be recoverable if a significant portion of data was held only in system memory and not on permanent storage. In this case, the filesystem must be recreated from backups. Often, changes made since the file system was last backed up are inevitably lost.

[Aug 7, 2007] Linux Replacing atime


August 7, 2007 | KernelTrap | Last updated 02/28/2008 12:02:07 Submitted by Jeremy on August 7, 2007 - 9:26am. In a recent lkml thread, Linus Torvalds was involved in a discussion about mounting filesystems with the noatime option for better performance, "'noatime,data=writeback' will quite likely be *quite* noticeable (with different effects for different loads), but almost nobody actually runs that way." He noted that he set O_NOATIME when writing git, "and it was an absolutely huge time-saver for the case of not having 'noatime' in the mount options. Certainly more than your estimated 10% under some loads." The discussion then looked at using the relatime mount option to improve the situation, "relative atime only updates the atime if the previous atime is older than the mtime or ctime. Like noatime, but useful for applications like mutt that need to know when a file has been read since it was last modified." Ingo Molnar stressed the significance of fixing this performance issue, "I cannot over-emphasize how much of a deal it is in practice. Atime updates are by far the biggest IO performance deficiency that Linux has today. Getting rid of atime updates would give us more everyday Linux performance than all the pagecache speedups of the past 10 years, _combined_." He submitted some patches to improve relatime, and noted about atime: "It's also perhaps the most stupid Unix design idea of all times. Unix is really nice and well done, but think about this a bit: 'For every file that is read from the disk, lets do a ... write to the disk! And, for every file that is already cached and which we read from the cache ... do a write to the disk!'"

Linux Ext2 filesystem for Windows NT driver


Ext2 0.04 for NT4 read-write

Contacts and feedback: Andrey Shedel andreys@cr.cyco.com Primary site: http://www.chat.ru/~ashedel

CAUTION!!! this is nt kernel-mode driver and you are using it at your own risk. It is highly recommended to use sync utility to flush regular volumes first.

>> You should be aware of the fact that ext2.sys might << >> damage the data stored on your hard disks. <<

If you cannot agree to these conditions, you should NOT use ext2.sys !

installation (you should be the member of administrators group):

copy ext2.sys to your %systemroot%\system32\drivers directory merge ext2.reg file reboot to update driver information edit go.cmd to point to your Linux drive run go.cmd

Known features: Non-regular files are converted to regular at first write attempt.

Mounting partitions:

NT4:

Instead of loading the driver manually or automatically (by setting startup mode to 1) you can use fs_rec.sys (recognizer driver). This driver is a superset of the recognizer that comes with NT4 and can be used instead of it. In addition to CDFS, NTFS and FAT (standatd set for NT4) it includes recognision modules for HPFS (for pinball.sys from NT3.51), FAT32-enabled fastfat and Ext2. It is not recommended to use this recognizer on NT5 because support for UDFS is not included. Unfortunately even in this case you still have to set persistent links in DosDevices namespace UNLESS YOU ARE USING NT5. For example:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\DOS Devices] "E:"="\\Device\\Harddisk0\\Partition2"

NT5:

On NT5 you can use Disk Management utility to assign drive letter.

Files included: readme.txt - this file ext2.sys - driver dosdev.exe - Define/RemoveDosDevice utility. kloader.exe - utility to load kernel-mode driver. ln.exe - hardlink creation utility. - Flush write-behind cache utility.

SYNC.EXE

fs_rec.sys - Recognizer driver.

Changes: 0.04: pagefile support initial security implementation.

[Dec 12, 1999] Slashdot Ask Slashdot EXT3


A great way to follow kernel development is to read the excellent kernel mailing list synopses written by Zack Brown at: http://kt.linuxcare.com Ext3fs is a journaled version of ext2fs written by Stephen Tweedie. It's in beta form right now but works pretty well. Stephen and Ted Ts'o talked about ext3fs at our Linux Storage Management Workshop in Darmstadt, Germany (you can get the slides for this workshop at ftp://linux.msede.com/lsmws_talks/) The ext3 filesystem, of which early alphas are ready (version 0.0.2c, the excitement !!). Development is on the linux-fsdevel mailing list, archived here. Hello, I've been running ext3 on my laptop computer for about two months now. It works great. Just sync the disks and turn it off. No shutdown. No data loss either. If you look at e.g Solaris disk-suite you are able to control where your should store your metadata. Say that you want to have journaling file data

also, this is normally slowing the system down. But if you can specify that all file metadata should be on a separate solidstate disk (naturally mirrored for safety). Then journaling of file data will be quick and swift. This is in my view quite important. If I understand everything correctly you can do that with ext3. One of the major problems with ext2fs (IMHO) is that it doesn't resize well. This is because there is a copy of everygroup descriptor in every group [a g.d. contains metadata for a group of blocks/inodes, typically 8M in size]. Therefore enlarging or shrinking the drive causes a major reshuffle of ALL the data; so far, the only utility I know that can do this is resiz2fs, which comes with Partition Magic (there are no doubt others now). This redundancy is good in theory (backups), but keeping a copy of a constant number of group descriptors (perhaps the previous and next 32) in a given group would still give you a lot of redundancy plus make resizing simpler. Granted, resizing isn't something you do a lot, but having had my system lock up and die while resizing and having to recover using Turbo C++ and the ext2fs spec (code and info on my ext2fs page), it would be nice if ext3fs (or XFS) made this easier. The Reiser Filesystems by Hans Reiser, a very ambitious project to not only improve performance and add journaling, but to redefine the filesystem as a storage repository for arbitrarily complex objects. Reiserfs is faster than ext2/3 because it uses balanced trees for it's directory-structures. The project is now released for 2.2.11 - 2.2.13. Mailing list archive here. The Xfs site has some docs. The work to unencumber the code is accelerating, and February is the target date for source code release. XFS is the one that I think has the most potential. It's a full logging filesystem from the ground up, not an extension (not that EXT3 or DTFS are bad or misguided efforts) I'm betting it will be the highest performance filesystem for linux when it goes gold. I think the tight integration of the log could be a huge plus. It's been a while since filesystem 101 but I would think that there are a ton of ways to optimize performance with log write back tricks and useage optimizations.. You could include a hit counter in metadata and have an optimizer that moves higher hit files closer to the log in the center of the disk making your more frequently used files closer to where the head is supposed to be. Those kinds of optimizations (if practical, maybe I'm full of it) wouldn't be nearly as easy with ext3 since the FS doesn't have any knowldege of the log. Plus xfs has ACLs and big file support already.

Hi,ext3fs is a journaled version of ext2fs written by Stephen Tweedie. It's in beta form right now but works pretty well. Stephen and Ted Ts'o talked about ext3fs at our Linux Storage Management Workshop in Darmstadt, Germany (you can get the slides for this workshop at ftp://linux.msede.com/lsmws_talks/) Stephen also gave a talk on ext3fs at the Linux Kongress in Augsburg, Germany. He is predicting Summer 2000 for production use of ext3fs. Nice features include the fact that ext3fs is backwards compatible with older versions of ext2. In addition, ext3fs uses asynchronous journaling, which means the performance will be as good or better than ext2fs. I am involved with the SGI effort to port XFS to Linux. The work to unencumber the code is accelerating, and February is the target date for source code release. The read path is working at this time. More work remains however, so stay tuned to http://oss.sgi.com

From Slashdot Q: I hate these "/dev/hda5 has reached maximal mount count; check forced". I hope they too go away with journaling...

A: Easy fix: raise the max-mount-counts and interval-between-checks for the filesystem with tune2fs. Example: tune2fs -c 200 /dev/sda1 -i 700 The -l flag will show you, among other things, the current settings. Be aware you are defeating a builtin safeguard to protect your data.

Das könnte Ihnen auch gefallen