Sie sind auf Seite 1von 402

UNIX Administration Course

Day 1:
Part 1: Introduction to the course. Introduction to UNIX. History
of UNIX and key features. Comparison with other OSs.

Part 2: The basics: files, UNIX shells, editors, commands.


Regular Expressions and Metacharacters in Shells.

Part 3: File ownership and access permissions.


Online help (man pages, etc.)

Day 2:
Part 1: System identity (system name, IP address, etc.)
Software: vendor, commercial, shareware, freeware (eg. GNU).
Hardware features: auto-detection, etc.
UNIX Characteristics: integration, stability,
reliability, security, scalability, performance.

Part 2: Shell scripts.

Part 3: System monitoring tools and tasks.

Part 4: Further shell scripts. Application development tools:


compilers, debuggers, GUI toolkits, high-level APIs.

Day 3:
Part 1: Installing an OS and software: inst, swmgr.
OS updates, patches, management issues.

Part 2: Organising a network with a server. NFS. Quotas.


Installing/removing internal/external hardware.
SGI OS/software/hardware installation. Network setup.

Part 3: Daily system administration tasks, eg. data backup.


System bootup and shutdown, events, daemons.

Part 4: Security/Access control: the law, firewalls, ftp.


Internet access: relevant files and services.
Course summary.

Part 5: Exploring administration issues, security, hacking,


responsibility, end-user support, the law (discussion).
Indy/Indy attack/defense using IRIX 5.3 vs. IRIX 6.5
(two groups of 3 or 4 each).

Figures
Day 1:
Part 1: Introduction to the course. Introduction to UNIX. History
Of UNIX and key features. Comparison with other OSs.

Introduction to UNIX and the Course.

The UNIX operating system (OS) is widely used around the world, eg.

 The backbone of the Internet relies on UNIX-based systems and services, as do the
systems used by most Internet Service Providers (ISPs).
 Major aspects of everyday life are managed using UNIX-based systems, eg. banks,
booking systems, company databases, medical records, etc.
 Other 'behind the scenes' uses concern data-intensive tasks, eg. art, design, industrial
design, CAD and computer animation to real-time 3D graphics, virtual reality, visual
simulation & training, data visualisation, database management, transaction processing,
scientific research, military applications, computational challenges, medical modeling,
entertainment and games, film/video special effects, live on-air broadcast effects, space
exploration, etc.

As an OS, UNIX is not often talked about in the media, perhaps because there is no single large
company such as Microsoft to which one can point at and say, "There's the company in charge of
UNIX." Most public talk is of Microsoft, Bill gates, Intel, PCs and other more visible aspects of
the computing arena, partly because of the home-based presence of PCs and the rise of the
Internet in the public eye. This is ironic because OSs like MS-DOS, Win3.1, Win95 and WinNT
all draw many of their basic features from UNIX, though they lack UNIX's sophistication and
power, mainly because they lack so many key features and a lengthy development history.

In reality, a great deal of the everyday computing world relies on UNIX-based systems running
on computers from a wide variety of vendors such as Compaq (Digital Equipment Corporation,
or DEC), Hewlett Packard (HP), International Business Machines (IBM), Intel, SGI (was Silicon
Graphics Inc., now just 'SGI'), Siemens Nixdorf, Sun Microsystems (Sun), etc.

In recent years, many companies which previously relied on DOS or Windows have begun to
realise that UNIX is increasingly important to their business, mainly because of what UNIX has
to offer and why, eg. portability, security, reliability, etc. As demands for handling data grow,
and companies embrace new methods of manipulating data (eg. data mining and visualisation),
the need for systems that can handle these problems forces companies to look at solutions that
are beyond the Wintel platform in performance, scalability and power.

Oil companies such as Texaco [1] and Chevron [2] are typical organisations which already use
UNIX systems extensively because of their data-intensive tasks and a need for extreme reliability
and scalability. As costs have come down, along with changes in the types of available UNIX
system (newer low-end designs, eg. Ultra5, O2, etc.), small and medium-sized companies are
looking towards UNIX solutions to solve their problems. Even individuals now find that older
2nd-hand UNIX systems have significant advantages over modern Wintel solutions, and many
companies/organisations have adopted this approach too [3].

This course serves as an introduction to UNIX, its history, features, operation, use and services,
applications, typical administration tasks, and relevant related topics such as the Internet,
security and the Law. SGI's version of UNIX, called IRIX, is used as an example UNIX OS. The
network of SGI Indys and an SGI Challenge S server I admin is used as an example UNIX
hardware platform.
The course lasts three days, each day consisting of a one hour lecture followed by a two hour
practical session in the morning, and then a three hour practical session in the afternoon; the only
exceptions to this are Day 1 which begins with a two hour lecture, and Day 3 which has a 1 hour
afternoon lecture.

Detailed notes are provided for all areas covered in the lectures and the practical sessions. With
new topics introduced step-by-step, the practical sessions enable first-hand familiarity with the
topics covered in the lectures.

As one might expect of an OS which has a vast range of features, capabilities and uses, it is not
possible to cover everything about UNIX in three days, especially the more advanced topics such
as kernel tuning which most administrators rarely have to deal with. Today, modern UNIX
hardware and software designs allow even very large systems with, for example, 64 processors to
be fully setup at the OS level in little more than an hour [4]. Hence, the course is based on the
author's experience of what a typical UNIX user and administrator (admin) has to deal with,
rather than attempting to present a highly compressed 'Grand Description of Everything' which
simply isn't necessary to enable an admin to perform real-world system administration on a daily
basis.

For example, the precise nature and function of the Sendmail email system on any flavour of
UNIX is not immediately easy to understand; looking at the various files and how Sendmail
works can be confusing. However, in the author's experience, due to the way UNIX is designed,
even a default OS installation without any further modification is sufficient to provide users with
a fully functional email service [5], a fact which shouldn't be of any great surprise since email is
a built-in aspect of any UNIX OS. Thus, the presence of email as a fundamental feature of UNIX
is explained, but configuring and customising Sendmail is not.

History of UNIX

Key:

BTL = Bell Telephone Laboratories


GE = General Electric
WE = Western Electric
MIT = Massachusetts Institute of Technology
BSD = Berkeley Standard Domain
Summary History:
1957: BTL creates the BESYS OS for internal use.
1964: BTL needs a new OS, develops Multics with GE and MIT.
1969: UNICS project started at BTL and MIT; OS written using the B
language.
1970: UNICS project well under way; anonymously renamed to UNIX.
1971: UNIX book published. 60 commands listed.
1972: C language completed (a rewritten form of B). Pipe concept invented.
1973: UNIX used on 16 sites. Kernel rewritten in C. UNIX spreads rapidly.
1974: Work spreads to Berkeley. BSD UNIX is born.
1975: UNIX licensed to universities for free.
1978: Two UNIX styles, though similar and related: System V and BSD.
1980s: Many companies launch their versions of UNIX, including Microsoft.
A push towards cross-platform standards: POSIX/X11/Motif
Independent organisations with cross-vendor membership
Control future development and standards. IEEE included.
1990s: 64bit versions of UNIX released. Massively scalable systems.
Internet springs to life, based on UNIX technologies. Further
Standardisation efforts (OpenGL, UNIX95, UNIX98).

Detailed History.

UNIX is now nearly 40 years old. It began


life in 1969 as a combined project run by
BTL, GE and MIT, initially created and
managed by Ken Thompson and Dennis
Ritchie [6]. The goal was to develop an
operating system for a large computer which
could support hundreds of simultaneous
users. The very early phase actually started at
BTL in 1957 when work began on what was
to become BESYS, an OS developed by BTL
for their internal needs.

In 1964, BTL started on the third generation


of their computing resources. They needed a
new operating system and so initiated the MULTICS (MULTIplexed operating and Computing
System) project in late 1964, a combined research programme between BTL, GE and MIT. Due
to differing design goals between the three groups, Bell pulled out of the project in 1969, leaving
personnel in Bell's Computing Science and Research Center with no usable computing
environment.

As a response to this move, Ken Thompson and Dennis Ritchie offered to design a new OS for
BTL, using a PDP-7 computer which was available at the time. Early work was done in a
language designed for writing compilers and systems programming, called BCPL (Basic
Combined Programming Language). BCPL was quickly simplified and revised to produce a
better language called B.

By the end of 1969 an early version of the OS was completed; a pun at previous work on
Multics, it was named UNICS (UNIplexed operating and Computing System) - an "emasculated
Multics". UNICS included a primitive kernel, an editor, assembler, a simple shell command
interpreter and basic command utilities such as rm, cat and cp. In 1970, extra funding arose from
BTL's internal use of UNICS for patent processing; as a result, the researchers obtained a DEC
PDP-11/20 for further work (24K RAM). At that time, the OS used 12K, with the remaining 12K
used for user programs and a RAM disk (file size limit was 64K, disk size limit was 512K).
BTL's Patent Department then took over the project, providing funding for a newer machine,
namely a PDP-11/45. By this time, UNICS had been abbreviated to UNIX - nobody knows
whose idea it was to change the name (probably just phonetic convenience).
In 1971, a book on UNIX by Thompson and Ritchie described over 60 commands, including:

 b (compile a B program)

 chdir (change working directory)

 chmod (change file access permissions)

 chown (change file ownership)

 cp (copy a file)

 ls (list directory contents)

 who (show who is on the system)

Even at this stage, fundamentally important aspects of UNIX were already firmly in place as core
features of the overall OS, eg. file ownership and file access permissions. Today, other operating
systems such as WindowsNT do not have these features as a rigorously integrated aspect of the
core OS design, resulting in a plethora of overhead issues concerning security, file management,
user access control and administration. These features, which are very important to modern
computing environments, are either added as convoluted bolt-ons to other OSs or are totally non-
existent (NT does have a concept of file ownership, but it isn't implemented very well;
regrettably, much of the advice given by people from VMS to Microsoft on how to implement
such features was ignored).

In 1972, Ritchie and Thompson rewrote B to create a new language called C. Around this time,
Thompson invented the 'pipe' - a standard mechanism for allowing the output of one program or
process to be used as the input for another. This became the foundation of the future UNIX OS
development philosophy: write programs which do one thing and do it well; write programs
which can work together and cooperate using pipes; write programs which support text streams
because text is a 'universal interface' [6].

By 1973, UNIX had spread to sixteen sites, all within AT&T and WE. First made public at a
conference in October that year, within six months the number of sites using UNIX had tripled.
Following a publication of a version of UNIX in 'Communications of the ACM' in July 1974,
requests for the OS began to rapidly escalate. Crucially at this time, the fundamentals of C were
complete and much of UNIX's 11000 lines of code were rewritten in C - this was a major
breakthrough in operating systems design: it meant that the OS could be used on virtually any
computer platform since C was hardware independent.

In late 1974, Thompson went to University of California at Berkeley to teach for a year. Working
with Bill Joy and Chuck Haley, the three developed the 'Berkeley' version of UNIX (named
BSD, for Berkeley Software Distribution), the source code of which was widely distributed to
students on campus and beyond, ie. students at Berkeley and elsewhere also worked on
improving the OS. BTL incorporated useful improvements as they arose, including some work
from a user in the UK. By this time, the use and distribution of UNIX was out of BTL's control,
largely because of the work at Berkeley on BSD.

Developments to BSD UNIX added the vi editor, C-based shell interpreter, the Sendmail email
system, virtual memory, and support for TCP/IP networking technologies (Transmission Control
Protocol/Internet Protocol). Again, a service as important as email was now a fundamental part
of the OS, eg. the OS uses email as a means of notifying the system administrator of system
status, problems, reports, etc. Any installation of UNIX for any platform automatically includes
email; by complete contrast, email is not a part of Windows3.1, Win95, Win98 or WinNT -
email for these OSs must be added separately (eg. Pegasus Mail), sometimes causing problems
which would not otherwise be present.

In 1975, a further revision of UNIX known as the Fifth Edition was released and licensed to
universities for free. After the release of the Seventh Edition in 1978, the divergence of UNIX
development along two separate but related paths became clear: System V (BTL) and BSD
(Berkeley). BTL and Sun combined to create System V Release 4 (SVR4) which brought
together System V with large parts of BSD. For a while, SVR4 was the more rigidly controlled,
commercial and properly supported (compared to BSD on its own), though important work
occurred in both versions and both continued to be alike in many ways. Fearing Sun's possible
domination, many other vendors formed the Open Software Foundation (OSF) to further work on
BSD and other variants. Note that in 1979, a typical UNIX kernel was still only 40K.

Because of a legal decree which prevented AT&T from selling the work of BTL, AT&T allowed
UNIX to be widely distributed via licensing schemas at minimal or zero cost. The first genuine
UNIX vendor, Interactive Systems Corporation, started selling UNIX systems for automating
office work. Meanwhile, the work at AT&T (various internal design groups) was combined, then
taken over by WE, which became UNIX System Laboratories (now owned by Novell). Later
releases included Sytem III and various releases of System V. Today, most popular brands of
UNIX are based either on SVR4, BSD, or a combination of both (usually SVR4 with standard
enhancements from BSD, which for example describes SGI's IRIX version perfectly). As an
aside, there never was a System I since WE feared companies would assume a 'system 1' would
be bug-ridden and so would wait for a later release (or purchase BSD instead!).

It's worth noting the influence from the superb research effort at Xerox Parc, which was working
on networking technologies, electronic mail systems and graphical user interfaces, including the
proverbial 'mouse'. The Apple Mac arose directly from the efforts of Xerox Parc which,
incredibly and much against the wishes of many Xerox Parc employees, gave free
demonstrations to people such as Steve Jobs (founder of Apple) and sold their ideas for next to
nothing ($50000). This was perhaps the biggest financial give-away in history [7].

One reason why so many different names for UNIX emerged over the years was the practice of
AT&T to license the UNIX software, but not the UNIX name itself. The various flavours of
UNIX may have different names (SunOS, Solaris, Ultrix, AIX, Xenix, UnixWare, IRIX, Digital
UNIX, HP-UX, OpenBSD, FreeBSD, Linux, etc.) but in general the differences between them
are minimal. Someone who learns a particular vendor's version of UNIX (eg. Sun's Solaris) will
easily be able to adapt to a different version from another vendor (eg. DEC's Digital UNIX).
Most differences merely concern the names and/or locations of particular files, as opposed to any
core underlying aspect of the OS.

Further enhancements to UNIX included compilation management systems such as make and
Imake (allowing for a single source code release to be compiled on any UNIX platform) and
support for source code management (SCCS). Services such as telnet for remote communication
were also completed, along with ftp for file transfer, and other useful functions.

In the early 1980s, Microsoft developed and released its version of UNIX called Xenix (it's a
shame this wasn't pushed into the business market instead of DOS). The first 32bit version of
UNIX was released at this time. SCO developed UnixWare which is often used today by Intel for
publishing performance ratings for its x86-based processors [8]. SGI started IRIX in the early
1980s, combining SVR4 with an advanced GUI. Sun's SunOS sprang to life in 1984, which
became widely used in educational institutions. NeXT-Step arrived in 1989 and was hailed as a
superb development platform; this was the platform used to develop the game 'Doom', which was
then ported to DOS for final release. 'Doom' became one of the most successful and influential
PC games of all time and was largely responsible for the rapid demand for better hardware
graphics systems amongst home users in the early 1990s - not many people know that it was
originally designed on a UNIX system though. Similarly, much of the development work for
Quake was done using a 4-processor Digital Alpha system [9].

During the 1980s, developments in standardised graphical user interface elements were
introduced (X11 and Motif) along with other major additional features, especially Sun's
Networked File System (NFS) which allows multiple file systems, from multiple UNIX
machines from different vendors, to be transparently shared and treated as a single file structure.
Users see a single coherant file system even though the reality may involve many different
systems in different physical locations.

By this stage, UNIX's key features had firmly established its place in the computing world, eg.
Multi-tasking and multi-user (many independent processes can run at once; many users can use a
single system at the same time; a single user can use many systems at the same time). However,
in general, the user interface to most UNIX variants was poor: mainly text based. Most vendors
began serious GUI development in the early 1980s, especially SGI which has traditionally
focused on visual-related markets [10].

From the point of view of a mature operating system, and certainly in the interests of companies
and users, there were significant moves in the 1980s and early 1990s to introduce standards
which would greatly simplify the cross-platform use of UNIX. These changes, which continue
today, include:

 The POSIX standard [6], begun in 1985 and released in 1990: a suite of application
programming interface standards which provide for the portability of application source
code relating to operating system services, managed by the X/Open group.
 X11 and Motif: GUI and windowing standards, managed by the X Consortium and OSF.
 UNIX95, UNIX98: a set of standards and guidelines to help make the various UNIX
flavours more coherant and cross-platform.
 OpenGL: a 3D graphics programming standard originally developed by SGI as GL
(Graphics Library), then IrisGL, eventually released as an open standard by SGI as
OpenGL and rapidly adopted by all other vendors.
 Journaled file systems such as SGI's XFS which allow the creation, management and use
of very large file systems, eg. multiple terabytes in size, with file sizes from a single byte
to millions of terabytes, plus support for real-time and predictable response. EDIT
(2008): Linux can now use XFS.
 Interoperability standards so that UNIX systems can seamlessly operate with non-UNIX
systems such as DOS PCs, WindowsNT, etc.

Standards Notes
POSIX:
X/Open eventually became UNIX International (UI), which competed for a while with
OSF. The US Federal Government initiated POSIX (essentially a version of UNIX),
requiring all government contracts to conform to the POSIX standard - this freed the US
government from being tied to vendor-specific systems, but also gave UNIX a major
boost in popularity as users benefited from the industry's rapid adoption of accepted
standards.

X11 and Motif:


Programming directly using low-level X11/Motif libraries can be non-trivial. As a result,
higher level programming interfaces were developed in later years, eg. the ViewKit
library suite for SGI systems. Just as 'Open Inventor' is a higher-level 3D graphics API to
OpenGL, ViewKit allows one to focus on developing the application and solving the
client's problem, rather than having to wade through numerous low-level details. Even
higher-level GUI-based toolkits exist for rapid application development, eg. SGI's
RapidApp.

UNIX95, UNIX98:
Most modern UNIX variants comply with these standards, though Linux is a typical
exception (it is POSIX-compliant, but does not adhere to other standards). There are
several UNIX variants available for PCs, excluding Alpha-based systems which can also
use NT (MIPS CPUs could once be used with NT as well, but Microsoft dropped NT
support for MIPS due to competition fears from Intel whose CPUs were not as fast at the
time [11]):

 Linux Open-architecture, free, global development,


insecure.

 OpenBSD More rigidly controlled, much more secure.

 FreeBSD Somewhere inbetween the above two.

 UnixWare More advanced. Scalable. Not free.

There are also commercial versions of Linux which have additional features and services,
eg. Red Hat Linux and Calderra Linux. Note that many vendors today are working to
enable the various UNIX variants to be used with Intel's CPUs - this is needed by Intel in
order to decrease its dependence on the various Microsoft OS products.

OpenGL:
Apple was the last company to adopt OpenGL. In the 1990s, Microsoft attempted to force
its own standards into the marketplace (Direct3D and DirectX) but this move was
doomed to failure due to the superior design of OpenGL and its ease of use, eg. games
designers such as John Carmack (Doom, Quake, etc.) decided OpenGL was the much
better choice for games development. Compared to Direct3D/DirectX, OpenGL is far
superior for seriously complex problems such as visual simulation, military/industrial
applications, image processing, GIS, numerical simulation and medical imaging.

In a move to unify the marketplace, SGI and Microsoft signed a deal in the late 1990s to
merge DirectX and Direct3D into OpenGL - the project, called Fahrenheit, will
eventually lead to a single unified graphics programming interface for all platforms from
all vendors, from the lowest PC to the fastest SGI/Cray supercomputer available with
thousands of processors. To a large degree, Direct3D will simply either be phased out in
favour of OpenGL's methods, or focused entirely on consumer-level applications, though
OpenGL will dominate in the final product for the entertainment market.

OpenGL is managed by the OpenGL Architecture Review Board, an independent


organisation with member representatives from all major UNIX vendors, relevant
companies and institutions.

Journaled file systems:


File systems like SGI's XFS running on powerful UNIX systems like CrayOrigin2000
can easily support sustained data transfer rates of hundreds of gigabytes per second. XFS
has a maximum file size limit of 9 million terabytes.
The end result of the last 30 years of UNIX development is what is known as an 'Open System',
ie. a system which permits reliable application portability, interoperability between different
systems and effective user portability between a wide variety of different vendor hardware and
software platforms. Combined with a modern set of compliance standards, UNIX is now a
mature, well-understood, highly developed, powerful and very sophisticated OS.

Many important features of UNIX do not exist in other OSs such as WindowsNT and will not do
so for years to come, if ever. These include guaranteeable reliability, security, stability, extreme
scalability (thousands of processors), proper support for advanced multi-processing with unified
shared memory and resources (ie. parallel compute systems with more than 1 CPU), support for
genuine real-time response, portability and an ever-increasing ease-of-use through highly
advanced GUIs. Modern UNIX GUIs combine the familiar use of icons with the immense power
and flexibility of the UNIX shell command line which, for example, supports full remote
administration (a significant criticism of WinNT is the lack of any real command line interface
for remote administration). By contrast, Windows2000 includes a colossal amount of new code
which will introduce a plethora of new bugs and problems.

A summary of key UNIX features would be:

 Multi-tasking: many different processes can operate independently at once.


 Multi-user: many users can use a single machine at the same time; a single user can use
multiple machines at the same time.
 Multi-processing: most commercial UNIX systems scale to at least 32 or 64 CPUs (Sun,
IBM, HP), while others scale to hundreds or thousands (IRIX, Unicos, AIX, etc.; Blue
Mountain [12], Blue Pacific, ASCI Red). Today, WindowsNT cannot reliably scale to
even 8 CPUs. Intel will not begin selling 8-way chip sets until Q3 1999.
 Multi-threading: automatic parallel execution of applications across multiple CPUs and
graphics systems when programs are written using the relevant extensions and libraries.
Some tasks are naturally non-threadable, eg. Rendering animation frames for movies
(each processor computes a single frame using a round-robin approach), while others
lend themselves very well to parallel execution, eg. Computational Fluid Dynamics,
Finite Element Analysis, Image Processing, Quantum Chronodynamics, weather
modeling, database processing, medical imaging, visual simulation and other areas of 3D
graphics, etc.
 Platform independence and portability: applications written on UNIX systems will
compile and run on other UNIX systems if they're developed with a standards-based
approach, eg. the use of ANSI C or C++, Motif libraries, etc.; UNIX hides the hardware
architecture from the user, easing portability. The close relationship between UNIX and
C, plus the fact that the UNIX shell is based on C, provides for a powerful development
environment. Today, GUI-based development environments for UNIX systems also exist,
giving even greater power and flexibility, eg. SGI's WorkShop Pro CASE tools and
RapidApp.
 Full 64bit environment: proper support for very large memory spaces, up to hundreds of
GB of RAM, visible to the system as a single combined memory space. Comparison:
NT's current maximum limit is 4GB; IRIX's current commercial limit is 512GB, though
Blue Mountain's 6144-CPU SGI system has a current limit of 12000GB RAM (twice that
if the CPUs were upgraded to the latest model). Blue Mountain has 1500GB RAM
installed at the moment.
 Inter-system communication: services such as telnet, Sendmail, TCP/IP, remote login
(rlogin), DNS, NIS, NFS, etc. Sophisticated security and access control. Features such as
email and telnet are a fundamental part of UNIX, but they must be added as extras to
other OSs. UNIX allows one to transparently access devices on a remote system and even
install the OS using a CDROM, DAT or disk that resides on a remote machine. Note that
some of the development which went into these technologies was in conjunction with the
evolution of ArpaNet (the early Internet that was just for key US government, military,
research and educational sites).
 File identity and access: unique file ownership and a logical file access permission
structure provide very high-level management of file access for use by users and
administrators alike. OSs which lack these features as a core part of the OS make it far
too easy for a hacker or even an ordinary user to gain administrator-level access (NT is a
typical example).
 System identity: every UNIX system has a distinct unique entity, ie. a system name and
an IP (Internet Protocol) address. These offer numerous advantages for users and
administrators, eg. security, access control, system-specific environments, the ability to
login and use multiple systems at once, etc.
 Genuine 'plug & play': UNIX OSs already include drivers and support for all devices that
the source vendor is aware of. Adding most brands of disks, printers, CDROMs, DATs,
Floptical drives, ZIP or JAZ drives, etc. to a system requires no installation of any drivers
at all (the downside of this is that a typical modern UNIX OS installation can be large,
eg. 300MB). Detection and name-allocation to devices is largely automatic - there is no
need to assign specific interrupt or memory addresses for devices, or assign labels for
disk drives, ZIP drives, etc. Devices can be added and removed without affecting the
long-term operation of the system. This also often applies to internal components such as
CPUs, video boards, etc. (at least for SGIs).

UNIX Today.

In recent years, one aspect of UNIX that was holding it back from spreading more widely was
cost. Many vendors often charged too high a price for their particular flavour of UNIX. This
made its use by small businesses and home users prohibitive. The ever decreasing cost of PCs,
combined with the sheer marketing power of Microsoft, gave rise of the rapid growth of
Windows and now WindowsNT. However, in 1993, Linus Torvalds developed a version of
UNIX called Linux (he pronounces it rather like 'leenoox', rhyming with 'see-books') which was
free and ran on PCs as well as other hardware platforms such as DEC machines. In what must be
one of the most astonishing developments of the computer age, Linux has rapidly grown to
become a highly popular OS for home and small business use and is now being supported by
many major companies too, including Oracle, IBM, SGI, HP, Dell and others.
Linux does not have the sophistication of the more traditional UNIX variants such as SGI's IRIX,
but Linux is free (older releases of IRIX such as IRIX 6.2 are also free, but not the very latest
release, namely IRIX 6.5). This has resulted in the rapid adoption of Linux by many people and
businesses, especially for servers, application development, home use, etc. With the recent
announcement of support for multi-processing in Linux for up to 8 CPUs, Linux is becoming an
important player in the UNIX world and a likely candidate to take on Microsoft in the battle for
OS dominance.

However, it'll be a while before Linux


will be used for 'serious' applications
since it does not have the rigorous
development history and discipline of
other UNIX versions, eg. Blue Mountain
is an IRIX system consisting of 6144
CPUs, 1500GB RAM, 76000GB disk
space, and capable of 3000 billion
floating-point operations per second.
This level of system development is
what drives many aspects of today's
UNIX evolution and the hardware which
supports UNIX OSs. Linux lacks this
top-down approach and needs a lot of
work in areas such as security and
support for graphics, but Linux is
nevertheless becoming very useful in
fields such as render-farm construction
for movie studios, eg. a network of cheap PentiumIII machines, networked and running the free
Linux OS, reliable and stable. The film "Titanic" was the first major film which used a Linux-
based render-farm, though it employed many other UNIX systems too (eg. SGIs, Alphas), as
well as some NT systems.

EDIT (2008): Linux is now very much used for serious work, running most of the planet's
Internet servers, and widely used in movie studios for Flame/Smoke on professional x86
systems. It's come a long way since 1999, with new distributions such as Ubuntu and Gentoo
proving very popular. At the high-end, SGI offers products that range from its shared-memory
Linux-based Altix 4700 system with up to 1024 CPUs, to the Altix ICE, a highly expandable
XEON/Linux cluster system with some sites using machines with tens of thousands of cores.

UNIX has come a long way since 1969. Thompson and Ritchie could never have imagined that it
would spread so widely and eventually lead to its use in such things as the control of the Mars
Pathfinder probe which last year landed on Mars, including the operation of the Internet web
server which allowed millions of people around the world to see the images brought back as the
Martian event unfolded [13].

Today, from an administrator's perspective, UNIX is a stable and reliable OS which pretty much
runs itself once it's properly setup. UNIX requires far less daily administration than other OSs
such as NT - a factor not often taken into account when companies form purchasing decisions
(salaries are a major part of a company's expenditure). UNIX certainly has its baggage in terms
of file structure and the way some aspects of the OS actually work, but after so many years most
if not all of the key problems have been solved, giving rise to an OS which offers far superior
reliability, stability, security, etc. In that sense, UNIX has very well-known baggage which is
absolutely vital to safety-critical applications such as military, medical, government and
industrial use. Byte magazine once said that NT was only now tackling OS issues which other
OSs had solved years before [14].

Thanks to a standards-based and top-down approach, UNIX is evolving to remove its baggage in
a reliable way, eg. the introduction of the NSD (Name Service Daemon) to replace DNS
(Domain Name Service), NIS (Network Information Service) and aspects of NFS operation; the
new service is faster, more efficient, and easier on system resources such as memory and
network usage.

However, in the never-ending public relations battle for computer systems and OS dominance,
NT has firmly established itself as an OS which will be increasingly used by many companies
due to the widespread use of the traditional PC and the very low cost of Intel's mass-produced
CPUs. Rival vendors continue to offer much faster systems than PCs, whether or not UNIX is
used, so I expect to see interesting times ahead in the realm of OS development. Companies like
SGI bridge the gap by releasing advanced hardware systems which support NT (eg. the Visual
Workstation 320 [15]), systems whose design is born out of UNIX-based experience.

One thing is certain: some flavour of UNIX will always be at the forefront of future OS
development, whatever variant it may be.

References

1. Texaco processes GIS data in order to analyse suitable sites for oil exploration. Their
models can take several months to run even on large multi-processor machines. However,
as systems become faster, companies like Texaco simply try to solve more complex
problems, with more detail, etc.
2. Chevron's Nigerian office has, what was in mid-1998, the fastest supercomputer in
Africa, namely a 16-processor SGI POWER Challenge (probably replaced by now with a
modern 64-CPU Origin2000). A typical data set processed by the system is about 60GB
which takes around two weeks to process, during which time the system must not go
wrong or much processing time is lost. For individual work, Chevron uses Octane
workstations which are able to process 750MB of volumetric GIS data in less than three
seconds. Solving these types of problems with PCs is not yet possible.
3. The 'Tasmania Parks and Wildlife Services' (TPWS) organisation is responsible for the
management and environmental planning of Tasmania's National Parks. They use modern
systems like the SGI O2 and SGI Octane for modeling and simulation (virtual park
models to aid in decision making and planning), but have found that much older systems
such as POWER Series Predator and Crimson RealityEngine (SGI systems dating from
1992) are perfectly adequate for their tasks, and can still outperform modern PCs. For
example, the full-featured pixel-fill rate of their RealityEngine system (320M/sec), which
supports 48bit colour at very high resolutions (1280x2048 with 160MB VRAM), has still
not been bettered by any modern PC solution. Real-time graphics comparisons at
http://www.blender.nl/stuff/blench1.html show Crimson RE easily outperforming many
modern PCs which ought to be faster given RE is 7 years old. Information supplied by
Simon Pigot (TPWS SysAdmin).
4. "State University of New York at Buffalo Teams up with SGI for Next-Level
Supercomputing Site. New Facility Brings Exciting Science and Competitive Edge to
University":
http://www.sgi.com/origin/successes/buffalo.html

5. Even though the email-related aspects of the Computing Department's SGI network have
not been changed in any way from the default settings (created during the original OS
installation), users can still email other users on the system as well as send email to
external sites.
6. Unix history:
http://virtual.park.uga.edu/hc/unixhistory.html
A Brief History of UNIX:
http://pantheon.yale.edu/help/unixos/unix-intro.html
UNIX Lectures:
http://www.sis.port.ac.uk/~briggsjs/csar4/U2.htm
Basic UNIX:
http://osiris.staff.udg.mx/man/ingles/his.html
POSIX: Portable Operating System Interface:
http://www.pasc.org/abstracts/posix.htm

7. "The Triumph of the Nerds", Channel 4 documentary.


8. Standard Performance Evaluation Corporation:
http://www.specbench.org/
Example use of UnixWare by Intel for benchmark reporting:
http://www.specbench.org/osg/cpu95/results/res98q3/cpu95-980831-03026.html
http://www.specbench.org/osg/cpu95/results/res98q3/cpu95-980831-03023.html
9. "My Visit to the USA" (id Software, Paradigm Simulation Inc., NOA):
http://doomgate.gamers.org/dhs/dhs/usavisit/dallas.html
10. Personal IRIS 4D/25, PCW Magazine, September 1990, pp. 186:
http://www.futuretech.vuurwerk.nl/pcw9-90pi4d25.html
IndigoMagic User Environment, SGI, 1993 [IND-MAGIC-BRO(6/93)].

IRIS Indigo Brochure, SGI, 1991 [HLW-BRO-01 (6/91)].

"Smooth Operator", CGI Magazine, Vol4, Issue 1, Jan/Feb 1999, pp. 41-42.

Digital Media World '98 (Film Effects and Animation Festival, Wembley Conference
Center, London). Forty six pieces of work were submitted to the conference magazine by
company attendees. Out of the 46 items, 43 had used SGIs; of these, 34 had used only
SGIs.
11. "MIPS-based PCs fastest for WindowsNT", "MIPS Technologies announces 200MHz
R4400 RISC microprocessor", "MIPS demonstrates Pentium-class RISC PC designs", -
all from IRIS UK, Issue 1, 1994, pp. 5.
12. Blue Mountain, Los Alamos National Laboratory:
13. http://www.lanl.gov/asci/
14. http://www.lanl.gov/asci/bluemtn/ASCI_fly.pdf
15. http://www.lanl.gov/asci/bluemtn/bluemtn.html
16. http://www.lanl.gov/asci/bluemtn/t_sysnews.shtml
http://www.lanl.gov/orgs/pa/News/111298.html#anchor263034

17. "Silicon Graphics Technology Plays Mission-Critical Role in Mars Landing"


http://www.sgi.com/newsroom/press_releases/1997/june/jplmars_release.html
"Silicon Graphics WebFORCE Internet Servers Power Mars Web Site, One of the
World's Largest Web Events"
http://www.sgi.com/newsroom/press_releases/1997/july/marswebforce_release.html
"PC Users Worldwide Can Explore VRML Simulation of Mars Terrain Via the Internet"
http://www.sgi.com/newsroom/press_releases/1997/june/vrmlmars_release.html

18. "Deja Vu All Over Again"; "Windows NT security is under fire. It's not just that there are
holes, but that they are holes that other OSes patched years ago", Byte Magazine, Vol 22
No. 11, November 1997 Issue, pp. 81 to 82, by Peter Mudge and Yobie Benjamin.
19. VisualWorkstation320 Home Page:
http://visual.sgi.com/
Day 1:
Part 2: The basics: files, UNIX shells, editors, commands.
Regular Expressions and Metacharacters in Shells.

UNIX Fundamentals: Files and the File System.

At the lowest level, from a command-line point of view, just about everything in a UNIX
environment is treated as a file - even hardware entities, eg. Printers, disks and DAT drives. Such
items might be described as 'devices' or with other terms, but at the lowest level they are visible
to the admin and user as files somewhere in the UNIX file system (under /dev in the case of
hardware devices). Though this structure may seem a little odd at first, it means that system
commands can use a common processing and communication interface no matter what type of
file they're dealing with, eg. Text, pipes, data redirection, etc. (these concepts are explained in
more detail later).

The UNIX file system can be regarded as a top-down tree of files and directories, starting with
the top-most 'root' directory. A directory can be visualised as a filing cabinet, other directories as
folders within the cabinet and individual files as the pieces of paper within folders. It's a useful
analogy if one isn't familiar with file system concepts, but somewhat inaccurate since a directory
in a computer file system can contain files on their own as well as other directories, ie. Most
office filing cabinets don't have loose pieces of paper outside of folders.

UNIX file systems can also have 'hidden' files and directories. In DOS, a hidden file is just a file
with a special attribute set so that 'dir' and other commands do not show the file; by contrast, a
hidden file in UNIX is any file which begins with a dot '.' (period) character, ie. the hidden status
is a result of an aspect of the file's name, not an attribute that is bolted onto the file's general
existence. Further, whether or not a user can access a hidden file or look inside a hidden
directory has nothing to do with the fact that the file or directory is hidden from normal view (a
hidden file in DOS cannot be written to). Access permissions are a separate aspect of the
fundamental nature of a UNIX file and are dealt with later.

The 'ls' command lists files and directories in the current directory, or some other part of the file
system by specifying a 'path' name. For example:

ls /

Will show the contents of the root directory, which may typically contain the following:

CDROM dev home mapleson proc stand usr


bin dumpster lib nsmail root.home tmp var
debug etc lib32 opt sbin unix

Figure 1. A typical root directory shown by 'ls'.


Almost every UNIX system has its own unique root directory and file system, stored on a disk within the
machine. The exception is a machine with no internal disk, running off a remote server in some way;
such systems are described as 'diskless nodes' and are very rare in modern UNIX environments, though
still used if a diskless node is an appropriate solution.

Some of the items in Fig 1. Are files, while others are directories? If one uses an option '-F' with
the ls command, special characters are shown after the names for extra clarity:

/ - directory

* - executable file

@ - link to another file or directory


Elsewhere in the file system

Thus, using 'ls -F' gives this more useful output:

CDROM/ dev/ home/ mapleson/ proc/ stand/ usr/


Bin/ dumpster/ lib/ nsmail/ root.home tmp/ var/
Debug/ etc/ lib32/ opt/ sbin/ unix*

Figure 2. The root directory shown by 'ls -F /'.


Fig 2 shows that most of the items are in fact other directories. Only two items are ordinary files: 'unix'
and 'root.home'. 'UNIX' is the main UNIX kernel file and is often several megabytes in size for today's
modern UNIX systems - this is partly because the kernel must often include support for 64bit as well as
older 32bit system components. 'root.home' is merely a file created when the root user accesses the
WWW using Netscape, ie. an application-specific file.

Important directories in the root directory:

/bin - many as-standard system commands


are here (links to /usr/bin)

/dev - device files for keyboard, disks, printers, etc.

/etc - system configuration files

/home - user accounts are here (NFS mounted)

/lib - library files used by executable programs

/sbin - user applications and other commands

/tmp - temporary directory (anyone can create


files here). This directory is normally
erased on bootup

/usr - Various product-specific directories, system


resource directories, locations of online help
(/usr/share), header files of application
development (usr/include), further system
configuration files relating to low-level
hardware which are rarely touched even by an
administrator (eg. /usr/cpu and /usr/gfx).

/var - X Windows files (/var/X11), system services


files (eg. software licenses in /var/flexlm),
various application related files (/var/netscape,
/var/dmedia), system administration files and data
(/var/adm, /var/spool) and a second temporary
directory (/var/tmp) which is not normally erased
on bootup (an administrator can alter the
behaviour of both /tmp and /var/tmp).

/mapleson - (non-standard) my home account is here, NFS-


mounted from the admin Indy called Milamber.

Figure 3. Important directories in the root directory.


Comparisons with other UNIX variants such as HP-UX, SunOS and Solaris can be found in the many FAQ
(Frequently Asked Questions) files available via the Internet [1].

Browsing around the UNIX file system can be enlightening but also a little overwhelming at
first. However, an admin never has to be concerned with most parts of the file structure; low-
level system directories such as /var/cpu are managed automatically by various system tasks and
programs. Rarely, if ever, does an admin even have to look in such directories, never mind alter
their contents (the latter is probably an unwise thing to do).

From the point of view of a novice admin, the most important directory is /etc. It is this directory
which contains the key system configuration files and it is these files which are most often
changed when an admin wishes to alter system behaviour or properties. In fact, an admin can get
to grips with how a UNIX system works very quickly, simply by learning all about the following
files to begin with:

/etc/sys_id - the name of the system (may include full domain)

/etc/hosts - summary of full host names (standard file,


added to by the administrator)

/etc/fstab - list of file systems to mount on bootup

/etc/passwd - password file, contains user account information

/etc/group - group file, contains details of all user groups

Figure 4. Key files for the novice administrator.


Note that an admin also has a personal account, ie. an ordinary user account, which should be used for
any task not related to system administration. More precisely, an admin should only be logged in as root
when it is strictly necessary, mainly to avoid unintended actions, eg. accidental use of the 'rm'
command.
A Note on the 'man' Command.

The manual pages and other online information for the files shown in Fig 4 all list references to
other related files, eg. the man page for 'fstab' lists 'mount' and 'xfs' in its 'SEE ALSO' section, as
well as an entry called 'filesystems' which is a general overview document about UNIX file
systems of all types, including those used by CDROMs and floppy disks. Modern UNIX releases
contain a large number of useful general reference pages such as 'filesystems'. Since one may not
know what is available, the 'k' and 'f' options can be used with the man command to offer
suggestions, eg. 'man -f file' gives this output (the -f option shows all man page titles for entries
that begin with the word 'file'):

ferror, feof, clearerr,


fileno (3S) stream status inquiries
file (1) determine file type
file (3Tcl) Manipulate file names and attributes
File::Compare (3) Compare files or filehandles
File::Copy (3) Copy files or filehandles
File::DosGlob (3) DOS like globbing and then some
File::Path (3) create or remove a series
of directories
File::stat (3) by-name interface to Perl's built-in
stat() functions
filebuf (3C++) buffer for file I/O.
FileCache (3) keep more files open than
the system permits
fileevent (3Tk) Execute a script when a file
becomes readable or writable
FileHandle (3) supply object methods for filehandles
filename_to_devname (2) determine the device name
for the device file
filename_to_drivername (2) determine the device name
for the device file
fileparse (3) split a pathname into pieces
files (7P) local files name service
parser library
FilesystemManager (1M) view and manage filesystems
filesystems: cdfs, dos,
fat, EFS, hfs, mac,
iso9660, cd-rom, kfs,
nfs, XFS, rockridge (4) IRIX filesystem types
filetype (5) K-AShare's filetype
specification file
filetype, fileopen,
filealtopen, wstype (1) determine filetype of specified
file or files
routeprint, fileconvert (1) convert file to printer or
to specified filetype

Figure 5. Output from 'man -f file'.


'man -k file' gives a much longer output since the '-k' option runs a search on every man page title
containing the word 'file'. So a point to note: judicious use of the man command along with other online
information is an effective way to learn how any UNIX system works and how to make changes to
system behaviour. All man pages for commands give examples of their use, a summary of possible
options, syntax, further references, a list of any known bugs with appropriate workarounds, etc.

The next most important directory is probably /var since this is where the configuration files for
many system services are often housed, such as the Domain Name Service (/var/named) and
Network Information Service (/var/yp). However, small networks usually do not need these
services which are aimed more at larger networks. They can be useful though, for example in
aiding Internet access.

Overall, a typical UNIX file system will have over several thousand files. It is possible for an
admin to manage a system without ever knowing what the majority of the system's files are for.
In fact, this is a preferable way of managing a system. When a problem arises, it is more
important to know where to find relevant information on how to solve the problem, rather than
try to learn the solution to every possible problem in the first instance (which is impossible).

I once asked an experienced SGI administrator (the first person to ever use the massive Cray
T3D supercomputer at the Edinburgh Parallel Computing Centre) what the most important thing
in his daily working life was. He said it was a small yellow note book in which he had written
where to find information about various topics. The book was an index on where to find facts,
not a collection of facts in itself.

Hidden files were described earlier. The '-a' option can be used with the ls command to show
hidden files:

ls -a /

gives:

./ .sgihelprc lib/
../ .ssh/ lib32/
.Acroread.License .varupdate mapleson/
.Sgiresources .weblink nsmail/
.cshrc .wshttymode opt/
.desktop-yoda/ .zmailrc proc/
.ebtpriv/ CDROM/ sbin/
.expertInsight bin/ stand/
.insightrc debug/ swap/
.jotrc* dev/ tmp/
.login dumpster/ unix*
.netscape/ etc/ usr/
.profile floppy/ var/
.rhosts home/

Figure 6. Hidden files shown with 'ls -a /'.


For most users, important hidden files would be those which configure their basic working environment
when they login:

.cshrc
.login
.profile
Other hidden files and directories refer to application-specific resources such as Netscape, or
GUI-related resources such as the .desktop-sysname directory (where 'sysname' is the name of
the host).

Although the behaviour of the ls command can be altered with the 'alias' command so that it
shows hidden files by default, the raw behaviour of ls can be accessed by using an absolute
directory path to the command:

/bin/ls

Using the absolute path to any file in this way allows one to ignore any aliases which may have
been defined, as well as the normal behaviour of the shell to search the user's defined path for the
first instance of a command. This is a useful technique when performing actions as root since it
ensures that the wrong command is not executing by mistake.

Network File System (NFS)

An important feature of UNIX is the ability to access a particular directory on one machine from
another machine. This service is called the 'Network File System' (NFS) and the procedure itself
is called 'mounting'.

For example, on the machines in Ve24, the directory /home is completely empty - no files are in
it whatsoever (except for a README file which is explained below). When one of the Indys is
turned on, it 'mounts' the /home directory from the server 'on top' of the /home directory of the
local machine. Anyone looking in the /home directory actually sees the contents of /home on the
server.

The 'mount' command is used to mount a directory on a file system belonging to a remote host
onto some directory on the local host's filesystem. The remote host must 'export' a directory in
order for other hosts to locally mount it. The /etc/exports file contains a list of directories to be
exported.

For example, the following shows how the /home directory on one of the Ve24 Indys (akira) is
mounted off the server, yet appears to an ordinary user to be just another part of akira's overall
file system (NB: the '#' indicates these actions are being performed as root; an ordinary user
would not be able to use the mount command in this way):

AKIRA 1# mount | grep YODA


YODA:/var/www on /var/www type nfs (vers=3,rw,soft,intr,bg,dev=c0001)
YODA:/var/mail on /var/mail type nfs (vers=3,rw,dev=c0002)
YODA:/home on /home type nfs (vers=3,rw,soft,intr,bg,dev=c0003)
AKIRA 1# ls /home
dist/ projects/ pub/ staff/ students/ tmp/ yoda/
AKIRA 2# umount /home
AKIRA 1# mount | grep YODA
YODA:/var/www on /var/www type nfs (vers=3,rw,soft,intr,bg,dev=c0001)
YODA:/var/mail on /var/mail type nfs (vers=3,rw,dev=c0002)
AKIRA 3# ls /home
README
AKIRA 4# mount /home
AKIRA 5# ls /home
dist/ projects/ pub/ staff/ students/ tmp/ yoda/
AKIRA 6# ls /
CDROM/ dev/ home/ mapleson/ proc/ stand/ usr/
bin/ dumpster/ lib/ nsmail/ root.home tmp/ var/
debug/ etc/ lib32/ opt/ sbin/ unix*

Figure 7. Manipulating an NFS-mounted file system with 'mount'.

Each Indy has a README file in its local /home, containing:

The /home filesystem from Yoda is not mounted for some reason.
Please contact me immediately!

Ian Mapleson, Senior Technician.

3297 (internal)
mapleson@gamers.org

After /home is remounted in Fig 7, the ls command no longer shows the README file as being
present in /home, ie. when /home is mounted from the server, the local contents of /home are
completely hidden and inaccessible.

When accessing files, a user never has to worry about the fact that the files in a directory which
has been mounted from a remote system actually reside on a physically separate disk, or even a
different UNIX system from a different vendor. Thus, NFS gives a seamless transparent way to
merge different files systems from different machines into one larger structure. At the
department where I studied years ago [2], their UNIX system included Hewlett Packard
machines running HP-UX, Sun machines running SunOS, SGIs running IRIX, DEC machines
running Digital UNIX, PCs running an X-Windows emulator called Windows Exceed, and some
Linux PCs. All the machines had access to a single large file structure so that any user could
theoretically use any system in any part of the building (except where deliberately prevented
from doing so via local system file alterations).

Another example is my home directory /mapleson - this directory is mounted from the admin
Indy (Technicians' office Ve48) which has my own extra external disk locally mounted. As far as
the server is concerned, my home account just happens to reside in /mapleson instead of
/home/staff/mapleson. There is a link to /mapleson from /home/staff/mapleson which allows
other staff and students to access my directory without having to ever be aware that my home
account files do not physically reside on the server.

Every user has a 'home directory'. This is where all the files owned by that user are stored. By
default, a new account would only include basic files such as .login, .cshrc and .profile. Admin
customisation might add a trash 'dumpster' directory, user's WWW site directory for public
access, email directory, perhaps an introductory README file, a default GUI layout, etc.
UNIX Fundamentals: Processes and process IDs.

As explained in the UNIX history, a UNIX OS can run many programs, or processes, at the same
time. From the moment a UNIX system is turned on, this process is initiated. By the time a
system is fully booted so that users can login and use the system, many processes will be running
at once. Each process has its own unique identification number, or process ID. An administrator
can use these ID numbers to control which processes are running in a very direct manner.

For example, if a user has run a program in the background and forgotten to close it down before
logging off (perhaps the user's process is using up too much CPU time) then the admin can
shutdown the process using the kill command. Ordinary users can also use the kill command, but
only on processes they own.

Similarly, if a user's display appears frozen due to a problem with some application (eg.
Netscape) then the user can logon to a different system, login to the original system using rlogin,
and then use the kill command to shutdown the process at fault either by using the specific
process ID concerned, or by using a general command such as killall, eg.:

killall netscape

This will shutdown all currently running Netscape processes, so using specific ID numbers is
often attempted first.

Most users only encounter the specifics of processes and how they work when they enter the
world of application development, especially the lower-level aspects of inter-process
communication (pipes and sockets). Users may often run programs containing bugs, perhaps
leaving processes which won't close on their own. Thus, kill can be used to terminate such
unwanted processes.

The way in which UNIX manages processes and the resources they use is extremely tight, ie. it is
very rare for a UNIX system to completely fall over just because one particular process has
caused an error. 3rd-party applications like Netscape are usually the most common causes of
process errors. Most UNIX vendors vigorously test their own system software to ensure they are,
as far as can be ascertained, error-free. One reason why alot of work goes into ensuring programs
are bug free is that bugs in software are a common means by which hackers try to gain root
(admin) access to a system: by forcing a particular error condition, a hacker may be able to
exploit a bug in an application.

For an administrator, most daily work concerning processes is about ensuring that system
resources are not being overloaded for some reason, eg. a user running a program which is
forking itself repeatedly, slowing down a system to a crawl.

In the case of the SGI system I run, staff have access to the SGI server, so I must ensure that staff
do not carelessly run processes which hog CPU time. Various means are available by which an
administrator can restrict the degree to which any particular process can utilise system resources,
the most important being a process priority level (see the man pages for 'nice' and 'renice').
The most common process-related command used by admins and users is 'ps', which displays the
current list of processes. Various options are available to determine which processes are
displayed and in what output format, but perhaps the most commonly used form is this:

ps -ef

which shows just about everything about every process, though other commands exist which can
give more detail, eg. the current CPU usage for each process (osview). Note that other UNIX
OSs (eg. SunOS) require slightly different options, eg. 'ps -aux' - this is an example of the kind of
difference which users might notice between System V and BSD derived UNIX variants.

The Pipe.

An important aspect of processes is inter-process communication. From an every day point of


view, this involves the concept of pipes. A pipe, as the name suggests, acts as a communication
link between two processes, allowing the output of one processes to be used as the input for
another. The pipe symbol is a vertical bar '|'.

One can use the pipe to chain multiple commands together, eg.:

cat *.txt | grep pattern | sort | lp

The above command sequence dumps the contents of all the files in the current directory ending
in .txt, but instead of the output being sent to the 'standard output' (ie. the screen), it is instead
used as the input for the grep operation which scans each incoming line for any occurence of the
word 'pattern' (grep's output will only be those lines which do contain that word, if any). The
output from grep is then sorted by the sort program on a line-by-line basis for each file found by
cat (in alphanumeric order). Finally, the output from sort is sent to the printer using lp.

The use of pipes in this way provides an extremely effective way of combining many commands
together to form more powerful and flexible operations. By contrast, such an ability does not
exist in DOS, or in NT.

Processes are explained further in a later lecture, but have been introduced now since certain
process-related concepts are relevant when discussing the UNIX 'shell'.

UNIX Fundamentals: The Shell Command Interface.

A shell is a command-line interface to a UNIX OS, written in C, using a syntax that is very like
the C language. One can enter simple commands (shell commands, system commands, user-
defined commands, etc.), but also more complex sequences of commands, including expressions
and even entire programs written in a scripting language called 'shell script' which is based on C
and known as 'sh' (sh is the lowest level shell; rarely used by ordinary users, it is often used by
admins and system scripts). Note that 'command' and 'program' are used synonymously here.
Shells are not in any way like the PC DOS environment; shells are very powerful and offer users
and admins a direct communication link to the core OS, though ordinary users will find there is a
vast range of commands and programs which they cannot use since they are not the root user.

Modern GUI environments are popular and useful, but some tasks are difficult or impossible to
do with an iconic interface, or at the very least are simply slower to perform. Shell commands
can be chained together (the output of one command acts as the input for another), or placed into
an executable file like a program, except there is no need for a compiler and no object file - shell
'scripts' are widely used by admins for system administration and for performing common tasks
such as locating and removing unwanted files. Combined with the facility for full-scale remote
administration, shells are very flexible and efficient. For example, I have a single shell script
'command' which simultaneously reboots all the SGI Indys in Ve24. These shortcuts are useful
because they minimise keystrokes and mistakes. An admin who issues lengthy and complex
command lines repeatedly will find these shortcuts a handy and necessary time-saving feature.

Shells and shell scripts can also use variables, just as a C program can, though the syntax is
slightly different. The equivalent of if/then statements can also be used, as can case statements,
loop structures, etc. Novice administrators will probably not have to use if/then or other more
advanced scripting features at first, and perhaps not even after several years. It is certainly true
that any administrator who already knows the C programming language will find it very easy to
learn shell script programming, and also the other scripting languages which exist on UNIX
systems such as perl (Practical Extraction and Report Language), awk (pattern scanning and
processing language) and sed (text stream editor).

perl is a text-processing language, designed for processing text files, extracting useful data,
producing reports and results, etc. perl is a very powerful tool for system management, especially
combined with other scripting languages. However, perl is perhaps less easy to learn for a
novice; the perl man page says, "The language is intended to be practical (easy to use, efficient,
complete) rather than beautiful (tiny, elegant, minimal)." I have personally never had to write a
perl program as yet, or a program using awk or sed. This is perhaps a good example if any were
required of how largely automated modern UNIX systems are. Note that the perl man page
serves as the entire online guide to the perl language and is thus quite large.

An indication of the fact that perl and similar languages can be used to perform complex
processing operations can be seen by examining the humourous closing comment in the perl man
page:

"Perl actually stands for Pathologically Eclectic


Rubbish Lister, but don't tell anyone I said that."

Much of any modern UNIX OS actually operates using shell scripts, many of which use awk, sed
and perl as well as ordinary shell commands and system commands. These scripts can look quite
complicated, but in general they need not be of any concern to the admin; they are often quite old
(ie. written years ago), well understood and bug-free.

Although UNIX is essentially a text-based command-driven system, it is perfectly possible for


most users to do the majority or even all of their work on modern UNIX systems using just the
GUI interface. UNIX variants such as IRIX include advanced GUIs which combine the best of
both worlds. It's common for a new user to begin with the GUI and only discover the power of
the text interface later. This probably happens because most new users are already familiar with
other GUI-based systems (eg. Win95) and initially dismiss the shell interface because of prior
experience of an operating system such as DOS, ie. they perceive a UNIX shell to be just some
weird form of DOS. Shells are not DOS, ie.:

 DOS is an operating system. Win3.1 is built on top of DOS, as is Win95, etc.


 UNIX is an operating system. Shells are a powerful text command interface to UNIX and not the
OS itself. A UNIX OS uses shell techniques in many aspects of its operation.

Shells are thus nothing like DOS; they are closely related to UNIX in that the very first version of UNIX
included a shell interface, and both are written in C. When a UNIX system is turned on, a shell is used
very early in the boot sequence to control what happens and execute actions.

Because of the way UNIX works and how shells are used, much of UNIX's inner workings are
hidden, especially at the hardware level. This is good for the user who only sees what she or he
wants and needs to see of the file structure. An ordinary user focuses on their home directory and
certain useful parts of the file system such as /var/tmp and /usr/share, while an admin will also be
interested in other directories which contain system files, device files, etc. such as /etc, /var/adm
and /dev.

The most commonly used shells are:

bsh - Bourne Shell; standard/job control


- command programming language

ksh - modern alternative to bsh, but still restricted

csh - Berkeley's C Shell; a better bsh


- with many additional features

tcsh - an enhanced version of csh

Figure 8. The various available shells.


These offer differing degrees of command access/history/recall/editing and support for shell script
programming, plus other features such as command aliasing (new names for user-defined sequences of
one or more commands). There is also rsh which is essentially a restricted version of the standard
command interpreter sh; it is used to set up login names and execution environments whose capabilities
are more controlled than those of the standard shell. Shells such as csh and tcsh execute the file
/etc/cshrc before reading the user's own .cshrc, .login and perhaps .tcshrc file if that exists.

Shells use the concept of a 'path' to determine how to find commands to execute. The 'shell path
variable', which is initially defined in the user's .cshrc or .tcshrc file, consists of a list of
directories, which may be added to by the user. When a command is entered, the shell
environment searches each directory listed in the path for the command. The first instance of a
file which matches the command is executed, or an error is given if no such executable command
is found. This feature allows multiple versions of the same command to exist in different
locations (eg. different releases of a commercial application). The user can change the path
variable so that particular commands will run a file from a desired directory.

Try:

echo $PATH

The list of directories is given.

WARNING: the dot '.' character at the end of a path definition means 'current directory'; it is
dangerous to include this in the root user's path definition (this is because a root user could run
an ordinary user's program(s) by mistake). Even an ordinary user should think twice about
including a period at the end of their path definition. For example, suppose a file called 'la' was
present in /tmp and was set so that it could be run by any user. Enterting 'la' instead of 'ls' by
mistake whilst in /tmp would fail to find 'la' in any normal system directory, but a period in the
path definition would result in the shell finding la in /tmp and executing it; thus, if the la file
contained malicious commands (eg. '/bin/rm -rf $HOME/mail'), then loss of data could occur.

Typical commands used in a shell include (most useful commands listed first):

cd - change directory
ls - list contents of directory
rm - delete a file (no undelete!)
mv - move a file
cat - dump contents of a file
more - display file contents in paged format
find - search file system for files/directories
grep - scan file(s) using pattern matching
man - read/search a man page (try 'man man')
mkdir - create a directory
rmdir - remove directory ('rm -r' has the same effect)
pwd - print current absolute working directory
cmp - show differences between two files
lp - print a file
df - show disk usage
du - show space used by directories/files
mail - send an email message to another user
passwd - change password (yppasswd for systems with NIS)

Figure 9. The commands used most often by any user.


Editors:

vi - ancient editor. Rarely used (arcane),


but occasionally useful, especially
for remote administration.

xedit
jot
nedit - GUI editors (jot is old, nedit is
- newer, xedit is very simple).
Figure 10. Editor commands.
Most of these are not built-in shell commands. Enter 'man csh' or 'man tcsh' to see which commands are
part of the shell and hence which are other system programs, eg. 'which' is a shell command, but 'grep'
is not; 'cd' is a shell command, but 'ls' is not.

vi is an ancient editor developed in the very early days of UNIX when GUI-based displays did
not exist. It is not used much today, but many admins swear by it - this is only really because
they know it so well after years of experience. The vi editor can have its uses though, eg. for
remote administration: if you happen to be using a Wintel PC in an Internet cafe and decide to
access a remote UNIX system via telnet, the vi editor will probably be the only editor which you
can use to edit files on the remote system.

Jot has some useful features, especially for programmers (macros, "Electric C Mode"), but is old
and contains an annoying colour map bug; this doesn't affect the way jot works, but does
sometimes scramble on-screen colours within the jot window. SGI recommends nedit be used
instead.

xedit is a very simple text editor. It has an extremely primitive file selection interface, but has a
rather nice search/replace mechanism.

nedit is a newer GUI editor with more modern features.

jot is specific to SGI systems, while vi, xedit and nedit exist on any UNIX variant (if not by
default, then they can be downloaded in source code or executable format from relevant
anonymous ftp sites).

Creating a new shell:

sh, csh, tcsh, bsh, ksh - use man pages to see differences

I have configured the SGI machines in Ve24 to use tcsh by default due to the numerous extra
useful features in tcsh, including file name completion (TAB), command-line editing, alias
support, file listing in the middle of a typed command (CTRL-D), command recall/reuse, and
many others (the man page lists 36 main extras compared to csh).

Further commands:

which - show location of a command based


on current path definition
chown - change owner ID of a file
chgrp - change group ID of a file
chmod - change file access permissions
who - show who is on the local system
rusers - show all users on local network
sleep - pause for a number of seconds
sort - sort data into a particular order
spell - run a spell-check on a file
split - split a file into a number of pieces
strings - show printable text strings in a file
cut - cut out selected fields of
each line of a file
tr - substitute/delete characters from
a text stream or file
wc - count the number of words in a file
whoami - show user ID
write - send message to another user
wall - broadcast to all users on local system
talk - request 1:1 communication link
with another user
to_dos - convert text file to DOS format
(add CTRL-M and CTRL-Z)
to_unix - convert text file to UNIX format
(opposite of to_dos)
su - adopt the identity of another user
(password usually required)

Figure 11. The next most commonly used commands.

Of the commands shown in Fig 11, only 'which' is a built-in shell command.

Any GUI program can also be executed via a text command (the GUI program is just a high-
level interface to the main program), eg. 'fm' for the iconic file manager/viewer, 'apanel' for the
Audio Panel, 'printers' for the Printer Manager, 'iconbook' for Icon Catalog, 'mouse' for
customise mouse settings, etc. However, not all text commands will have a GUI equivalent - this
is especially true of many system administration commands. Other categories are shown in Figs
12 to 17 below.

fx - repartition a disk, plus other functions


mkfs - make a file system on a disk
mount - mount a file system (NFS)
ln - create a link to a file or directory
tar - create/extract an archive file
gzip - compress a file (gunzip)
compress - compress a file (uncompress).
Different format from gzip.
pack - a further compression method (eg. used
with man pages and release notes)
head - show the first few lines in a file
tail - show the last few lines in a file

Figure 12. File system manipulation commands.

The tar command is another example where slight differences between UNIX variants exist with
respect to default settings. However, command options can always be used to resolve such
differences.
hinv - show hardware inventory (SGI specific)
uname - show OS version
gfxinfo - show graphics hardware information (SGI-specific)
sysinfo - print system ID (SGI-specific)
gmemusage - show current memory usage
ps - display a snapshot of running process information
top - constantly updated process list (GUI: gr_top)
kill - shutdown a process
killall - shutdown a group of processes
osview - system resource usage (GUI: gr_osview)
startconsole - system console, a kind of system monitoring
xterm which applications will echo messages into

Figure 13. System Information and Process Management Commands.

inst - install software (text-based)


swmgr - GUI interface to inst (the preferred
method; easier to use)
versions - show installed software

Figure 14. Software Management Commands.

cc,
CC,
gcc - compile program (further commands
may exist for other languages)
make - run program compilation script
xmkmf - Use imake on an Imakefile to create
vendor-specific make file
lint - check a C program for errors/bugs
cvd - CASE tool, visual debugger for C
programs (SGI specific)

Figure 15. Application Development Commands.

relnotes - software release notes (GUI: grelnotes)


man - manual pages (GUI: xman)
insight - online books
infosearch - searchable interface to the above
three (IRIX 6.5 and later)

Figure 16. Online Information Commands (all available from the 'Toolchest')

telnet - open communication link


ftp - file transfer
ping - send test packets
traceroute - display traced route to remote host
nslookup - translate domain name into IP address
finger - probe remote host for user information

Figure 17. Remote Access Commands.

This is not a complete list! And do not be intimidated by the apparent plethora of commands. An
admin won't use most of them at first. Many commands are common to any UNIX variant, while
those that aren't (eg. hinv) probably have equivalent commands on other UNIX platforms.

Shells can be displayed in different types of window, eg. winterm, xterm. xterms comply with
the X11 standard and offer a wider range of features. xterms can be displayed on remote
displays, as can any X-based application (this includes just about every program one ever uses).
Security note: the remote system must give permission or be configured to allow remote display
(xhost command).

If one is accessing a UNIX system via an older text-only terminal (eg. VT100) then the shell
operates in 'terminal' mode, where the particular characteristics of the terminal in use determine
how the shell communicates with the terminal (details of all known terminals are stored in the
/usr/lib/terminfo directory). Shells shown in visual windows (xterms, winterms, etc.) operate a
form of terminal emulation that can be made to exactly mimic a basic text-only terminal if
required.

Tip: if one ever decides to NFS-mount /usr/lib to save space (thus normally erasing the contents
of /usr/lib on the local disk), it is wise to at least leave behind the terminfo directory on the local
disk's /usr/lib; thus, should one ever need to logon to the system when /usr/lib is not mounted,
terminal communication will still operate normally.

The lack of a fundamental built-in shell environment in WindowsNT is one of the most common
criticisms made by IT managers who use NT. It's also why many high-level companies such as
movie studios do not use NT, eg. no genuine remote administration makes it hard to manage
clusters of several dozen systems all at once, partly because different systems may be widely
dispersed in physical location but mainly because remote administration makes many tasks
considerably easier and more convenient.

Regular Expressions and Metacharacters.

Shell commands can employ regular expressions and metacharacters which can act as a means
for referencing large numbers of files or directories, or other useful shortcuts. Regular
expressions are made up of a combination of alphanumeric characters and a series of punctuation
characters that have special meaning to the shell. These punctuation characters are called
metacharacters when they are used for their special meanings with shell commands.
The most common metacharacter is the wildcard '*', used to reference multiple files and/or
directories, eg.:

Dump the contents of all files in the current directory to the display:

cat *

Remove all object files in the current directory:

rm *.o

Search all files ending in .txt for the word 'Alex':

grep Alex *.txt

Print all files beginning with 'March' and ending in '.txt':

lp March*.txt

Print all files beginning with 'May':

lp May*

Note that it is not necessary to use 'May*.*' - this is because the dot is just another character that
can be a valid part of any UNIX file name at any position, ie. a UNIX file name may include
multiple dots. For example, the Blender shareware animation program archive file is called:

blender1.56_SGI_6.2_ogl.tar.gz

By contrast, DOS has a fixed file name format where the dot is a rigid aspect of any file name.
UNIX file names do not have to contain a dot character, and can even contain spaces (though
such names can confuse the shell unless one encloses the entire name in quotes "").

Other useful metacharacters relate to executing previously entered commands, perhaps with
modification, eg. the '!' is used to recall a previous command, as in:

!! - Repeat previous command


!grep - Repeat the last command which began with 'grep'

For example, an administrator might send 20 test packets to a remote site to see if the remote
system is active:

ping -c 20 www.sgi.com

Following a short break, the administrator may wish to run the same command again, which can
be done by entering '!!'. Minutes later, after entering other commands, the admin might want to
run the last ping test once more, which is easily possible by entering '!ping'. If no other command
had since been entered beginning with 'p', then even just '!p' would work.
The '^' character can be used to modify the previous command, eg. suppose I entered:

grwp 'some lengthy search string or whatever' *

grep has been spelled incorrectly here, so an error is given ('gwrp: Command not found'). Instead
of typing the whole line again, I could enter:

^w^e

The shell searches the previous command for the first appearance of 'w', replaces that letter with
'e', displays the newly formed command as a means of confirmation and then executes the
command. Note: the '^' operator can only search for the first occurrence of the character or string
to be changed, ie. in the above example, the word 'whatever' is not changed to 'ehatever'. The
parameter to search for, and the pattern to replace any targets found, can be any standard regular
expression, ie. a valid sequence of ASCII characters. In the above example, entering
'^grwp^grep^' would have had the same effect, though is unnecessarily verbose.

Note that characters such as '!' and '^' operate entirely within the shell, ie. they are not
'memorised' as discrete commands. Thus, within a tcsh, using the Up-Arrow key to recall the
previous command after the '^w^e' command sequence does not show any trace of the '^w^e'
action. Only the corrected, executed command is shown.

Another commonly used character is the '&' symbol, normally employed to control whether or
not a process executed from with a shell is run in the foreground or background. As explained in
the UNIX history, UNIX can run many processes at once. Processes employ a parental
relationship whereby a process which creates a new process (eg. a shell running a program) is
said to be creating a child process. The act of creating a new process is called forking. When
running a program from within a shell, the prompt may not come back after the command is
entered - this means the new process is running in 'foreground', ie. the shell process is suspended
until such time as the forked process terminates. In order to run the process in background, which
will allow the shell process to carry on as before and still be used, the '&' symbol must be
included at the end of the command.

For example, the 'xman' command normally runs in the foreground: enter 'xman' in a shell and
the prompt does not return; close the xman program, or type CTRL-C in the shell window, and
the shell prompt returns. This effectively means the xman program is 'tied' to the process which
forked it, in this case the shell. If one closes the shell completely (eg. using the top-left GUI
button, or a kill command from a different shell) then the xman window vanishes too.

However, if one enters:

xman &

then the xman program is run in the 'background', ie. the shell prompt returns immediately (note
the space is optional, ie. 'xman&' is also valid). This means the xman session is now independent
of the process which forked it (the shell) and will still exist even if the shell is closed.
Many programs run in the background by default, eg. swmgr (install system software). The 'fg'
command can be used to bring any process into the foreground using the unqiue process ID
number which every process has. With no arguments, fg will attempt to bring to the foreground
the most recent process which was run in the background. Thus, after entering 'xman&', the 'fg'
command on its will make the shell prompt vanish, as if the '&' symbol had never been used.

A process currently running in the foreground can be deliberately 'suspended' using the CTRL-Z
sequence. Try running xman in the foreground within a shell and then typing CTRL-Z - the
phrase 'suspended' is displayed and the prompt returns, showing that the xman process has been
temporarily halted. It still exists, but is frozen. Try using the xman program at this point: notice
that the menus cannot be accessed and the window overlay/underlay actions are not dealt with
anymore.

Now go back to the shell and enter 'fg' - the xman program is brought back into the foreground
and begins running once more. As a final example, try CTRL-Z once more, but this time enter
'bg'. Now the xman process is pushed fully into the background. Thus, if one intends to run a
program in the background but forgets to include the '&' symbol, then one can use CTRL-Z
followed by 'bg' to place the process in the background.

Note: it is worth mentioning at this point an example of how I once observed Linux to be
operating incorrectly. This example, seen in 1997, probably wouldn't happen today, but at the
time I was very surprised. Using a csh shell on a PC running Linux, I ran the xedit editor in the
background using:

xedit&

Moments later, I had cause to shutdown the relevant shell, but the xedit session terminated as
well, which should not have happened since the xedit process was supposed to be running in
background. Exactly why this happened I do not know - presumably there was a bug in the way
Linux handled process forking which I am sure has now been fixed. However, in terms of how
UNIX is supposed to work, it's a bug which should not have been present.

Actually, since many shells such as tcsh allow one to recall previous commands using the arrow
keys, and to edit such commands using Alt/CTRL key combinations and other keys, the need to
use metacharacter such as '!' and '^' is lessened. However, they're useful to know in case one
encounters a different type of shell, perhaps as a result of a telnet session to a remote site where
one may not have any choice over which type of shell is used.

Standard Input (stdin), Standard Output (stdout), Standard Error (stderr).

As stated earlier, everything in UNIX is basically treated as a file. This even applies to the
concept of where output from a program goes to, and where the input to a program comes from.
The relevant files, or text data streams, are called stdin and stdout (standard 'in', standard 'out').
Thus, whenever a command produces a visible output in a shell, what that command is actually
doing is sending its output to the file handle known as stdout. In the case of the user typing
commands in a shell, stdout is defined to be the display which the user sees.
Similarly, the input to a command comes from stdin which, by default, is the keyboard. This is
why, if you enter some commands on their own, they will appear to do nothing at first, when in
fact they are simply waiting for input from the stdin stream, ie. the keyboard. Enter 'cat' on its
own and see what happens; nothing at first, but then enter any text sequence - what you enter is
echoed back to the screen, just as it would be if cat was dumping the contents of a file to the
screen.

This stdin input stream can be temporarily redefined so that a command takes its input from
somewhere other than the keyboard. This is known as 'redirection'. Similarly, the stdout stream
can be redirected so that the output goes somewhere other than the display. The '<' and '>'
symbols are used for data redirection. For example:

ps -ef > file

This runs the ps command, but sends the output into a file. That file could then be examined with
cat, more, or loaded into an editor such as nedit or jot.

Try:

cat > file

You can then enter anything you like until such time as some kind of termination signal is sent,
either CTRL-D which acts to end the text stream, or CTRL-C which stops the cat process. Type
'hello', press Enter, then press CTRL-D. Enter 'cat file' to see the file's contents.

A slightly different form of output redirection is '>>' which appends a data stream to the end of
an existing file, rather than completely overwriting its current contents. Enter:

cat >> file

and type 'there!' followed by Enter and then CTRL-D. Now enter 'cat file' and you will see:

% cat file
hello
there!

By contrast, try the above again but with the second operation also using the single '>' operator.
This time, the files contents will only be 'there!'. And note that the following has the same effect
as 'cat file' (why?):

cat < file

Anyone familiar with C++ programming will recognise this syntax as being similar to the way
C++ programs display output.

Input and output redirection is used extensively by system shell scripts. Users and administrators
can use these operators as a quick and convenient way for managing program input and output.
For example, the output from a find command could be redirected into a file for later
examination. I often use 'cat > whatever' as a quick and easy way to create a short file without
using an editor.

Error messages from programs and commands are also often sent to a different output stream
called stderr - by default, stderr is also the relevant display window, or the Console Window if
one exists on-screen.

The numeric file handles associated with these three text streams are:

0 - stdin
1 - stdout
2 - stderr

These numbers can be placed before the < and > operators to select a particular stream to deal
with. Examples of this are given in the notes on shell script programming (Day 2).

The '&&' combination allows one to chain commands together so that each command is only
executed if the preceding command was successful, eg.:

run_my_prog_which_takes_hours > results && lp results

In this example, some arbitrary program is executed which is expected to take a long time. The
program's output is redirected into a file called results. If and only if the program terminates
successfully will the results file be sent to the default printer by the lp program. Note: any error
encountered by the program will also have the error message stored in the results file.

One common use of the && sequence is for on-the-spot backups:

cd /home && tar cv . && eject

This sequence changes directory to the /home area, archives the contents of /home to DAT and
ejects the DAT tape once the archive process has completed. Note that the eject command
without any arguments will search for a default removable media device, so this example
assumes there is only one such device, a DAT drive, attached to the system. Otherwise, one
could use 'eject /dev/tape' to be more specific.

The semicolon can also be used to chain commands together, but in a manner which does not
require each command to be successful in order for the next command to be executed, eg. one
could run two successive find commands, searching for different types of file, like this (try
executing this command in the directory /mapleson/public_html/sgi):

find . -name "*.gz" -print; find . -name "*.mpg" -print

The output given is:

./origin/techreport/compcon97_dv.pdf.gz
./origin/techreport/origin_chap7.pdf.gz
./origin/techreport/origin_chap6.pdf.gz
./origin/techreport/origin_chap5.pdf.gz
./origin/techreport/origin_chap4.pdf.gz
./origin/techreport/origin_chap3.pdf.gz
./origin/techreport/origin_chap2.pdf.gz
./origin/techreport/origin_chap1.5.pdf.gz
./origin/techreport/origin_chap1.0.pdf.gz
./origin/techreport/compcon_paper.pdf.gz
./origin/techreport/origin_techrep.pdf.tar.gz
./origin/techreport/origin_chap1-7TOC.pdf.gz
./pchall/pchal.ps.gz
./o2/phase/phase6.mpg
./o2/phase/phase7.mpg
./o2/phase/phase4.mpg
./o2/phase/phase5.mpg
./o2/phase/phase2.mpg
./o2/phase/phase3.mpg
./o2/phase/phase1.mpg
./o2/phase/phase8.mpg
./o2/phase/phase9.mpg

If one changes the first find command so that it will give an error, the second find command still
executes anyway:

% find /tmp/gurps -name "*.gz" -print ; find . -name "*.mpg" -print


cannot stat /tmp/gurps
No such file or directory
./o2/phase/phase6.mpg
./o2/phase/phase7.mpg
./o2/phase/phase4.mpg
./o2/phase/phase5.mpg
./o2/phase/phase2.mpg
./o2/phase/phase3.mpg
./o2/phase/phase1.mpg
./o2/phase/phase8.mpg
./o2/phase/phase9.mpg

However, if one changes the ; to && and runs the sequence again, this time the second find
command will not execute because the first find command produced an error:

% find /tmp/gurps -name "*.gz" -print && find . -name "*.mpg" -print
cannot stat /tmp/gurps
No such file or directory

As a final example, enter the following:

find /usr -name "*.htm*" -print & find /usr -name "*.rgb" -print &

This command runs two separate find processes, both in the background at the same time. Unlike
the previous examples, the output from each command is displayed first from one, then from the
other, and back again in a non-deterministic manner, as and when matching files are located by
each process. This is clear evidence that both processes are running at the same time. To shut
down the processes, either use 'killall find' or enter 'fg' followed by the use of CTRL-C twice (or
one could use kill with the appropriate process IDs, identifiable using 'ps -ef | grep find').

When writing shell script files, the ; symbol is most useful when one can identify commands
which do not depend on each other. This symbol, and the other symbols described here, are
heavily used in the numerous shell script files which manage many aspects of any modern UNIX
OS.

Note: if non-dependent commands are present in a script file or program, this immediately
allows one to imagine the idea of a multi-threaded OS, ie. an OS which can run many processes
in parallel across multiple processors. A typical example use of such a feature would be batch
processing scripts for image processing of medical data, or scripts that manage database systems,
financial accounts, etc.

References:

1. HP-UX/SUN Interoperability Cookbook, Version 1.0, Copyright 1994 Hewlett-Packard Co.:


2. http://www.hp-partners.com/ptc_public/techsup/SunInterop/

comp.sys.hp.hpux FAQ, Copyright 1995 by Colin Wynd:

http://hpux.csc.liv.ac.uk/hppd/FAQ/

3. Department of Computer Science and Electrical Engineering, Heriot Watt University, Riccarton
Campus, Edinburgh, Scotland:
4.
5. http://www.cee.hw.ac.uk/
Day 1:
Part 3: File ownership and access permissions.
Online help (man pages, etc.)

UNIX Fundamentals: File Ownership

UNIX has the concept of file 'ownership': every file has a unique owner, specified by a user ID
number contained in /etc/passwd. When examining the ownership of a file with the ls command,
one always sees the symbolic name for the owner, unless the corresponding ID number does not
exist in the local /etc/passwd file and is not available by any system service such as NIS.

Every user belongs to a particular group; in the case of the SGI system I run, every user belongs
to either the 'staff' or 'students' group (note that a user can belong to more than one group, eg. my
network has an extra group called 'projects'). Group names correspond to unique group IDs and
are listed in the /etc/group file. When listing details of a file, usually the symbolic group name is
shown, as long as the group ID exists in the /etc/group file, or is available via NIS, etc.

For example, the command:

ls -l /

shows the full details of all files in the root directory. Most of the files and directories are owned
by the root user, and belong to the group called 'sys' (for system). An exception is my home
account directory /mapleson which is owned by me.

Another example command:

ls -l /home/staff

shows that every staff member owns their particular home directory. The same applies to
students, and to any user which has their own account. The root user owns the root account (ie.
the root directory) by default.

The existence of user groups offers greater flexibility in how files are managed and the way in
which users can share their files with other users. Groups also offer the administrator a logical
way of managing distinct types of user, eg. a large company might have several groups:

accounts
clerical
investors
management
security

The admin decides on the exact names. In reality though, a company might have several internal
systems, perhaps in different buildings, each with their own admins and thus possibly different
group names.
UNIX Fundamentals: Access Permissions

Every file also has a set of file 'permissions'; the file's owner can set these permissions to alter
who can read, write or execute the file concerned. The permissions for any file can be examined
using the ls command with the -l option, eg.:

% ls -l /etc/passwd
-rw-r--r-- 1 root sys 1306 Jan 31 17:07 /etc/passwd

uuugggooo owner group size date mod name

Each file has three sets of file access permissions (uuu, ggg, ooo), relating to:

 the files owner, ie. the 'user' field


 the group which the file's owner belongs to
 the 'rest of the world' (useful for systems with more than one group)

This discussion refers to the above three fields as 'user', 'group' and 'others'. In the above example, the
three sets of permissions are represented by field shown as uuugggooo, ie. the main system password
file can be read by any user that has access to the relevant host, but can only be modified by the root
user. The first access permission is separate and is shown as a 'd' if the file is a directory, or 'l' if the file is
a link to some other file or directory (many examples of this can be found in the root directory and in
/etc).

Such a combination of options offers great flexibility, eg. one can have private email (user-only),
or one can share documents only amongst one's group (eg. staff could share exam documents, or
students could share files concerning a Student Union petition), or one can have files that are
accessible by anyone (eg. web pages). The same applies to directories, eg. since a user's home
directory is owned by that user, an easy way for a user to prevent anyone else from accessing
their home directory is to remove all read and execute permissions for groups and others.

File ownership and file access permissions are a fundamental feature of every UNIX file,
whether that file is an ordinary file, a directory, or some kind of special device file. As a result,
UNIX as an OS has inherent built-in security for every file. This can lead to problems if the
wrong permissions are set for a file by mistake, but assuming the correct permissions are in
place, a file's security is effectively secured.

Note that no non-UNIX operating system for PCs yet offers this fundamental concept of file-
ownership at the very heart of the OS, a feature that is definitely required for proper security.
This is largely why industrial-level companies, military, and government institutions do not use
NT systems where security is important. In fact, only Cray's Unicos (UNIX) operating system
passes all of the US DoD's security requirements.
Relevant Commands:

chown - change file ownership

chgrp - change group status of a file

chmod - change access permissions for one or more files

For a user to alter the ownership and/or access permissions of a file, the user must own that file.
Without the correct ownership, an error is given, eg. assuming I'm logged on using my ordinary
'mapleson' account:

% chown mapleson var


var - Operation not permitted

% chmod go+w /var


chmod() failed on /var: Operation not permitted

% chgrp staff /var


/var - Operation not permitted

All of these operations are attempting to access files owned by root, so they all fail.

Note: the root user can access any file, no matter what ownership or access permissions have
been set (unless a file owned by root has had its read permission removed). As a result, most
hacking attempts on UNIX systems revolve around trying to gain root privileges.

Most ordinary users will rarely use the chown or chgrp commands, but administrators may often
use them when creating accounts, installing custom software, writing scripts, etc.

For example, an admin might download some software for all users to use, installing it
somewhere in /usr/local. The final steps might be to change the ownership of every newly
installed file so ensure that it is owned by root, with the group set to sys, and then to use chmod
to ensure any newly installed executable programs can be run by all users, and perhaps to restrict
access to original source code.

Although chown is normally used to change the user ID of a file, and chgrp the group ID, chown
can actually do both at once. For example, while acting as root:

yoda 1# echo hello > file


yoda 2# ls -l file
-rw-r--r-- 1 root sys 6 May 2 21:50 file
yoda 3# chgrp staff file
yoda 4# chown mapleson file
yoda 5# ls -l file
-rw-r--r-- 1 mapleson staff 6 May 2 21:50 file
yoda 6# /bin/rm file
yoda 7# echo hello > file
yoda 8# ls -l file
-rw-r--r-- 1 root sys 6 May 2 21:51 file
yoda 9# chown mapleson.staff file
yoda 10# ls -l file
-rw-r--r-- 1 mapleson staff 6 May 2 21:51 file

Figure 18. Using chown to change both user ID and group ID.

Changing File Permissions: Examples.

The general syntax of the chmod command is:

chmod [-R] <mode> <filename(s)>

Where <mode> defines the new set of access permissions. The -R option is optional (denoted by
square brackets []) and can be used to recursively change the permissions for the contents of a
directory.

<mode> can be defined in two ways: using Octal (base-8) numbers or by using a sequence of
meaningful symbolic letters. This discussion covers the symbolic method since the numeric
method (described in the man page for chmod) is less intuitive to use. I wouldn't recommend an
admin use Octal notation until greater familiarity with how chmod works is attained.

<mode> can be summarised as containing three parts:

U operator P

where U is one or more characters corresponding to user, group, or other; operator is +, -, or =,


signifying assignment of permissions; and P is one or more characters corresponding to the
permission mode.

Some typical examples would be:

chmod go-r file - remove read permission for groups and others
chmod ugo+rx file - add read/execute permission for all
chmod ugo=r file - set permission to read-only for all users

A useful abbreviation in place of 'ugo' is 'a' (for all), eg.:

chmod a+rx file - give read and execute permission for all
chmod a=r file - set to read-only for all

For convenience, if the U part is missing, the command automatically acts for all, eg.:

chmod -x file - remove executable access from everyone


chmod =r file - set to read-only for everyone

though if a change in write permission is included, said change only affects user, presumably for
better security:
chmod +w file - add write access only for user
chmod +rwx file - add read/execute for all, add write only for
user
chmod -rw file - remove read from all, remove write from user

Note the difference between the +/- operators and the = operator: + and - add or take away from
existing permissions, while = sets all the permissions to a particular state, eg. consider a file
which has the following permissions as shown by ls -l:

-rw-------

The command 'chmod +rx' would change the permissions to:

-rwxr-xr-x

while the command 'chmod =rx' would change the permissions to:

-r-xr-xr-x

ie. the latter command has removed the write permission from the user field because the rx
permissions were set for everyone rather than just added to an existing state. Further examples of
possible permissions states can be found in the man page for ls.

A clever use of file ownership and groups can be employed by anyone to 'hand over' ownership
of a file to another user, or even to root. For example, suppose user alex arranges with user sam
to leave a new version of a project file (eg. a C program called project.c) in the /var/tmp
directory of a particular system at a certain time. User alex not only wants sam to be able to read
the file, but also to remove it afterwards, eg. move the file to sam's home directory with mv.
Thus, alex could perform the following sequence of commands:

cp project.c /var/tmp - copy the file


cd /var/tmp - change directory
chmod go-rwx project.c - remove all access for everyone else
chown sam project.c - change ownership to sam

Figure 19. Handing over file ownership using chown.

Fig 19 assumes alex and sam are members of the same group, though an extra chgrp command
could be used before the chown if this wasn't the case, or a combinational chown command used
to perform both changes at once.

After the above commands, alex will not be able to read the project.c file, or remove it. Only sam
has any kind of access to the file.

I once used this technique to show students how they could 'hand-in' project documents to a
lecturer in a way which would not allow students to read each others' submitted work.
Note: it can be easy for a user to 'forget' about the existence of hidden files and their associated
permissions. For example, someone doing some confidential movie editing might forget or not
even know that temporary hidden files are often created for intermediate processing. Thus,
confidential tasks should always be performed by users inside a sub-directory in their home
directory, rather than just in their home directory on its own.

Experienced users make good use of file access permissions to control exactly who can access
their files, and even who can change them.

Experienced administrators develop a keen eye and can spot when a file has unusual or perhaps
unintended permissions, eg.:

-rwxrwxrwx

if a user's home directory has permissions like this, it means anybody can read, write and execute
files in that directory: this is insecure and was probably not intended by the user concerned.

A typical example of setting appropriate access permissions is shown by my home directory:

ls -l /mapleson

Only those directories and files that I wish to be readable by anyone have the group and others
permissions set to read and execute.

Note: to aid security, in order for a user to access a particular directory, the execute permission
must be set on for that directory as well as read permission at the appropriate level (user, group,
others). Also, only the owner of a file can change the permissions or ownership state for that file
(this is why a chown/chgrp sequence must have the chgrp done first, or both at once via a
combinational chown).

The Set-UID Flag.

This special flag appears as an 's' instead of 'x' in either the user or group fields of a file's
permissions, eg.:

% ls -l /sbin/su
-rwsr-xr-x 1 root sys 40180 Apr 10 22:12 /sbin/su*

The online book, "IRIX Admin: Backup, Security, and Accounting", states:

"When a user runs an executable file that has either of these


permissions, the system gives the user the permissions of the
owner of the executable file."

An admin might use su to temporarily become root or another user without logging off. Ordinary
users may decide to use it to enable colleagues to access their account, but this should be
discouraged since using the normal read/write/execute permissions should be sufficient.
Mandatory File Locking.

If the 'l' flag is set in a file's group permissions field, then the file will be locked while another
user from the same group is accessing the file. For example, file locking allows a user to gather
data from multiple users in their own group via a group-writable file (eg. petition, questionnaire,
etc.), but blocks simultaneous file-write access by multiple users - this prevents data loss which
might otherwise occur via two users writing to a file at the same time with different versions of
the file.

UNIX Fundamentals: Online Help

From the very early days of UNIX, online help information was available in the form of manual
pages, or 'man' pages. These contain an extensive amount of information on system commands,
program subroutines, system calls and various general references pages on topics such as file
systems, CPU hardware issues, etc.

The 'man' command allows one to search the man page database using keywords, but this text-
based interface is still somewhat restrictive in that it does not allow one to 'browse' through
pages at will and does not offer any kind of direct hyperlinked reference system, although each
man pages always includes a 'SEE ALSO' section so that one will know what other man pages
are worth consulting.

Thus, most modern UNIX systems include the 'xman' command: a GUI interface using X
Window displays that allows one to browse through man pages at will and search them via
keywords. System man pages are actually divided into sections, a fact which is not at all obvious
to a novice user of the man command. By contrast, xman reveals immediately the existence of
these different sections, making it much easier to browse through commands.

Since xman uses the various X Windows fonts to display information, the displayed text can
incorporate special font styling such as italics and bold text to aid clarity. A man page shown in a
shell can use bright characters and inverted text, but data shown using xman is much easier to
read, except where font spacing is important, eg. enter 'man ascii' in a shell and compare it to the
output given by xman (use xman's search option to bring up the man page for ascii).

xman doesn't include a genuine hypertext system, but the easy-to-access search option makes it
much more convenient to move from one page to another based on the contents of a particular
'SEE ALSO' section.

Most UNIX systems also have some form of online book archive. SGIs use the 'Insight' library
system which includes a great number of books in electronic form, all written using hypertext
techniques. An ordinary user would be expected to begin their learning process by using the
online books rather than the man pages since the key introductory books guide the user through
the basics of using the system via the GUI interface rather than the shell interface.
SGIs also have online release notes for each installed software product. These can be accessed
via the command 'grelnotes' which gives a GUI interface to the release notes archive, or one can
use relnotes in a shell or terminal window. Other UNIX variants probably also have a similar
information resource. Many newer software products also install local web pages as a means of
providing online information, as do 3rd-party software distributions. Such web pages are usually
installed somewhere in /usr/local, eg. /usr/local/doc. The URL format 'file:/file-path' is used to
access such pages, though an admin can install file links with the ln command so that online
pages outside of the normal file system web area (/var/www/htdocs on SGIs) are still accessible
using a normal http format URL.

In recent years, there have been moves to incorporate web technologies into UNIX GUI systems.
SGI began their changes in 1996 (a year before anyone else) with the release of the O2
workstation. IRIX 6.3 (used only with O2) included various GUI features to allow easy
integration between the existing GUI and various web features, eg. direct iconic links to web
sites, and using Netscape browser window interface technologies for system administration,
online information access, etc. Most UNIX variants will likely have similar features; on SGIs
with the latest OS version (IRIX 6.5), the relevant system service is called InfoSearch - for the
first time, users have a single entry point to the entire online information structure, covering man
pages, online books and release notes.

Also, extra GUI information tools are available for consulting "Quick Answers" and "Hints and
Shortcuts". These changes are all part of a general drive on UNIX systems to make them easier
to use.

Unlike the xman resource, viewing man pages using InfoSearch does indeed hyperlink
references to other commands and resources throughout each man page. This again enhances the
ability of an administrator, user or application developer to locate relevant information.

Summary: UNIX systems have a great deal of online information. As the numerous UNIX
variants have developed, vendors have attempted to improve the way in which users can access
that information, ultimately resulting in highly evolved GUI-based tools that employ standard
windowing technologies such as those offered by Netscape (so that references may include direct
links to web sites, ftp sites, etc.), along with hypertext techniques and search mechanisms.
Knowing how to make the best use of available documentation tools can often be the key to
effective administration, ie. locating answers quickly as and when required.
Detailed Notes for Day 2 (Part 1)
UNIX Fundamentals: System Identity, IP Address, Domain Name, Subdomain.

Every UNIX system has its own unique name, which is the means by which that machine is
referenced on local networks and beyond, eg. the Internet. The normal term for this name is the
local 'host' name. Systems connected to the Internet employ naming structures that conform to
existing structures already used on the Internet. A completely isolated network can use any
naming scheme.

Under IRIX, the host name for a system is stored in the /etc/sys_id file. The name may be up to
64 alphanumeric characters in length and can include hyphens and periods. Period characters '.'
are not part of the real name but instead are used to separate the sequence into a domain-name
style structure (eg. www.futuretech.vuurwerk.nl). The SGI server's host name is yoda, the fully-
qualified version of which is written as yoda.comp.uclan.ac.uk. The choice of host names is
largely arbitrary, eg. the SGI network host names are drawn from my video library (I have
chosen names designed to be short without being too uninteresting).

On bootup, a system's /etc/rc2.d/S20sysetup script reads its /etc/sys_id file to determine the local
host name. From then onwards, various system commands and internal function calls will return
that system name, eg. the 'hostname' and 'uname' commands (see the respective man pages for
details).

Along with a unique identity in the form of a host name, a UNIX system has its own 32bit
Internet Protocol (IP) address, split for convenience into four 8bit integers separated by periods,
eg. yoda's IP address is 193.61.250.34, an address which is visible to any system anywhere on
the Internet.

IP is the network-level communications protocol used by Internet systems and services. Various
extra options can be used with IP layer communications to create higher-level services such as
TCP (Transmission Control Protocol). The entire Internet uses the TCP/IP protocols for
communication.

A system which has more than one network interface (eg. multiple Ethernet ports) must have a
unique IP address for each port. Special software may permit a system to have extra addresses,
eg. 'IP Aliasing', a technique often used by an ISP to provide a more flexible service to its
customers. Note: unlike predefined Ethernet addresses (every Ethernet card has its own unique
address), a system's IP address is determined by the network design, admin personnel, and
external authorities.

Conceptually speaking, an IP address consists of two numbers: one represents the network while
the other represents the system. In order to more efficiently make use of the numerous possible
address 'spaces', four classes of addresses exist, named A, B, C and D. The first few bits of an
address determines its class:
Initial Binary No. of Bits for No. of Bits for
Class Bit Field the Network Number The Host Number

A 0 7 24
B 10 14 16
C 110 21 8
D 1110 [special 'multicast' addresses for internal network
use]

Figure 20. IP Address Classes: bit field and width allocations.

This system allows the Internet to support a range of different network sizes with differing
maximum limits on the number of systems for each type of network:

Class A Class B Class C Class D

No. of networks: 128 16384 2097152 [multicast]


No. of systems each: 16777214 65534 254 [multicast]

Figure 21. IP Address Classes: supported network types and sizes.

The numbers 0 and 255 are never used for any host. These are reserved for special uses.

Note that a network which will never be connected to the Internet can theoretically use any IP
address and domain/subdomain configuration.
Which class of network an organisation uses depends on how many systems it expects to have
within its network. Organisations are allocated IP address spaces by Internet Network
Information Centers (InterNICs), or by their local ISP if that is how they are connected to the
Internet. An organisation's domain name (eg. uclan.ac.uk) is also obtained from the local
InterNIC or ISP. Once a domain name has been allocated, the organisation is free to setup its
own network subdomains such as comp.uclan.ac.uk (comp = Computing Department), within
which an individual host would be yoda.comp.uclan.ac.uk. A similar example is Heriot Watt
University in Edinburgh (where I studied for my BSc) which has the domain hw.ac.uk, with its
Department of Computer Science and Electrical Engineering using a subdomain called
cee.hw.ac.uk, such that a particular host is www.cee.hw.ac.uk (see Appendix A for an example
of what happens when this methodology is not followed correctly).

UCLAN uses Class C addresses, with example address spaces being 193.61.255 and 193.61.250.
A small number of machines in the Computing Department use the 250 address space, namely
the SGI server's external Ethernet port at 193.61.250.34, and the NT server at 193.61.250.35
which serves the NT network in Ve27.

Yoda has two Ethernet ports; the remaining port is used to connect to the SGI Indys via a hub -
this port has been defined to use a different address space, namely 193.61.252. The machines' IP
addresses range from 193.61.252.1 for yoda, to 193.61.252.23 for the admin Indy; .20 to .22 are
kept available for two HP systems which are occasionally connected to the network, and for a
future plan to include Apple Macs on the network.

The IP addresses of the Indys using the 252 address space cannot be directly accessed outside the
SGI network or, as the jargon goes, 'on the other side' of the server's Ethernet port which is being
used for the internal network. This automatically imposes a degree of security at the physical
level.

IP addresses and host names for systems on the local network are brought together in the file
/etc/hosts. Each line in this file gives an IP address, an official hostname and then any name
aliases which represent the same system, eg. yoda.comp.uclan.ac.uk is also known as
www.comp.uclan.ac.uk, or just yoda, or www, etc. When a system is first booted, the ifconfig
command uses the /etc/hosts file to assign addresses to the various available Ethernet network
interfaces. Enter 'more /etc/hosts' or 'nedit /etc/hosts' to examine the host names file for the
particular system you're using.

NB: due to the Internet's incredible expansion in recent years, the world is actually beginning to
run out of available IP addresses and domain names; at best, existing top-level domains are being
heavily overused (eg. .com, .org, etc.) and the number of allocatable network address spaces is
rapidly diminishing, especially if one considers the possible expansion of the Internet into
Russia, China, the Far East, Middle East, Africa, Asia and Latin America. Thus, there are moves
afoot to change the Internet so that it uses 128bit instead of 32bit IP addresses. When this will
happen is unknown, but such a change would solve the problem.
Special IP Addresses

Certain reserved IP addresses have special meanings, eg. the address 127.0.0.1 is known as the
'loopback' address (equivalent host name 'localhost') and always refers to the local system which
one happens to be using at the time. If one never intends to connect a system to the Internet,
there's no reason why this default IP address can't be left as it is with whatever default name
assigned to it in the /etc/hosts file (SGIs always use the default name, "IRIS"), though most
people do change their system's IP address and host name in case, for example, they have to
connect their system to the network used at their place of work, or to provide a common naming
scheme, group ID setup, etc.

If a system's IP address is changed from the default 127.0.0.1, the exact procedure is to add a
new line to the /etc/hosts file such that the system name corresponds to the information in
/etc/sys_id. One must never remove the 127.0.0.1 entry from the /etc/hosts file or the system will
not work properly. The important lines of the /etc/hosts file used on the SGI network are shown
in Fig 22 below (the appearance of '[etc]' in Fig 22 means some text has been clipped away to aid
clarity).

# This entry must be present or the system will not work.


127.0.0.1 localhost

# SGI Server. Challenge S.


193.61.252.1 yoda.comp.uclan.ac.uk yoda www.comp.uclan.ac.uk www
[etc]

# Computing Services router box link.


193.61.250.34 gate-yoda.comp.uclan.ac.uk gate-yoda

# SGI Indys in Ve24, except milamber which is in Ve47.

193.61.252.2 akira.comp.uclan.ac.uk akira


193.61.252.3 ash.comp.uclan.ac.uk ash
193.61.252.4 cameron.comp.uclan.ac.uk cameron
193.61.252.5 chan.comp.uclan.ac.uk chan
193.61.252.6 conan.comp.uclan.ac.uk conan
193.61.252.7 gibson.comp.uclan.ac.uk gibson
193.61.252.8 indiana.comp.uclan.ac.uk indiana
193.61.252.9 leon.comp.uclan.ac.uk leon
193.61.252.10 merlin.comp.uclan.ac.uk merlin
193.61.252.11 nikita.comp.uclan.ac.uk nikita
193.61.252.12 ridley.comp.uclan.ac.uk ridley
193.61.252.13 sevrin.comp.uclan.ac.uk sevrin
193.61.252.14 solo.comp.uclan.ac.uk solo
193.61.252.15 spock.comp.uclan.ac.uk spock
193.61.252.16 stanley.comp.uclan.ac.uk stanley
193.61.252.17 warlock.comp.uclan.ac.uk warlock
193.61.252.18 wolfen.comp.uclan.ac.uk wolfen
193.61.252.19 woo.comp.uclan.ac.uk woo
193.61.252.23 milamber.comp.uclan.ac.uk milamber

[etc]

Figure 22. The contents of the /etc/hosts file used on the SGI network.
One example use of the localhost address is when a user accesses a system's local web page
structure at:

http://localhost/

On SGIs, such an address brings up a page about the machine the user is using. For the SGI
network, the above URL always brings up a page for yoda since /var/www is NFS-mounted from
yoda. The concept of a local web page structure for each machine is more relevant in company
Intranet environments where each employee probably has her or his own machine, or where
different machines have different locally stored web page information structures due to, for
example, differences in available applications, etc.

The BIND Name Server (DNS).

If a site is to be connected to the Internet, then it should use a name server such as BIND
(Berkeley Internet Name Domain) to provide an Internet Domain Names Service (DNS). DNS is
an Internet-standard name service for translating hostnames into IP addresses and vice-versa. A
client machine wishing to access a remote host executes a query which is answered by the DNS
daemon, called 'named'. Yoda runs a DNS server and also a Proxy server, allowing the machines
in Ve24 to access the Internet via Netscape (telnet, ftp, http, gopher and other services can be
used).

Most of the relevant database configuration files for a DNS setup reside in /var/named. A set of
example configuration files are provided in /var/named/Examples - these should be used as
templates and modified to reflect the desired configuration. Setting up a DNS database can be a
little confusing at first, thus the provision of the Examples directory. The files which must be
configured to provide a functional DNS are:

/etc/named.boot
/var/named/root.cache
/var/named/named.hosts
/var/named/named.rev
/var/named/localhost.rev

If an admin wishes to use a configuration file other than /etc/named.boot, then its location should
be specified by creating a file called /etc/config/named.options with the following contents (or
added to named.options if it already exists):

-b some-other-boot-file

After the files in /var/named have been correctly configured, the chkconfig command is used to
set the appropriate variable file in /etc/config:

chkconfig named on
The next reboot will activate the DNS service. Once started, named reads initial configuration
information from the file /etc/named.boot, such as what kind of server it should be, where the
DNS database files are located, etc. Yoda's named.boot file looks like this:

;
; Named boot file for yoda.comp.uclan.ac.uk.
;
directory /var/named

cache . root.cache
primary comp.uclan.ac.uk named.hosts
primary 0.0.127.IN-ADDR.ARPA localhost.rev
primary 252.61.193.IN-ADDR.ARPA named.rev
primary 250.61.193.IN-ADDR.ARPA 250.rev
forwarders 193.61.255.3 193.61.255.4

Figure 23. Yoda's /etc/named.boot file.

Looking at the contents of the example named.boot file in /var/named/Examples, the differences
are not that great:

;
; boot file for authoritative master name server for Berkeley.EDU
; Note that there should be one primary entry for each SOA record.
;
;
sortlist 10.0.0.0

directory /var/named

; type domain source host/file backup


file

cache . root.cache
primary Berkeley.EDU named.hosts
primary 32.128.IN-ADDR.ARPA named.rev
primary 0.0.127.IN-ADDR.ARPA localhost.rev

Figure 24. The example named.boot file in /var/named/Examples.

Yoda's file has an extra line for the /var/named/250.rev file; this was an experimental attempt to
make Yoda's subdomain accessible outside UCLAN, which failed because of the particular
configuration of a router box elsewhere in the communications chain (the intention was to enable
students and staff to access the SGI network using telnet from a remote host).

For full details on how to configure a typical DNS, see Chapter 6 of the online book, "IRIX
Admin: Networking and Mail". A copy of this Chapter has been provided for reference. As an
example of how identical DNS is across UNIX systems, see the issue of Network Week [10]
which has an article on configuring a typical DNS. Also, a copy of each of Yoda’s DNS files
which I had to configure is included for reference. Together, these references should serve as an
adequate guide to configuring a DNS; as with many aspects of managing a UNIX system,
learning how someone else solved a problem and then modifying copies of what they did can be
very effective.

Note: it is not always wise to use a GUI tool for configuring a service such as BIND [11]. It's too
easy for ill-tested grandiose software management tools to make poor assumptions about how an
admin wishes to configure a service/network/system. Services such as BIND come with their
own example configuration files anyway; following these files as a guide may be considerably
easier than using a GUI tool which itself can cause problems created by whoever wrote the GUI
tool, rather than the service itself (in this case BIND).

Proxy Servers

A Proxy server acts as a go-between to the outside world, answering client requests for data from
the Internet, calling the DNS system to obtain IP addresses based on domain names, opening
connections to the Internet perhaps via yet another Proxy server elsewhere (the Ve24 system uses
Pipex as the next link in the communications chain), and retrieving data from remote hosts for
transmission back to clients.

Proxy servers are a useful way of providing Internet access to client systems at the same time as
imposing a level of security against the outside world, ie. the internal structure of a network is
hidden from the outside world due to the operational methods employed by a Proxy server, rather
like the way in which a representative at an auction can act for an anonymous client via a mobile
phone during the bidding. Although there are more than a dozen systems in Ve24, no matter
which machine a user decides to access the Internet from, the access will always appear to a
remote host to be coming from the IP address of the closest proxy server, eg. the University web
server would see Yoda as the accessing client. Similarly, I have noticed that when I access my
own web site in Holland, the site concerned sees my access as if it had come from the proxy
server at Pipex, ie. the Dutch system cannot see 'past' the Pipex Proxy server.

There are various proxy server software solutions available. A typical package which is easy to
install and configure is the Netscape Proxy Server. Yoda uses this particular system.

Network Information Service (NIS)

It is reasonably easy to ensure that all systems on a small network have consistent /etc/hosts files
using commands such as rcp. However, medium-sized networks consisting of dozens to
hundreds of machines may present problems for administrators, especially if the overall setup
consists of several distinct networks, perhaps in different buildings and run by different people.
For such environments, a Network Information Service (NIS) can be useful. NIS uses a single
system on the network to act as the sole trusted source of name service information - this system
is known as the NIS master. Slave servers may be used to which copies of the database on the
NIS master are periodically sent, providing backup services should the NIS master system fail.
Client systems locate a name server when required, requesting data based on a domain name and
other relevant information.

Unified Name Service Daemon (UNS, or more commonly NSD).

Extremely recently, the DNS and NIS systems have been superseded by a new system called the
Unified Name Service Daemon, or NSD for short. NSD handles requests for domain information
in a considerably more efficient manner, involving fewer system calls, replacing multiple files
for older services with a single file (eg. many of the DNS files in /var/named are replaced by a
single database file under NSD), and allowing for much larger numbers of entries in data files,
etc.

However, NSD is so new that even I have not yet had an opportunity to examine properly how it
works, or the way in which it correlates to the older DNS and NIS services. As a result, this
course does not describe DNS, NIS or NSD in any great detail. This is because, given the rapid
advance of modern UNIX OSs, explaining the workings of DNS or NIS would likely be a
pointless task since any admin beginning her or his career now is more likely to encounter the
newer NSD system which I am not yet comfortable with. Nevertheless, administrators should be
aware of the older style services as they may have to deal with them, especially on legacy
systems. Thus, though not discussed in these lectures, some notes on a typical DNS setup are
provided for further reading [10]. Feel free to login to the SGI server yourself with:

rlogin yoda

and examine the DNS and NIS configuration files at your leisure; these may be found in the
/var/named and /var/yp directories. Consult the online administration books for further details.

UNIX Fundamentals: UNIX Software Features

Software found on UNIX systems can be classified into several types:

 System software: items provided by the vendor as standard.


 Commercial software: items purchased either from the same vendor which supplied the OS, or
from some other commercial 3rd-party.
 Shareware software: items either supplied with the OS, or downloaded from the Internet, or
obtained from some other source such as a cover magazine CD.
 Freeware software: items supplied in the same manner as Shareware, but using a more open
'conditions of use'.
 User software: items created by users of a system, whether that user is an admin or an ordinary
user.
System Software

Any OS for any system today is normally supplied on a set of CDs. As the amount of data for an
OS installation increases, perhaps the day is not far away when vendors will begin using DVDs
instead.

Whether or not an original copy of OS CDs can be installed on a system depends very much on
the particular vendor, OS and system concerned. Any version of IRIX can be installed on an SGI
system which supports that particular version of IRIX - this ability to install the OS whether or
not one has a legal right to use the software is simply a practice SGI has adopted over the years.
SGI could have chosen to make OS installation more difficult by requiring license codes and
other details at installation time, but instead SGI chose a different route. What is described here
applies only to SGI's IRIX OS.

SGI decided some time ago to adopt a strategy of official software and hardware management
which makes it extremely difficult to make use of 'pirated' software. The means by which this is
achieved is explained in the System Hardware section below, but the end result is a policy where
any version IRIX older than the 'current' version is free by default. Thus, since the current release
of IRIX is 6.5, one could install IRIX 6.4, 6.3, 6.2 (or any older version) on any appropriate SGI
system (eg. installing IRIX 6.2 on a 2nd-hand Indy) without having to worry about legal issues.
There's nothing to stop one physically installing 6.5 if one had the appropriate CDs (ie. the
software installation tools and CDs do not include any form of installation protection or copy
protection), but other factors might make for trouble later on if the user concerned did not apply
for a license at a later date, eg. attempting to purchase commercial software and licenses for the
latest OS release.

It is highly likely that in future years, UNIX vendors will also make their current OSs completely
free, probably as a means of combating WindowsNT and other rivals.

As an educational site operating under an educational license agreement, UCLAN's Computing


Department is entitled to install IRIX 6.5 on any of the SGI systems owned by the Computing
Department, though at present most systems use the older IRIX 6.2 release for reasons connected
with system resources on each machine (RAM, disk space, CPU power).

Thus, the idea of a license can have two meanings for SGIs:

 A theoretical 'legal' license requirement which applies, for example, to the current release of
IRIX, namely IRIX 6.5 - this is a legal matter and doesn't physically affect the use of IRIX 6.5 OS
CDs.
 A real license requirement for particular items of software using license codes, obtainable either
from SGI or from whatever 3rd-party the software in question was purchased.

Another example of the first type is the GNU licensing system, explained in the 'Freeware Software'
section below (what the GNU license is and how it works is fascinatingly unique).
Due to a very early top-down approach to managing system software, IRIX employs a high-level
software installation structure which ensures that:

 It is extremely easy to add, remove, or update software, especially using the GUI software tool
called Software Manager (swmgr is the text command name which can be entered in a shell).
 Changes to system software are handled correctly with very few, if any, errors most of the time;
'most' could be defined as 'rarely, if ever, but not never'. A real world example might be to state
that I have installed SGI software elements thousands of times and rarely if ever encountered
problems, though I have had to deal with some issues on occasion.
 Software 'patches' (modificational updates to existing software already installed) are handled in
such a way as to allow the later removal of said patches if desired, leaving the system in exactly
its original state as if the patch had never been installed.

As an example of software installation reliability, my own 2nd-hand Indigo2 at home has been in use
since March 1998, was originally installed with IRIX 6.2, updated with patches several times, added to
with extra software over the first few months of ownership (mid-1998), then upgraded to IRIX 6.5,
added to with large amounts of freeware software, then upgraded to IRIX 6.5.1, then 6.5.2, then 6.5.3,
and all without a single software installation error of any kind. In fact, my Indigo2 hasn't crashed or
given a single error since I first purchased it. As is typical of any UNIX system which is/was widely used in
various industries, most if not all of the problems ever encountered on the Indigo2 system have been
resolved by now, producing an incredibly stable platform. In general, the newer the system and/or the
newer the software, then the greater number of problems there will be to deal with, at least initially.

Thankfully, OS revisions largely build upon existing code and knowledge. Plus, since so many
UNIX vendors have military, government and other important customers, there is incredible
pressure to be very careful when planning changes to system or application software. Intensive
testing is done before any new version is released into the marketplace (this contrasts completely
with Microsoft which deliberately allows the public to test Beta versions of its OS revisions as a
means of locating bugs before final release - a very lazy way to handle system testing by any
measure).

Because patches often deal with release versions of software subsystems, and many software
subsystems may have dependencies on other subsystems, the issue of patch installation is the
most common area which can cause problems, usually due to unforseen conflicts between
individual versions of specific files. However, rigorous testing and a top-down approach to
tracking release versions minimises such problems, especially since all UNIX systems come
supplied with source code version/revision tracking tools as-standard, eg. SCCS. The latest
'patch CD' can usually be installed automatically without causing any problems, though it is wise
for an administrator to check what changes are going to be made before commencing any such
installation, just in case.

The key to such a high-level software management system is the concept of a software
'subsystem'. SGI has developed a standard means by which a software suite and related files
(manual pages, release notes, data, help documents, etc.) are packaged together in a form suitable
for installation by the usual software installation tools such as inst and swmgr. Once this
mechanism was carefully defined many years ago, insisting that all subsequent official software
releases comply with the same standard ensures that the opportunity for error is greatly
minimised, if not eliminated. Sometimes, certain 3rd-party applications such as Netscape can
display apparent errors upon installation or update, but these errors are usually explained in
accompanying documentation and can always be ignored.

Each software subsystem is usually split into several sub-units so that only relevant components
need be installed as desired. The sub-units can then be examined to see the individual files which
would be installed, and where. When making updates to software subsystems, selecting a newer
version of a subsystem automatically selects only the relevant sub-units based on which sub-
units have already been installed, ie. new items will not automatically be selected. For ease of
use, an admin can always choose to execute an automatic installation or removal (as desired),
though I often select a custom installation just so that I can see what's going on and learn more
about the system as a result. In practice, I rarely need to alter the default behaviour anyway.

The software installation tools automatically take care not to overwrite existing configuration
files when, for example, installing new versions (ie. upgrades) of software subsystems which
have already been installed (eg. Netscape). In such cases, both the old and new configuration
files are kept and the user (or admin) informed that there may be a need to decide which of the
two files to keep, or perhaps to copy key data from the old file to the new file, deleting the old
file afterwards.

Commercial Software

A 3rd-party commercial software package may or may not come supplied in a form which
complies with any standards normally used by the hardware system vendor. UNIX has a long
history of providing a generic means of packaging software and files in an archive which can be
downloaded, uncompressed, dearchived, compiled and installed automatically, namely the
'tar.gz' archive format (see the man pages for tar and gzip). Many commercial software suppliers
may decide to sell software in this format. This is ok, but it does mean one may not be able to
use the usual software management tools (inst/swmgr in the case of SGIs) to later remove the
software if desired. One would have to rely on the supplier being kind enough to either provide a
script which can be used to remove the software, or at the very least a list of which files get
installed where.

Thankfully, it is likely that most 3rd-parties will at least try to use the appropriate distribution
format for a particular vendor's OS. However, unlike the source vendor, one cannot be sure that
the 3rd-party has taken the same degree of care and attention to ensure they have used the
distribution format correctly, eg. checking for conflicts with other software subsystems,
providing product release notes, etc.

Commercial software for SGIs may or may not use the particular hardware feature of SGIs
which SGI uses to prevent piracy, perhaps because exactly how it works is probably itself a
licensed product from SGI. Details of this mechanism are given in the System Hardware section
below.
Shareware Software

The concept of shareware is simple: release a product containing many useful features, but which
has more advanced features and perhaps essential features limited, restricted, or locked out
entirely, eg. being able to save files, or working on files over a particular size.

A user can download the shareware version of the software for free. They can test out the
software and, if they like it, 'register' the software in order to obtain either the 'full' (ie. complete)
version, or some kind of encrypted key or license code that will unlock the remaining features
not accessible or present in the shareware version. Registration usually involves sending a small
fee, eg. $30, to the author or company which created the software. Commonly, registration
results in the author(s) sending the user proper printed and bound documentation, plus regular
updates to the registered version, news releases on new features, access to dedicated mailing
lists, etc.

The concept of shareware has changed over the years, partly due to the influence of the computer
game 'Doom' which, although released as shareware in name, actually effectively gave away an
entire third of the complete game for free. This was a ground-breaking move which proved to be
an enormous success, earning the company which made the game (id Software, Dallas, Texas,
USA) over eight million $US and a great deal of respect and loyalty from gaming fans. Never
before had a company released shareware software in a form which did not involve deliberately
'restricting' key aspects of the shareware version. As stated above, shareware software is often
altered so that, for example, one could load files, work on them, make changes, test out a range
of features, but (crucially) not save the results. Such shareware software is effectively not of any
practical use on its own, ie. it serves only as a kind of hands-on advertisement for the full
version. Doom was not like this at all. One could play an entire third of the game, including over
a network against other players.

Today, other creative software designers have adopted a similar approach, perhaps the most
famous recent example of which is 'Blender' [1], a free 3D rendering and animation program for
UNIX and (as of very soon) WindowsNT systems.

In its as-supplied form, Blender can be used to do a great deal of work, creating 3D scenes,
renderings and animations easily on a par with 3D Studio Max, even though some features in
Blender are indeed locked out in the shareware version. However, unlike conventional traditional
concepts, Blender does allow one to save files and so can be used for useful work. It has spread
very rapidly in the last few months amongst students in educational sites worldwide, proving to
be of particular interest to artists and animators who almost certainly could not normally afford a
commercial package which might cost hundreds or perhaps thousands of pounds. Even small
companies have begun using Blender.

However, supplied documentation for Blender is limited. As a 'professional level' system, it is


unrealistic to expect to be able to get the best out of it without much more information on how it
works and how to use it. Thus, the creators of Blender, a company called NaN based in Holland,
makes most of their revenue by offering a very detailed 350 page printed and bound manual for
about $50 US, plus a sequence of software keys which make available the advanced features in
Blender.

Software distribution concepts such as the above methods used by NaN didn't exist just a few
years ago, eg. before 1990. The rise of the Internet, certain games such as Doom, the birth of
Linux, and changes in the way various UNIX vendors manage their business have caused a
quantum leap in what people think of as shareware.

Note that the same caveat stated earlier with respect to software quality also applies to
shareware, and to freeware too, ie. such software may or may not use the normal distribution
method associated with a particular UNIX platform - in the case of SGIs, the 'inst' format.

Another famous example of shareware is the XV [2] image-viewer program, which offers a
variety of functions for image editing and image processing (even though its author insists it's
really just an image viewer). XV does not have restricted features, but it is an official shareware
product which one is supposed to register if one intends to use the program for commercial
purposes. However, as is typical with many modern shareware programs, the author stipulates
that there is no charge for personal (non-commercial) or educational use.

Freeware Software

Unlike shareware software, freeware software is exactly that: completely free. There is no
concept of registration, restricted features, etc. at all.

Until recently, even I was not aware of the vast amount of free software available for SGIs and
UNIX systems in general. There always has been free software for UNIX systems, but as in
keeping with other changes by UNIX vendors over the past few years, SGI altered its application
development support policy in 1997 to make it much easier for users to make use of freeware on
SGI systems. Prior to that time, SGI did not make the system 'header' files (normally kept in
/usr/include) publicly available. Without these header files, one could not compile any new
programs even if one had a free compiler.

So, SGI adopted a new stance whereby the header files, libraries, example source code and other
resources are provided free, but its own advanced compiler technologies (the MIPS Pro
Compilers) remain commercial products. Immediately, anyone could then write their own
applications for SGI systems using the supplied CDs (copies of which are available from SGI's
ftp site) in conjunction with free compilation tools such as the GNU compilers. As a result, the
2nd-hand market for SGI systems in the USA has skyrocketed, with extremely good systems
available at very low cost (systems which cost 37500 pounds new can now be bought for as little
as 500 pounds, even though they can still be better than modern PCs in many respects).

It is highly likely that other vendors have adopted similar strategies in recent years (most of my
knowledge concerns SGIs). Sun Microsystems made its SunOS free for students some years ago
(perhaps Solaris too); my guess is that a similar compiler/development situation applies to
systems using SunOS and Solaris as well - one can write applications using free software and
tools. This concept probably also applies to HP systems, Digital UNIX systems, and other
flavours of UNIX.

Linux is a perfect example of how the ideas of freeware development can determine an OS'
future direction. Linux was meant to be a free OS from its very inception - Linus Torvalds, its
creator, loathes the idea of an OS supplier charging for the very platform upon which essential
software is executed. Although Linux is receiving considerable industry support these days,
Linus is wary of the possibility of Linux becoming more commercial, especially as vendors such
as Red Hat and Caldera offer versions of Linux with added features which must be paid for.
Whether or not the Linux development community can counter these commercial pressures in
order to retain some degree of freeware status and control remains to be seen.

Note: I'm not sure of the degree to which completely free development environments on a
quality-par with GNU are available for MS Windows-based systems (whether that involves
Win95, Win98, WinNT or even older versions such as Win3.1).

The GNU Licensing System

The GNU system is, without doubt, thoroughly unique in the modern era of copyright,
trademarks, law suits and court battles. It can be easily summarised as a vast collection of free
software tools, but the detail reveals a much deeper philosophy of software development, best
explained by the following extract from the main GNU license file that accompanies any GNU-
based program [3]:

"The licenses for most software are designed to take away your freedom to share and
change it. By contrast, the GNU General Public License is intended to guarantee your
freedom to share and change free software--to make sure the software is free for all its
users. This General Public License applies to most of the Free Software Foundation's
software and to any other program whose authors commit to using it. (Some other Free
Software Foundation software is covered by the GNU Library General Public License
instead.) You can apply it to your programs, too.

When we speak of free software, we are referring to freedom, not price. Our General Public Licenses
are designed to make sure that you have the freedom to distribute copies of free software (and charge
for this service if you wish), that you receive source code or can get it if you want it, that you can
change the software or use pieces of it in new free programs; and that you know you can do these
things.

To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to
ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.

For example, if you distribute copies of such a program, whether gratis or for a fee, you must give
the recipients all the rights that you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which
gives you legal permission to copy, distribute and/or modify the software.

Also, for each author's protection and ours, we want to make certain that everyone understands that
there is no warranty for this free software. If the software is modified by someone else and passed on,
we want its recipients to know that what they have is not the original, so that any problems
introduced by others will not reflect on the original authors' reputations.

Finally, any free program is threatened constantly by software patents. We wish to avoid the danger
that redistributors of a free program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any patent must be licensed for
everyone's free use or not licensed at all."

Reading the above extract, it is clear that those responsible for the GNU licensing system had to
spend a considerable amount of time actually working out how to make something free! Free in a
legal sense that is. So many standard legal matters are designed to restrict activities, the work put
into the GNU Free Software Foundation makes the license document read like some kind of
software engineer's nirvana. It's a serious issue though, and the existence of GNU is very
important in terms of the unimaginable amount of creative work going on around the world
which would not otherwise exist (without GNU, Linux would probably not exist).

SGI, and other UNIX vendors I expect, ships its latest OS (IRIX 6.5) with a CD entitled
'Freeware', which not only contains a vast number of freeware programs in general (everything
from spreadsheets and data plotting to games, audio/midi programming and molecular
modeling), but also a complete, pre-compiled inst-format distribution of the entire GNU archive:
compilers, debugging tools, GNU versions of shells and associated utilities, calculators,
enhanced versions of UNIX commands and tools, even higher-level tools such as a GUI-based
file manager and shell tool, and an absolutely superb Photoshop-style image editing tool called
GIMP [4] (GNU Image Manipulation Program) which is extendable by the user. The individual
software subsystems from the Freeware CD can also be downloaded in precompiled form from
SGI's web site [5].

The February 1999 edition of SGI's Freeware CD contains 173 different software subsystems, 29
of which are based on the GNU licensing system (many others are likely available from
elsewhere on the Internet, along with further freeware items). A printed copy of the contents of
the Feb99 Freeware CD is included with the course notes for further reading.

Other important freeware programs which are supplied separately from such freeware CD
distributions (an author may wish to distribute just from a web site), include the Blue Moon
Rendering Tools (BMRT) [6], a suite of advanced 3D ray-tracing and radiosity tools written by
one of the chief architects at Pixar animation studios - the company which created "Toy Story",
"Small Soldiers" and "A Bug's Life". Blender can output files in Inventor format, which can then
be converted to RIB format for use by BRMT.

So why is shareware and freeware important? Well, these types of software matter because,
today, it is perfectly possible for a business to operate using only shareware and/or freeware
software. An increasingly common situation one comes across is an entrepreneurial multimedia
firm using Blender, XV, GIMP, BMRT and various GNU tools to manage its entire business,
often running on 2nd-hand equipment using free versions of UNIX such as Linux, SunOS or
IRIX 6.2! I know of one such company in the USA which uses decade-old 8-CPU SGI servers
and old SGI workstations such as Crimson RealityEngine and IRIS Indigo. The hardware was
acquired 2nd-hand in less than a year.

Whether or not a company decides to use shareware or freeware software depends on many
factors, especially the degree to which a company feels it must have proper, official support.
Some sectors such as government, medical and military have no choice: they must have proper,
fully guaranteeable hardware and software support because of the nature of the work they do, so
using shareware or freeware software is almost certainly out of the question. However, for
medium-sized or smaller companies, and especially home users or students, the existence of
shareware and freeware software, combined with the modern approaches to these forms of
software by today's UNIX vendors, offers whole new avenues of application development and
business ideas which have never existed before as commercially viable possibilities.

System Hardware

The hardware platforms supplied by the various UNIX vendors are, like UNIX itself today, also
designed and managed with a top-down approach.

The world of PCs has always been a bottom-up process of putting together a mish-mash of
different components from a wide variety of sources. Motherboards, video cards, graphics cards
and other components are available in a plethora of types of varying degrees of quality. This
bottom-up approach to systems design means it's perfectly possible to have a PC with a good
CPU, good graphics card, good video card, but an awful motherboard. If the hardware is suspect,
problems faced by the user may appear to be OS-related when in fact they could be down to poor
quality hardware. It's often difficult or impossible to ascertain the real cause of a problem -
sometimes system components just don't work even though they should, or a system suddenly
stops recognising the presence of a device; these problems are most common with peripherals
such as CDROM, DVD, ZIP, sound cards, etc.

Dealing only with hardware systems designed specifically to run a particular vendor's UNIX
variant, the situation is very different. The vendor maintains a high degree of control over the
design of the hardware platform. Hence, there is opportunity to focus on the unique requirements
of target markets, quality, reliability, etc. rather than always focusing on absolute minimum cost
which inevitably means cutting corners and making tradeoffs.

This is one reason why even very old UNIX systems, eg. multi-processor systems from 1991
with (say) eight 33MHz CPUs, are still often found in perfect working order. The initial focus on
quality results in a much lower risk of component failure. Combined with generous hardware and
software support policies, hardware platforms for traditional UNIX systems are far more reliable
than PCs.

My personal experience is with hardware systems designed by SGI, about which I know a great
deal. Their philosophy of design is typical of most UNIX hardware vendors (others would be
Sun, HP, IBM, DEC, etc.) and can be contrasted very easily with the way PCs are designed and
constructed:

UNIX low-end: "What can we give the customer for 5000?"


mid-range: "What can we give the customer for 15000?"
high-end: "What can we give the customer for 65000+?"

PC: "How cheap can we make a machine which offers


a particular feature set and level of ability?"

Since the real driving force behind PC development is the home market, especially games, the
philosophy has always been to decide what features a typical 'home' or 'office' PC ought to have
and then try and design the cheapest possible system to offer those features. This approach has
eventually led to incredibly cut-throat competition, creating new concepts such as the 'sub-$1000'
PC, and even today's distinctly dubious 'free PC', but in reality the price paid by consumers is the
use of poor quality components which do not integrate well, especially components from
different suppliers. Hardware problems in PCs are common, and now unavoidable. In Edinburgh,
I know of a high-street PC store which always has a long queue of customers waiting to have
their particular problem dealt with.

By contrast, most traditional UNIX vendors design their own systems with a top-down approach
which focuses on quality. Since the vendor usually has complete control, they can ensure a much
greater coherence of design and degree of integration. System components work well with each
other because all parts of the system were designed with all the other parts in mind.

Another important factor is that a top-down approach allows vendors to innovate and develop
new architectural designs, creating fundamentally new hardware techniques such as SMP and
S2MP processing, highly scalable systems, advanced graphics architectures, and perhaps most
importantly of all from a customer's point of view: much more advanced CPU designs (Alpha,
MIPS, SPARC, PA-RISC, POWER series, etc.) Such innovations and changes in design concept
are impossible in the mainstream PC market: there is too much to lose by shifting from the
status-quo. Everything follows the lowest common denominator.

The most obvious indication of these two different approaches is that UNIX hardware platforms
have always been more expensive than PCs, but that is something which should be expected
given that most UNIX platforms are deliberately designed to offer a much greater feature set,
better quality components, better integration, etc.

A good example is the SGI Indy. With respect to absolute cost, the Indy was very expensive
when it was first released in 1993, but because of what it offered in terms of hardware and
software features it was actually a very cheap system compared to trying to put together a PC
with a similar feature set. In fact, Indy offered features such as hardware-accelerated 3D graphics
at high resolution (1280x1024) and 24bit colour at a time when such features did not exist at all
for PCs.

PCW magazine said in its original review [7] that to give a PC the same standard features and
abilities, such as ISDN, 4-channel 16bit stereo sound with multiple stereo I/O sockets, S-
Video/Composite/Digital video inputs, NTSC-resolution CCD digital camera, integrated SCSI,
etc. would have cost twice as much as an Indy. SGI set out to design a system which would
include all these features as-standard, so the end result was bound to cost several thousand
pounds, but that was still half the cost of trying to cobble together a collection of mis-matched
components from a dozen different companies to produce something which still would not have
been anywhere near as good. As PCW put it, the Indy - for its time - was a great machine
offering superb value if one was the kind of customer which needed its features and would be
able to make good use of them.

Sun Microsystems adopted a similar approach to its recent Ultra5, Ultra10 and other systems:
provide the user with an integrated design with a specific feature set that Sun knew its customers
wanted. SGI did it again with their O2 system, released in October 1996. O2 has such a vast
range of features (highly advanced for its time) that few ordinary customers would find
themselves using most or all of them. However, for the intended target markets (ranging from
CAD, design, animation, film/video special effects, video editing to medical imaging, etc.) the
O2 was an excellent system. Like most UNIX hardware systems, O2 today is not competitive in
certain areas such as basic 3D graphics performance (there are exceptions to this), but certain
advanced and unique architectural features mean it's still purchased by customers who require
those features.

This, then, is the key: UNIX hardware platforms which offer a great many features and high-
quality components are only a good choice if one:

 is the kind of customer which definitely needs those features


 values the ramifications of using a better quality system that has been designed top-down:
reliability, quality, long-term value, ease of maintenance, etc.

One often observes people used to PCs asking why systems like O2, HP's Visualize series, SGI's Octane,
Sun's Ultra60, etc. cost so much compared compared to PCs. The reason for the confusion is that the
world of PCs focuses heavily on the abilities of the main CPU, whereas all UNIX vendors have, for many
years, made systems which include as much dedicated acceleration hardware as possible, easing the
burden on the main CPU. For the home market, systems like the Amiga pioneered this approach;
unfortunately, the company responsible fort the Amiga doomed itself to failure as a result of various
marketing blunders.

From an admin's point of view, the practical side effect of having to administer and run a UNIX
hardware platform is that there is far, far less effort needed in terms of configuring systems at the
hardware level, or having to worry about different system hardware components operating
correctly with one other. Combined with the way most UNIX variants deal with hardware
devices (ie. automatically and transparently most of the time), a UNIX admin can swap hardware
components between different systems from the same vendor without any need to alter system
software, ie. any changes in system hardware configuration are dealt with automatically.

Further, many UNIX vendors use certain system components that are identical (usually memory,
disks and backup devices), so admins can often swap generic items such as disks between
different vendor platforms without having to reconfigure those components (in the case of disks)
or worry about damaging either system. SCSI disks are a good example: they are supplied
preformatted, so an admin should never have to reformat a SCSI disk. Swapping a SCSI disk
between different vendor platforms may require repartitioning of the disk, but never a reformat.
In the 6 years I've been using SGIs, I've never had to format a SCSI disk.

Examining a typical UNIX hardware system such as Indy, one notices several very obvious
differences compared to PCs:

 There are far fewer cables in view.


 Components are positioned in such a way as to greatly ease access to all parts of the system.
 The overall design is highly integrated so that system maintenance and repairs/replacements
are much easier to carry out.

Thus, problems that are solvable by the admin can be dealt with quickly, while problems requiring
vendor hardware support assistance can be fixed in a short space of time by a visiting technician, which
obviously reduces costs for the vendor responsible by enabling their engineers to deal with a larger
number of queries in the same amount of time.

Just as with the approaches taken to hardware and software design, the way in which support
contracts for UNIX systems operate also follow a top-down approach. Support costs can be high,
but the ethos is similar: you get what you pay for - fast no-nonsense support when it's needed.

I can only speak from experience of dealing with SGIs, but I'm sure the same is true of other
UNIX vendors. Essentially, if I encounter a hardware problem of some kind, the support service
always errs on the side of caution in dealing with the problem, ie. I don't have to jump through
hoops in order to convince them that there is a problem - they accept what I say and organise a
visiting technician to help straight away (one can usually choose between a range of response
times from 1 hour to 5 days). Typically, unless the technician can fix the problem on-site in a
matter of minutes, then some, most, or even all of the system components will be replaced if
necessary to get the system in working order once more.

For example, when I was once encountering SCSI bus errors, the visiting engineer was almost at
the point of replacing the motherboard, video card and even the main CPU (several thousand
pounds worth of hardware in terms of new-component replacement value at the time) before
some extra further tests revealed that it was in fact my own personal disk which was causing the
problem (I had an important jumper clip missing from the jumper block). In other words, UNIX
vendor hardware support contracts tend to place much less emphasis on the customer having to
prove they have a genuine problem.

I should imagine this approach exists because many UNIX vendors have to deal with extremely
important clients such as government, military, medical, industrial and other sectors (eg. safety
critical systems). These are customers with big budgets who don't want to waste time messing
around with details while their faulty system is losing them money - they expect the vendor to
help them get their system working again as soon as possible.
Note: assuming a component is replaced (eg. motherboard), even if the vendor's later tests show
the component to be working correctly, it is not returned to the customer, ie. the customer keeps
the new component. Instead, most vendors have their own dedicated testing laboratories which
pull apart every faulty component returned to them, looking for causes of problems so that the
vendor can take corrective action if necessary at the production stage, and learn any lessons to
aid in future designs.

To summarise the above:

 A top-down approach to hardware design means a better feature set, better quality, reliability,
ease of use and maintenance, etc.
 As a result, UNIX hardware systems can be costly. One should only purchase such a system if
one can make good use of the supplied features, and if one values the implications of better
quality, etc., despite the extra cost.

However, a blurred middle-ground between the top-down approach to UNIX hardware platforms and
the bottom-up approach to the supply of PCs is the so-called 'vendor-badged' NT workstation market. In
general, this is where UNIX vendors create PC-style hardware systems that are still based on off-the-
shelf components, but occasionally include certain modifications to improve performance, etc. beyond
what one normally sees of a typical PC. The most common example is where vendors such as Compaq
supply systems which have two 64bit PCI busses to increase available system bandwidth.

All these systems are targeted at the 'NT workstation' market. Cynics say that such systems are
just a clever means of placing a 'quality' brand name on ordinary PC hardware. However, such
systems do tend to offer a better level of quality and integration that ordinary PCs (even
expensive ordinary PCs), but an inevitable ironic side effect is that these vendor-badged systems
do cost more. Just as with traditional UNIX hardware systems, whether or not that cost is worth
it depends on customers' priorities. Companies such as movie studios regard stability and
reliability as absolutely critical, which is why most studios do not use NT [8]. Those that do,
especially smaller studios (perhaps because of limited budgets) will always go for vendor-badged
NT workstations rather than purchasing systems from PC magazines and attempting to cobble
together a reliable platform. The extra cost is worth it.

There is an important caveat to the UNIX hardware design approach: purchasing what can be a
very good UNIX hardware system is a step that can easily be ruined by not equipping that
system in the first instance with sufficient essential system resources such as memory capacity,
disk space, CPU power and (if relevant) graphics/image/video processing power. Sometimes,
situations like this occur because of budget constraints, but the end result may be a system which
cannot handle the tasks for which it was purchased. If such mis-matched purchases are made, it's
usually a good sign that the company concerned is using a bottom-up approach to making
decisions about whether or not to buy a hardware platform that has been built using a top-down
approach. The irony is plain to see. Since admins often have to advise on hardware purchases or
upgrades, a familiarity with these issues is essential.

Conclusion: decide what is needed to solve the problem. Evaluate which systems offer
appropriate solutions. If there no system is affordable, do not compromise on essentials such as
memory or disk as a means of lowering cost - choose a different platform instead such as good
quality NT system, or a system with lower costs such as an Intel machine running Linux, etc.

Similarly, it makes no sense to have a good quality UNIX system, only to then adopt a strategy
of buying future peripherals (eg. extra disks, memory, printers, etc.) that are of poor quality. In
fact, some UNIX vendors may not offer or permit hardware support contracts unless the
customer sticks to using approved 3rd-party hardware sources.

Summary: UNIX hardware platforms are designed top-down, offer better quality components,
etc., but tend to be more expensive as a result.

Today, an era when even SGI has started to sell systems that support WindowsNT, the
philosophy is still the same: design top-down to give quality hardware, etc. Thus, SGI's
WindowsNT systems start at around 2500 pounds - alot by the standards of any home user, but
cheap when considering the market in general. The same caveat applies though: such a system
with a slow CPU is wasting the capabilities of the machine.

UNIX Characteristics.

Integration:

A top-down approach results in an integrated design. Systems tend to be supplied 'complete', ie.
everything one requires is usually supplied as-standard. Components work well together since
the designers are familiar with all aspects of the system.

Stability and Reliability:

The use of quality components, driven by the demands of the markets which most UNIX vendors
aim for, results in systems that experience far fewer component failures compared to PCs. As a
result of a top-down and integrated approach, the chances of a system experiencing hardware-
level conflicts are much lower compared to PCs.

Security:

It is easy for system designers to incorporate hardware security features such as metal hoops that
are part of the main moulded chassis, for attaching to security cables.

On the software side, and as an aid to preventing crime (as well as making it easier to solve
crime in terms of tracing components, etc.) systems such as SGIs often incorporate unique
hardware features. The following applies to SGIs but is also probably true of hardware from
other UNIX vendors in some equivalent form.
Every SGI has a PROM chip on the motherboard, without which the system will not boot. This
PROM chip is responsible for initiating the system bootup sequence at the very lowest hardware
level. However, the chip also contains an ID number which is unique to that particular machine.
One can display this ID number with the following command:

sysinfo -s

Alternatively, the number can be displayed in hexadecimal format by using sysinfo command on
its own (one notes the first 4 groups of two hex digits). A typical output might look like this:

% sysinfo -s
1762299020
% sysinfo
System ID:
69 0a 8c 8c 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The important part of the output from the second command is the beginning sequence consisting
of '690A8C8C'.

The ID number is not only used by SGI when dealing with system hardware and software
support contracts, it is also the means by which license codes are supplied for SGI's commercial
software packages.

If one wishes to use a particular commercial package, eg. the VRML editor called
CosmoWorlds, SGI uses the ID number of the machine to create a license code which will be
recognised by the program concerned as being valid only for that particular machine. The 20-
digit hexadecimal license code is created using a special form of encryption, presumably
combining the ID number with some kind of internal database of codes for SGI's various
applications which only SGI has access to. In the case of the O2 I use at home, the license code
for CosmoWorlds is 4CD4FB82A67B0CEB26B7 (ie. different software packages on the same
system need different license codes). This code will not work for any other software package on
any other SGI anywhere in the world.

There are two different license management systems in use by SGIs: the NetLS environment on
older platforms, and the FlexLM environment on newer platforms. FlexLM is being widely
adopted by many UNIX vendors. NetLS licenses are stored in the /var/netls directory, while
FlexLM licenses are kept in /var/flexlm. To the best of my knowledge, SGI's latest version of
IRIX (6.5) doesn't use NetLS licenses anymore, though it's possible that 3rd-party software
suppliers still do.

As stated in the software section, the use of the ID number system at the hardware level means it
is impossible to pirate commercial software. More accurately, anyone can copy any SGI software
CD, and indeed install the software, but that software will not run without the license code which
is unique to each system, so there's no point in copying commercial software CDs or installing
copied commercial software in the first place.
Of course, one could always try to reverse-engineer the object code of a commercial package to
try and get round the section which makes the application require the correct license code, but
this would be very difficult. The important point is that, to the best of my knowledge, SGI's
license code schema has never been broken at the hardware level.

Note: from the point of view of an admin maintaining an SGI system, if a machine completely
fails, eg. damage by fire and water, the admin should always retain the PROM chip if possible -
ie. a completely new system could be obtained but only the installation of the original PROM
chip will make the new system effectively the same as the old one. For PCs, the most important
system component in terms of system identity is the system disk (more accurately, its contents);
but for machines such as SGIs, the PROM chip is just as if not more important than the contents
of the system disk when it comes to a system having a unique identity.

Scalability.

Because a top-down hardware design approach has been used by all UNIX hardware vendors
over the years, most UNIX vendors offer hardware solutions that scale to a large number of
processors. Sun, IBM, SGI, HP and other vendors all offer systems that scale to 64 CPUs.
Currently, one cannot obtain a reliable PC/NT platform that scales to even 8 CPUs (Intel won't
begin shipping 8-way chip sets until Q3 1999).

Along with the basic support for a larger number of processors, UNIX vendors have spent a great
deal of time researching advanced ways of properly supporting many CPUs. There are complex
issues concerning how such systems handle shared memory, the movement of data,
communications links, efficient use of other hardware such as graphics and video subsystems,
maximised use of storage systems (eg. RAID), and so on.

The result is that most UNIX vendors offer large system solutions which can tackle extremely
complex problems. Since these systems are obviously designed to the very highest quality
standards with a top-down approach to integration, etc., they are widely used by companies and
institutions which need such systems for solving the toughest of tasks, from processing massive
databases to dealing with huge seismic data sets, large satellite images, complex medical data
and intensive numerical processing (eg. weather modeling).

One very beneficial side-effect of this kind of development is that the technology which comes
out of such high-quality designs slowly filters down to the desktop systems, enabling customers
to eventually utilise extremely advanced and powerful computing systems. A particularly good
example of this is SGI's Octane system [9] - it uses the same components and basic technology
as SGI's high-end Origin server system. As a result, the user benefits from many advanced
features, eg.

 Octane has no inherent maximum memory limit. Memory is situated on a 'node board' along
with the 1 or 2 main CPUs, rather than housed on a backplane. As CPU designs improve, so
memory capacity on the node board can be increased by using a different node board design, ie.
without changing the base system at all. For example, Octane systems using the R10000 CPU
can have up to 2GB RAM, while Octane systems using the R12000 CPU can have up to 4GB RAM.
Future CPUs (R14K, R16K, etc.) will change this limit again to 8GB, 16GB, etc.
 The speed at which all internal links operate is directly synchronised to the clock speed of the
main CPU. As a result, internal data pathways can always supply data to both main CPUs faster
than they can theoretically cope with, ie. one can get the absolute maximum performance out
of a CPU (this is fundamentally not possible with any PC design). As CPU clock speeds increase,
so does the rate at which the system can move data around internally. An Octane using 195MHz
R10000s offers three separate internal data pathways each operating at 1560MB/sec (10X faster
than a typical PCI bus). An Octane using 300MHz R12000s runs the same pathways at the faster
rate of 2400MB/sec per link. ie. system bandwidth and memory bandwidth increase to match
CPU speed.

The above is not a complete list of advanced features.

SGI's high-end servers are currently the most scalable in the world, offering up to 256 CPUs for
a commercially available system, though some sites with advance copies of future OS changes
have systems with 512 and 720 CPUs. As stated elsewhere, one system has 6144 CPUs.

The quality of design required to create technologies like this, along with software and OS
concepts that run them properly, are quite incredible. These features are passed on down to
desktop systems and eventually into consumer markets. But it means that, at any one time, mid-
range systems based on such advanced technologies can be quite expensive (Octanes generally
start at around 7000 pounds). Since much of the push behind these developments comes from
military and government clients, again there is great emphasis on quality, reliability, security,
etc. Cray Research, which is owned by SGI, holds the world record for the most stable and
reliable system: a supercomputer with 2048 CPUs which ran for 2.5 years without any of the
processors exhibiting a single system-critical error.

Sun, HP, IBM, DEC, etc. all operate similar design approaches, though SGI/Cray happens to
have the most advanced and scalable server and graphics system designs at the present time,
mainly because they have traditionally targeted high-end markets, especially US government
contracts.

The history of UNIX vendor CPU design follows a similar legacy: typical customers have
always been willing to pay 3X as much as an Intel CPU in order to gain access to 2X the
performance. Ironically, as a result, Intel have always produced the world's slowest CPUs, even
though they are the cheapest. CPUs at much lower clock speeds from other vendors (HP, IBM,
Sun, SGI, etc.) can easily be 2X to 5X faster than Intel's current best. As stated above though,
these CPUs are much more expensive - even so, it's an extra cost which the relevant clients say
they will always bare in order to obtain the fastest available performance. The exception today is
the NT workstation market where systems from UNIX vendors utilise Intel CPUs and
WindowsNT (and/or Linux), offering a means of gaining access to better quality graphics and
video hardware while sacrificing the use of more powerful CPUs and the more sophisticated
UNIX OSs, resulting in lower cost. Even so, typical high-end NT systems still cost around 3000
to 15000 pounds.
So far, no UNIX vendor makes any product that is targeted at the home market, though some
vendors create technologies that are used in the mass consumer market (eg. the R3000 CPU
which runs the Sony PlayStation is designed by SGI and was used in their older workstations in
the late 1980s and early 1990s; all of the Nintendo64's custom processors were designed by
SGI). In terms of computer systems, it is unlikely this situation will ever change because to do so
would mean a vendor would have to adopt a bottom-up design approach in order to minimise
cost above all else - such a change wouldn't be acceptable to customers and would contradict the
way in which the high-end systems are developed. Vendors which do have a presence in the
consumer market normally use subsidiaries as a means of avoiding internal conflicts in design
ethos, eg. SGI's MIPS subsidiary (soon to be sold off).

References:

1. Blender Animation and Rendering Program:


2. http://www.blender.nl/
3. XV Image Viewer:
4. http://www.trilon.com/xv/xv.html
5. Extract taken from GNU GENERAL PUBLIC LICENSE, Version 2, June 1991, Copyright (C) 1989,
1991 Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
6. GIMP (GNU Image Manipulation Program):
7. http://www.gimp.org/
8. SGI Freeware Sites (identical):
9. http://freeware.sgi.com/
10. http://toolbox.sgi.com/TasteOfDT/public/freeware/
11. Pixar's Blue Moon Rendering Tools (BMRT):
12. http://www.bmrt.org/
13. Silicon Graphics Indy, PCW, September 1993:
14. http://www.futuretech.vuurwerk.nl/pcw9-93indy.html
15. "LA conferential", CGI Magazine, Vol4, Issue 1, Jan/Feb 1999, pp. 21, by Richard Spohrer.

Interview from the 'Digital Content and Creation' conference and exhibition:

'"No major production facilities rely on commercial software, everyone has to customise
applications in order to get the most out of them," said Hughes. "We run Unix on SGI as we need
a stable environment which allows fast networking. NT is not a professional solution and was
never designed to handle high-end network environments," he added. "Windows NT is the
antithesis of what the entertainment industry needs. If we were to move from Irix, we would
use Linux over NT."'

- John Hughes, president/CEO of Rhythm & Hues and Scott Squires, visual effects
supervisor at ILM and ceo of Puffin Design.
16. Octane Information Index:

http://www.futuretech.vuurwerk.nl/octane/

17. "How to set up the BIND domain name server", Network Week, Vol4 No. 29, 14th April 1999, pp.
17, by David Cartwright.
18. A letter from a reader in response to [10]:

"Out of a BIND", Network Week, Vol4 No. 31, 28th April 1999, pp. 6:

"A couple of weeks ago, I had a problem. I was attempting to configure NT4's DNS
Server for use on a completely private network, but it just wasn't working properly. The
WindowsNT 'help' - and I use that term loosely - assumed my network was connected to
the Internet, so the examples it gave were largely useless. Then I noticed David
Cartwright's article about setting up DNS servers. (Network Week, 14th April). The light
began to dawn. Even better, the article used BIND's configuration files as examples. This
meant that I could dump NT's obtuse GUI DNS Manager application and hand-hack the
configuration files myself. A few minor problems later (most of which were caused by
Microsoft's example DNS config files being a bit... um... optimistic) and the DNS server
finally lurched into life. Thank you Network Week. The more Q&A and how-to type
information you print, the better."

- Matthew Bell, Fluke UK.

General References:

Anonymous SGI FTP Site List: http://reality.sgi.com/billh/anonftp/


Origin2000 Information Index: http://www.futuretech.vuurwerk.nl/origin/
Onyx2 Information Index: http://www.futuretech.vuurwerk.nl/onyx2/
SGI: http://www.sgi.com/
Hewlett Packard: http://www.hp.com/
Sun Microsystems: http://www.sun.com/
IBM: http://www.ibm.com/
Compaq/Digital: http://www.digital.com/
SCO: http://www.sco.com/
Linux: http://www.linux.org/

Appendix A: Case Study.

For unknown and unchangeable reasons, UCLAN's central admin system has a DNS
setup which, incorrectly, does not recognise comp.uclan.ac.uk as a subdomain. Instead,
the central DNS lists comp as a host name, ie. comp.uclan.ac.uk is listed as a direct
reference to Yoda's external IP address, 193.61.250.34; in terms of the intended use of
the word 'comp', this is rather like referring to a house on a street by using just the street
name. As a result, the SGI network's fully qualified host names, such as
yoda.comp.uclan.ac.uk, are not recognised outside UCLAN, and neither is
comp.uclan.ac.uk since all the machines on the SGI network treat comp as a subdomain.
Thus, external users can access Yoda's IP address directly by referring to 193.61.250.34
(so ftp is possible), but they cannot access Yoda as a web server, or access individual
systems in Ve24 such as sevrin.comp.uclan.ac.uk, or send email to the SGI network.
Also, services such as USENET cannot be setup, so internal users must use web sites to
access newsgroups.

This example serves as a warning: organisations should thoroughly clarify what their
individual department's network structures are going to be, through a proper consultation
and discussion process, before allowing departments to setup internal networks.
Otherwise, confusion and disagreement can occur. In the case of the SGI network, its
internal structure is completely correct (as confirmed by SGI themselves), but the way it
is connected to the Internet is incorrect. Only the use of a Proxy server allows clients to
access the Internet, but some strange side-effects remain; for example, email can be sent
from the SGI network to anywhere on the Internet (from Yoda to Yahoo in less than 10
seconds!), but not vice-versa because incoming data is blocked by the incorrectly
configured central DNS.

Email from the SGI network can reach the outside world because of the way the email
system works: the default settings installed along with the standard Berkeley Sendmail
software (/usr/lib/sendmail) are sufficient to forward email from the SGI network to the
Internet via routers further along the communications chain, which then send the data to
JANET at Manchester, and from there to the final destination (which could include a
UCLAN student or staff member). The situation is rather like posting a letter without a
sender's address, or including an address which gives everything as far as the street name
but not the house number - the letter will be correctly delivered, but the recipient will not
be able to reply to the sender.
Detailed Notes for Day 2 (Part 2)
UNIX Fundamentals: Shell scripts.

It is an inevitable consequence of using a command interface such as shells that one would wish
to be able to run a whole sequence of commands to perform more complex tasks, or perhaps the
same task many times on multiple systems.

Shells allow one to do this by creating files containing sequences of commands. The file,
referred to as a shell script, can be executed just like any other program, though one must ensure
the execute permissions on the file are set appropriately in order for the script to be executable.

Large parts of all modern UNIX variants use shell scripts to organise system management and
behaviour. Programming in shell script can include more complicated structures such as if/then
statements, case statements, for loops, while loops, functions, etc. Combined with other features
such as metacharacters and the various text-processing utilities (perl, awk, sed, grep, etc.) one
can create extremely sophisticated shell scripts to perform practically any system administration
task, ie. one is able to write programs which can use any available application or existing
command as part of the code in the script. Since shells are based on C and the commands use a
similar syntax, shell programming effectively combines the flexibility of C-style programming
with the ability to utilise other programs and resources within the shell script code.

Looking at typical system shell script files, eg. the bootup scripts contained in /etc/init.d, one can
see that most system scripts make extensive use of if/then expressions and case statements.
However, a typical admin will find it mostly unnecessary to use even these features. In fact,
many administration tasks one might choose to do can be performed by a single command or
sequence of commands on a single line (made possible via the various metacharacters). An
admin might put such mini-scripts into a file and execute that file when required; even though
the file's contents may not appear to be particularly complex, one can perform a wide range of
tasks using just a few commands.

Hash symbol '#' in a script file at the beginning of a line is used to denote a comment.

One of the most commonly used commands in UNIX is 'find' which allows one to search for
files, directories, files belonging to a particular user or group, files of a special type (eg. a link to
another file), files modified before or after a certain time, and so on (there are many options).
Most admins tend to use the find command to select certain files upon which to perform some
other operation, to locate files for information gathering purposes, etc.

The find command uses a Boolean expression which defines the type of file the command is to
search for. The name of any file matching the Boolean expression is returned.
For example (see the 'find' man page for full details):

find /home/students -name "capture.mv" -print

Figure 25. A typical find command.

This command searches all students directories, looking for any file called 'capture.mv'. On Indy
systems, users often capture movie files when first using the digital camera, but usually never
delete them, wasting disk space. Thus, an admin might have a site policy that, at regular
intervals, all files called capture.mv are erased - users would be notified that if they captured a
video sequence which they wished to keep, they should either set the name to use as something
else, or rename the file afterwards.

One could place the above command into a executable file called 'loc', running that file when one
so desired. This can be done easily by the following sequence of actions (only one line is entered
in this example, but one could easily enter many more):

% cat > loc


find /home/students -name "capture.mv" -print
[press CTRL-D]
% chmod u+x loc
% ls -lF loc
-rwxr--r-- 1 mapleson staff 46 May 3 13:20 loc*

Figure 26. Using cat to quickly create a simple shell script.

Using ls -lF to examine the file, one would see the file has the execute permission set for user,
and a '*' has been appended after the file name, both indicating the file is now executable. Thus,
one could run that file just as if it were a program. One might imagine this is similar to .BAT
files in DOS, but the features and functionality of shell scripts are very different (much more
flexible and powerful, eg. the use of pipes).

There's no reason why one couldn't use an editor to create the file, but experienced admins know
that it's faster to use shortcuts such as employing cat in the above way, especially compared to
using GUI actions which requires one to take hold the mouse, move it, double-click on an icon,
etc. Novice users of UNIX systems don't realise until later that very simple actions can take
longer to accomplish with GUI methods.

Creating a file by redirecting the input from cat to a file is a technique I often use for typing out
files with little content. cat receives its input from stdin (the keyboard by default), so using 'cat >
filename' means anything one types is redirected to the named file instead of stdout; one must
press CTRL-D to end the input stream and close the file.

An even lazier way of creating the file, if just one line was required, is to use echo:

% echo 'find /home/students -name "capture.mv" -print' > loc


% chmod u+x loc
% ls -lF loc
-rwxr--r-- 1 mapleson staff 46 May 3 13:36 loc
% cat loc
find /home/students -name "capture.mv" -print

Figure 27. Using echo to create a simple one-line shell script.

This time, there is no need to press CTRL-D, ie. the prompt returns immediately and the file has
been created. This happens because, unlike cat which requires an 'end of file' action to terminate
the input, echo's input terminates when it receives an end-of-line character instead (this
behaviour can be overridden with the '-n' option).

The man page for echo says, "echo is useful for producing diagnostics in command files and for
sending known data into a pipe."

For the example shown in Fig 27, single quote marks surrounding the find command were
required. This is because, without the quotes, the double quotes enclosing capture.mv are not
included in the output stream which is redirected into the file. When contained in a shell script
file, find doesn't need double quotes around the file name to search for, but it's wise to include
them because other characters such as * have special meaning to a shell. For example, without
the single quote marks, the script file created with echo works just fine (this example searches
for any file beginning with the word 'capture' in my own account):

% echo find /mapleson -name "capture.*" -print > loc


% chmod u+x loc
% ls -lF loc
-rwxr--r-- 1 mapleson staff 38 May 3 14:05 loc*
% cat loc
find /mapleson -name capture.* -print
% loc
/mapleson/work/capture.rgb

Figure 28. An echo sequence without quote marks.

Notice the loc file has no double quotes. But if the contents of loc is entered directly at the
prompt:

% find /mapleson -name capture.* -print


find: No match.

Figure 29. The command fails due to * being treated as a metacommand by the shell.

Even though the command looks the same as the contents of the loc file, entering it directly at
the prompt produces an error. This happens because the * character is interpreted by the shell
before the find command, ie. the shell tries to evaluate the capture.* expression for the current
directory, instead of leaving the * to be part of the find command. Thus, when entering
commands at the shell prompt, it's wise to either use double quotes where appropriate, or use the
backslash \ character to tell the shell not to treat the character as if it was a shell metacommand,
eg.:

% find /mapleson -name capture.\* -print


/mapleson/work/capture.rgb

Figure 30. Using a backslash to avoid confusing the shell.

A -exec option can be used with the find command to enable further actions to be taken on each
result found, eg. the example in Fig 25 could be enhanced by including making the find
operation execute a further command to remove each capture.mv file as it is found:

find /home/students -name "capture.mv" -print -exec /bin/rm {} \;

Figure 31. Using find with the -exec option to execute rm.

Any name returned by the search is passed on to the rm command. The shell substitutes the {}
symbols with each file name result as it is returned by find. The \; grouping at the end serves to
terminate the find expression as a whole (the ; character is normally used to terminate a
command, but a backslash is needed to prevent it being interpreted by the shell as a
metacommand).

Alternatively, one could use this type of command sequence to perform other tasks, eg. suppose I
just wanted to know how large each movie file was:

find /home/students -name "capture.mv" -print -exec /bin/ls -l {} \;

Figure 32. Using find with the -exec option to execute ls.

This works, but two entries will be printed for each command: one is from the -print option, the
other is the output from the ls command. To see just the ls output, one can omit the -print option.

Consider this version:

find /home/students -name "*.mov" -exec /bin/ls -l {} \; > results

Figure 33. Redirecting the output from find to a file.

This searches for any .mov movie file (usually QuickTime movies), with the output redirected
into a file. One can then perform further operations on the results file, eg. one could search the
data for any movie that contains the word 'star' in its name:

grep star results

A final change might be to send the results of the grep operation to the printer for later reading:

grep star results | lp

Thus, the completed script looks like this:

find /home/students -name "*.mv" -exec /bin/ls -l {} \; > results


grep star results | lp
Figure 34. A simple script with two lines.

Only two lines, but this is now a handy script for locating any movies on the file system that are
likely to be related to the Star Wars or Star Trek sagas and thus probably wasting valuable disk
space! For the network I run, I could then use the results to send each user a message saying the
Star Wars trailer is already available in /home/pub/movies/misc, so they've no need to download
extra copies to their home directory.

It's a trivial example, but in terms of the content of the commands and the way extra commands
are added, it's typical of the level of complexity of most scripts which admins have to create.

Further examples of the use of 'find' are in the relevant man page; an example file which contains
several different variations is:

/var/spool/cron/crontabs/root

This file lists the various administration tasks which are executed by the system automatically on
a regular basis. The cron system itself is discussed in a later lecture.

WARNING. The Dangers of the Find Command and Wildcards.

Although UNIX is an advanced OS with powerful features, sometimes one encounters an aspect
of its operation which catches one completely off-guard, though this is much less the case after
just a little experience.

A long time ago (January 1996), I realised that many students who used the Capture program to
record movies from the Digital Camera were not aware that using this program or other movie-
related programs could leave unwanted hidden directories containing temporary movie files in
their home directory, created during capture, editing or conversion operations (I think it happens
when an application is killed of suddenly, eg. with CTRL-C, which doesn't give it an opportunity
to erase temporary files).

These directories, which are always located in a user's home directory, are named
'.capture.mv.tmpXXXXX' where XXXXX is some 5-digit string such as '000Hb', and can easily
take up many megabytes of space each.

So, I decided to write a script to automatically remove such directories on a regular basis. Note
that I was logged on as root at this point, on my office Indy.

In order to test that a find command would work on hidden files (I'd never used the find
command to look for hidden files before), I created some test directories in the /tmp directory,
whose contents would be given by 'ls -AR' as something like this:

% ls -AR
.b/ .c/ a/ d/
./.b:

./.c:
.b a

./a:

./d:
a

ie. a simple range of hidden and non-hidden directories with or without any content:

 Ordinary directories with or without hidden/non-hidden files inside,


 Hidden directories with or without hidden/non-hidden files inside,
 Directories with ordinary files,
 etc.

The actual files such as .c/a and .c/.b didn't contain anything. Only the names were important for the
test.

So, to test that find would work ok, I executed the following command from within the /tmp
directory:

find . -name ".*" -exec /bin/rm -r {} \;

(NB: the -r option for rm means do a recursive removal, and note that there was no -i option used
with the rm here)

What do you think this find command would do? Would it remove the hidden directories .b and
.c and their contents? If not, why not? Might it do anything else as well?

Nothing happened at first, but the command did seem to be taking far too long to return the shell
prompt. So, after a few seconds, I decided something must have gone wrong; I typed CTRL-C to
stop the find process (NB: it was fortunate I was not distracted by a phone call or something at
this point).

Using the ls command showed the test files I'd created still existed, which seemed odd. Trying
some further commands, eg. changing directories, using the 'ps' command to see if there was
something causing system slowdown, etc., produced strange errors which I didn't understand at
the time (this was after only 1 or 2 months' admin experience), so I decided to reboot the system.

The result was disaster: the system refused to boot properly, complaining about swap file errors
and things relating to device files. Why did this happen?

Consider the following command sequence by way of demonstration:


cd /tmp
mkdir xyz
cd xyz
/bin/ls -al

The output given will look something like this:

drwxr-xr-x 2 root sys 9 Apr 21 13:28 ./


drwxrwxrwt 6 sys sys 512 Apr 21 13:28 ../

Surely the directory xyz should be empty? What are these two entries? Well, not quite empty. In
UNIX, as stated in a previous lecture, virtually everything is treated as a file. Thus, for example,
the command so commonly performed even on the DOS operating system:

cd ..

is actually doing something rather special on UNIX systems. 'cd ..' is not an entire command in
itself. Instead, every directory on a UNIX file system contains two hidden directories which are
in reality special types of file:

./ - this refers to the current directory.


../ - this is effectively a link to the
directory above in the file system.

So typing 'cd ..' actually means 'change directory to ..' (logical since cd does mean 'change
directory to') and since '..' is treated as a link to the directory above, then the shell changes the
current working directory to the next level up.

[by contrast, 'cd ..' in DOS is treated as a distinct command in its own right - DOS recognises the
presence of '..' and if possible changes directory accordingly; this is why DOS users can type
'cd..' instead if desired]

But this can have an unfortunate side effect if one isn't careful, as is probably becoming clear by
now. The ".*" search pattern in the find command will also find these special './' and '../' entries
in the /tmp directory, ie.:

 The first thing the find command locates is './'


 './' is inserted into the search string ".*" to give "../*"
 find changes directory to / (root directory). Uh oh...
 find locates the ./ entry in / and substitutes this string into ".*" to give "../*". Since the current
directory cannot be any higher, the search continues in the current directory; ../ is found next
and is treated the same way.
 The -exec option with 'rm' causes find to begin erasing hidden files and directories such as
.Sgiresources, eventually moving onto non-hidden files: first the /bin link to /usr/bin, then the
/debug link, then all of /dev, /dumpster, /etc and so on.

By the time I realised something was wrong, the find command had gone as far as deleting most of /etc.
Although important files in /etc were erased which I could have replaced with a backup tape or reinstall,
the real damage was the erasure of the /dev directory. Without important entries such as /dev/dsk,
/dev/rdsk, /dev/swap and /dev/tty*, the system cannot mount disks, configure the swap partition on
bootup, connect to keyboard input devices (tty terminals), and accomplish other important tasks.

In other words, disaster. And I'd made it worse by rebooting the system. Almost a complete
repair could have been done simply by copying the /dev and /etc directories from another
machine as a temporary fix, but the reboot made everything go haywire. I was partly fooled by
the fact that the files in /tmp were still present after I'd stopped the command with CTRL-C. This
led me to at first think that nothing had gone awry.

Consulting an SGI software support engineer for help, it was decided the only sensible solution
was to reinstall the OS, a procedure which was alot simpler than trying to repair the damage I'd
done.

So, the lessons learned:

 Always read up about a command before using it. If I'd searched the online books with the
expression 'find command', I would have discovered the following paragraph in Chapter 2
("Making the Most of IRIX") of the 'IRIX Admin: System Configuration and Operation' manual:

"Note that using recursive options to commands can be very dangerous in that the command
automatically makes changes to your files and file system without prompting you in each case.
The chgrp command can also recursively operate up the file system tree as well as down. Unless
you are sure that each and every case where the recursive command will perform an action is
desired, it is better to perform the actions individually. Similarly, it is good practice to avoid the
use of metacharacters (described in "Using Regular Expressions and Metacharacters") in
combination with recursive commands."

I had certainly broken the rule suggested by the last sentence in the above paragraph. I
also did not know what the command would do before I ran it.

 Never run programs or scripts with as-yet unknown effects as root.

ie. when testing something like removing hidden directories, I should have logged on as
some ordinary user, eg. a 'testuser' account, so that if the command went wrong it would
not have been able to change or remove any files owned by root, or files owned by
anyone else for that matter, including my own in /mapleson. If I had done this, the
command I used would have given an immediate error and halted when the find string
tried to remove the very first file found in the root directory (probably some minor hidden
file such as .Sgiresources).

Worrying thought: if I hadn't CTRL-C'd the find command when I did, after enough time, the command
would have erased the entire file system (including /home), or at least tried to. I seem to recall that, in
reality (tested once on a standalone system deliberately), one can get about as far as most of /lib before
the system actually goes wrong and stops the current command anyway, ie. the find command
sequence eventually ends up failing to locate key libraries needed for the execution of 'rm' (or perhaps
the 'find' itself) at some point.

The only positive aspects of the experience were that, a) I'd learned alot about the subtleties of
the find command and the nature of files very quickly; b) I discovered after searching the Net
that I was not alone in making this kind of mistake - there was an entire web site dedicated to the
comical mess-ups possible on various operating systems that can so easily be caused by even
experienced admins, though more usually as a result of inexperience or simple errors, eg. I've
had at least one user so far who has erased their home directory by mistake with 'rm -r *' (he'd
thought his current working directory was /tmp when in fact it wasn't). A backup tape restored
his files.

Most UNIX courses explain how to use the various available commands, but it's also important
to show how not to use certain commands, mainly because of what can go wrong when the root
user makes a mistake. Hence, I've described my own experience of making an error in some
detail, especially since 'find' is such a commonly used command.

As stated in an earlier lecture, to a large part UNIX systems run themselves automatically. Thus,
if an admin finds that she/he has some spare time, I recommend using that time to simply read up
on random parts of the various administration manuals - look for hints & tips sections, short-cuts,
sections covering daily advice, guidance notes for beginners, etc. Also read man pages: follow
them from page to page using xman, rather like the way one can become engrossed in an
encyclopedia, looking up reference after reference to learn more.

A Simple Example Shell Script.

I have a script file called 'rebootlab' which contains the following:

rsh akira init 6&


rsh ash init 6&
rsh cameron init 6&
rsh chan init 6&
rsh conan init 6&
rsh gibson init 6&
rsh indiana init 6&
rsh leon init 6&
rsh merlin init 6&
rsh nikita init 6&
rsh ridley init 6&
rsh sevrin init 6&
rsh solo init 6&
#rsh spock init 6&
rsh stanley init 6&
rsh warlock init 6&
rsh wolfen init 6&
rsh woo init 6&
Figure 35. The simple rebootlab script.

The rsh command means 'remote shell'. rsh allows one to execute commands on a remote system
by establishing a connection, creating a shell on that system using one's own user ID
information, and then executing the supplied command sequence.

The init program is used for process control initialisation (see the man page for details). A
typical use for init is to shutdown the system or reboot the system into a particular state, defined
by a number from 0 to 6 (0 = full shutdown, 6 = full reboot) or certain other special possibilities.

As explained in a previous lecture, the '&' runs a process in the background.

Thus, each line in the file executes a remote shell on a system, instructing that system to reboot.
The init command in each case is run in the background so that the rsh command can
immediately return control to the rebootlab script in order to execute the next rsh command.

The end result? With a single command, I can reboot the entire SGI lab without ever leaving the
office.

Note: the line for the machine 'spock' is commented out. This is because the Indy called spock is
currently in the technician's office, ie. not in service. This is a good example of where I could
make the script more efficient by using a for loop, something along the lines of: for each name in
this list of names, do <command>.

As should be obvious, the rebootlab script makes no attempt to check if anybody is logged into
the system. So in practice I use the rusers command to make sure nobody is logged on before
executing the script. This is where the script could definitely be improved: the command sent by
rsh to each system could be modified with some extra commands so that each system is only
rebooted if nobody is logged in at the time (the 'who' command could probably be used for this,
eg. 'who | grep -v root' would give no output if nobody was logged on).

The following script, called 'remountmapleson', is one I use when I go home in the evening, or
perhaps at lunchtime to do some work on the SGI I use at home.

rsh yoda umount /mapleson && mount /mapleson &


rsh akira umount /mapleson && mount /mapleson &
rsh ash umount /mapleson && mount /mapleson &
rsh cameron umount /mapleson && mount /mapleson &
rsh chan umount /mapleson && mount /mapleson &
rsh conan umount /mapleson && mount /mapleson &
rsh gibson umount /mapleson && mount /mapleson &
rsh indiana umount /mapleson && mount /mapleson &
rsh leon umount /mapleson && mount /mapleson &
rsh merlin umount /mapleson && mount /mapleson &
rsh nikita umount /mapleson && mount /mapleson &
rsh ridley umount /mapleson && mount /mapleson &
rsh sevrin umount /mapleson && mount /mapleson &
rsh solo umount /mapleson && mount /mapleson &
#rsh spock umount /mapleson && mount /mapleson &
rsh stanley umount /mapleson && mount /mapleson &
rsh warlock umount /mapleson && mount /mapleson &
rsh wolfen umount /mapleson && mount /mapleson &
rsh woo umount /mapleson && mount /mapleson &

Figure 36. The simple remountmapleson script.

When I leave for home each day, my own external disk (where my own personal user files
reside) goes with me, but this means the mount status of the /mapleson directory for every SGI in
Ve24 is now out-of-date, ie. each system still has the directory mounted even though the file
system which was physically mounted from the remote system (called milamber) is no longer
present. As a result, any attempt to access the /mapleson directory would give an error: "Stale
NFS file handle." Even listing the contents of the root directory would show the usual files but
also the error as well.

To solve this problem, the script makes every system unmount the /mapleson directory and, if
that was successfully done, remount the directory once more. Without my disk present on
milamber, its /mapleson directory simply contains a file called 'README' whose contents state:

Sorry, /mapleson data not available - my external disk has been temporarily removed. I've
probably gone home to work for a while. If you need to contact me, please call <phone
number>.

As soon as my disk is connected again and the script run once more, milamber's local /mapleson
contents are hidden by my own files, so users can access my home directory once again.

Thus, I'm able to add or remove my own personal disk and alter what users can see and access at
a global level without users ever noticing the change.

Note: the server still regards my home directory as /mapleson on milamber, so in order to ensure
that I can always logon to milamber as mapleson even if my disk is not present, milamber's
/mapleson directory also contains basic .cshrc, .login and .profile files.

Yet again, a simple script is created to solve a particular problem.

Command Arguments.

When a command or program is executed, the name of the command and any parameters are
passed to the program as arguments. In shell scripts, these arguments can be referenced via the '$'
symbol. Argument 0 is always the name of the command, then argument 1 is the first parameter,
argument 2 is the second parameter, etc. Thus, the following script called (say) 'go':

echo $0
echo $1
echo $2
would give this output upon execution:

% go somewhere nice
go
somewhere
nice

Including extra echo commands such 'echo $3' merely produces blank lines after the supplied
parameters are displayed.

If one examines any typical system shell script, this technique of passing parameters and
referencing arguments is used frequently. As an example, I once used the technique to aid in the
processing of a large number of image files for a movie editing task. The script I wrote is also
typical of the general complexity of code which most admins have to deal with; called 'go', it
contained:

subimg $1 a.rgb 6 633 6 209


gammawarp a.rgb m.rgb 0.01
mult a.rgb a.rgb n.rgb
mult n.rgb m.rgb f.rgb
addborder f.rgb b.rgb x.rgb
subimg x.rgb ../tmp2/$1 0 767 300 875

(the commands used in this script are various image processing commands that are supplied as
part of the Graphics Library Image Tools software subsystem. Consult the relevant man pages
for details)

The important feature is the use of the $1 symbol in the first line. The script expects a single
parameter, ie. the name of the file to be processed. By eventually using this same argument at the
end of an alternative directory reference, a processed image file with the same name is saved
elsewhere after all the intermediate processing steps have finished. Each step uses temporary
files created by previous steps.

When I used the script, I had a directory containing 449 image files, each with a different name:

i000.rgb
i001.rgb
i002.rgb
.
.
.
i448.rgb

To process all the frames in one go, I simply entered this command:

find . -name "i*.rgb" -print -exec go {} \;

As each file is located by the find command, its name is passed as a parameter to the go script.
The use of the -print option displays the name of each file before the go script begins processing
the file's contents. It's a simple way to execute multiple operations on a large number of files.
Secure/Restricted Shell Scripts.

It is common practice to include the following line at the start of a shell script:

#!/bin/sh

This tells any shell what to use to interpret the script if the script is simply executed, as opposed
to sourcing the script within the shell.

The 'sh' shell is a lower level shell than csh or tcsh, ie. it's more restricted in what it can do and
does not have all the added features of csh and tcsh. However, this means a better level of
security, so many scripts (especially as-standard system scripts) include the above line in order to
make sure that security is maximised.

Also, by starting a new shell to run the script in, one ensures that the commands are always
performed in the same way, ie. a script without the above line may work slightly differently
when executed from within different shells (csh, tcsh, etc.), perhaps because of any aliases
present in the current shell environment, or a customised path definition, etc.
Detailed Notes for Day 2 (Part 3)
UNIX Fundamentals: System Monitoring Tools.

Running a UNIX system always involves monitoring how a system is behaving on a daily basis.
Admins must keep an eye on such things as:

 disk space usage


 system performance and statistics, eg. CPU usage, disk I/O, memory, etc.
 network performance and statistics
 system status, user status
 service availability, eg. Internet access
 system hardware failures and related maintenance
 suspicious/illegal activity

Figure 37. The daily tasks of an admin.

This section explains the various system monitoring tools, commands and techniques which an
admin can use to monitor the areas listed above. Typical example administration tasks are
discussed in a later lecture. The focus here is on available tools and what they offer, not on how
to use them as part of an admin strategy.

Disk Space Usage.

The df command reports current disk space usage. Run on its own, the output is expressed in
terms of numbers of blocks used/free, eg.:

yoda # df
Filesystem Type blocks use avail %use Mounted on
/dev/root xfs 8615368 6116384 2498984 71 /
/dev/dsk/dks4d5s7 xfs 8874746 4435093 4439653 50 /home
milamber:/mapleson nfs 4225568 3906624 318944 93 /mapleson

Figure 38. Using df without options.

A block is 512 bytes. But most people tend to think in terms of kilobytes, megabytes and
gigabytes, not multiples of 512 bytes. Thus, the -k option can be used to show the output in K:

yoda # df -k
Filesystem Type kbytes use avail %use Mounted on
/dev/root xfs 4307684 3058192 1249492 71 /
/dev/dsk/dks4d5s7 xfs 4437373 2217547 2219826 50 /home
milamber:/mapleson nfs 2112784 1953312 159472 93 /mapleson

Figure 39. The -k option with df to show data in K.


The df command can be forced to report data only for the file system housing the current
directory by adding a period:

yoda # cd /home && df -k .


Filesystem Type kbytes use avail %use Mounted on
/dev/dsk/dks4d5s7 xfs 4437373 2217547 2219826 50 /home

Figure 40. Using df to report usage for the file system holding the current directory.

The du command can be used to show the amount of space used by a particular directory or file,
or series of directories and files. The -k option can be used to show usage in K instead of 512byte
blocks just as with df. du's default behaviour is to report a usage amount recursively for every
sub-directory, giving a total at the end, eg.:

yoda # du -k /usr/share/data/models
436 /usr/share/data/models/sgi
160 /usr/share/data/models/food
340 /usr/share/data/models/toys
336 /usr/share/data/models/buildings
412 /usr/share/data/models/household
864 /usr/share/data/models/scenes
132 /usr/share/data/models/chess
1044 /usr/share/data/models/geography
352 /usr/share/data/models/CyberHeads
256 /usr/share/data/models/machines
1532 /usr/share/data/models/vehicles
88 /usr/share/data/models/simple
428 /usr/share/data/models/furniture
688 /usr/share/data/models/robots
7760 /usr/share/data/models

Figure 41. Using du to report usage for several directories/files.

The -s option can be used to restrict the output to just an overall total for the specified directory:

yoda # du -k -s /usr/share/data/models
7760 /usr/share/data/models

Figure 42. Restricting du to a single directory.

By default, du does not follow symbolic links, though the -L option can be used to force links to
be followed if desired.

However, du does examine NFS-mounted file systems by default. The -l and -m options can be
used to restrict this behaviour, eg.:

ASH # cd /
ASH # du -k -s -l
0 CDROM
0 bin
0 debug
68 dev
0 disk2
2 diskcopy
0 dumpster
299 etc
0 home
2421 lib
2579 lib32
0 opt
0 proc
1 root.home
4391 sbin
565 stand
65 tmp
3927 unix
397570 usr
6346 var

Figure 43. Forcing du to ignore symbolic links.

The output in Fig 43 shows that the /home directory has been ignored.

Another example: a user can find out how much disk space their account currently uses by
entering: du -k -s ~/

Swap space (ie. virtual memory on disk) can be monitored using the swap command with the -l
option.

For full details on these commands, see the relevant man pages.

Commands relating to file system quotas are dealt with in a later lecture.

System Performance.

This includes processor loading, disk loading, etc.

The most common command used by admins/users to observe CPU usage is ps, which displays a
list of currently running processes along with associated information, including the percentage of
CPU time currently being consumed by each process, eg.:

ASH 6# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 0 0 0 08:00:41 ? 0:01 sched
root 1 0 0 08:00:41 ? 0:01 /etc/init
root 2 0 0 08:00:41 ? 0:00 vhand
root 3 0 0 08:00:41 ? 0:03 bdflush
root 4 0 0 08:00:41 ? 0:00 munldd
root 5 0 0 08:00:41 ? 0:02 vfs_sync
root 900 895 0 08:03:27 ? 1:25 /usr/bin/X11/Xsgi -bs
[etc]
root 7 0 0 08:00:41 ? 0:00 shaked
root 8 0 0 08:00:41 ? 0:00 xfsd
root 9 0 0 08:00:41 ? 0:00 xfsd
root 10 0 0 08:00:41 ? 0:00 xfsd
root 11 0 0 08:00:41 ? 0:00 pdflush
root 909 892 0 08:03:31 ? 0:02 /usr/etc/videod
root 1512 1509 0 15:37:17 ? 0:00 sh -c /var/X11/xdm/Xlogin
root 158 1 0 08:01:01 ? 0:01 /usr/etc/ypbind -ypsetme
root 70 1 0 08:00:50 ? 0:00 /usr/etc/syslogd
root 1536 211 0 16:06:04 pts/0 0:00 rlogind
root 148 1 0 08:01:00 ? 0:01 /usr/etc/routed -h -[etc]
root 146 1 0 08:01:00 ? 0:00 /usr/etc/portmap
root 173 172 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 172 1 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 174 172 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 175 172 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 178 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 179 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 180 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 181 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 189 1 0 08:01:04 ? 0:00 bio3d
root 190 1 0 08:01:04 ? 0:00 bio3d
root 191 1 0 08:01:04 ? 0:00 bio3d
root 202 1 0 08:01:05 ? 0:00 /usr/etc/rpc.statd
root 192 1 0 08:01:04 ? 0:00 bio3d
root 188 1 0 08:01:03 ? 0:00 bio3d
root 311 1 0 08:01:08 ? 0:00 /usr/etc/timed -M -F yoda
root 211 1 0 08:01:05 ? 0:02 /usr/etc/inetd
root 823 1 0 08:01:33 ? 0:13 /usr/lib/sendmail -bd -
q15m
root 1557 1537 9 16:10:58 pts/0 0:00 ps -ef
root 892 1 0 08:03:25 ? 0:00 /usr/etc/videod
root 1513 1512 0 15:37:17 ? 0:07 /usr/Cadmin/bin/clogin -f
root 1546 872 0 16:07:55 ? 0:00
/usr/Cadmin/bin/directoryserver
root 1537 1536 1 16:06:04 pts/0 0:01 -tcsh
root 903 1 0 08:03:27 tablet 0:00 /sbin/getty ttyd1 co_9600
lp 460 1 0 08:01:17 ? 0:00 /usr/lib/lpsched
root 1509 895 0 15:37:13 ? 0:00 /usr/bin/X11/xdm
root 488 1 0 08:01:19 ? 0:01 /sbin/cron
root 1556 1537 28 16:10:56 pts/0 0:01 find /usr -name *.txt -
print
root 895 1 0 08:03:27 ? 0:00 /usr/bin/X11/xdm
root 872 1 0 08:02:32 ? 0:06
/usr/Cadmin/bin/directoryserver

Figure 44. Typical output from the ps command.

Before obtaining the output shown in Fig 44, I ran a find command in the background. The
output shows that the find command was utilising 28% of available CPU resources; tasks such as
find are often limited by the speed and bandwidth capacity of the disk, not the speed of the main
CPU.

The ps command has a variety of options to show or not show various information. Most of the
time though, 'ps -ef' is adequate to display the kind of information required. Note that other
UNIX variants use slightly different options, eg. the equivalent command on SunOS would be
'ps -aux'.

One can use grep to only report data for a particular process, eg.:

ASH 5# ps -ef | grep lp

lp 460 1 0 08:01:17 ? 0:00 /usr/lib/lpsched

Figure 45. Filtering ps output with grep.

This only reports data for the lp printer scheduler.

However, ps only gives a snapshot of the current system state. Often of more interest is a
system's dynamic behaviour. A more suitable command for monitoring system performance over
time is 'top', a typical output of which looks like this:

IRIX ASH 6.2 03131015 IP22 Load[0.22,0.12,0.01] 16:17:47 166 procs


user pid pgrp %cpu proc pri size rss time command
root 1576 1576 24.44 * 20 386 84 0:02 find
root 1577 1577 0.98 0 65 432 100 0:00 top
root 1513 1509 0.18 * 60 4322 1756 0:07 clogin
root 900 900 0.12 * 60 2858 884 1:25 Xsgi
root 146 146 0.05 * 60 351 77 0:00 portmap
root 158 0 0.05 * 60 350 81 0:00 ypbind
root 1567 1567 0.02 * 60 349 49 0:00 rlogind
root 3 0 0.01 * +39 0 0 0:03 bdflush
root 172 0 0.00 * 61 0 0 0:00 nfsd
root 173 0 0.00 * 61 0 0 0:00 nfsd
root 174 0 0.00 * 61 0 0 0:00 nfsd
root 175 0 0.00 * 61 0 0 0:00 nfsd

Figure 46. top shows a continuously updated output.

From the man page for top:

"Two header lines are displayed. The first gives the machine name, the release and build date
information, the processor type, the 1, 5, and 15 minute load average, the current time and the
number of active processes. The next line is a header containing the name of each field
highlighted."

The display is constantly updated at regular intervals, the duration of which can be altered with
the -i option (default duration is 5 seconds). top shows the following data for each process:
"user name, process ID, process group ID, CPU usage, processor currently executing the process
(if process not currently running), process priority, process size (in pages), resident set size (in
pages), amount of CPU time used by the process, and the process name."

Just as with the ps command, top shows the ID number for each process. These IDs can be used
with the kill command (and others) to control running processes, eg. shut them down, suspend
them, etc.

There is a GUI version of top called gr_top.

Note that IRIX 6.5 contains a newer version of top which gives even more information, eg.:

IRIX WOLFEN 6.5 IP22 load averages: 0.06 0.01 0.00


17:29:44
58 processes: 56 sleeping, 1 zombie, 1 running
CPU: 93.5% idle, 0.5% usr, 5.6% ker, 0.0% wait, 0.0% xbrk, 0.5%
intr
Memory: 128M max, 116M avail, 88M free, 128M swap, 128M free swap

PID PGRP USERNAME PRI SIZE RES STATE TIME WCPU% CPU%
COMMAND
1372 1372 root 20 2204K 1008K run/0 0:00 0.2 3.22 top
153 153 root 20 2516K 1516K sleep 0:05 0.1 1.42 nsd
1364 1364 root 20 1740K 580K sleep 0:00 0.0 0.24
rlogind

Figure 47. The IRIX 6.5 version of top, giving extra information.

A program which offers much greater detail than top is osview. Like top, osview constantly
updates a whole range of system performance statistics. Unlike top though, so much information
is available from osview that it offers several different 'pages' of data. The number keys are used
to switch between pages. Here is a typical output for each of the five pages:

Page 1 (system information):

Osview 2.1 : One Second Average WOLFEN 17:32:13 04/21/99 #5


int=5s
Load Average fs ctl 2.0M
1 Min 0.000 fs data 7.7M
5 Min 0.000 delwri 0
15 Min 0.000 free 87.5M
CPU Usage data 26.0M
%user 0.20 empty 61.4M
%sys 0.00 userdata 20.7M
%intr 0.00 reserved 0
%gfxc 0.00 pgallocs 2
%gfxf 0.00 Scheduler
%sxbrk 0.00 runq 0
%idle 99.80 swapq 0
System Activity switch 4
syscall 19 kswitch 95
read 1 preempt 1
write 0 Wait Ratio
fork 0 %IO 1.2
exec 0 %Swap 0.0
readch 19 %Physio 0.0
writech 38
iget 0
System Memory
Phys 128.0M
kernel 10.1M
heap 3.9M
mbufs 96.0K
stream 40.0K
ptbl 1.2M

Figure 48. System information from osview.

Page 2 (CPU information):

Osview 2.1 : One Second Average WOLFEN 17:36:27 04/21/99 #1


int=5s
CPU Usage
%user 0.00
%sys 100.00
%intr 0.00
%gfxc 0.00
%gfxf 0.00
%sxbrk 0.00
%idle 0.00

Figure 49. CPU information from osview.

Page 3 (memory information):

Osview 2.1 : One Second Average WOLFEN 17:36:56 04/21/99 #1


int=5s
System Memory iclean 0
Phys 128.0M *Swap
kernel 10.5M *System VM
heap 4.2M *Heap
mbufs 100.0K *TLB Actions
stream 48.0K *Large page stats
ptbl 1.3M
fs ctl 1.5M
fs data 8.2M
delwri 0
free 77.1M
data 28.8M
empty 48.3M
userdata 30.7M
reserved 0
pgallocs 450
Memory Faults
vfault 1.7K
protection 225
demand 375
cw 25
steal 375
onswap 0
oncache 1.4K
onfile 0
freed 0
unmodswap 0
unmodfile 0

Figure 50. Memory information from osview.

Page 4 (network information):

Osview 2.1 : One Second Average WOLFEN 17:38:15 04/21/99 #1


int=5s
TCP
acc. conns 0
sndtotal 33
rcvtotal 0
sndbyte 366
rexmtbyte 0
rcvbyte 0
UDP
ipackets 0
opackets 0
dropped 0
errors 0
IP
ipackets 0
opackets 33
forward 0
dropped 0
errors 0
NetIF[ec0]
Ipackets 0
Opackets 33
Ierrors 0
Oerrors 0
collisions 0
NetIF[lo0]

Figure 51. Network information from osview.

Page 5 (miscellaneous):

Osview 2.1 : One Second Average WOLFEN 17:38:43 04/21/99 #1


int=5s
Block Devices
lread 37.5K
bread 0
%rcache 100.0
lwrite 0
bwrite 0
wcancel 0
%wcache 0.0
phread 0
phwrite 0
Graphics
griioctl 0
gintr 75
swapbuf 0
switch 0
fifowait 0
fifonwait 0
Video
vidioctl 0
vidintr 0
drop_add 0
*Interrupts
*PathName Cache
*EfsAct
*XfsAct
*Getblk
*Vnodes

Figure 51. Miscellaneous information from osview.

osview clearly offers a vast amount of information for monitoring system and network activity.

There is a GUI version of osview called gr_osview. Various options exist to determine which
parameters are displayed with gr_osview, the most commonly used being -a to display as much
data as possible.

Programs such as top and osview may be SGI-specific (I'm not sure). If they are, other versions
of UNIX are bound to have equivalent programs to these.

Example use: although I do virtually all the administration of the server remotely using the office
Indy (either by command line or GUI tools), there is also a VT-style terminal in my office
connected to the server's serial port via a lengthy cable (the Challenge S server itself is in a small
ante room). The VT display offers a simple text-only interface to the server; thus, most of the
time, I leave osview running on the VT display so that I can observe system activity whenever I
need to. The VT also offers an extra communications link for remote administration should the
network go down, ie. if the network links fail (eg. broken hub) the admin Indy cannot be used to
communicate with the server, but the VT still can.

Another tool for monitoring memory usage is gmemusage, a GUI program which displays a
graphical split-bar chart view of current memory consumption. gmemusage can also display a
breakdown of the regions within a program's memory space, eg. text, data, shared memory, etc.
Much lower-level tools exist too, such as sar (system activity reporter). In fact, osview works by
using sar. Experienced admins may use tools like sar, but most admins will prefer to use higher-
level tools such as top, osview and gmemusage. However, since sar gives a text output, one can
use it in script files for automated system information gathering, eg. a system activity report
produced by a script, executed every hour by the cron job-scheduling system (sar-based
information gathering scripts are included in the cron job schedule as standard). sar can be given
options to report only on selected items, eg. the number of processes in memory waiting for CPU
resource time. sar can be told to monitor some system feature for a certain period, saving the data
gathered during that period to a file. sar is a very flexible program.

Network Performance and Statistics.

osview can be used to monitor certain network statistics, but another useful program is ttcp. The
online book, "IRIX Admin: Networking and Mail", says:

"The ttcp tool measures network throughput. It provides a realistic measurement of network
performance between two stations because it allows measurements to be taken at both the
local and remote ends of the transmission."

To run a test with ttcp, enter the following on one system, eg. sevrin:

ttcp -r -s

Then enter the following on another system, eg. akira:

ttcp -t -s sevrin

After a delay of roughly 20 seconds for a 10Mbit network, results are reported by both systems,
which will look something like this:

SEVRIN # ttcp -r -s
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp
ttcp-r: socket
ttcp-r: accept from 193.61.252.2
ttcp-r: 16777216 bytes in 18.84 real seconds = 869.70 KB/sec +++
ttcp-r: 3191 I/O calls, msec/call = 6.05, calls/sec = 169.39
ttcp-r: 0.1user 3.0sys 0:18real 16% 118maxrss 0+0pf 1170+1csw

AKIRA # ttcp -t -s sevrin


ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ->
sevrin
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 18.74 real seconds = 874.19 KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 9.37, calls/sec = 109.27
ttcp-t: 0.0user 2.3sys 0:18real 12% 408maxrss 0+0pf 426+4csw

Figure 52. Results from ttcp between two hosts on a 10Mbit network.
Full details of the output are in the ttcp man page, but one can immediately see that the observed
network throughput (around 870KB/sec) is at a healthy level.

Another program for gathering network performance information is netstat. The online book,
"IRIX Admin: Networking and Mail", says:

"The netstat tool displays various network-related data structures that are useful for monitoring
and troubleshooting a network. Detailed statistics about network collisions can be captured with
the netstat tool."

netstat is commonly used with the -i option to list basic local network information, eg.:

yoda # netstat -i
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs
Coll
ec0 1500 193.61.252 yoda.comp.uclan 3906956 3 2945002 0
553847
ec3 1500 193.61.250 gate-yoda.comp. 560206 2 329366 0
16460
lo0 8304 loopback localhost 476884 0 476884 0
0

Figure 53. The output from netstat.

Here, the packet collision rate has averaged at 18.8%. This is within acceptable limits [1].

Another useful command is 'ping'. This program sends packets of data to a remote system
requesting an acknowledgement response for each packet sent. Options can be used to send a
specific number of packets, or send as many packets as fast as they are returned, send a packet
every so often (user-definable duration), etc.

For example:

MILAMBER # ping yoda


PING yoda.comp.uclan.ac.uk (193.61.252.1): 56 data bytes
64 bytes from 193.61.252.1: icmp_seq=0 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=1 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=2 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=3 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=4 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=5 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=6 ttl=255 time=1 ms

----yoda.comp.uclan.ac.uk PING Statistics----


7 packets transmitted, 7 packets received, 0% packet loss
round-trip min/avg/max = 1/1/1 ms
Figure 54. Example use of the ping command.

I pressed CTRL-C after the 7th packet was sent. ping is a quick and easy way to see if a host is
active and if so how responsive the connection is.

If a ping test produces significant packet loss on a local network, then it is highly likely there
exists a problem of some kind. Normally, one would rarely see a non-zero packet loss on a local
network from a direct machine-to-machine ping test.

A fascinating use of ping I once observed was at The Moving Picture Company (MPC) [2]. The
admin had written a script which made every host on the network send a ping test to every other
host. The results were displayed as a table with host names shown down the left hand side as
well as along the top. By looking for horizontal or diagonal lines of unusually large ping times,
the admin could immediately see if there was a problem with a single host, or with a larger part
of the network. Because of the need for a high system availability rate, the script allows the
admin to spot problems almost as soon as they occur, eg. by running the script once every ten
seconds.

When the admin showed me the script in use, one column had rather high ping times (around
20ms). Logging into the host with rlogin, ps showed everything was ok: a complex process was
merely consuming alot of CPU time, giving a slower network response.

System Status and User Status.

The rup command offers an immediate overview of current system states, eg.:

yoda # rup
yoda.comp.uclan.ac.u up 6 days, 8:25, load average: 0.33, 0.36,
0.35
gate-yoda.comp.uclan up 6 days, 8:25, load average: 0.33, 0.36,
0.35
wolfen.comp.uclan.ac up 11:28, load average: 0.00, 0.00,
0.00
conan.comp.uclan.ac. up 11:28, load average: 0.06, 0.01,
0.00
akira.comp.uclan.ac. up 11:28, load average: 0.01, 0.00,
0.00
nikita.comp.uclan.ac up 11:28, load average: 0.03, 0.00,
0.00
gibson.comp.uclan.ac up 11:28, load average: 0.00, 0.00,
0.00
woo.comp.uclan.ac.uk up 11:28, load average: 0.01, 0.00,
0.00
solo.comp.uclan.ac.u up 11:28, load average: 0.00, 0.00,
0.00
cameron.comp.uclan.a up 11:28, load average: 0.02, 0.00,
0.00
sevrin.comp.uclan.ac up 11:28, load average: 0.69, 0.46,
0.50
ash.comp.uclan.ac.uk up 11:28, load average: 0.00, 0.00,
0.00
ridley.comp.uclan.ac up 11:28, load average: 0.00, 0.00,
0.00
leon.comp.uclan.ac.u up 11:28, load average: 0.00, 0.00,
0.00
warlock.comp.uclan.a up 1:57, load average: 0.08, 0.13,
0.11
milamber.comp.uclan. up 9:52, load average: 0.11, 0.07,
0.00
merlin.comp.uclan.ac up 11:28, load average: 0.01, 0.00,
0.00
indiana.comp.uclan.a up 11:28, load average: 0.00, 0.00,
0.02
stanley.comp.uclan.a up 1:56, load average: 0.00, 0.00,
0.00

Figure 55. The output from rup.

The load averages for a single machine can be ascertained by running 'uptime' on that machine,
eg.:

MILAMBER 84# uptime


8:05pm up 10:28, 6 users, load average: 0.07, 0.06, 0.25
MILAMBER 85# rsh yoda uptime
8:05pm up 6 days, 9:02, 2 users, load average: 0.47, 0.49, 0.42

Figure 56. The output from uptime.

The w command displays current system activity, including what each user is doing. The man
page says, "The heading line shows the current time of day, how long the system has been up,
the number of users logged into the system, and the load averages." For example:

yoda # w
8:10pm up 6 days, 9:07, 2 users, load average: 0.51, 0.50, 0.41
User tty from login@ idle JCPU PCPU what
root q0 milamber.comp. 7:02pm 8 w
cmprj ftp UNKNOWN@ns5ip. 7:29pm -

Figure 57. The output from w showing current user activity.

With the -W option, w shows the 'from' information on a separate line, allowing one to see the
full domain address of ftp connections, etc.:

yoda # w -W
8:11pm up 6 days, 9:08, 2 users, load average: 0.43, 0.48, 0.40
User tty login@ idle JCPU PCPU what
root ttyq0 7:02pm 8 w -W
milamber.comp.uclan.ac.uk
cmprj ftp22918 7:29pm -
UNKNOWN@ns5ip.uclan.ac.uk
Figure 58. Obtaining full domain addresses from w with the -W option.

The rusers command broadcasts to all machines on the local network, gathering data about who
is logged on and where, eg.:

yoda # rusers
yoda.comp.uclan.ac.uk root
wolfen.comp.uclan.ac.uk guest guest
gate-yoda.comp.uclan.ac.uk root
milamber.comp.uclan.ac.uk root root root root mapleson mapleson
warlock.comp.uclan.ac.uk sensjv sensjv

Figure 59. The output from rusers, showing who is logged on where.

The multiple entries for certain users indicate that more than one shell is active for that user. As
usual, my login data shows I'm doing several things at once.

rusers can be modified with options to:

 report for all machines, whether users are logged in or not (-a),
 probe a specific machine (supply host name(s) as arguments),
 display the information sorted alphabetically by:
o host name (-h),
o idle time (-i),
o number of users (-u),
 give a more detailed output in the same style as the who command (-l).

Service Availability.

The most obvious way to check if a service is available for use by users is to try and use the
service, eg. ftp or telnet to a test location, run up a Netscape sessions and enter a familiar URL,
send an email to a local or remote account, etc. The ps command can be used to make sure the
relevant background process is running for a service too, eg. 'nfsd' for the NFS system. However,
if a service is experiencing problems, simply attempting to use the service will not reveal what
may be wrong.

For example, if one cannot ftp, it could be because of anything from a loose cable connection to
some remote server that's gone down.

The ping command is useful for an immediate check of network-related services such as ftp,
telnet, WWW, etc. One pings each host in the communication chain to see if the hosts respond. If
a host somewhere in the chain does not respond, then that host may be preventing any data from
getting through (eg. a remote proxy server is down).

A useful command one can use to aid in such detective work is traceroute. This command sends
test packets in a similar way to ping, but it also reports how the test packets reached the target
site at each stage of the communication chain, showing response times in milliseconds for each
step, eg.:

yoda # traceroute www.cee.hw.ac.uk


traceroute to osiris.cee.hw.ac.uk (137.195.52.12), 30 hops max, 40 byte
packets
1 193.61.250.33 (193.61.250.33) 6 ms (ttl=30!) 3 ms (ttl=30!) 4 ms
(ttl=30!)
2 193.61.250.65 (193.61.250.65) 5 ms (ttl=29!) 5 ms (ttl=29!) 5 ms
(ttl=29!)
3 gw-mcc.netnw.net.uk (194.66.24.1) 9 ms (ttl=28!) 8 ms (ttl=28!) 10 ms
(ttl=28!)
4 manchester-core.ja.net (146.97.253.133) 12 ms 11 ms 9 ms
5 scot-pop.ja.net (146.97.253.42) 15 ms 13 ms 14 ms
6 146.97.253.34 (146.97.253.34) 20 ms 15 ms 17 ms
7 gw1.hw.eastman.net.uk (194.81.56.110) 20 ms (ttl=248!) 18 ms 14 ms
8 cee-gw.hw.ac.uk (137.195.166.101) 17 ms (ttl=23!) 31 ms (ttl=23!) 18
ms (ttl=23!)
9 osiris.cee.hw.ac.uk (137.195.52.12) 14 ms (ttl=56!) 26 ms (ttl=56!)
30 ms (ttl=56!)
If a particular step shows a sudden jump in response time, then there may be a communications
problem at that step, eg. the host in question may be overloaded with requests, suffering from lack of
communications bandwidth, CPU processing power, etc.

At a lower level, system services often depend on background system processes, or daemons. If
these daemons are not running, or have shut down for some reason, then the service will not be
available.

On the SGI Indys, one example is the GUI service which handles the use of on-screen icons. The
daemon responsible is called objectserver. Older versions of this particular daemon can
occasionally shut down if an illegal iconic operation is performed, or if the file manager daemon
experiences an error. With no objectserver running, the on-screen icons disappear.

Thus, a typical task might be to periodically check to make sure the objectserver daemon is
running on all relevant machines. If it isn't, then the command sequence:

/etc/init.d/cadmin stop
/etc/init.d/cadmin start

restarts the objectserver. Once running, the on-screen icons return.

A common cause of objectserver shutting down is when a user's desktop layout configuration
files (contained in .desktop- directories) become corrupted in some way, eg. edited by hand in an
incorrect manner, or mangled by some other operation (eg. a faulty Java script from a home
made web page). One solution is to erase the user's desktop layout configuration directory, then
login as the user and create a fresh .desktop- directory.

objectserver is another example of UNIX GUI evolution. In 1996 SGI decided to replace the
objectserver system entirely in IRIX 6.3 (and later) with a new service that was much more
reliable, less likely to be affected by errors made in other applications, and fully capable of
supporting new 'webified' iconic services such as on-screen icons that are direct links to ftp,
telnet or WWW sites.

In general, checking the availability of a service requires one to check that the relevant daemons
are running, that the appropriate configuration files are in place, accessible and have the correct
settings, that the relevant daemon is aware of any changes which may have been made (perhaps
the service needs to be stopped and restarted?) and to investigate via online information what
may have caused services to fail as and when incidents occur. For every service one can use, the
online information explains how to setup, admin and troubleshoot the service. The key is to
know where to find that information when it is needed.

A useful source of constantly updated status information is the /var/adm/SYSLOG file. This file
is where any important system events are logged. One can configure all the various services and
daemons to log different degrees of detailed information in the SYSLOG file. Note: logging too
much detail can cause the log file to grow very quickly, in which case one would also have to
ensure that it did not consume valuable disk space. The SYSLOG file records user logins,
connections via ftp, telnet, etc., messages logged at system bootup/shutdown time, and many
other things.

Vendor Information Updates.

Most UNIX vendors send out periodic information booklets containing indepth articles on
various system administration issues. SGI's bulletin is called Pipeline. Such information guides
are usually supplied as part of a software support contract, though the vendor will often choose
to include copies on the company web site. An admin should read any relevant articles from
these guides - they can often be unexpectedly enlightening.

System Hardware Failures.

When problems occur on a system, what might at first appear to be a software problem may in
fact be a hardware fault. Has a disk failed? The fx program can be used to check disk status
(block read tests, disk label checks, etc.)

Has a network cable failed? Are all the cable connections firmly in place in the hub? Has a plug
come loose?

In late 1998, the Ve24 network stopped operating quite unexpectedly one morning. The errors
made it appear that there was a problem with the NFS service or perhaps the main user files disk
connected to the server; in fact, the fault lay with the Ve24 hub.

The online guides have a great deal of advice on how to spot possible hardware failures. My
advice is to check basic things first and move onto the more complex possibilities later. In the
above example, I wasted a great deal of time investigating whether the NFS service was
responsible, or the external user files disk, when in fact I should have checked the hub
connections first. As it happens, the loose connection was such that the hub indicator light was
on even though the connection was not fully working (thus, a visual check would not have
revealed the problem) - perhaps the fault was caused by a single loose wire out of the 8 running
through the cable, or even an internal fault in the hub (more likely). Either way, the hub was
eventually replaced.

Other things that can go wrong include memory faults. Most memory errors are not hardware
errors though, eg. applications with bugs can cause errors by trying to access some protected area
of memory.

Hardware memory errors will show up in the system log file /var/adm/SYSLOG as messages
saying something like 'Hardware ECC Memory Error in SIMM slot 4'. By swapping the memory
SIMMs around between the slots, one can identify which SIMM is definitely at fault (assuming
there is only one causing the problem).

The most common hardware component to go wrong on a system, even a non-PC system, is the
disk drive. When configuring systems, or carrying out upgrades/expansions, it is wise to stick
with models recommended by the source vendor concerned, eg. SGI always uses high-quality
Seagate, IBM or Quantum disk drives for their systems; thus, using (for example) a Seagate
drive is a good way to ensure a high degree of reliability and compatibility with the system
concerned.

Sometimes an admin can be the cause of the problem. For example, when swapping disks around
or performing disk tasks such as disk cloning, it is possible to incorrectly set the SCSI ID of the
disk. SGI systems expect the system disk to be on SCSI ID 1 (though this is a configurable
setting); if the internal disk is on the wrong SCSI ID, then under certain circumstances it can
appear to the system as if there are multiple disks present, one on each possible ID. If hardware
errors are observed on bootup (the system diagnostics checks), then the first thing to do is to
reboot and enter the low-level 'Command Monitor' (an equivalent access method will exist for all
UNIX systems): the Command Monitor has a small set of commands available, some of which
can be used to perform system status checks, eg. the hinv command. For the problem described
above, hinv would show multiple instances of the same disk on all SCSI IDs from 1 to 7 - the
solution is to power down and check the SCSI jumpers carefully.

Other problems can occasionally be internal, eg. a buildup of dust blocking air vents (leading to
overheating), or a fan failure, followed by overheating and eventually an automatic system
shutdown (most UNIX systems' power supplies include circuitry to monitor system temperature,
automatically shutting down if the system gets too hot). This leads on to questions of system
maintenance which will be dealt with on Day 3.

After disk failures, the other most common failure is the power supply. It can sometimes be
difficult to spot because a failure overnight or when one isn't around can mean the system shuts
down, cools off and is thus rebootable again the next morning. All the admin sees is a system
that's off for no readily apparent reason the next morning. The solution is to, for example, move
the system somewhere close at hand so that it can be monitored, or write a script which tests
whether the system is active every few seconds, logging the time of each successful test - if the
system goes down, the admin is notified in some way (eg. audio sound file played) and the
admin can then quickly check the machine - if the power supply area feels overly hot, then that is
the likely suspect, especially if an off/on mains switch toggle doesn't turn the system back on
(power supplies often have circuitry which will not allow power-on if the unit is still too hot). If
the admin wasn't available at the time, then the logged results can show when the system failed.

All SGIs (and UNIX systems in general) include a suite of hardware and software diagnostics
tests as part of the OS. IRIX contains a set of tests for checking the mouse, keyboard, monitor,
audio ports, digital camera and other basic hardware features.

Thankfully, for just about any hardware failure, hardware support contracts cover repairs and/or
replacements very effectively for UNIX systems. It's worth noting that although the Computing
Department has a 5-day support contract with SGI, all problems I've encountered so far have
been dealt either on the same day or early next morning by a visiting support engineer (ie. they
arrived much earlier than they legally had to). Since November 1995 when I took charge of the
Ve24 network, the hardware problems I've encountered have been:

 2 failed disks
 1 replaced power supply
 2 failed memory SIMMs (1 failed SIMM from two different machines)
 1 replaced keyboard (user damage)
 1 failed monitor
 1 suspect motherboard (replaced just in case)
 1 suspect video card (replaced just in case)
 1 problematic 3rd-party disk (incorrect firmware, returned to supplier and corrected with up-to-
date firmware; now operating ok)
 1 suspect hub (unknown problem; replaced just in case)

Given that the atmosphere in Ve24 is unfiltered, often humid air, and the fact that many of the
components in the Indys in Ve24 have been repeatedly swapped around to create different
configurations at different times, such a small number of failures is an excellent record after nearly 4
years of use.

It is likely that dirty air (dust, humidity, corrosive gases) was largely responsible for the disk,
power supply and memory failures - perhaps some of the others too. A build up of dust can
combine with airborne moisture to produce corrosive chemicals which can short-circuit delicate
components.

To put the above list another way: 14 out of the 18 Indys have been running non-stop for 3.5
years without a single hardware failure of any kind, despite being housed in an area without
filtered air or temperature control. This is very impressive and is quite typical of UNIX hardware
platforms.
Installing systems with proper air filtering and temperature control can be costly, but the benefit
may be a much reduced chance of hardware failure - this could be important for sites with many
more systems and a greater level of overall system usage (eg. 9 to 5 for most machines).

Some companies go to great lengths to minimise the possibility of hardware failure. For
example, MPC [2] has an Origin200 render farm for rendering movie animation frames. The
render farm consists of 50 Origin200 servers, each with 2 R10000 CPUs, ie. 100 CPUs in total.
The system is housed in a dedicated room with properly filtered air and temperature control.
Almost certainly as a result of this high-quality setup, MPC has never had a single hardware
failure of any kind in nearly 3 years of operation. Further, MPC has not experienced a single OS
failure over the same period either, even though the system operates 24hours/day.

This kind of setup is common amongst companies which have time-critical tasks to perform, eg.
oil companies with computational models that can take six months to complete - such
organisations cannot afford to have failures (the problem would likely have to be restarted from
scratch, or at least delayed), so it's worth spending money on air filters, etc.

If one does not have filtered air, then the very least one should do is keep the systems clean
inside and out, performing system cleaning on a regular basis.

At present, my current policy is to thoroughly clean the Indys twice a year: every machine is
stripped right down to the bare chassis; every component is individually cleaned with appropriate
cleaning solutions, cloths, air-dusters, etc. (this includes removing every single key from all the
keyboards and mass-cleaning them with a bucket of hot water and detergent! And cleaning the
keyboard bases inside and out too). Aside from these major bi-annual cleanings, simple regular
cleaning is performed on a weekly or monthly basis: removing dirt from the mice (inside
especially), screen, chassis/monitor surface, cables and so on; cleaning the desks; opening each
system and blowing away internal dust using a can of compressed filtered air, etc.

Without a doubt, this process greatly lengthens the life-span of the systems' hardware
components, and users benefit too from a cleaner working environment - many new students
each autumn often think the machines must be new because they look so clean.

Hardware failures do and will occur on any system whether it's a UNIX platform or not. An
admin can use information from online sources, combined with a knowledge of relevant system
test tools such as fx and ping, to determine the nature of hardware failures and take corrective
action (contacting vendor support if necessary); such a strategy may include setting up
automated hardware tests using regularly-executed scripts.

Another obvious source of extensive information about any UNIX platform is the Internet.
Hundreds of existing users, including company employees, write web pages [3] or USENET
posts describing their admin experiences and how to deal with typical problems.
Suspicious/Illegal Activity.

Users inevitably get up to mischief on occasion, or external agencies may attempt to hack the
system. Types of activity could include:

 users downloading illegal or prohibited material, either with respect to national/local laws or
internal company policy,
 accessing of prohibited sites, eg. warez software piracy sites,
 mail spamming and other abuses of Internet services,
 attacks by hackers,
 misuse/abuse of system services internally.

There are other possibilities, but these are the main areas. This lecture is an introduction to security and
monitoring issues. A more in-depth discussion is given in the last lecture.

As an admin who is given the task of supposedly preventing and/or detecting illegal activities,
the first thing which comes to mind is the use of various file-searching methods to locate suspect
material, eg. searching every user's netscape bookmarks file for particular keywords. However,
this approach can pose legal problems.

Some countries have data protection and/or privacy laws [4] which may prohibit one from
arbitrarily searching users' files. Searches of this type are the equivalent of a police force tapping
all the phones in an entire street and recording every single conversation just on the off-chance
that they might record something interesting; such methods are sometimes referred to as 'fishing'
and could be against the law. So, for example, the following command might be illegal:

find /home/students -name "*" -print > list


grep sex list > suspected
grep warez list >> suspected
grep xxx list >> suspected
grep pics list >> suspected
grep mpg list >> suspected
grep jpg list >> suspected
grep gif list >> suspected
grep sites list >> suspected

As a means of finding possible policy violations, the above script would be very effective, but
it's definitely a form of fishing (even the very first line).

Now consider the following:

find /home/students -name "bookmarks.html" -print -exec grep playboy


{} \;

This command will effectively locate any Netscape bookmarks file which contains a possible
link to the PlayBoy web site. Such a command is clearly looking for fairly specific content in a
very specific file in each user's .netscape directory; further, it is probably accessing a user's
account space without her or his permission (this opens the debate on whether 'root' even needs a
user's permission since root actually owns all files anyway - more on this below).

The whole topic of computer file access is a grey area. For example, might the following
command also be illegal?

find . -name "*.jpg" -print > results && grep sex results

A user's lawyer could argue that it's clearly looking for any JPEG image file that is likely to be of
an explicit nature. On the other hand, an admin's lawyer could claim the search was actually
looking for any images relating to tourism in Sussex county, or musical sextets, or adverts for
local unisex hair salons, and just accidentally happened to be in a directory above /home/students
when the command was executed (the find would eventually reach /home/students). Obviously a
setting for a messy court-room battle.

But even ignoring actions taken by an admin using commands like find, what about data
backups? An extremely common practice on any kind of computer system is to backup user data
to a media such as DAT on a regular basis - but isn't this accessing user files without permission?
But hang on, on UNIX systems, the root user is effectively the absolute owner of any file, eg.
suppose a file called 'database' in /tmp, owned by an ordinary user, contained some confidential
data; if the the admin (logged in as root) then did this:

cat /tmp/database

the contents of the database file would indeed be displayed.

Thus, since root basically owns all files anyway by default, surely a backup procedure is just the
root user archiving files it already owns? If so, does one instead have to create some abstract
concept of ownership in order to offer users a concrete concept of what data privacy actually is?
Who decides? Nations which run their legal systems using case-law will find these issues very
difficult to clarify, eg. the UK's Data Protection Act is known to be 'weak'.

Until such arguments are settled and better laws created, it is best for an admin to err on the side
of caution. For example, if an admin wishes to have some kind of regular search conducted, the
existence of the search should be included as part of stated company policy, and enshrined into
any legal documents which users must sign before they begin using the system, ie. if a user signs
the policy document, then the user has agreed to the actions described in that document. Even
then, such clauses may not be legally binding. An admin could also setup some form of login
script which would require users to agree to a sytsem usage policy before they were fully logged-
in.

However, these problems won't go away, partly because of the specifics of how some modern
Internet services such as the web are implemented. For example, a user could access a site which
automatically forces the pop-up of a Netscape window which is directed to access a prohibited
site; inline images from the new site will then be present in the user's Netscape cache directory in
their home account area even though they haven't specifically tried to download anything. Are
they legally liable? Do such files even count as personal data? And if the site has its own proxy
server, then the images will also be in the server's proxy cache - are those responsible for the
server also liable? Nobody knows. Legal arguments on the nature of cache directories and other
file system details have not yet been resolved. Clearly, there is a limit to how far one can go in
terms of prevention simply because of the way computing technologies work.

Thus, the best thing to do is to focus efforts on information that does not reside inside user
accounts. The most obvious place is the system log file, /var/adm/SYSLOG. This file will show
all the ftp and telnet sites which users have been accessing; if one of these sites is a prohibited
place, then that is sufficient evidence to take action.

The next most useful data resource to keep an eye on is the web server log(s). The web logs
record every single access by all users to the WWW. Users have their own record of their
accesses in the form of a history file, hidden in their home directory inside the .netscape
directory (or other browser); but the web logs are outside their accounts and so can be probably
be freely examined, searched, processed, etc. by an admin without having to worry about legal
issues. Even here though, there may be legal issues, eg. log data often includes user IDs which
can be used to identify specific individuals and their actions - does a user have a legal right to
have such data kept private? Only a professional lawyer in the field would know the correct
answer.

Note: the amount of detail placed into a log file can be changed to suit the type of logging
required. If a service offers different levels of logging, then the appropriate online documentation
will explain how to alter the settings.

Blocking Sites.

If an admin does not want users to be able to access a particular site, then that site can be added
to a list of 'blocked' sites by using the appropriate option in the web server software concerned,
eg. Netscape Enterprise Server, CERN web server, Apache web server, etc. Even this may pose
legal problems if a country has any form of freedom-of-speech legislation though (non-existent
in the UK at present, so blocking sites should be legally OK in the UK).

However, blocking sites can become somewhat cumbersome because there are thousands of web
sites which an admin could theoretically have to deal with - once the list becomes quite large,
web server performance decreases as every access has to have its target URL checked against the
list of banned sites. So, if an admin does choose to use such a policy, it is best to only add sites
when necessary, and to construct some kind of checking system so that if no attempt is made to
access a blocked site after a duration of, say, two weeks (whatever), then that site is removed
from the list of blocked sites. In the long term, such a policy should help to keep the list to a
reasonably manageable size. Even so, just the act of checking the web logs and adding sites to
the list could become a costly time-consuming process (time = money = wages).

One can also use packet filtering systems such as hardware routers or software daemons like
ipfilterd which can accept, reject, or reject-and-log incoming packets based on source/destination
IP address, host name, network interface, port number, or any combination of these. Note that
daemons such as ipfilterd may require the presence of a fast CPU if the overhead from a busy
site is to be properly supported. The ipfilterd system is discussed in detail on Day 3.

System Temporary Directories.

An admin should keep a regular eye on the contents of temporary directories on all systems, ie.
/tmp and /var/tmp. Users may download material and leave the material lying around for anyone
to see. Thus, a suspicious file can theoretically be traced to its owner via the user ID and group
ID of the file. I say theoretically because, as explained elsewhere, it is possible for a user X to
download a file (eg. by ftp so as to avoid the web logs, or by telnet using a shell on a remote
system) and then 'hand over' ownership of the file to someone else (say user Y) using the chgrp
and chown commands, making it look as though a different user is responsible for the file. In that
sense, files found outside a user's home directory could not normally be used as evidence, though
they would at least alert the admin to the fact that suspect activities may be occurring, permitting
a refocusing of monitoring efforts, etc.

However, one way in which it could be possible to reinforce such evidence is by being able to
show that user Y was not logged onto the system at the time when the file in question was
created (this information can be gleaned from the system's own local /var/adm/SYSLOG file,
and the file's creation time and date).

Unfortunately, both users could have been logged onto the same system at the time of the file's
creation. Thus, though a possibility, the extra information may not help. Except in one case:
video evidence. If one can show by security camera recordings that user X did indeed login 'on
console' (ie. at the actual physical keyboard) then that can be tied in with SYSLOG data plus file
creation times, irrespective of what user Y was doing at the time.

Certainly, if someone wished to frame a user, it would not be difficult to cause a considerable
amount of trouble for that user with just a little thought on how to access files, where to put
them, changing ownership, etc.

In reality, many admins probably just do what they like in terms of searching for files, examining
users' areas, etc. This is because there is no way to prove someone has attempted to search a
particular part of the file system - UNIX doesn't keep any permanent record of executed
commands.

Ironically, the IRIX GUI environment does keep a record of any file-related actions taken with
the GUI system (icons, file manager windows, directory views, etc.) but the log file with this
information is kept inside the user's .desktop- directory and thus may be legally out of bounds.

File Access Permissions.

Recall the concept of file access permissions for files. If a user has a directory or file with its
permissions set so that another ordinary user can read it (ie. not just root, who can access
anything by default anyway), does the fact that the file is globally readable mean the user has by
default given permission for anyone else to read the file? If one says no, then that would mean it
is illegal to read any user's own public_html web area! If one says yes, and a legal body
confirmed this for the admin, then that would at least enable the admin to examine any directory
or file that had the groups and others permissions set to a minimum of read-only (read and
executable for directories).

The find command has an option called -perm which allows one to search for files with
permissions matching a given mode. If nothing else, such an ability would catch out careless
users since most users are not aware that their account has hidden directories such as .netscape.
An admin ought to make users aware of security issues beforehand though.

Backup Media.

Can an admin search the data residing on backup media? (DAT, CDR, ZIP, DLT, etc.) After all,
the data is no longer inside the normal home account area. In my opinion yes, since root owns all
files anyway (though I've never done such a search), but others might disagree. For that matter,
consider the tar commands commonly used to perform backups: a full backup accesses every file
on the file system by default (ie. including all users' files, whatever the permissions may be), so
are backups a problem area?

Yet again, one can easily see how legal grey areas emerge concerning the use of computing
technologies.

Conclusion.

Until the law is made clearer and brought up-to-date (unlikely) the best an admin can do is
consult any available internal legal team, deciding policy based on any advice given.

References:

1. "Ethernet Collisions on Silicon Graphics Systems", SGI Pipeline magazine (support info bulletin),
July/August 1998 (NB: URL not accessible to those without a software support contract):
2. http://support.sgi.com/surfzone/content/pipeline/html/19980404EthernetC
ollisions.html

3. The Moving Picture Company, Soho Square, London. Responsible for some or all of the special
effects in Daylight, The English Patient, Goldeneye, The Borrowers, and many other feature
films, music videos, adverts, etc. Hardware used: several dozen Octane workstations, many
Onyx2 graphics supercomputers, a 6.4TB Ampex disk rack with real-time Philips cinescan film-to-
digital-video converter (cinema resolution 70mm uncompressed video converter; 250K's worth),
Challenge L / Discrete Logic video server, a number of O2s, various older SGI models such as
Onyx RealityEngine2, Indigo2, Personal IRIS, etc., some high-end Apple Macs and a great deal of
dedicated video editing systems and VCRs, supported by a multi-gigabit network. I saw one NT
system which the admin said nobody used.

4. The SGI Tech/Advice Centre: Holland #1:


http://www.futuretech.vuurwerk.nl/
5. Worldwide Mirror Sites: Holland #2: http://sgi.webguide.nl/
6. Holland #3: http://sgi.webcity.nl/
7. USA: http://sgi.cartsys.net/
8. Canada: http://sgi-tech.unixology.com/
9.
10. The Data Protection Act 1984, 1998. Her Majesty's Stationary Office (HMSO):
http://www.hmso.gov.uk/acts/acts1984/1984035.htm
Detailed Notes for Day 2 (Part 4)
UNIX Fundamentals: Further Shell scripts.

for/do Loops.

The rebootlab script shown earlier could be rewritten using a for/do loop, a control structure
which allows one to execute a series of commands many times.

Rewriting the rebootlab script using a for/do loop doesn't make much difference to the
complexity of this particular script, but using more sophisticated shell code is worthwhile when
one is dealing with a large number of systems. Other benefits arise too; a suitable summary is
given at the end of this discussion.

The new version could be rewritten like this:

#!/bin/sh
for machine in akira ash cameron chan conan gibson indiana leon merlin
\
nikita ridley sevrin solo stanley warlock wolfen woo
do
echo $machine
rsh $machine init 6&
done

The '\' symbol is used to continue a line onto the next line. The 'echo' line displays a comment as
each machine is dealt with.

This version is certainly shorter, but whether or not it's easier to use in terms of having to modify
the list of host names is open to argument, as opposed to merely commenting out the relevant
lines in the original version. Even so, if one happened to be writing a script that was fairly
lengthy, eg. 20 commands to run on every system, then the above format is obviously much
more efficient.

Similarly, the remountmapleson script could be rewritten as follows:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
nikita ridley sevrin solo stanley warlock wolfen woo
do
echo $machine
rsh $machine "umount /mapleson && mount /mapleson"
done

Note that in this particular case, the command to be executed must be enclosed within quotes in
order for it to be correctly sent by rsh to the remote system. Quotes like this are normally not
needed; it's only because rsh is being used in this example that quotes are required.
Also note that the '&' symbol is not used this time. This is because the rebootlab procedure is
asynchronous, whereas I want the remountdir script to output its messages just one action at a
time.

In other words, for the rebootlab script, I don't care in what order the machines reboot, so each
rsh call is executed as a background process on the remote system, thus the rebootlab script
doesn't wait for each rsh call to return before progressing.

By contrast, the lack of a '&' symbol in remountdir's rsh command means the rsh call must finish
before the script can continue. As a result, if an unexpected problem occurs, any error message
will be easily noticed just by watching the output as it appears.

Sometimes a little forward thinking can be beneficial; suppose one might have reason to want to
do exactly the same action on some other NFS-mounted area, eg. /home, or /var/mail, then the
script could be modified to include the target directory as a single argument supplied on the
command line. The new script looks like this:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
nikita ridley sevrin solo stanley warlock wolfen woo
do
echo $machine
rsh $machine "umount $1 && mount $1"
done

The script would probably be renamed to remountdir (whatever) and run with:

remountdir /mapleson

or perhaps:

remountdir /home

if/then/else constructs.

But wait a minute, couldn't one use the whole concept of arguments to solve the problem of
communicating to the script exactly which hosts to deal with? Well, a rather useful feature of any
program is that it will always return a result of some kind. Whatever the output actually is, a
command always returns a result which is defined to be true or false in some way.

Consider the following command:

grep target database


If grep doesn't find 'target' in the file 'database', then no output is given. However, as a program
that has been called, grep has also passed back a value of 'FALSE' - the fact that grep does this is
simply invisible during normal usage of the command.

One can exploit this behaviour to create a much more elegant script for the remountdir
command. Firstly, imagine that I as an admin keep a list of currently active hosts in a file called
'live' (in my case, I'd probably keep this file in /mapleson/Admin/Machines). So, at the present
time, the file would contain the following:

yoda
akira
ash
cameron
chan
conan
gibson
indiana
leon
merlin
nikita
ridley
sevrin
solo
stanley
warlock
wolfen
woo

ie. the host called spock is not listed.

The remountdir script can now be rewritten using an if/then construct:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live; then


echo Remounting $1 on $machine...
rsh $machine "umount $1 && mount $1"
fi
done

This time, the complete list of hosts is always used in the script, ie. once the script is rewritten, it
doesn't need to be altered again. For each machine, the grep command searches the 'live' file for
the target name; if it finds the name, then the result is some output to the screen from grep, but
also a 'TRUE' condition, so the echo and rsh commands are executed. If grep doesn't find the
target host name in the live file then that host is ignored.
The result is a much more elegant and powerful script. For example, suppose some generous
agency decided to give the department a large amount of money for an extra 20 systems: the only
changes required are to add the names of the new hosts to remountdir's initial list, and to add the
names of any extra active hosts to the live file. Along similar lines, when spock finally is
returned to the lab, its name would be added to the live file, causing remountdir to deal with it in
the future.

Even better, each system could be setup so that, as long as it is active, the system tells the server
every so often that all is well (a simple script could achieve this). The server brings the results
together on a regular basis, constantly keeping the live file up-to-date. Of course, the server
includes its own name in the live file. A typical interval would be to update the live file every
minutes. If an extra program was written which used the contents of the live file to create some
kind of visual display, then an admin would know in less than a minute when a system had gone
down.

Naturally, commercial companies write professional packages which offer these kinds of
services and more, with full GUI-based monitoring, but at least it is possible for an admin to
create home-made scripts which would do the job just as well.

/dev/null.

There is still an annoying feature of the script though: if grep finds a target name in the live file,
the output from grep is visible on the screen which we don't really want to see. Plus, the umount
command will return a message if /mapleson wasn't mounted anyway. These messages clutter up
the main 'trace' messages.

To hide the messages, one of UNIX's special device files can be used. Amongst the various
device files in the /dev directory, one particularly interesting file is called /dev/null. This device
is known as a 'special' file; any data sent to the device is discarded, and the device always returns
zero bytes. Conceptually, /dev/null can be regarded as an infinite sponge - anything sent to it is
just ignored. Thus, for dealing with the unwanted grep output, one can simply redirect grep's
output to /dev/null.

The vast majority of system script files use this technique, often many times even in a single
script.

Note: descriptions of all the special device files /dev are given in Appendix C of the online book,
"IRIX Admin: System Configuration and Operation".

Since grep returns nothing if a host name is not in the live file, a further enhancement is to
include an 'else' clause as part of the if construct so that a separate message is given for hosts that
are currently not active. Now the final version of the script looks like this:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live > /dev/null; then


echo Remounting $1 on $machine...
rsh $machine "umount $1 && mount $1"
else
echo $machine is not active.
fi
done

Running the above script with 'remountdir /mapleson' gives the following output:

Checking yoda...
Remounting /mapleson on yoda...
Checking akira...
Remounting /mapleson on akira...
Checking ash...
Remounting /mapleson on ash...
Checking cameron...
Remounting /mapleson on cameron...
Checking chan...
Remounting /mapleson on chan...
Checking conan...
Remounting /mapleson on conan...
Checking gibson...
Remounting /mapleson on gibson...
Checking indiana...
Remounting /mapleson on indiana...
Checking leon...
Remounting /mapleson on leon...
Checking merlin...
Remounting /mapleson on merlin...
Checking spock...
spock is not active.
Checking nikita...
Remounting /mapleson on nikita...
Checking ridley...
Remounting /mapleson on ridley...
Checking sevrin...
Remounting /mapleson on sevrin...
Checking solo...
Remounting /mapleson on solo...
Checking stanley...
Remounting /mapleson on stanley...
Checking warlock...
Remounting /mapleson on warlock...
Checking wolfen...
Remounting /mapleson on wolfen...
Checking woo...
Remounting /mapleson on woo...
Notice the output from grep is not shown, and the different response given when the script deals
with the host called spock.

Scripts such as this typically take around a minute or so to execute, depending on how quickly
each host responds.

The rebootlab script can also be rewritten along similar lines to take advantage of the new 'live'
file mechanism, but with an extra if/then structure to exclude yoda (the rebootlab script is only
meant to reboot the lab machines, not the server). The extra if/then construct uses the 'test'
command to compare the current target host name with the word 'yoda' - the rsh command is
only executed if the names do not match; otherwise, a message is given stating that yoda has
been excluded. Here is the new rebootlab script:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live > /dev/null; then


if test $machine != yoda; then
echo Rebooting $machine...
rsh $machine init 6&
else
echo Yoda excluded.
fi
else
echo $machine is not active.
fi
done

Of course, an alternative way would be to simply exclude 'yoda' from the opening 'for' line.
However, one might prefer to always use the same host name list in order to minimise the
amount of customisation between scripts, ie. to create a new script just copy an existing one and
modify the content after the for/do structure.

Notes:

 All standard shell commands and other system commands, programs, etc. can be used in
shell scripts, eg. one could use 'cd' to change the current working directory between
commands.
 An easy way to ensure that a particular command is used with the default or specifically
desired behaviour is to reference the command using an absolute path description, eg.
/bin/rm instead of just rm. This method is frequently found in system shell scripts. It also
ensures that the scripts are not confused by any aliases which may be present in the
executing shell.
 Instead of including a raw list of hosts in the script at beginning, one could use other
commands such as grep, awk, sed, perl and cut to obtain relevant host names from the
/etc/hosts file, one at a time. There are many possibilities.

Typically, as an admin learns the existence of new commands, better ways of performing tasks
are thought of. This is perhaps one reason why UNIX is such a well-understood OS: the process
of improving on what has been done before has been going on for 30 years, largely because
much of the way UNIX works can be examined by the user (system script files, configuration
files, etc.) One can imagine the hive of activity at BTL and Berkeley in the early days, with
suggestions for improvements, additions, etc. pouring in from enthusiastic testers and volunteers.
Today, after so much evolution, most basic system scripts and other files are probably as good as
they're going to be, so efforts now focus on other aspects such as system service improvements,
new technology (eg. Internet developments, NSD), security enhancements, etc. Linux evolved in
a very similar way.

I learned shell programming techniques mostly by looking at existing system scripts and reading
the relevant manual pages. An admin's shell programming experience usually begins with simple
sequential scripts that do not include if/then structures, for loops, etc. Later on, a desire to be
more efficient gives one cause to learn new techniques, rewriting earlier work as better ideas are
formed.

Simple scripts can be used to perform a wide variety of tasks, and one doesn't have to make them
sophisticated or clever to get the job done - but with some insightful design, and a little
knowledge of how the more useful aspects of UNIX work, one can create extremely flexible
scripts that can include error checking, control constructs, progress messages, etc. written in a
way which does not require them to be modified, ie. external ideas, such as system data files, can
be used to control script behaviour; other programs and scripts can be used to extract information
from other parts of the system, eg. standard configuration files.

A knowledge of the C programming language is clearly helpful in writing shell scripts since the
syntax for shell programming is so similar. An excellent book for this is "C Programming in a
UNIX Environment", by Judy Kay & Bob Kummerfeld (Addison Wesley Publishing, 1989.
ISBN: 0 201 12912 4).

Other Useful Commands.

A command found in many of the numerous scripts used by any UNIX OS is 'test'; typically used
to evaluate logical expressions within 'if' clauses, test can determine the existence of files, status
of access permissions, type of file (eg. ordinary file, directory, symbolic link, pipe, etc.), whether
or not a file is empty (zero size), compare strings and integers, and other possibilities. See the
test man page for full details.

For example, the test command could be used to include an error check in the rebootlab script, to
ascertain whether the live file is accessible:
#!/bin/sh
if test -r /mapleson/Admin/Machines/live; then
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live > /dev/null; then


if test $machine != yoda; then
echo Rebooting $machine...
rsh $machine init 6&
else
echo Yoda excluded.
fi
else
echo $machine is not active.
fi
done
else
echo Error: could not access live file, or file is not readable.
fi

NOTE: Given that 'test' is a system command...

% which test
/sbin/test

...any user who creates a program called test, or an admin who writes a script called test, will be
unable to execute the file unless one of the following is done:

 Use a complete pathname for the file, eg. /home/students/cmpdw/test


 Insert './' before the file name
 Alter the path definition ($PATH) so that the current directory is searched before /sbin
(dangerous! The root user should definitely not do this).

In my early days of learning C, I once worked on a C program whose source file I'd called
simply test.c - it took me an hour to realise why nothing happened when I ran the program
(obviously, I was actually running the system command 'test', which does nothing when given no
arguments except return an invisible 'false' exit status).

Problem Question 1.

Write a script which will locate all .capture.mv.* directories under /home and remove them
safely. You will not be expected to test this for real, but feel free to create 'mini' test directories if
required by using mkdir.

Modify the script so that it searches a directory supplied as a single argument ($1).

Relevant commands: find, rm


Tips:

 Research the other possible options for rm which might be useful.


 Don't use your home directory to test out ideas. Use /tmp or /var/tmp.

Problem Question 2.

This is quite a complicated question. Don't feel you ought to be able to come up with an answer
after just one hour.

I want to be able to keep an eye on the amount of free disk space on all the lab machines. How
could this be done?

If a machine is running out of space, I want to be able to remove particular files which I know
can be erased without fear of adverse side effects, including:

 Unwanted user files left in /tmp and /var/tmp, ie. files such as movie files, image files,
sound files, but in general any file that isn't owned by root.
 System crash report files left in /var/adm/crash, in the form of unix.K and
vmcore.K.comp, where K is some digit.
 Unwanted old system log information in the file /var/adm/SYSLOG. Normally, the file is
moved to oSYSLOG minus the last 10 or 20 lines, and a new empty SYSLOG created
containing the aforementioned most recent 10 or 20 lines.

a. Write a script which will probe each system for information, showing disk space usage.

b. Modify the script (if necessary) so that it only reports data for the local system disk.

c. Add a means for saving the output to some sort of results file or files.

d. Add extra features to perform space-saving operations such as those described above.

Advanced:

e. Modify the script so that files not owned by root are only removed if the relevant user is not
logged onto the target system.

Relevant commands: grep, df, find, rm, tail, cd, etc.


UNIX Fundamentals: Application Development Tools.

A wide variety of commands, programs, tools and applications exist for application development
work on UNIX systems, just as for any system. Some come supplied with a UNIX OS as-
standard, some are free or shareware, while others are commercial packages.

An admin who has to manage a system which offers these services needs to be aware of their
existence because there are implications for system administration, especially with respect to
installed software.

This section does not explain how to use these tools (even though an admin would probably find
many of them useful for writing scripts, etc.) The focus here is on explaining what tools are
available and may exist on a system, where they are usually located (or should be installed if an
admin has to install non-standard tools), and how they might affect administration tasks and/or
system policy.

There tend to be several types of software tools:

1. Software executed usually via command line and written using simple editors, eg. basic
compilers such as cc, development systems such as the Sun JDK for Java.

Libraries for application development, eg. OpenGL, X11, Motif, Digital Media Libraries
- such library resources will include example source code and programs, eg. X11 Demo
Programs.

In both cases, online help documents are always included: man pages, online books, hints
& tips, local web pages either in /usr/share or somewhere else such as /usr/local/html.

2. Higher-level toolkits providing an easier way of programming with various libraries, eg.
Open Inventor. These are often just extra library files somewhere in /usr/lib and so don't
involve executables, though example programs may be supplied (eg. SceneViewer,
gview, ivview). Any example programs may be in custom directories, eg. SceneViewer is
in /usr/demos/Inventor, ie. users would have to add this directory to their path in order to
be able to run the program. These kinds of details are in the release notes and online
books. Other example programs may be in /usr/sbin (eg. ivview).
3. GUI-based application development systems for all manner of fields, eg. WorkShop Pro
CASE tools for C, C++, Ada, etc., CosmoWorlds for VRML, CosmoCreate for HTML,
CosmoCode for Java, RapidApp for rapid prototyping, etc. Executables are usually still
accessible by default (eg. cvd appears to be in /usr/sbin) but the actual programs are
normally stored in application-specific directories, eg. /usr/WorkShop, /usr/CosmoCode,
etc. (/usr/sbin/cvd is a link to /usr/WorkShop/usr/sbin/cvd). Supplied online help
documents are in the usual locations (/usr/share, etc.)
4. Shareware/Freeware programs, eg. GNU, Blender, XV, GIMP, XMorph, BMRT.
Sometimes such software comes supplied in a form that means one can install it
anywhere (eg. Blender) - it's up to the admin to decide where (/usr/local is the usual
place). Other type of software installs automatically to a particular location, usually
/usr/freeware or /usr/local (eg. GIMP). If the admin has to decide where to install the
software, it's best to follow accepted conventions, ie. place such software in /usr/local (ie.
executables in /usr/local/bin, libraries in /usr/local/lib, header files in /usr/local/include,
help documents in /usr/local/docs or /usr/local/html, source code in /usr/local/src). In all
cases, it's the admin's responsibility to inform users of any new software, how to use it,
etc.

The key to managing these different types of tools is consistency; don't put one shareware
program in /usr/local and then another in /usr/SomeCustomName. Users looking for
online source code, help docs, etc. will become confused. It also complicates matters
when one considers issues such as library and header file locations for compiling
programs.

Plus, consistency eases other aspects of administration, eg. if one always uses /usr/local
for 3rd-party software, then installing this software onto a system which doesn't yet have
it is a simple matter of copying the entire contents of /usr/local to the target machine.

It's a good idea to talk to users (perhaps by email), ask for feedback on topics such as
how easy it is to use 3rd-party software, are there further programs they'd like to have
installed to make their work easier, etc. For example, a recent new audio standard is
MPEG3 (MP3 for short); unknown to me until recently, there exists a freeware MP3
audio file player for SGIs. Unusually, the program is available off the Net in executable
form as just a single program file. Once I realised that users were trying to play MP3
files, I discovered the existence of the MP3 player and installed it in /usr/local/bin as
'mpg123'.

My personal ethos is that users come first where issues of carrying out their tasks are
concerned. Other areas such as security, etc. are the admin's responsibility though - such
important matters should either be left to the admin or discussed to produce some
statement of company policy, probably via consulation with users, managers, etc. For
everyday topics concerning users getting the most out of the system, it's wise for an
admin to do what she/he can to make users' lives easier.

General Tools (editors).

Developers always use editing programs for their work, eg. xedit, jot, nedit, vi, emacs,
etc. If one is aware that a particular editor is in use, then one should make sure that all
appropriate components of the relevant software are properly installed (including any
necessary patches and bug fixes), and interested users notified of any changes, newly
installed items, etc.

For example, the jot editor is popular with many SGI programmers because it has some
extra features for those programming in C, eg. an 'Electric C Mode'. However, a bug
exists in jot which can cause file corruption if jot is used to access files from an NFS-
mounted directory. Thus, if jot is being used, then one should install the appropriate patch
file to correct the bug, namely Patch 2051 (patch CDs are supplied as part of any
software support contract, but most patches can also be downloaded from SGI's ftp site).

Consider searching the vendor's web site for information about the program in question,
as well as the relevant USENET newsgroups (eg. comp.sys.sgi.apps, comp.sys.sgi.bugs).
It is always best to prevent problems by researching issues beforehand.

Whether or not an admin chooses to 'support' a particular editor is another matter; SGI
has officially switched to recommending the nedit editor for users now, but many still
prefer to use jot simply because of familiarity, eg. all these course notes have been typed
using jot. However, an application may 'depend' on minor programs like jot for particular
functions. Thus, one may have to install programs such as jot anyway in order to support
some other application (dependency).

An example in the case of the Ve24 network is the emacs editing system: I have chosen
not to support emacs because there isn't enough spare disk space available to install
emacs on the Indys which only have 549MB disks. Plus, the emacs editor is not a vendor-
supplied product, so my position is that it poses too many software management issues to
be worth using, ie. unknown bug status, file installation location issues, etc.

Locations: editors are always available by default; executables tend to be in /usr/sbin, so


users need not worry about changing their path definition in order to use them.

All other supplied-as-standard system commands and programs come under the heading
of general tools.

Compilers.

There are many different compilers which might have to be installed on a system, eg.:

Programming Compiler
Language Executable

C cc, gcc
C++ CC
Ada ?
Fortran77 f77
Fortran90 f90
Pascal ?

Some UNIX vendors supply C and C++ compilers as standard, though licenses may be
required. If there isn't a supplied compiler, but users need one, then an admin can install
the GNU compilers which are free.

An admin must be aware that the release versions of software such as compilers is very
important to the developers who use them (this actually applies to all types of software).
Installing an update to a compiler might mean the libraries have fewer bugs, better
features, new features, etc., but it could also mean that a user's programs no longer
compile with the updated software. Thus, an admin should maintain a suitable
relationship with any users who use compilers and other similar resources, ie. keep each
other informed of relevant issues, changes being made or requested, etc.

Another possibility is to manage the system in such a way as to offer multiple versions of
different software packages, whether that is a compiler suite such as C development kit,
or a GUI-based application such as CosmoWorlds. Multiple versions of low-level tools
(eg. cc and associated libraries, etc.) can be supported by using directories with different
names, or NFS-mounting directories/disks containing software of different versions, and
so on. There are many possibilities - which one to use depends on the size of the network,
ease of management, etc.

Multiple versions of higher-level tools, usually GUI-based development environments


though possibly ordinary programs like Netscape, can be managed by using 'wrapper'
scripts: the admin sets an environment variable to determine which version of some
software package is to be the default; when a system is booted, the script is executed and
uses the environment variable to mount appropriate directories, execute any necessary
initialisation scripts, background daemons, etc. Thus, when a user logs in, they can use
exactly the same commands but find themselves using a different version of the software.
Even better, an admin can customise the setup so that users themselves can decide what
version they want to use; logging out and then logging back in again would then reset all
necessary settings, path definitions, command aliases, etc.

MPC operates its network in this way. They use high-end professional film/video
effects/animation tools such as Power Animator, Maya, Flame, etc. for their work, but the
network actually has multiple versions of each software package available so that
animators and artists can use the version they want, eg. for compatibility reasons, or
personal preferences for older vs. newer features. MPC uses wrapper scripts of a type
which require a system reboot to change software version availability, though the systems
have been setup so that a user can initiate the reboot (I suspect the reboot method offers
better reliability).

Locations:

Executables are normally in /usr/sbin, libraries in /usr/lib, header files in /usr/include and
online documents, etc. in /usr/share. Note also that the release notes for such products
contain valuable information for administrators (setup advice) and users alike.

Debuggers.

Debugging programs are usually part of a compilation system, so everything stated above
for compilers applies to debuggers as well. However, it's perfectly possible for a user to
use a debugger that's part of a high-level GUI-based application development toolkit to
debug programs that are created using low-level tools such as jot and xedit. A typical
example on the Ve24 machines is students using the cvd program (from the WorkShop
Pro CASE Tools package) to debug their C programs, even though they don't use
anything else from the comprehensive suite of CASE tools (source code management,
version control, documentation management, rapid prototyping, etc.)

Thus, an admin must again be aware that users may be using features of high-level tools
for specific tasks even though most work is done with low-level tools. Hence, issues
concerning software updates arise, eg. changing software versions without user
consulation could cause problems for existing code.

High-level GUI-based Development Toolkits.

Usually vendor-supplied or commercial in nature, these toolkits include products such as


CosmoCode (Java development with GUI tools), RapidApp, etc. As stated above, there
are issues with respect to not carrying out updates without proper consideration to how
the changes may affect users who use the products, but the ramifications are usually
much less serious than low-level programs or shareware/freeware. This is because the
software supplier will deliberately develop new versions in such a way as to maximise
compatibility with older versions.

High-level toolkits sometimes rely on low-level toolkits (eg. CosmoCode depends on the
Sun JDK software), so an admin should also be aware that installing updates to low-level
toolkits may have implications for their higher-level counterparts.

High-level APIs (Application Programming Interfaces).

This refers to advanced library toolkits such as Open Inventor, ViewKit, etc. The actual
application developments tools used with these types of products are the same, whether
low-level or high-level (eg. cc and commands vs. WorkShop Pro CASE Tools). Thus,
high-level APIs are not executable programs in their own right; they are a suite of easier-
to-use libraries, header files, etc. which users can use to create applications designed at a
higher level of abstraction. Some example high-level APIs and their low-level
counterparts include:

Lower-level Higher-level

OpenGL Open Inventor


X11/Motif ViewKit/Tk
ImageVision Image Format Library,
Electronic Light Table.

This is not a complete list. And there may be more than one level of abstraction, eg. Open
Inventor is a subset of VRML.
Locations: high-level APIs tend to have their files stored in correspondingly named
directories in /usr/lib, /usr/include, etc. For example, Open Inventor files can be found in
/usr/lib/Inventor and /usr/include/Inventor. An exception is support files such as example
models, images, textures, etc. which will always be in /usr/share, but not necessarily in
specifically named locations, eg. the example 3D Inventor models are in
/usr/share/data/models.

Shareware and Freeware Software.

This category of software, eg. the GNU compiler system, is usually installed either in
/usr/local somewhere, or in /usr/freeware. Many shareware/freeware program don't have
to be installed in one of these two places (Blender is one such example) but it is best to
do so in order to maintain a consistent software management policy.

Since /usr/local and /usr/freeware are not normally referenced by the standard path
definition, an admin must ensure that relevant users are informed of any changes they
may have to make in order to access newly installed software. A typical notification
might be a recommendation of a how a user can modify her/his own .cshrc file so that
shells and other programs know where any new executable files, libraries, online
documents, etc. are stored.

Note that, assuming the presence of Internet access, users can easily download
freeware/shareware on their own and install it in their own directory so that it runs from
their home account area, or they could even install software in globally writeable places
such as /var/tmp. If this happens, it's common for an admin to become annoyed, but the
user has every right to install software in their own account area (unless it's against
company policy, etc.) A better response is to appreciate the user's need for the software
and offer to install it properly so that everyone can use it, unless some other factor is
more important.

Unlike vendor-supplied or commercial applications, newer versions of shareware and


freeware programs can often be radically different from older versions. GIMP is a good
example of this - one version introduced so many changes that it was barely comparable
to an older version. Users who utilise these types of packages might be annoyed if an
update is made without consulting them because:

o it's highly likely their entire working environment may be different in the new
version,
o features of the old version may no longer be available,
o aspects of the new version may be incompatible with the old version,
o etc.

Thus, shareware/freeware programs are a good example of where it might be better for
admins to offer more than one version of a software package, eg. all the files for Blender
V1.57 are stored in /usr/local/blender1.57_SGI_6.2_iris on akira and sevrin. When the
next version comes out (eg. V1.6), the files will be in /usr/local/blender1.6_SGI_6.2_iris
- ie. users can still use the old version if they wish.

Because shareware/free programs tend to be supplied as distinct modules, it's often easier
to support multiple versions of such software compared to vendor-supplied or
commercial packages.

Comments on Software Updates, Version Issues, etc.

Modern UNIX systems usually employ software installation techniques which operate in
such way so as to show any incompatibilities before installation (SGIs certainly operate
this way); the inst program (and thus swmgr too since swmgr is just a GUI interface to
inst) will not allow one to install software if there are conflicts present concerning
software dependency and compatibility. This feature of inst (and swmgr) to monitor
software installation issues applies only to software subsystems that can be installed and
removed using inst/swmgr, ie. those said to be in 'inst' format. Thankfully, large numbers
of freeware programs (eg. GIMP) are supplied in this format and so they can be managed
correctly. Shareware/Freeware programs do not normally offer any means by which one
can detect possible problems before installation or removal, unless the authors have been
kind enough to supply some type of analysis script or program.

Of course, there is nothing to stop an admin using low-level commands such as cp, tar,
mv, etc. to manually install problematic files by copying them from a CD, or another
system, but to do so is highly unwise as it would invalidate the inst database structure
which normally acts as a highly accurate and reliable record of currently installed
software. If an admin must make custom changes, an up-to-date record of these changes
should be maintained.

To observe inst/swmgr in action, either enter 'inst' or 'swmgr' at the command prompt (or
select 'Software Manager' from the Toolchest which runs swmgr). swmgr is the easier to
understand because of its intuitive interface.

Assuming the use of swmgr, once the application window has appeared, click on 'Manage
Installed Software'. swmgr loads the inst database information, reading the installation
history, checking subsystem sizes, calculating dependencies, etc. The inst system is a
very effective and reliable way of managing software.

Most if not all modern UNIX systems will employ a software installation and
management system such as inst, or a GUI-based equivalent.
Summary.

As an administrator, one should not need to know how to use the software products
which users have access to (though it helps in terms of being able to answer simple
questions), but one should:

o be aware of where the relevant files are located,


o understand issues concerning revision control,
o notify users of any steps they must take in order to access new software or
features,
o aid users in being able to use the products efficiently (eg. using /tmp or /var/tmp
for working temporarily with large files or complex tasks),
o have a consistent strategy for managing software products.

These issues become increasingly important as systems become more complex, eg.
multiple vendor platforms, hundreds of systems connected across multiple departments,
etc. One solution for companies with multiple systems and more than one admin is to
create a system administration committee whose responsibilities could include
coordinating site policies, dealing with security problems, sharing information, etc.
Detailed Notes for Day 3 (Part 1)
UNIX Fundamentals: Installing an Operating System and/or Software.

Installation Rationale.

Installing an OS is a common task for an admin to perform, usually often because of the
acquisition of a new system or the installation of a new disk.

Although any UNIX variant should be perfectly satisfactory once it has been installed,
sometimes the admin or a user has a particular problem which requires, for example, a different
system configuration (and thus perhaps a reinstall to take account of any major hardware
changes), or a different OS version for compatibility testing, access to more up-to-date features,
etc. Alternatively, a serious problem or accidental mistake might require a reinstallation, eg.
corrupted file system, damaged disk, or an unfortunate use of the rm command (recall the
example given in the notes for Day 2, concerning the dangers of the 'find' command); although
system restoration via backups is an option, often a simple reinstall is more convenient and
faster.

Whatever the reason, an admin must be familiar with the procedure for installing an OS on the
platform for which she/he is responsible.

Installation Interface and Tools.

Most UNIX systems have two interfaces for software installation: a high-level mode where an
admin can use some kind of GUI-based tool, and a low-level mode which employs the command
line shell. The GUI tool normally uses the command line version for the actual installation
operations. In the case of SGI's IRIX, the low-level program is called 'inst', while the GUI
interface to inst is called 'swmgr' - the latter can be activated from the 'Toolchest' on the desktop
or entered as a command. Users can also run swmgr, but only in 'read-only' mode, ie. Non-root
users cannot use inst or swmgr to install or remove software.

For general software installation tasks (new/extra applications, updates, patches, etc.) the GUI
tool can normally be used, but for installing an OS, virtually every UNIX platform will require
the admin to not only use the low-level tool for the installation, but also carry out the installation
in a 'lower' (restricted) access mode, ie. a mode where only the very basic system services are
operating: no user-related processes are running, the end-user GUI interface is not active, no
network services are running, etc. For SGI's IRIX, this mode is called 'miniroot'.

Major updates to the OS are also usually carried out in miniroot mode - this is because a fully
operational system will have services running which could be altered by a major OS change, ie.
it would be risky to perform any such change in anything but the equivalent of miniroot.

It is common for this restricted miniroot mode to be selected during bootup, perhaps by pressing
the ESC key when prompted. In the case of SGI systems, the motherboard PROM chip includes
a hard-coded GUI interface mechanism called ARCS which displays a mouse-driven menu on
bootup. This provides the admin with a user-friendly way of performing low-level system
administration tasks, eg. installing the OS from miniroot, running hardware diagnostics,
accessing a simple PROM shell called a Command Monitor for performing low-level actions
such as changing PROM settings (eg. which SCSI ID to treat as the system disk), etc.

Systems without graphics boards, such as servers, provide the same menu but in text-only form,
usually through a VT or other compatible text display terminal driven from the serial port. Note
that SGI's VisualWorkstation machine (an NT system) also uses the ARCS GUI interface - a first
for any NT system (ie. no DOS at all for low-level OS operations).

Not many UNIX vendors offer a GUI menu system like ARCS for low-level tasks - SGI is one of
the few who do, probably because of a historical legacy of making machines for the visual arts
and sciences. Though the ARCS system is perhaps unique, after one one has selected 'Software
Installation' the procedure progresses to a stage where the interface does become the more
familiar text-based use of inst (ie. the text information just happens to be presented within a
GUI-style window).

Very early UNIX platforms were not so friendly when it came to offering an easy method for
installing the OS, especially in the days of older storage media such as 5.25" disks, magnetic
tapes, etc. However, some vendors did a good job, eg. the text-only interface for installing HP-
UX on Hewlett Packard machines (eg. HP9000/730) is very user-friendly, allowing the admin to
use the cursor arrow keys to select options, activate tasks, etc. During installation, constantly
updated information shows how the installation is progressing: current file being installed,
number of files installed so far, number of files remaining, amount of disk space used up so far,
disk space remaining, percentage equivalents for all these, and even an estimate of how much
longer the installation will take before completion (surprisingly, inst doesn't provide this last
piece of information as it is running, though one can make good estimates or find out how long
it's going to take from a 3rd-party information source).

The inst program gives progress output equivalent to most of the above by showing the current
software subsystem being installed, which sub-unit of which subsystem, and what percentage of
the overall operation has been done so far.

Perhaps because of the text-only interface which is at the heart of installing any UNIX variant,
installing an OS can be a little daunting at first, but the actual procedure itself is very easy. Once
an admin has installed an OS once, doing it again quickly becomes second nature. The main
reason the task can seem initially confusing is that the printed installation guides are often too
detailed, ie. the supplied documents have to assume that the person carrying out the installation
may know nothing at all about what they're doing. Thankfully, UNIX vendors have recognised
this fact and so nowadays any such printed material also contains a summary installation guide
for experts and those who already know the general methods involved - this is especially useful
when performing an OS update as opposed to an original OS installation.
OS Source Media.

Years ago, an OS would be stored on magnetic tape or 5.25" disks. Today, one can probably
state with confidence that CDROMs are used by every vendor. For example, SGI's IRIX 6.2
comes on 2 CDROMs; IRIX 6.5 uses 4 CDROMs, but this is because 6.5 can be used with any
machine from SGI's entire current product line, aswell as many older systems - thus, the basic
CD set must contain the data for all relevant systems even though an actual installation will only
use a small subset of the data from the CDs (typically less than one CD's worth).

In the future, it is likely that vendors will switch to DVDs due to higher capacities and faster
transfer rates.

Though a normal OS installation uses some form of original OS media, UNIX actually allows
one to install an OS (or any software) via some quite unique ways. For example, one could copy
the data from the source media (I shall assume CDROM) to a fast UltraSCSI disk drive. Since
disks offer faster transfer rates and access times, using a disk as a source media enables a faster
installation, as well as removing the need for swapping CDROMs around during the installation
process. This is essentially a time-saving feature but is also very convenient, eg. no need to carry
around many CDROMs (remember that after an OS installation, an admin may have to install
extra software, applications, etc. from other CDs).

A completely different option is to install the OS using a storage device which is attached to a
remote machine across a network. This may sound strange, ie. the idea that a machine without an
OS can access a device on a remote system and use that as an OS installation source. It's
something which is difficult but not impossible with PCs (I'm not sure whether a Linux PC
would support this method). A low-level communications protocol called bootp (Internet
Bootstrap Protocol), supported by all traditional UNIX variants, is used to facilitate
communication across the network. As long as the remote system has been configured to allow
another system to access its local device as a source for remote OS installation, then the remote
system will effectively act as an attatched storage medium.

However, most admins will rarely if ever have to install an OS this way for small networks,
though it may be more convenient for larger networks. Note that IRIX systems are supplied by
default with the bootp service disabled in the /etc/inetd.conf file (the contents of this file controls
various network services). Full details on how to use the bootp service for remote OS installation
are normally provided by the vendor in the form of an online book or reference page. In the case
of IRIX, see the section entitled, "Booting across the Network" in Chapter 10 of the online book,
"IRIX Admin: System Configuration and Operation".

Note: this discussion does not explain every single step of installing an OS on an SGI system,
though the method will be demonstrated during the practical session if time permits. Instead, the
focus here is on management issues which surround an OS installation, especially those
techniques which can ease the installation task. Because of the SGI-related technical site I run, I
have already created extremely detailed installation guides for IRIX 6.2 [1] and IRIX 6.5 [2]
which also include tables of example installation times (these two documents are included for
future reference). The installation times obtained were used to conduct a CPU and CDROM
performance analysis [3]. Certain lessons were learned from this analysis which are also relevant
to installing an OS - these are explained later.

Installing an OS on multiple systems.

Using a set of CDs to install an OS can take quite some time (15 to 30 minutes is a useful
approximation). If an admin has many machines to install, there are several techniques for
cutting the amount of time required to install the OS on all the machines.

The most obvious method is for all machines to install via a remote network device, but this
could actually be very slow, limited partly by network speed but also by the way in which
multiple systems would all be trying to access the same device (eg. CDROM) at the same time. It
would only really be effective for a situation where the network was very fast and the device - or
devices, there could be more than one - was also fast.

An example would be the company MPC; as explained in previous lectures, their site
configuration is extremely advanced. The network they employ is so fast that it can saturate the
typical 100Mbit Ethernet port of a modern workstation like Octane. MPC's storage systems
include many high-end RAID devices capable of delivering data at hundreds of MB/sec rates
(this kind of bandwidth is needed for editing broadcast-quality video and assuring that animators
can load complete scene databases without significant delay).

Thus, the admin at MPC can use some spare RAID storage to install an OS on a system across
the network. When this is done, the limiting factor which determines how long the installation
takes is the computer's main CPU(s) and/or its Ethernet port (100MBit), the end result of which
is that an installation can take mere minutes. In reality, the MPC admin uses an even faster
technique for installing an OS, which is discussed in a moment.

At the time of my visit, MPC was using a high-speed crossbar switching 288Mbit/sec network
(ie. multiple communications links through the routers - each machine could be supplied with up
to 36MB/sec). Today they use multiple gigabit links (HiPPI) and other supporting devices. But
not everyone has the luxury of having such equipment.

Disk Cloning [1].

If an admin only has a single machine to deal with, the method used may not matter too much,
but often the admin has to deal with many machines. A simple technique which saves a huge
amount of time is called 'disk cloning'. This involves installing an OS onto a single system ('A')
and then copying (ie. cloning) the contents of that system's disk onto other disks. The first
installation might be carried out by any of the usual means (CDROM, DAT, network, etc.), after
which any extra software is also installed; in the case of SGIs, this would mean the admin
starting up the system into a normal state of operation, logging in as root and using swmgr to
install extra items. At this point, the admin may wish to make certain custom changes as well, eg.
installing shareware/freeware software, etc. This procedure could take more than an hour or two
if there is a great deal of software to install.

Once the initial installation has finished, then begins the cloning process. On SGIs, this is
typically done as follows (other UNIX systems will be very similar if not identical):

1. Place the system disk from another system B into system A, installed at, for example,
SCSI ID 2 (B's system disk would be on SCSI ID 1 in the case of SGIs; SCSI ID 0 is
used for the SCSI controller). Bootup the system.
2. Login as root. Use fx to initialise the B disk to be a new 'root' (ie. system) disk; create a
file system on it; mount the disk on some partition on A's disk such as /disk2.
3. Copy the contents of disk A to disk B using a command such as tar. Details of how to do
this with example tar commands are given in the reference guides [1] [2].
4. Every system disk contains special volume header information which is required in order
to allow it to behave as a bootable device. tar cannot copy this information since it does
not reside on the main data partition of the disk in the form of an ordinary file, so the next
step is to copy the volume header data from A to B using a special command for that
purpose. In the case of SGIs, the relevant program is called dvhtool (device volume
header tool).
5. Shut down system A; remove the B disk; place the B disk back into system B,
remembering to change its SCSI ID back to 1. If further cloning is required, insert
another disk into system A on SCSI ID 2, and (if needed) a further disk into system B,
also set to SCSI ID 2. Reboot both systems.
6. System A will reboot as normal. At bootup time, although system B already has a kernel
file available (/unix) because all the files will be recognised as new (ie. changed) system
B will also create a new kernel file (/unix.install) and then bootup normally ready for
login. Reboot system B once more so that the new kernel file is made the current kernel
file.

At this stage, what one has effectively created is a situation comprising two systems as described
in Step 1, instead of only one such system which existed before the cloning process. Thus, one
could now repeat the process again, creating four systems ready to use or clone again as desired.
Then eight, sixteen, thirty two and so on. This is exactly the same way biological cells divide, ie.
binary fission. Most people are familiar with the idea that repeatedly doubling the number of a
thing can create a great many things in a short space of time, but the use of such a technique for
installing an operating system on many machines means an admin can, for example, completely
configure over one hundred machines in less than five hours! The only limiting factor, as the
number of machines to deal with increases, is the amount of help available by others to aid in the
swapping of disks, typing of commands, etc. In the case of the 18 Indys in Ve24, the last
complete reinstall I did on my own took less than three hours.

Note: the above procedure assumes that each cloning step copies one disk onto just a single other
disk - this is because I'm using the Indy as an example, ie. Indy only has internal space for one
extra disk. But if a system has the available room, then many more disks could be installed on
other SCSI IDs (3, 4, 5, etc.) resulting in each cloning step creating three, four, etc. disks from
just one. This is only possible because one can run multiple tar copy commands at the same time.
Of course, one could use external storage devices to connect extra disks. There's no reason why a
system with two SCSI controllers (Indigo2, O2, Octane, etc.) couldn't use external units to clone
the system disk to 13 other disks at the same time; for a small network, such an ability could
allow the reinstallation of the entire system in a single step!

Using a Backup Image.

If a system has been backed up onto a medium such as DAT tape, one could in fact use that tape
for installing a fresh OS onto a different disk, as opposed to the more usual use of the tape for
data restoration purposes.

The procedure would be similar to some of the steps in disk cloning, ie. install a disk on SCSI ID
2, initialise, and use tar to extract the DAT straight to the disk. However, the volume header
information would have to come from the original system since it would not be present on the
tape, and only one disk could be written to at a time from the tape. Backup media are usually
slower than disks too.

Installing a New Version of an OS (Major Updates).

An admin will often have to install updates to various OS components as part of the normal
routine of installing software patches, bug fixes, new features, security fixes, etc. as they arrive
in CD form from the vendor concerned. These can almost always be installed using the GUI
method (eg. swmgr) unless specifically stated otherwise for some reason. However, if an admin
wishes to change a machine which already has an OS installed to a completely new version
(whether a newer version or an older one), then other issues must be considered.

Although it is perfectly possible to upgrade a system to a newer OS, an existing system will often
have so much software installed with a whole range of configuration files, a straight upgrade to a
new OS revision may not work very well. It would be successful, but what usually happens is
that the admin has to resolve installation conflicts before the procedure can begin, which is
annoyingly time wasting. Further, some changes may even alter some fundamental aspect of the
system, in which case an upgrade on top of the existing OS would involve extra changes which
an admin would have to read up on first (eg. IRIX 6.2 uses a completely different file system to
IRIX 5.3: XFS vs. EFS).

Even if an update over an existing OS is successful, one can never really be sure that older files
which aren't needed anymore were correctly removed. To an admin, the system would 'feel' as if
the older OS was somehow still there, rather like an old layer of paint hidden beneath a new
gloss. This aspect of OS management is perhaps only psychological, but it can be important. For
example, if problems occurred later, an admin might waste time checking for issues concerning
the older OS which aren't relevant anymore, even though the admin theoretically knows such
checks aren't needed.
Thus, a much better approach is to perform a 'clean' installation when installing a new OS. A
typical procedure would be as follows:

1. Read all the relevant notes supplied with the new OS release so that any issues relevant to
how the system may be different with the new OS version are known beforehand, eg. if
any system services operate in a different way, or other factors (eg. new type of file
system, etc.)
2. Make a full system backup of the machine concerned.
3. Identify all the key files which make the system what it is, eg. /etc/sys_id, /etc/hosts, and
other configuration files/directories such as /var/named, /var/flexlm/license.dat, etc.
These could be placed onto a DAT, floptical, ZIP, or even another disk. Items such as
shareware/freeware software are probably best installed anew (read any documents
relevant to software such as this too).
4. Use the appropriate low-level method to reinitialise the system disk. For SGI IRIX
systems, this means using the ARCS bootup menu to select the Command Monitor, boot
off of the OS CDROM and use the fx program to reinitialise the disk as a root disk, use
mkfs to create a new file system (the old OS image is now gone), then reboot to access
the 'Install System Software' option from the ARCS menu.
5. Install the OS in the normal manner.
6. Use the files backed up in step 3 to change the system so that it adopts its usual identity
and configuration, baring in mind any important features/caveats of the new OS release.

This is a safe and reliable way of ensuring a clean installation. Of course, the installation data
could come from a different media or over a network from a remote system as described earlier.

Time-saving Tips.

When installing an OS or software from a CDROM, it's tempting to want to use the fastest
possible CDROM available. However, much of the process of installing software, whether the
task is an OS installation or not, involves operations which do not actually use the CDROM. For
example, system checks need to be made before the installation can begin (eg. available disk
space), hundreds of file structures need to be created on the disk, installation images need to be
uncompressed in memory once they have been retrieved from the CDROM, installed files need
to be checked as the installation progresses (checksums), and any post-installation tasks
performed such as compiling any system software indices.

As a result, perhaps 50% of the total installation time may involve operations which do not
access the CDROM. Thus, using a faster CDROM may not speedup the overall installation to
any great degree. This effect is worsened if the CPU in the system is particularly old or slow, ie.
a slow CPU may not be able to take full advantage of an old CDROM, never mind a new one.

In order for a faster CDROM to make any significant difference, the system's CPU must be able
to take advantage of it, and a reasonably large proportion of an installation procedure must
actually consist of accessing the CDROM.
For example, consider the case of installing IRIX 6.5 on two different Indys - one with a slow
CPU, the other with a better CPU - comparing any benefit gained from using a 32X CDROM
instead of a 2X CDROM [3]. Here is a table of installation times, in hours minutes and seconds,
along with percentage speedups.

2X CDROM 32X CDROM %Speedup

100MHz R4600PC Indy: 1:18:36 1:12:11 8.2%

200MHz R4400SC Indy: 0:52:35 0:45:24 13.7%

(data for a 250MHz R4400SC Indigo2 shows the speedup would rise to 15.2% - a valid
comparison since Indy and Indigo2 are almost identical in system design)

In other words, the better the main CPU, the better the speedup obtained by using a faster
CDROM.

This leads on to the next very useful tip for installing software (OS or otherwise)...

Temporary Hardware Swaps.

The example above divided the columns in order to obtain the speedup for using a faster
CDROM, but it should be obvious looking at the table that a far greater speedup can be obtained
by using a better CPU:

Using 200MHz R4400SC CPU


Instead of 100MHz R4600PC.
(Percentage Speedup)

2X CDROM with Indy: 33.1%

32X CDROM with Indy: 37.1%

In other words, no matter what CDROM is used, an admin can save approximately a third of the
normal installation time just by temporarily swapping the best possible CPU into the target
system! And of course, the saving is maximised by using the fastest CDROM available too, or
other installation source such as a RAID containing the CDROM images.

For example, if an admin has to carry out a task which would normally be expected to take, say,
three hours on the target system, then a simple component swap could save over an hour of
installation time. From an admin's point of view, that means getting the job done quicker (more
time for other tasks), and from a management point of view that means lower costs and better
efficiency, ie. less wages money spent on the admin doing that particular task.

Some admins might have to install OS images as part of their job, eg. performance analysis or
configuring systems to order. Thus, saving as much time as possible could result in significant
daily productivity improvements.
The Effects of Memory Capacity.

During the installation of software or an OS, the system may consume large amounts of memory
in order to, for example, uncompress installation images from the CDROM, process existing
system files during a patch update, recompile system file indices, etc. If the target system does
not have enough physical memory, then swap space (otherwise known as virtual memory) will
have to be used. Since software installation is a disk and memory intensive task, this can
massively slow down the installation or removal procedure (the latter can happen too because
complex file processing may be required in order to restore system files to an earlier state prior
to the installation of the software items being removed).

Thus, just as it can be helpful to temporarily swap a better CPU into the target system and use a
faster CDROM if available, it is also a good idea to ensure the system has sufficient physical
memory for the task.

For example, I once had cause to install a large patch upgrade to the various compiler
subsystems on an Indy running IRIX 6.2 with 64MB RAM [1]. The update procedure seemed to
be taking far too long (15 minutes and still not finished). Noticing the unusually large amount of
disk activity compared to what I would normally expect, ie. Noise coming from the disk, I
became suspicious and wondered whether the installation process was running out of memory. A
quick use of gmemusage showed the available memory to be very low (3MB) implying that
memory swapping was probably occurring. I halted the update procedure (easy to do with IRIX)
and cancelled the installation. After upgrading the system temporarily to 96MB RAM (using
32MB from another Indy) I ran the patch again. This time, the update was finished in less than
one minute! Using gmemusage showed the patch procedure required at least 40MB RAM free in
order to proceed without resorting to the use of swap space.

Summary.

1. Before making any major change to a system, make a complete backup just in case
something goes wrong. Read any relevant documents supplied with the software to be
installed, eg. release notes, caveats to installation, etc.
2. When installing an OS or other software, use the most efficient storage media available if
possible, eg. the OS CDs copied onto a disk. NB: using a disk OS image for installation
might mean repartitioning the disk so that the system regards the disk as a bootable
device, just like a CDROM. By default, SCSI disks do not have the same partition layout
as a typical CDROM. On SGIs using IRIX, the fx program is used to repartition disks.
3. If more than one system is involved, use methods such as disk cloning to improve the
efficiency of the procedure.
4. If possible, temporarily swap better system components into the target system in order to
reduce installation time and ensure adequate resources for the procedure (better CPU, lots
of RAM, fastest possible CDROM).
Caution: item 4 above might not be possible if the particular set of files which get installed are
determined by the presence of internal components. In the case of Indy, installing an R5000
series CPU would result in the installation of different low-level bootup CPU-initialisation
libraries compared to R4600 or R4400 (these latter two CPUs can use the same libraries, but any
R5000 CPU uses newer libraries). Files relevant to these kinds of issues are located in directories
such as /var/sysgen.

Patch Files.

Installing software updates to parts of the OS or application software is a common task for
admins. In general, patch files should not be installed unless they are needed, but sometimes an
admin may not have any choice, eg. for security reasons, or Y2K compliance.

Typically, patch updates are supplied on CDs in two separate categories (these names apply to
SGIs; other UNIX vendors probably use a similar methodology):

1. Required/Recommended patches.
2. Fix-on-Fail Patches.

Item 1 refers to patches which the vendor suggests the admin should definitely install. Typically,
a CD containing such patches is accessed with inst/swmgr and an automatic installation carried
out, ie. the admin lets the system work out which of the available required/recommended patches
should be installed. This concept is known as installing a 'patch set'. When discussing system
problems or issues with others (eg. technical support, or colleagues on the Net), the admin can
then easily describe the OS state as being a particular revision modified by a particular dated
patch set, eg. IRIX 6.5 with the April 1999 Patch Set.

Item 2 refers to patches which only concern specific problems or issues, typically a single patch
file for each problem. An admin should not install such patches unless they are required, ie. they
are selectively installed as and when is necessary. For example, an unmodified installation of
IRIX 6.2 contains a bug in the 'jot' editor program which affects they way in which jot accesses
files across an NFS-mounted directory (the bug can cause jot to erase the file). To fix the bug,
one installs patch number 2051 which is shown in the inst/swmgr patch description list as 'Jot fix
for mmapping', but there's no need to install the patch if a machine running 6.2 is not using NFS.

Patch Inheritance.

As time goes by, it is common for various bug fixes and updates from a number of patches to be
brought together into a 'rollup' patch. Also, a patch file may contain the same fixes as an earlier
patch plus some other additional fixes. Two issues arise from this:

1. If one is told to install a patch file of a particular number (eg. advice gained from
someone on a newsgroup), it is usually the case that any later patch which has been
declared to be a replacement for the earlier patch can be used instead. This isn't always
the case, perhaps due to specific hardware issues of a particular system, but in general a
fix for a problem will be described as 'install patch <whatever> or later'. The release
notes for any patch file will describe what hardware platforms and OS revisions that
patch is intended for, what patches it replaces, what bugs are fixed by the patch (official
bug code numbers included), what other known bugs still exist, and what workarounds
can be used to temporarily solve the remaining problems.
2. When a patch is installed, a copy of the effected files prior to installation, called a 'patch
history', is created and safely stored away so that if ever the patch has to be removed at a
later date, the system can restore the relevant files to the state they were in before the
patch was first installed. Thus, installing patch files consumes disk space - how much
depends on the patch concerned. The 'versions' command with the 'removehist' option can
be used to remove the patch history for a particular patch, recovering disk space, eg.:
3. versions removehist patchSG0001537

would remove the patch history file for patch number 1537. To remove all patch
histories, the command to use is:

versions removehist "*"

Conflicts.

When installing patches, especially of the Fix-on-Fail variety, an admin can come across a
situation where a patch to be installed (A) is incompatible with one already present on the system
(B). This usually happens when an earlier problem was dealt with using a more up-to-date patch
than was actually necessary. The solution is to either remove B, then install an earlier but
perfectly acceptable patch C and finally install A, or find a more up-to-date patch D which
supersedes A and is thus compatible with B.

Note: if the history file for a patch has been removed in order to save disk space, then it will not
be possible to remove that patch from the system. Thus, if an admin encounters the situation
described above, the only possible solution will be to find the more up-to-date patch D.

Exploiting Patch File Release Notes.

The release notes for patches can be used to identify which patches are compatible, as well
ascertain other useful information, especially to check whether a particular patch is the right one
an admin is looking for (patch titles can sometimes be somewhat obscure). Since the release
notes exist on the system in text form (stored in /usr/relnotes), one can use the grep command to
search the release notes for information by hand, using appropriate commands. The commands
'relnotes' and 'grelnotes' can be used to view release notes.

relnotes outputs only text. Without arguments, it shows a summary of all installed products for
which release notes are available. One can then supply a product name - relnotes will respond
with a list of chapter titles for that product. Finally, specifying a product name and a chapter
number will output the actual text notes for the chosen chapter, or one can use '*' to display all
chapters for a product. grelnotes gives the same information in a browsable format displayed in a
window, ie. grelnotes is a GUI interface to relnotes. See the man pages for these commands for
full details.

relnotes actually uses the man command to display information, ie. the release notes files are
stored in the same compressed text format ('pack') used by online manual pages (man uses the
'unpack' command to decompress the text data). Thus, in order to grep-search through a release
notes file, the file must first be uncompressed using the unpack command. This is a classic
example of where the UNIX shell becomes very powerful, ie. one could write a shell script using
a combination of find, ls, grep, unpack and perhaps other commands to allow one to search for
specific items in release notes.

Although the InfoSearch tool supplied with IRIX 6.5 allows one to search release notes, IRIX 6.2
does not have InfoSearch, so an admin might decide that writing such a shell script would prove
very useful. Incidentally, this is exactly the kind of useful script which ends up being made
available on the Net for free so that anyone can use it. For all I know, such a script already exists.
Over time, entire collections of useful scripts are gathered together and eventually released as
freeware (eg. GNU shell script tools). An admin should examine any such tools to see if they
could be useful - a problem which an admin has to deal with may already have been solved by
someone else two decades earlier.

Patch Subsystem Components.

Like any other software product, a patch file is a software subsystem usually containing several
sub-units, or components. When manually selecting a patch for installation, inst/swmgr may tag
all sub-units for installation even if certain sub-units are not applicable (this can happen for an
automatic selection too, perhaps because inst selects all of a patch's components by default). If
this happens, any conflicts present will be displayed, preventing the admin from accidentally
installing unwanted or irrelevant items. Remember that an installation cannot begin until all
conflicts are resolved, though an admin can override this behaviour if desired.

Thus, when manually installing a patch file (or files), I always check the individual sub-units to
see what they are. In this way, I can prevent conflicts from arising in the first place by not
selecting subsystems which I know are not relevant, eg. 64bit libraries which aren't needed for a
system with a 32bit memory address kernel like Indy (INFO: all SGIs released after the Indigo
R3000 in 1991 do 64bit processing, but the main kernel file does not need to be compiled using
64bit addressing extensions unless the system is one which might have a very large amount of
memory, eg. an Origin2000 with 16GB RAM). Even when no conflicts are present, I always
check the selected components to ensure no 'older version' items have been selected.
References:

1. Disk and File System Administration:


2. http://www.futuretech.vuurwerk.nl/disksfiles.html
3. How to Install IRIX 6.5:
4. http://www.futuretech.vuurwerk.nl/6.5inst.html
5. SGI General Performance Comparisons:
6. http://www.futuretech.vuurwerk.nl/perfcomp.html
Detailed Notes for Day 3 (Part 2)
UNIX Fundamentals: Organising a network with a server.

This discussion explains basic concepts rather than detailed ideas such as specific 'topologies' to
use with large networks, or how to organise complex distributed file systems, or subdomains and
address spaces - these are more advanced issues which most admins won't initially have to deal
with, and if they do then the tasks are more likely to be done as part of a team.

The SGI network in Ve24 is typical a modern UNIX platform in how it is organised. The key
aspects of this organisation can be summarised as follows:

 A number of client machines and a server are connected together using a hub (24-port in
this case) and a network comprised of 10Mbit Ethernet cable (100Mbit is more common
in modern systems, with Gigabit soon to enter the marketplace more widely).
 Each client machine has its own unique identity, a local disk with an installed OS and a
range of locally installed application software for use by users.
 The network has been configured to have its own subdomain name of a form that
complies with the larger organisation of which it is just one part (UCLAN).
 The server has an external connection to the Internet.
 User accounts are stored on the server, on a separate external disk. Users who login to the
client machines automatically find their own files available via the use of the NFS
service.
 Users can work with files in their home directory (which accesses the server's external
disk across the network) or use the temporary directories on a client machine's local disk
for better performance.
 Other directories are NFS mounted from the server in order to save disk space and to
centralise certain services (eg. /usr/share, /var/mail, /var/www).
 Certain aspects of the above are customised in places. Most networks are customised in
certain ways depending on the requirements of users and the decisions taken by the
admin and management. In this case, specifics include:
o Some machines have better hardware internals, allowing for software installation
setups that offer improved user application performance and services, eg. bigger
disk permits /usr/share to be local instead of NFS-mounted, and extra vendor
software, shareware and freeware can be installed.
o The admin's account resides on an admin machine which is effectively also a
client, but with minor modifications, eg. tighter security with respect to the rest of
the network, and the admin's personal account resides on a disk attached to the
admin machine. NFS is used to export the admin's home account area to the
server and all other clients; custom changes to the admin's account definition
allows the admin account to be treated just like any other user account (eg.
accessible from within /home/staff).
o The server uses a Proxy server in order to allow the client machines to access the
external connection to the Internet.
o Ordinary users cannot login to the server, ensuring that the server's resources are
reserved for system services instead of running user programs. Normally, this
would be a more important factor if the server was a more powerful system than
the clients (typical of modern organisations). In the case of the Ve24 network
though, the server happens to have the same 133MHz R4600PC CPU as the client
machines. Staff can login to the server however - an ability based on assumed
privilege.
o One client machine is using a more up-to-date OS version (IRIX 6.5) in order to
permit the use of a ZIP drive, a device not fully supported by the OS version used
on the other clients (IRIX 6.2). ZIP drives can be used with 6.2 at the command-
line level, but the GUI environment supplied with 6.2 does not fully support ZIP
devices. In order to support 6.5 properly, the client with the ZIP drive has more
memory and a larger disk (most of the clients have 549MB system disks -
insufficient to install 6.5 which requires approximately 720MB of disk space for a
default installation).
o etc.

This isn't a complete list, but the above are the important examples.

Exactly how an admin configures a network depends on what services are to be provided, how
issues such as security and access control are dealt with, Internet issues, available disk space and
other resources, peripherals provided such as ZIP, JAZ, etc., and of course any policy directives
decided by management.

My own personal ethos is, in general, to put users first. An example of this ethos in action is that
/usr/share is made local on any machine which can support it - accesses to such a local directory
occur much faster than across a network to an NFS-mounted /usr/share on a server. Thus,
searching for man pages, accessing online books, using the MIDI software, etc. is much more
efficient/faster, especially when the network or server is busy.

NFS Issues.

Many admins will make application software NFS-mounted, but this results in slower
performance (unless the network is fast and the server capable of supplying as much data as can
be handled by the client, eg. 100Mbit Ethernet, etc.) However, NFS-mounted application
directories do make it easier to manage software versions, updates, etc. Traditional client/server
models assume applications are stored on a server, but this is an old ethos that was designed
without any expectation that the computing world would eventually use very large media files,
huge applications, etc. Throwing application data across a network is a ridiculous waste of
bandwidth and, in my opinion, should be avoided where possible (this is much more important
for slower networks though, eg. 10Mbit).

In the case of the Ve24 network, other considerations also come into play because of hardware-
related factors, eg. every NFS mount point employed by a client system uses up some memory
which is needed to handle the operational overhead of dealing with accesses to that mount point.
Adding more mount points means using more memory on the client; for an Indy with 32MB
RAM, using as many as a dozen mount points can result in the system running out of memory (I
tried this in order to offer more application software on the systems with small disks, but 32MB
RAM isn't enough to support lots of NFS-mounted directories, and virtual memory is not an
acceptable solution). This is a good example of how system issues should be considered when
deciding on the hardware specification of a system. As with any computer, it is unwise to equip a
UNIX system with insufficient resources, especially with respect to memory and disk space.

Network Speed.

Similarly, the required speed of the network will depend on how the network will be used. What
applications will users be running? Will there be a need to support high-bandwidth data such as
video conferencing? Will applications be NFS-mounted or locally stored? What kind of system
services will be running? (eg. web servers, databases, image/document servers, etc.) What about
future expansion? All these factors and more will determine whether typical networking
technologies such as 10Mbit, 100Mbit or Gigabit Ethernet are appropriate, or whether a different
networking system such as ATM should be used instead. For example, MPC uses a fast-
switching high-bandwidth network due to the extensive use of data-intensive applications which
include video editing, special effects, rendering and animation.

After installation, commands such as netstat, osview, ping and ttcp can be used to monitor
network performance. Note that external companies, and vendor suppliers, can offer advice on
suggested system topologies. For certain systems (eg. high-end servers), specific on-site
consultation and analysis may be part of the service.

Storage.

Deciding on appropriate storage systems and capacities can be a daunting task for a non-trivial
network. Small networks such as the SGI network I run can easily be dealt with simply by
ensuring that the server and clients all have large disks, that there is sufficient disk space for user
accounts, and a good backup system is used, eg. DDS3 DAT. However, more complex networks
(eg. banks, commercial businesses, etc.) usually need huge amounts of storage space, use very
different types of data with different requirements (text, audio, video, documents, web pages,
images, etc.), and must consider a whole range of issues which will determine what kind of
storage solution is appropriate, eg.:

 preventing data loss,


 sufficient data capacity with room for future expansion,
 interupt-free fast access to data,
 failure-proof (eg. backup hub units/servers/UPS),
 etc.

A good source of advice may be the vendor supplying the systems hardware, though note that
3rd-party storage solutions can often be cheaper, unless there are other reasons for using a
vendor-sourced storage solution (eg. architectural integration).
See the article listed in reference [1] for a detailed discussion on these issues.

Setting up a network can thus be summarised as follows:

 Decide on the desired final configuration (consultation process, etc.)


 Install the server with default installations of the OS. Install the clients with a default or
expanded/customised configuration as desired.
 Construct the hardware connections.
 Modify the relevant setup files of a single client and the server so that one can rlogin to
the server from the client and use GUI-based tools to perform further system
configuration and administration tasks.
 Create, modify or install the files necessary for the server and clients to act as a coherent
network, eg. /etc/hosts, .rhosts, etc.
 Setup other services such as DNS, NIS, etc.
 Setup any client-specific changes such as NFS mount points, etc.
 Check all aspects of security and access control, eg. make sure guest accounts are
blocked if required, all client systems have a password for the root account, etc. Use any
available FAQ (Frequently Asked Questions) files or vendor-supplied information as a
source of advice on how to deal with these issues. Very usefully, IRIX 6.5 includes a
high-level tool for controlling overall system and network security - the tool can be (and
normally is) accessed via a GUI interface.
 Begin creating group entries in /etc/group ready for user accounts, and finally the user
accounts themselves.
 Setup any further services required, eg. Proxy server for Internet access.
 etc.

The above have not been numbered in a rigid order since the tasks carried out after the very first
step can usually be performed in a different order without affecting the final configuration. The
above is only a guide.

Quotas.

Employing disk quotas is a practice employed by most administrators as a means of controlling


disk space usage by users. It is easy to assume that a really large disk capacity would mean an
admin need not bother with quotas, but unfortunately an old saying definitely holds true: "Data
will expand to fill the space available."

Users are lazy where disk space is concerned, perhaps because it is not their job to manage the
system as a whole. If quotas are not present on a system, most users simply don't bother deleting
unwanted files. Alternatively, the quota management software can be used as an efficient disk
accounting system by setting up quotas for a file system without using limit enforcement.

IRIX employs a quota management system that is common amongst many UNIX variants.
Examining the relevant commands (consult the 'SEE ALSO' section from the 'quotas' man page),
IRIX's quota system appears to be almost identical to that employed by, for example, HP-UX
(Hewlett Packard's UNIX OS). There probably are differences between the two implementations,
eg. issues concerning supported operations on particular types of file system, but in this case the
quota system is typical of the kind of OS service which is very similar or identical across all
UNIX variants. An important fact is that the quota software is part of the overall UNIX OS,
rather than some hacked 3rd-party software addon.

Quota software allows users to determine their current disk usage, and enables an admin to
monitor available resources, how long a user is over their quota, etc. Quotas can be used not only
to limit the amount of available disk space a user has, but also the number of files (inodes) which
a user is permitted to create.

Quotas consist of soft limits and hard limits. If a user's disk usage exceeds the soft limit, a
warning is given on login, but the user can still create files. If disk usage continues to increase,
the hard limit is the point beyond which the user will not be able to use any more disk space, at
least until the usage is reduced so that it is sufficiently below the hard limit once more.

Like most system services, how to setup quotas is explained fully in the relevant online book,
"IRIX Admin: Disks and Filesystems". What follows is a brief summary of how quotas are setup
under IRIX. Of more interest to an admin are the issues which surround quota management -
these are discussed shortly.

To activate quotas on a file system, an extra option is added to the relevant entry in the /etc/fstab
file so that the desired file system is set to have quotas imposed on all users whose accounts
reside on that file system. For example, without quotas imposed, the relevant entry in yoda's
/etc/fstab file looks like this:

/dev/dsk/dks4d5s7 /home xfs rw 0 0

With quotas imposed, this entry is altered to be:

/dev/dsk/dks4d5s7 /home xfs rw,quota 0 0

Next, the quotaon command is used to activate quotas on the root file system. A reboot causes
the quota software to automatically detect that quotas should be imposed on /home and so the
quota system is turned on for that file system.

The repquota command is used to display quota statistics for each user. The edquota command is
used to change quota values for a single user, or multiple users at once. With the -i option,
edquota can also read in quota information from a file, allowing an admin to set quota limits for
many users with a single command. With the -e option, repquota can output the current quota
statistics to a file in a format that is suitable for use with edquota's -i option.

Note: the editor used by edquota is vi by default, but an admin can change this by etting an
environment variable called 'EDITOR', eg.:

setenv EDITOR jot -f


The -f option forces jot to run in the foreground. This is necessary because the editor used by
edquota must run in the foreground, otherwise edquota will simply see an empty file instead of
quota data.

Ordinary users cannot change quota limits.

Quota Management Issues.

Most users do not like disk quotas. They are perceived as the information equivalent of a
straitjacket. However, quotas are usually necessary in order to keep disk usage to a sensible level
and to maintain a fair usage amongst all users.

As a result, the most important decision an admin must make regarding quotas is what limit to
actually set for users, either as a whole or individually.

The key to amicable relations between an admin and users is flexibility, eg. start with a small to
moderate limit for all (eg. 20MB). If individuals then need more space, and they have good
reason to ask, then an admin should increase the user's quota (assuming space is available).

Exactly what quota to set in the first instance can be decided by any sensible/reasonable schema.
This is the methodology I originally adopted:

 The user disk is 4GB. I don't expect to ever have more than 100 users, so I set the initial
quota to 40MB each.

In practice, as expected, some users need more, but most do not. Thus, erring on the side of
caution while also being flexible is probably the best approach.

Today, because the SGI network has a system with a ZIP drive attatched, and the SGIs offer
reliable Internet access to the WWW, many students use the Ve24 machines solely for
downloading data they need, copying or moving the data onto ZIP for final transfer to their PC
accounts, or to a machine at home. Since the ZIP drive is a 100MB device, I altered the quotas to
50MB each, but am happy to change that to 100MB if anyone needs it (this allows for a
complete ZIP image to be downloaded if required), ie. I am tailoring quota limits based on a
specific hardware-related user service issue.

If a user exceeds their quota, warnings are given. If they ask for more disk space, an admin
would normally enquire as to whether the user genuinely needs more space, eg.:

 Does the user have unnecessary files lying around in their home directory somewhere?
For example, movie files from the Internet, unwanted mail files, games files, object files
or core dump files left over from application development, media files created by
'playing' with system tools (eg. the digital camera). What about their Netscape cache?
Has it been set to too high a value? Do they have hidden files they're not aware of, eg.
.capture.tmp.* directories, capture.mv files, etc.? Can the user employ compression
methods to save space? (gzip, pack, compress)

If a user has removed all unnecessary files, but is still short of space, then unless there is some
special reason for not increasing their quota, an admin should provide more space. Exceptions
could include, for example, a system which has a genuine overall shortage of storage space. In
such a situation, it is common for an admin to ask users to compress their files if possible, using
the 'gzip', 'compress' or 'pack' commands. Users can use tar to create archives of many files prior
to compression. There is a danger with asking users to compress files though: eventually, extra
storage has to be purchased; once it has been, many users start uncompressing many of the files
they earlier compressed. To counter this effect, any increase in storage space being considered
should be large, say an order of magnitude, or at the very least a factor of 3 or higher (I'm a firm
believer in future-proofing).

Note that the find command can be used to locate files which are above a certain size, eg. those
that are particularly large or in unexpected places. Users can use the du command to examine
how much space their own directories and files are consuming.

Note: if a user exceeds their hard quota limit whilst in the middle of a write operation such as
using an editor, the user will find it impossible to save their work. Unfortunately, quitting the
editor at that point will lose the contents of the file because the editor will have opened a file for
writing already, ie. the opened file will have zero contents. The man page for quotas describes
the problem along with possible solutions that a user can employ:

"In most cases, the only way for a user to recover from over-quota conditions is to abort
whatever activity is in progress on the filesystem that has reached its limit, remove
sufficient files to bring the limit back below quota, and retry the failed program.

However, if a user is in the editor and a write fails because of an over quota situation, that
is not a suitable course of action. It is most likely that initially attempting to write the file
has truncated its previous contents, so if the editor is aborted without correctly writing the
file, not only are the recent changes lost, but possibly much, or even all, of the contents
that previously existed.

There are several possible safe exits for a user caught in this situation. He can use the
editor ! shell escape command (for vi only) to examine his file space and remove surplus
files. Alternatively, using csh, he can suspend the editor, remove some files, then resume
it. A third possibility is to write the file to some other filesystem (perhaps to a file on
/tmp) where the user's quota has not been exceeded. Then after rectifying the quota
situation, the file can be moved back to the filesystem it belongs on."

It is important that users be made aware of these issues if quotas are installed. This is also
another reason why I constantly remind users that they can use /tmp and /var/tmp for temporary
tasks. One machine in Ve24 (Wolfen) has an extra 549MB disk available which any user can
write to, just in case a particularly complex task requiring alot of disk space must be carried out,
eg. movie file processing.
Naturally, an admin can write scripts of various kinds to monitor disk usage in detailed ways, eg.
regularly identify the heaviest consumers of disk resources; one could place the results into a
regularly updated file for everyone to see, ie. a publicly readable "name and shame" policy (not a
method I'd use unless absolutely necessary, eg. when individual users are abusing the available
space for downloading game files).

UNIX Fundamentals: Installing/removing internal/external hardware.

As explained in this course's introduction to UNIX, the traditional hardware platforms which run
UNIX OSs have a legacy of top-down integrated design because of the needs of the market areas
the systems are sold into.

Because of this legacy, much of the toil normally associated with hardware modifications is
removed. To a great extent, an admin can change the hardware internals of a machine without
ever having to be concerned with system setup files. Most importantly, low-level issues akin to
IRQ settings in PCs are totally irrelevant with traditional UNIX hardware platforms. By
traditional I mean the long line of RISC-based systems from the various UNIX vendors such as
Sun, IBM, SGI, HP, DEC and even Intel. This ease of use does not of course apply to ordinary
PCs running those versions of UNIX which can be used with PCs, eg. Linux, OpenBSD,
FreeBSD, etc.; for this category of system, the OS issues will be simpler (presumably), but the
presence of a bottom-up-designed PC hardware platform presents the usual problems of
compatible components, device settings, and other irritating low-level issues.

This discussion uses the SGI Indy as an example system. If circumstances allow, a more up-to-
date example using the O2 system will also be briefly demonstrated in the practical session.
Hardware from other UNIX vendors will likely be similar in terms of ease-of-access and
modification, though it has to be said that SGI has been an innovator in this area of design.

Many system components can be added to, or removed from a machine, or swapped between
machines, without an admin having to change system setup files in order to make the system run
smoothly after any alterations. Relevant components include:

 Memory units,
 Disk drives (both internal and external),
 Video or graphics boards that do not alter how the system would handle relevant
processing operations.
 CPU subsystems which use the same instruction set and hardware-level initialisation
libraries as are already installed.
 Removable storage devices, eg. ZIP, JAZ, Floptical, SyQuest, CDROM, DVD (where an
OS is said to support it), DAT, DLT, QIC, etc.
 Any option board which does not impact on any aspect of existing system operation not
related to the option board itself, eg. video capture, network expansion (Ethernet, HiPPI,
TokenRing, etc.), SCSI expansion, PCI expansion, etc.
Further, the physical layout means the admin does not have to fiddle with numerous cables and
wires. The only cables present in Indy are the two short power supply cables, and the internal
SCSI device ribbon cable with its associated power cord. No cables are present for graphics
boards, video options, or other possible expansion cards. Some years after the release of the
Indy, SGI's O2 design allows one to perform all these sorts of component changes without
having to fiddle with any cables or screws at all (the only exception being any PCI expansion,
which most O2 users will probably never use anyway).

This integrated approach is certainly true of Indy. The degree to which such an ethos applies to
other specific UNIX hardware platforms will vary from system to system. I should imagine
systems such as Sun's Ultra 5, Ultra 10 and other Ultra-series workstations are constructed in a
similar way.

One might expect that any system could have a single important component replaced without
affecting system operation to any great degree, even though this is usually not the case with PCs,
but it may come as a far greater surprise that an entire set of major internal items can be changed
or swapped from one system to another without having to alter configuration files at all.

Even when setup files do have to be changed, the actual task normally only involves either a
simple reinstall of certain key OS software sub-units (the relevant items will be listed in
accompanying documentation and release notes), or the installation of some additional software
to support any new hardware-level system features. In some cases, a hardware alteration might
require a software modification to be made from miniroot if the software concerned was of a
type involved in normal system operation, eg. display-related graphics libraries which controlled
how the display was handled given the presence of a particular graphics board revision.

The main effect of this flexible approach is that an admin has much greater freedom to:

 modify systems as required, perhaps on a daily basis (eg. the way my external disk is
attatched and removed from the admin machine every single working day),
 experiment with hardware configurations, eg. performance analysis (a field I have
extensively studied with SGIs [2]),
 configure temporary setups for various reasons (eg. demonstration systems for visiting
clients),
 effect maintenance and repairs, eg. cleaning, replacing a power supply, etc.

All this without the need for time-consuming software changes, or the irritating necessity to
consult PC-targeted advice guides about devices (eg. ZIP) before changes are made.

Knowing the scope of this flexibility with respect to a system will allow an admin to plan tasks
in a more efficient manner, resulting in better management of available time.

An example of the above with respect to the SGI Indy would be as follows (this is an imaginary
demonstration of how the above concepts could be applied in real-life):
 An extensive component swap between two indys, plus new hardware installed.

Background information:

CPUs.

All SGIs use a design method which involves supplying a CPU and any necessary secondary
cache plus interface ASICs on a 'daughterboard', or 'daughtercard'. Thus, replacing a CPU merely
involves changing the daughtercard, ie. no fiddling with complex CPU insertion sockets, etc.
Daughtercards in desktop systems can be replaced in seconds, certainly no more than a minute or
two.

The various CPUs available for Indy can be divided into two categories: those which support
everything up to and including the MIPS III instruction set, and those which support all these
plus the MIPS IV instruction set.

The R4000, R4600 and R4400 CPUs all use MIPS III and are initialised on bootup with the same
low-level data files, ie. the files stored in /var/sysgen. This covers the following CPUs:

100MHz R4000PC (no L2)


100MHz R4000SC (1MB L2)
100MHz R4600PC (no L2)
133MHz R4600PC (no L2)
133MHz R4600SC (512K L2)
100MHz R4400SC (1MB L2)
150MHz R4400SC (1MB L2)
175MHz R4400SC (1MB L2)
200MHz R4400SC (1MB L2)

Thus, two Indys with any of the above CPUs can have their CPUs swapped without having to
alter system software.

Similarly, the MIPS IV CPUs:

150MHz R5000PC (no L2)


150MHz R5000SC (512K L2)
180MHz R5000SC (512K L2)

can be treated as interchangeable between systems in the same way.

The difference between an Indy which uses a newer vs. older CPU is that the newer CPUs
require a more up-to-date version of the system PROM chip to be installed on the motherboard (a
customer who orders an upgrade is suppled with the newer PROM if required).

Video/Graphics Boards.

Indy can have three different boards which control display output:
8bit XL
24bit XL
24bit XZ

8bit and 24bit XL are designed for 2D applications. They are identical except for the addition of
more VRAM to the 24bit version. XZ is designed for 3D graphics and so requires a slightly
different installation of software graphics libraries to be installed in order to permit proper use.
Thus, with respect to the XL version, an 8bit XL card can be swapped with a 24bit XL card with
no need to alter system software.

Indy can have two other video options:

 IndyVideo (provides video output ports as well as extra input ports),


 CosmoCompress (hardware-accelerate MJPEG video capture board).

IndyVideo does not require the installation of any extra software in order to be used.
CosmoCompress does require some additional software to be installed (CosmoCompress
compression API and libraries). Thus, IndyVideo could be installed without any post-installation
software changes. swmgr can be used to install the CosmoCompress software after the option
card has been installed.

Removable Media Devices.

As stated earlier, no software modifications are required, unless specifically stated by the vendor.
Once a device has its SCSI ID set appropriately and installed, it is recognised automatically and
a relevant icon placed on the desktop for users to exploit. Some devices may require a group of
DIP switches to be configured on the outside of the device, but that is all (settings to use for a
particular system will be found in the supplied device manual). The first time I used a DDS3
DAT drive (Sony SDT9000) with an Indy, the only setup required was to set four DIP switches
on the underside of the DAT unit to positions appropriate for use with an SGI (as detailed on the
first page of the DAT manual). Connecting the DAT unit to the Indy, booting up and logging in,
the DAT was immediately usable (icon available, etc.) No setup files, no software to install, etc.
The first time I used a 32X CDROM (Toshiba CD-XM-6201B) not even DIP switches had to be
set.

System Disks, Extra Disks.

Again, installed disks are detected automatically and the relevant device files in /dev initialised
to be treated as the communication points with the devices concerned. After bootup, the fx, mkfs
and mount commands can be used to configure and mount new disks, while disks which already
have a valid file system installed can be mounted immediately. GUI tools are available for
performing these actions too.

Thus, consider two Indys:


System A System B

200MHz R4400SC 100MHz R4600PC


24bit XL 8bit XL
128MB RAM 64MB RAM
2GB disk 1GB disk
IRIX 6.2 IRIX 6.2

Suppose an important company visitor is expected the next morning at 11am and the admin is
asked to quickly prepare a decent demonstration machine, using a budget provided by the
visiting company to cover any changes required (as a gift, any changes can be permanent).

The admin orders the following extra items for next-day delivery:

 A new 4GB SCSI disk (Seagate Barracuda 7200rpm)


 IndyVideo board
 Floptical drive
 ZIP drive
 32X Toshiba CDROM (external)
 DDS3 Sony DAT drive (external)

The admin decides to make the following changes (Steps 1 and 2 are carried out immediately; in
order to properly support the ZIP drive, the admin needs to use IRIX 6.5 on B. The support
contract means the CDs are already available.):

1. Swap the main CPU, graphics board and memory components between systems A and B.
2. Remove the 1GB disk from System B and install it as an option disk in System A. The
admin uses fx and mkfs to redine the 1GB disk as an option drive, deciding to use the
disk for a local /usr/share partition (freeing up perhaps 400MB of space from System A's
2GB disk).
3. The order arrives the next morning at 9am (UNIX vendors usually use couriers such as
Fedex and DHL, so deliveries are normally very reliable). The 4GB disk is installed into
System B (empty at this point) and the CDROM connected to the external SCSI port
(SCSI ID 3). The admin then installs IRIX 6.5 onto the 4GB disk, a process which takes
approximately 45 minutes. The system is powered down ready for the final hardware
changes.
4. The IndyVideo board is installed in System B (sits on top of the 24bit XL board, 2 or 3
screws involved, no cables), along with the internal Floptical drive above the 4GB disk
(SCSI ID set to 2). The DAT drive (SCSI ID set to 4) is daisy chained to the external
CDROM. The ZIP drive is daisy chained to the DAT (SCSI ID 5 by default selector,
terminator enabled). This can all be done in less than five minutes.
5. The system is rebooted, the admin logs in as root. All devices are recognised
automatically and icons for each device (ZIP, CDROM, DAT, Floptical) are immediately
present on the desktop and available for use. Final additional software installations can
begin, ready for the visitor's arrival. An hour should be plenty of time to install specific
application(s) or libraries that might be required for the visit.
I am confident that steps 1 and 2 could be completed in less than 15 minutes. Steps 3, 4 and 5
could be completed in less little more than an hour. Throughout the entire process, no OS or
software changes have to be made to either System A, or to the 6.5 OS installed on System B's
new 4GB after initial installation (ie. the ZIP, DAT and Floptical were not attatched to System B
when the OS was installed, but they are correctly recognised by the default 6.5 OS when the
devices are added afterwards).

If time permits and interest is sufficient, almost all of this example can be demonstrated live (the
exception is the IndyVideo board; such a board is not available for use with the Ve24 system at
the moment).

How does the above matter from an admin's point of view? The answer is confidence and lack of
stress. I could tackle a situation such as described here in full confidence that I would not have to
deal with any matters concerning device drivers, interupt addresses, system file modifications,
etc. Plus, I can be sure the components will work perfectly with one another, constructed as they
are as part of an integrated system design. In short, this integrated approach to system design
makes the admin's life substantially easier.

The Visit is Over.

Afterwards, the visitor donates funds for a CosmoCompress board and an XZ board set. Ordered
that day, the boards arrive the next morning. The admin installs the CosmoCompress board into
System B (2 or 3 more screws and that's it). Upon bootup, the admin installs the
CosmoCompress software from the supplied CD with swmgr. With no further system changes,
all the existing supplied software tools (eg. MediaRecorder) can immediately utilise the new
hardware compression board.

The 8bit XL board is removed from System A and replaced with the XZ board set. Using inst
accessed via miniroot, the admin reinstalls the OS graphics libraries so that the appropriate
libraries are available to exploit the new board. After rebooting the system, all existing software
written in OpenGL automatically runs ten times faster than before, without modification.

Summary.

Read available online books and manual pages on general hardware concepts thoroughly.

Get to know the system - every machine will either have its own printed hardware guide, or an
equivalent online book.

Practice hardware changes before they are required for real.

Consult any Internet-based information sources, especially newsgroup posts, 3rd-party web sites
and hardware-related FAQ files.
When performing installations, follow all recommended procedures, eg. use an anti-static strap
to eliminate the risk of static discharge damaging system components (especially important for
handling memory items, but also just as relevant to any other device).

Construct a hardware maintenance strategy for cleansing and system checking, eg. examine all
mice on a regular basis to ensure they are dirt-free, use an air duster once a month to clear away
accumulated dist and grime, clean the keyboards every two months, etc.

Be flexible. System management policies are rarely static, eg. a sudden change in the frequency
of use of a system might mean cleansing tasks need to be performed more often, eg. cleaning
monitor screens.

If you're not sure what the consequences of an action might be, call the vendor's hardware
support service and ask for advice. Questions can be extremely detailed if need be - this kind of
support is what such support services are paid to offer, so make good use of them.

Before making any change to a system, whether hardware or software, inform users if possible.
This is probably more relevant to software changes (eg. if a machine needs to be rebooted, use
'wall' to notify any users logged onto the machine at the time, ie. give them time to log off; if
they don't, go and see why they haven't), but giving advance notice is still advisable for hardware
changes too, eg. if a system is being taken away for cleaning and reinstallation, a user may want
to retrieve files from /var/tmp prior to the system's removal, so place a notice up a day or so
beforehand if possible.

References:

1. "Storage for the network", Network Week, Vol4 No.31, 28th April 1999, pp. 25 to 29, by
Marshall Breeding.
2. SGI General Performance Comparisons:
3. http://www.futuretech.vuurwerk.nl/perfcomp.html
Detailed Notes for Day 3 (Part 3)
UNIX Fundamentals: Typical system administration tasks.

Even though the core features of a UNIX OS are handled automatically, there are still some jobs
for an admin to do. Some examples are given here, but not all will be relevant for a particular
network or system configuration.

Data Backup.

A fundamental aspect of managing any computer system, UNIX or otherwise, is the backup of
user and system data for possible retrieval purposes in the case of system failure, data corruption,
etc. Users depend the admin to recover files that have been accidentally erased, or lost due to
hardware problems.

Backup Media.

Backup devices may be locally connected to a system, or remotely accessible across a network.
Typical backup media types include:

 1/4" cartridge tape, 8mm cartridge tape (used infrequently today)


 DAT (very common)
 DLT (where lots of data must be archived)
 Floptical, ZIP, JAZ, SyQuest (common for user-level backups)

Backup tapes, disks and other media should be well looked after in a secure location [3].

Backup Tools.

Software tools for archiving data include low-level format-independent tools such as dd, file and
directory oriented tools such as tar and cpio, filesystem-oriented tools such as bru, standard
UNIX utilities such as dump and restore (cannot be used with XFS filesystems - use xfsdump
and xfsrestore instead), etc., and high-level tools (normally commercial packages) such as IRIS
NetWorker. Some tools include a GUI frontend interface.

The most commonly used program is tar, which is also widely used for the distribution of
shareware and freeware software. Tar allows one to gather together a number of files and
directories into a single 'tar archive' file which by convention should always have a '.tar' suffix.
By specifying a device such as a DAT instead of an archive file, tar can thus be used to archive
data directly to a backup medium.

Tar files can also be compressed, usually with the .gz format (gzip and gunzip) though there are
other compression utilities (compress, pack, etc.) Backup and restoration speed can be improved
by compressing files before any archiving process commences. Some backup devices have built-
in hardware compression abilities. Note that files such as MPEG movies and JPEG images are
already in a compressed format, so compressing these prior to backup is pointless.

Straightforward networks and systems will almost always use a DAT drive as the backup device
and tar as the software tool. Typically, the 'cron' job scheduling system is used to execute a
backup at regular intervals, usually overnight. Cron is discussed in more detail below.

Backup Strategy.

Every UNIX guide will recommend the adoption of a 'backup strategy', ie. a combination of
hardware and software related management methods determined to be the most suitable for the
site in question.

A backup strategy should be rigidly adhered to once in place. Strict adherence allows an admin
to reliably assess whether lost or damaged data is recoverable when a problem arises.

Exactly how an admin performs backups depends upon the specifics of the site in question.
Regardless of the chosen strategy, at least two full sets of reasonably current backups should
always be maintained. Users should also be encouraged to make their own backups, especially
with respect to files which are changed and updated often.

What/When to Backup.

How often a backup is made depends on the system's frequency of use. For a system like the
Ve24 SGI network, a complete backup of user data every night, plus a backup of the server's
system disk once a week, is fairly typical. However, if a staff member decided to begin important
research with commercial implications on the system, I might decide that an additional backup at
noon each day should also be performed, or even hourly backups of just that person's account.

Usually, a backup archives all user or system data, but this may not be appropriate for some sites.
For example, an artist or animator may only care about their actual project files in their ~/Maya
project directory (Maya is a professional Animation/Rendering package) rather than the files
which define their user environment, etc. Thus, an admin might decide to only backup every
users' Maya projects directory. This would, for example, have the useful side effect of excluding
data such as the many files present in a user's .netscape/cache directory. In general though, all of
a user's account is archived.

If a change is to be made to a system, especially a server change, then separate backups should
be performed before and after the change, just in case anything goes wrong.

Since root file systems do not change very much, they can be backed up less frequently, eg. once
per week. An exception might be if the admin wishes to keep a reliable record of system access
logs which are part of the root file system, eg. those located in the files (for example):
/var/adm/SYSLOG
/var/netscape/suitespot/proxy-sysname-proxy/logs

The latter of the two would be relevant if a system had a Proxy server installed, ie. 'sysname'
would be the host name of the system. Backing up /usr and /var instead of the entire / root
directory is another option - the contents of /usr and /var change more often than many other
areas of the overall file system, eg. users' mail is stored in /var/mail and most executable
programs are under /usr.

In some cases, it isn't necessary to backup an entire root filesystem anyway. For example, the
Indys in Ve24 all have more or less identical installations: all Indys with a 549MB disk have the
same disk contents as each other, likewise for those with 2GB disks. The only exception is
Wolfen which uses IRIX 6.5 in order to provide proper support for an attached ZIP drive. Thus, a
backup of one of the client Indys need only concern specific key files such as /etc/hosts,
/etc/sys_id, /var/flexlm/license.dat, etc. However, this policy may not work too well for servers
(or even clients) because:

 an apparently small change, eg. adding a new user, installing a software patch, can affect
many files,
 the use of GUI-based backup tools does not aid an admin in remembering which files
have been archived.

For this reason, most admins will use tar, or a higher-level tool like xfsdump.

Note that because restoring data from a DAT device is slower than copying data directly from
disk to disk (especially modern UltraSCSI disks), an easier way to restore a client's system disk -
where all clients have identical disk contents - is to clone the disk from another client and then
alter the relevant files; this is what I do if a problem occurs.

Other backup devices can be much faster though [1], eg. DLT9000 tape streamer, or
military/industrial grade devices such as the DCRsi 240 Digital Cartridge Recording System
(30MB/sec) as was used to backup data during the development of the 777 aircraft, or the
Ampex DIS 820i Automated Cartridge Library (scalable from 25GB to 6.4TB max capacity,
80MB/sec sustained record rate, 800MB/sec search/read rate, 30 seconds maximum search time
for any file), or just a simple RAID backup which some sites may choose to use.

It's unusual to use another disk as a backup medium, but not unheard of. Theoretically, it's the
fastest possible backup medium, so if there's a spare disk available, why not? Some sites may
even have a 'mirror' system whereby a backup server B copies exactly the changes made to an
identical file system on the main server A; in the event of serious failure, server B can take over
immediately. SGI's commercial product for this is called IRIS FailSafe, with a switchover time
between A and B of less than a millisecond. Fail-safe server configurations like this are the
ultimate form of backup, ie. all files are being backed up in real-time, and the support hardware
has a backup too. Any safety-critical installation will probably use such methods.

Special power supplies might be important too, eg. a UPS (Uninterruptable Power Supply) which
gives some additional power for a few minutes to an hour or more after a power failure and
notifies the system to facilitate a safe shutdown, or a dedicated backup power generator could be
used, eg. hospitals, police/fire/ambulance, airtraffic control, etc.

Note: systems managed by more than one admin should be backed up more often; admin policies
should be consistent.

Incremental Backup.

This method involves only backing up files which have changed since the previous backup,
based on a particular schedule. An incremental schema offers the same degree of 'protection' as
an entire system backup and is faster since fewer files are archived each time, which means
faster restoration time too (fewer files to search through on a tape).

An example schedule is given in the online book, "IRIX Admin: Backup, Security, and
Accounting':

"An incremental scheme for a particular filesystem


looks something like this:

1. On the first day, back up the entire filesystem.


This is a monthly backup.

2. On the second through seventh days, back up only


the files that changed from the previous day.
These are daily backups.

3. On the eighth day, back up all the files that


changed the previous week. This is a weekly backup.

4. Repeat steps 2 and 3 for four weeks (about one month).

5. After four weeks (about a month), start over,


repeating steps 1 through 4.

You can recycle daily tapes every month, or whenever you feel safe
about doing so. You can keep the weekly tapes for a few months.
You should keep the monthly tapes for about one year before
recycling them."

Backup Using a Network Device.

It is possible to archive data to a remote backup medium by specifying the remote host name
along with the device name. For example, an ordinary backup to a locally attached DAT might
look like this:

tar cvf /dev/tape /home/pub

Or if no other relevant device was present:


tar cv /home/pub

For a remote device, simply add the remote host name before the file/directory path:

tar cvf yoda:/dev/tape /home/pub

Note that if the tar command is trying to access a backup device which is not made by the source
vendor, then '/dev/tape' may not work. In such cases, an admin would have to use a suitable
lower-level device file, ie. one of the files in /dev/rmt - exactly which one can be determined by
deciding on the required functionality of the device, as explained in the relevant device manual,
along with the SCSI controller ID and SCSI device ID.

Sometimes a particular user account name may have to be supplied when accessing a remote
device, eg.:

tar guest@yoda:/dev/tape /home/pub

This example wouldn't actually work on the Ve24 network since all guest accounts are locked
out for security reasons, except on Wolfen. However, an equivalent use of the above syntax can
be demonstrated using Wolfen's ZIP drive and the rcp (remote copy) command:

rcp -r /home/pub guest.guest1@wolfen:/zip

Though note that the above use of rcp would not retain file time/date creation/modification
information when copying the files to the ZIP disk (tar retains all information).

Automatic Backup With Cron.

The job scheduling system called cron can be used to automatically perform backups, eg.
overnight. However, such a method should not be relied upon - nothing is better than someone
manually executing/observing a backup, ensuring that the procedure worked properly, and
correctly labelling the tape afterwards.

If cron is used, a typical entry in the root cron jobs schedule file (/var/spool/cron/crontabs/root)
might look like this:

0 3 * * * /sbin/tar cf /dev/tape /home

This would execute a backup to a locally attached backup device at 3am every morning. Of
course, the admin would have to ensure a suitable media was loaded before leaving at the end of
each day.

This is a case where the '&&' operator can be useful: in order to ensure no subsequent operation
could alter the backed-up data, the 'eject' command could be employed thus:

0 3 * * * /sbin/tar cf /dev/tape /home && eject /dev/tape


Only after the tar command has finished will the backup media be ejected. Notice there is no 'v'
option in these tar commands (verbose mode). Why bother? Nobody will be around to see the
output. However, an admin could modify the command to record the output for later reading:

0 3 * * * /sbin/tar cvf /dev/tape /home > /var/tmp/tarlog && eject


/dev/tape

Caring for Backup Media.

This is important, especially when an admin is responsible for backing up commercially


valuable, sensitive or confidential data.

Any admin will be familiar with the usual common-sense aspects of caring for any storage
medium, eg. keeping media away from strong magnetic fields, extremes of temperature and
humidity, etc., but there are many other factors too. The "IRIX Admin: Backup, Security, and
Accounting' guide contains a good summary of all relevant issues:

"Storage of Backups

Store your backup tapes carefully. Even if you create backups on more durable media,
such as optical disks, take care not to abuse them. Set the write protect switch on tapes
you plan to store as soon as a tape is written, but remember to unset it when you are ready
to overwrite a previously-used tape.

Do not subject backups to extremes of temperature and humidity, and keep tapes away
from strong electromagnetic fields. If there are a large number of workstations at your
site, you may wish to devote a special room to storing backups.

Store magnetic tapes, including 1/4 in. and 8 mm cartridges, upright. Do not store tapes
on their sides, as this can deform the tape material and cause the tapes to read incorrectly.

Make sure the media is clearly labeled and, if applicable, write-protected. Choose a label-
color scheme to identify such aspects of the backup as what system it is from, what level
of backup (complete versus partial), what filesystem, and so forth.

To minimize the impact of a disaster at your site, such as a fire, you may want to store
main copies of backups in a different building from the actual workstations. You have to
balance this practice, though, with the need to have backups handy for recovering files.

If backups contain sensitive data, take the appropriate security precautions, such as
placing them in a locked, secure room. Anyone can read a backup tape on a system that
has the appropriate utilities.
How Long to Keep Backups

You can keep backups as long as you think you need to. In practice, few sites keep
system backup tapes longer than about a year before recycling the tape for new backups.
Usually, data for specific purposes and projects is backed up at specific project
milestones (for example, when a project is started or finished).

As site administrator, you should consult with your users to determine how long to keep
filesystem backups.

With magnetic tapes, however, there are certain physical limitations. Tape gradually loses
its flux (magnetism) over time. After about two years, tape can start to lose data.

For long-term storage, re-copy magnetic tapes every year to year-and-a-half to prevent
data loss through deterioration. When possible, use checksum programs, such as the
sum(1) utility, to make sure data hasn't deteriorated or altered in the copying process. If
you want to reliably store data for several years, consider using optical disk.

Guidelines for Tape Reuse

You can reuse tapes, but with wear, the quality of a tape degrades. The more important
the data, the more precautions you should take, including using new tapes.

If a tape goes bad, mark it as "bad" and discard it. Write "bad" on the tape case before
you throw it out so that someone doesn't accidentally try to use it. Never try to reuse an
obviously bad tape. The cost of a new tape is minimal compared to the value of the data
you are storing on it."

Backup Performance.

Sometimes data archive/extraction speed may be important, eg. a system critical to a commercial
operation fails and needs restoring, or a backup/archive must be made before a deadline.

In these situations, it is highly advisable to use a fast backup medium, eg. DDS3 DAT instead of
DDS1 DAT.

For example, an earlier lecture described a situation where a fault in the Ve24 hub caused
unnecessary fault-hunting. As part of that process, I restored the server's system disk from a
backup tape. At the time, the backup device was a DDS1 DAT. Thus, to restore some 1.6GB of
data from a standard 2GB capacity DAT tape, I had to wait approximately six hours for the
restoration to complete (since the system was needed the next morning, I stayed behind well into
the night to complete the operation).
The next day, it was clear that using a DDS1 was highly inefficient and time-wasting, so a DDS3
DAT was purchased immediately. Thus, if the server ever has to be restored from DAT again,
and despite the fact it now has a larger disk (4GB with 2.5GB of data typically present), even a
full restoration would only take three hours instead of six (with 2.5GB used, the restoration
would finish in less than two hours). Tip: as explained in the lecture on hardware modifications
and installations, consider swapping a faster CPU into a system in order to speedup a backup or
restoration operation - it can make a significant difference [2].

Hints and Tips.

 Keep tape drives clean. Newer tapes deposit more dirt than old ones.
 Use du and df to check that a media will have enough space to store the data. Consider
using data compression options if space on the media is at a premium (some devices may
have extra device files which include a 'c' in the device name to indicate it supports
hardware compression/decompression, eg. a DLT drive whose raw device file is
/dev/rmt/tps0d5vc). There is no point using compression options if the data being
archived is already compressed with pack, compress, gzip, etc. or is naturally compressed
anyway, eg. an MPEG movie, JPEG image, etc.
 Use good quality media. Do not use ordinary audio DAT tapes with DAT drives for
computer data backup; audio DAT tapes are of a lower quality than DAT tapes intended
for computer data storage.
 Consider using any available commands to check beforehand that a file system to be
backed up is not damaged or corrupted (eg. fsck). This will be more relevant to older file
system types and UNIX versions, eg. fsck is not relevant to XFS filesystems (IRIX 6.x
and later), but may be used with EFS file systems (IRIX 5.3 and earlier). Less important
when dealing with a small number of items.
 Label all backups, giving full details, eg. date, time, host name, backup command used
(so you or another admin will know how to extract the files later), general contents
description, and your name if the site has more than one admin with responsibility for
backup procedures.
 Verify a backup after it is made; some commands require specific options, while others
provide a means of listing the contents of a media, eg. the -t option used with tar.
 Write-protect a media after a backup has finished.
 Keep a tally on the media of how many times it has been used.
 Consider including an index file at the very start of the backup on the media, eg.:
 ls -AlhFR /home > /home/0000index && tar cv /home

Note: such index files can be large.

 Exploit colour code schemes to denote special attributes, eg. daily vs. weekly vs. monthly
tapes.
 Be aware of any special issues which may be relevant to the type of data being backed
up. For example, movie files can be very large; on SGIs, tar requires the K option in
order to archive files larger than 2GB. Use of this option may mean the archived media is
not compatible with another vendor's version of tar.
 Consult the online guides. Such guides often have a great deal of advice, examples, etc.

tar is a powerful command with a wide range of available options and is used on UNIX systems
worldwide. It is typical of the kind of UNIX command for which an admin is well advised to
read through the entire man page. Other commands in this category include find, rm, etc.

Note: if compatibility between different versions of UNIX is an issue, one can use the lower-
level dd command which allows one to specify more details about how the data is to be dealt
with as it is sent to or received from a backup device, eg. changing the block size of the data. A
related command is 'mt' which can be used to issue specific commands to a magnetic tape device,
eg. print device details and default block size.

If problems occur during backup/restore operations, remember to check /var/adm/SYSLOG for


any relevant error messages (useful if one cannot be present to monitor the operation in person).

Restoring Data from Backup Media.

Restoring non-root-filesystem data is trivial: just use the relevant extraction tool, eg.:

tar xv /dev/tape

However, restoring the root '/' partition usually requires access to an appropriate set of OS CD(s)
and a full system backup tape of the / partition. Further, many OSs may insist that backup and
restore operations at the system level must be performed with a particular tool, eg. Backup and
Restore. If particular tools were required but not used to create the backup, or if the system
cannot boot to a state where normal extraction tools can be used (eg. damage to the /usr section
of the filesystem) then a complete reinstallation of the OS must be done, followed by the
extraction of the backup media ontop of the newly created filesystem using the original tool.

Alternatively, a fresh OS install can be done, then a second empty disk inserted on SCSI ID 2,
setup to be a root disk, the backup media extracted onto the second disk, then the volume header
copied over using dhvtool or other command relevant to the OS being used (this procedure is
similar to disk cloning). Finally, a quick swap of the disks so that the second disk is on SCSI ID
1 and the system is back to normal. I personally prefer this method since it's "cleaner", ie. one
can never be sure that extracting files ontop of an existing file system will result in a final
filesystem that is genuinely identical to the original. By using a second disk in this way, the
psychological uncertainty is removed.

Just like backing up data to a remote device, data can be restored from a remote device as well.
An OS 'system recovery' menu will normally include an option to select such a restoration
method - a full host:/path specification is required.

Note that if a filesystem was archived with a leading / symbol, eg.:

tar cvf /dev/tape /home/pub/movies/misc


then an extraction may fail if an attempt is made to extract the files without changing the
equivalent extraction path, eg. if a student called cmpdw entered the following command with
such a tape while in their home directory:

tar xvf /dev/tape

then the command would fail since students cannot write to the top level of the /home directory.

Thus, the R option can be used (or equivalent option for other commands) to remove leading /
symbols so that files are extracted into the current directory, ie. if cmpdw entered:

tar xvfR /dev/tape

then tar would place the /home data from the tape into the cmpdw's home directory, ie. cmpdw
would see a new directory with the name:

/home/students/cmpdw/home

Other Typical Daily Tasks.

From my own experience, these are the types of task which most admins will likely carry out
every day:

 Check disk usage across the system.


 Check system logs for important messages, eg. system errors and warnings, possible
suspected access attempts from remote systems (hackers), suspicious user activity, etc.
This applies to web server logs too (use script processing to ease analysis).
 Check root's email for relevant messages (eg. printers often send error messages to root in
the form of an email).
 Monitor system status, eg. all systems active and accessible (ping).
 Monitor system performance, eg. server load, CPU-hogging processes running in
background that have been left behind by a careless user, packet collision checks,
network bandwidth checks, etc.
 Ensure all necessary system services are operating correctly.
 Tour the facilities for general reasons, eg. food consumed in rooms where such activity is
prohibited, users who have left themselves logged in by mistake, a printer with a paper
jam that nobody bothered to report, etc. Users are notoriously bad at reporting physical
hardware problems - the usual response to a problem is to find an alternative
system/device and let someone else deal with it.
 Dealing with user problems, eg. "Somebody's changed my password!" (ie. the user has
forgotten their password). Admins should be accessible by users, eg. a public email
address, web feedback form, post box by the office, etc. Of course, a user can always
send an email to the root account, or to the admin's personal account, or simply visit the
admin in person. Some systems, like Indy, may have additional abilities, eg. video
conferencing: a user can use the InPerson software to request a live video/audio link to
the admin's system, allowing 2-way communication (see the inperson man page). Other
facilities such as the talk command can also be employed to contact the admin, eg. at a
remote site. It's up to the admin to decide how accessible she/he should be - discourage
trivial interruptions.
 Work on improving any relevant aspect of system, eg. security, services available to users
(software, hardware), system performance tuning, etc.
 Cleaning systems if they're dirty; a user will complain about a dirty monitor screen or
sticking mouse behaviour, but they'll never clean them for you. Best to prevent
complaints via regular maintenance. Consider other problem areas that may be hidden,
eg. blowing loose toner out of a printer with an air duster can.
 Learning more about UNIX in general.
 Taking necessary breaks! A tired admin will make mistakes.

This isn't a complete list, and some admins will doubtless have additional responsibilities, but the
above describes the usual daily events which define the way I manage the Ve24 network.

Useful file: /etc/motd

The contents of this file will be echoed to stdout whenever a user activates a login shell. Thus,
the message will be shown when:

 a user first logs in (contents in all visible shell windows),


 a user accesses another system using commands such as rlogin and telnet,
 a user creates a new console shell window; from the man page for console, "The console
provides the operator interface to the system. The operating system and system utility
programs display error messages on the system console."

The contents of /etc/motd are not displayed when the user creates a new shell using 'xterm', but is
displayed when winterm is used. The means by which xterm/winterm are executed are irrelevant
(icon, command, Toolchest, etc.)

The motd file can be used as a simple way to notify users of any developments. Be careful of
allowing its contents to become out of date though. Also note that the file is local to each system,
so maintaining a consistent motd between systems might be necessary, eg. a script to copy the
server's motd to all clients.

Other possible ways to inform users of worthy news is the xconfirm command, which could be
included within startup scripts, user setup files, etc. From the xconfirm man page:

"xconfirm displays a line of text for each -t argument specified (or a file when the -file
argument is used), and a button for each -b argument specified. When one of the buttons
is pressed, the label of that button is written to xconfirm's standard output. The enter key
activates the specified default button. This provides a means of communication/feedback
from within shell scripts and a means to display useful information to a user from an
application. Command line options are available to specify geometry, font style, frame
style, modality and one of five different icons to be presented for tailored visual feedback
to the user."

For example, xconfirm could be used to interactively warn the user if their disk quota has been
exceeded.

UNIX Fundamentals: System bootup and shutdown, events, daemons.

SGI's IRIX is based on System V with BSD enhancements. As such, the way an IRIX system
boots up is typical of many UNIX systems. Some interesting features of UNIX can be discovered
by investigating how the system starts up and shuts down.

After power on and initial hardware-level checks, the first major process to execute is the UNIX
kernel file /unix, though this doesn't show up in any process list as displayed by commands such
as ps.

The kernel then starts the init program to begin the bootup sequence, ie. init is the first visible
process to run on any UNIX system. One will always observe init with a process ID of 1:

% ps -ef | grep init | grep -v grep


root 1 0 0 21:01:57 ? 0:00 /etc/init

init is used to activate, or 'spawn', other processes. The /etc/inittab file is used to determine what
processes to spawn.

The lecture on shell scripts introduced the init command, in a situation where a system was made
to reboot using:

init 6

The number is called a 'run level', ie. a software configuration of the system under which only a
selected group of processes exist. Which processes correspond to which run level is defined in
the /etc/inittab file.

A system can be in any one of eight possible run levels: 0 to 6, s and S (the latter two are
identical). The states which most admins will be familiar with are 0 (total shutdown and power
off), 1 (enter system administration mode), 6 (reboot to default state) and S (or s) for 'single-user'
mode, a state commonly used for system administration. The /etc/inittab file contains an
'initdefault' state, ie. the run level to enter by default, which is normally 2, 3 or 4. 2 is the most
common, ie. the full multi-user state with all processes, daemons and services activated.

The /etc/inittab file is constructed so that any special initialisation operations, such as mounting
filesystems, are executed before users are allowed to access the system.

The init man page has a very detailed description of these first few steps of system bootup. Here
is a summary:
An initial console shell is created with which to begin spawning processes. The fact that a shell is
used this early in the boot cycle is a good indication of how closely related shells are to UNIX in
general.

The scripts which init uses to manage processes are stored in the /etc/init.d directory. During
bootup, the files in /etc/rc2.d are used to bring up system processes in the correct order (the
/etc/rc0.d directory is used for shutdown - more on that later). These files are actually links to the
equivalent script files in /etc/init.d.

Each file in /etc/rc2.d (the 2 presumably corresponding to run level 2 by way of a naming
convention) all begin with S followed by two digits (S for 'Spawn' perhaps), causing them to be
executed in a specific order as determined by the first 3 characters of each file (alphanumeric).
Thus, the first file run in the console shell is /etc/rc2.d/S00anounce (a link to /etc/init.d/announce
- use 'more' or load this file into an editor to see what it does). init will run the script with
appropriate arguments depending on whether the procedure being followed is a startup or
shutdown, eg. 'start', 'stop', etc.

The /etc/config directory is used by each script in /etc/init.d to decide what it should do.
/etc/config contains files which correspond to files found in /etc/rc2.d with the same name. These
/etc/config files contain simply 'on' or 'off'. The chkconfig command is used to test the
appropriate file by each script, returning true or false depending on its contents and thus
determining whether the script does anything. An admin uses chkconfig to set the various files'
contents to on or off as desired, eg. to switch a system into stand-alone mode, turn off all
network-related services on the next reboot:

chkconfig network off


chkconfig nfs off
ckkconfig yp off
chkconfig named off
init 6

Enter chkconfig on its own to see the current configuration states.

Lower-level functions are performed first, beginning with a SCSI driver check to ensure that the
system disk is going to be accessed correctly. Next, key file systems are mounted. Then the
following steps occur, IF the relevant /etc/config file contains 'on' for any step which depends on
that fact:

 A check to see if any system crash files are present (core dumps) and if so to send a
message to stdout.
 Display company trademark information if present; set the system name.
 Begin system activity reporting daemons.
 Create a new OS kernel if any system changes have been made which require it (this is
done by testing whether or not any of the files in /var/sysgen are newer than the /unix
kernel file).
 Configure and activate network ports.
 etc.
Further services/systems/tasks to be activated if need be include ip-aliasing, system auditing,
web servers, license server daemons, core dump manager, swap file configuration, mail daemon,
removal of /tmp files, printer daemon, higher-level web servers such as Netscape Administration
Server, cron, PPP, device file checks, and various end-user and application daemons such as the
midi sound daemon which controls midi library access requests.

This isn't a complete list, and servers will likely have more items to deal with than clients, eg.
starting up DNS, NIS, security & auditing daemons, quotas, internet routing daemons, and more
than likely a time daemon to serve as a common source of current time for all clients.

It should be clear that the least important services are executed last - these usually concern user-
related or application-related daemons, eg. AppleTalk, Performance Co-Pilot, X Windows
Display Manager, NetWare, etc.

Even though a server or client may initiate many background daemon processes on bootup,
during normal system operation almost all of them are doing nothing at all. A process which isn't
doing anything is said to be 'idle'. Enter:

ps -ef

The 'C' column shows the activity level of each process. No matter when one checks, almost all
the C entries will be zero. UNIX background daemons only use CPU time when they have to, ie.
they remain idle until called for. This allows a process which truly needs CPU cycles to make
maximum use of available CPU time.

The scripts in /etc/init.d may startup other services if necessary as well. Extra
configuration/script files are often found in /etc/config in the form of a file called
servicename.options, where 'servicename' is the name of the normal script run by init.

Note: the 'verbose' file in /etc/config is used by scripts to dynamically redefine whether the echo
command is used to output progress messages. Each script checks whether verbose mode is on
using the chkconfig command; if on, then a variable called $ECHO is set to 'echo'; if off,
$ECHO is set to something which is interpreted by a shell to mean "ignore everything that
follows this symbol", so setting verbose mode to off means every echo command in every script
(which uses the $ECHO test and set procedure) will produce no output at all - a simple, elegant
and clean way of controlling system behaviour.

When shutting a system down, the behaviour described above is basically just reversed. Scripts
contained in the /etc/rc0.d directory perform the necessary actions, with the name prefixes
determining execution order. Once again, the first three characters of each file name decide the
alphanumeric order in which to execute the scripts; 'K' probably stands for 'Kill'. The files in
/etc/rc0.d shutdown user/application-related daemons first, eg. the MIDI daemon. Comparing the
contents of /etc/rc2.d and /etc/rc0.d, it can be seen that their contents are mirror images of each
other.

The alphanumeric prefixes used for the /etc/rc*.d directories are defined in such a way as to
allow extra scripts to be included in those directories, or rather links to relevant scripts in
/etc/init.d. Thus, a custom 'static route' (to force a client to always route externally via a fixed
route) can be defined by creating new links from /etc/rc2.d/S31network and
/etc/rc0.2/K39network, to a custom file called network.local in /etc/init.d.

There are many numerical gaps amongst the files, allowing for great expansion in the number of
scripts which can be added in the future.

References:

1. Extreme Technologies:
2. http://www.futuretech.vuurwerk.nl/extreme.html
3. DDS1 vs. DDS3 DAT Performance Tests:
4. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT1
5. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT2
6. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT3
7. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT4
8. "Success With DDS Media", Hewlett Packard, Edition 1, February 1991.
Detailed Notes for Day 3 (Part 4)
UNIX Fundamentals: Security and Access Control.

General Security.

Any computer system must be secure, whether it's connected to the Internet or not. Some issues
may be irrelevant for Intranets (isolated networks which may or may not use Internet-style
technologies), but security is still important for any internal network, if only to protect against
employee grievances or accidental damage. Crucially, a system should not be expanded to
include external network connections until internal security has been dealt with, and individual
systems should not be added to a network until they have been properly configured (unless the
changes are of a type which cannot be made until the system is physically connected).

However, security is not an issue which can ever be finalised; one must constantly maintain an
up-to-date understanding of relevant issues and monitor the system using the various available
tools such as 'last' (display recent logins; there are many other available tools and commands).

In older UNIX variants, security mostly involved configuring the contents of various
system/service setup files. Today, many UNIX OSs offer the admin a GUI-frontend security
manager to deal with security issues in a more structured way. In the case of SGI's IRIX, version
6.5 has such a GUI tool, but 6.2 does not. The GUI tool is really just a convenient way of
gathering together all the relevant issues concerning security in a form that is easier to deal with
(ie. less need to look through man pages, online books, etc.) The security issues themselves are
still the same.

UNIX systems have a number of built-in security features which offer a reasonably acceptable
level of security without the need to install any additional software. UNIX gives users a great
deal of flexibility in how they manage and share their files and data; such convenience may be
incompatible with an ideal site security policy, so decisions often have to be taken about how
secure a system is going to be - the more secure a system is, the less flexible for users it
becomes.

Older versions of any UNIX variant will always be less secure than newer ones. If possible, an
admin should always try and use the latest version in order to obtain the best possible default
security. For example, versions of IRIX as old as 5.3 (circa 1994) had some areas of subtle
system functionality rather open by default (eg. some feature or service turned on), whereas
versions later than 6.0 turned off the features to improve the security of a default installation -
UNIX vendors began making these changes in order to comply with the more rigorous standards
demanded by the Internet age.

Standard UNIX security features include:

1. File ownership,
2. File permissions,
3. System activity monitoring tools, eg. who, ps, log files,
4. Encryption-based, password-protected user accounts,
5. An encryption program (crypt) which any user can exploit.

Figure 60. Standard UNIX security features.

All except the last item above have already been discussed in previous lectures.

The 'crypt' command can be used by the admin and users to encrypt data, using an encryption
key supplied as an argument. Crypt employs an encryption schema based on similar ideas used in
the German 'Enigma' machine in WWII, although crypt's implementation of the mathematical
equivalent is much more complex, like having a much bigger and more sophisticated Enigma
machine. Crypt is a satisfactorily secure program; the man page says, "Methods of attack on such
machines are known, but not widely; moreover the amount of work required is likely to be
large."

However, since crypt requires the key to be supplied as an argument, commands such as ps could
be used by others to observe the command in operation, and hence the key. This is crypt's only
weakness. See the crypt man page for full details on how crypt is used.

Responsibility.

Though an admin has to implement security policies and monitor the system, ordinary users are
no less responsible for ensuring system security in those areas where they have influence and can
make a difference. Besides managing their passwords carefully, users should control the
availability of their data using appropriate read, write and execute file permissions, and be aware
of the security issues surrounding areas such as accessing the Internet.

Security is not just software and system files though. Physical aspects of the system are also
important and should be noted by users as well as the admin.

Thus:

 Any item not secured with a lock, cable, etc. can be removed by anyone who has physical
access.
 Backups should be securely stored.
 Consider the use of video surveillance equipment and some form of metal-key/key-
card/numeric-code entry system for important areas.
 Account passwords enable actions performed on the system to be traced. All accounts should
have passwords. Badly chosen passwords, and old passwords, can compromise security. An
admin should consider using password-cracking software to ensure that poorly chosen
passwords are not in use.
 Group permissions for files should be set appropriately (user, group, others).
 Guest accounts can be used anonymously; if a guest account is necessary, the tasks which can
be carried out when logged in as guest should be restricted. Having open guest accounts on
multiple systems which do not have common ordinary accounts is unwise - it allows users to
anonymously exchange data between such systems when their normal accounts would not
allow them to do so. Accounts such as guest can be useful, but they should be used with care,
especially if they are left with no password.
 Unused accounts should be locked out, or backed up and removed.
 If a staff member leaves the organisation, passwords should be changed to ensure such former
users do not retain access.
 Sensitive data should not be kept on systems with more open access such as anonymous ftp and
modem dialup accounts.
 Use of the su command amongst users should be discouraged. Its use may be legitimate, but it
encourages lax security (ordinary users have to exchange passwords in order to use su). Monitor
the /var/adm/sulog file for any suspicious use of su.
 Ensure that key files owned by a user are writeable only by that user, thus preventing 'trojan
horse' attacks. This also applies to root-owned files/dirs, eg. /, /bin, /usr/bin, /etc, /var, and so
on. Use find and other tools to locate directories that are globally writeable - if such a directory
is a user's home directory, consider contacting the user for further details as to why their home
directory has been left so open. For added security, use an account-creation schema which sets
users' home directories to not be readable by groups or others by default.
 Instruct users not to leave logged-in terminals unattended. The xlock command is available to
secure an unattended workstation but its use for long periods may be regarded as inconsiderate
by other users who are not able to use the terminal, leading to the temptation of rebooting the
machine, perhaps causing the logged-in user to lose data.
 Only vendor-supplied software should be fully trusted. Commercial 3rd-party software should
be ok as long as one has confidence in the supplier, but shareware or freeware software must
be treated with care, especially if such software is in the form of precompiled ready-to-run
binaries (precompiled non-vendor software might contain malicious code). Software distributed
in source code form is safer, but caution is still required, especially if executables have to be
owned by root and installed using the set-UID feature in order to run. Set-UID and set-GID
programs have legitimate uses, but because they are potentially harmful, their presence on a
system should be minimised. The find command can be used to locate such files, while older file
system types (eg. EFS) can be searched with commands such as ncheck.
 Network hardware can be physically tapped to eavesdrop on network traffic. If security must be
particularly tight, keep important network hardware secure (eg. locked cupboard) and regularly
check other network items (cables, etc.) for any sign of attack. Consider using specially secure
areas for certain hardware items, and make it easy to examine cabling if possible (keep an up-to-
date printed map to aid checks). Fibre-optic cables are harder to interfere with, eg. FDDI.
Consider using video surveillance technologies in such situations.
 Espionage and sabotage are issues which some admins may have to be aware of, especially
where commercially sensitive or government/police-related work data is being manipulated.
Simple example: could someone see a monitor screen through a window using a telescope?
What about RF radiation? Remote scanners can pickup stray monitor emissions, so consider
appropriate RF shielding (Faraday Cage). What about insecure phone lines? Could someone,
even an ordinary user, attach a modem to a system and dial out, or allow someone else to dial
in?
 Keep up-to-date with security issues; monitor security-related sites such as www.rootshell.com,
UKERNA, JANET, CERT, etc. [7]. Follow any extra advice given in vendor-specific security FAQ
files (usually posted to relevant 'announce' or 'misc' newsgroups, eg. comp.sys.sgi.misc). Most
UNIX vendors also have an anonymous ftp site from which customers can obtain security
patches and other related information. Consider joining any specialised mailing lists that may be
available.
 If necessary tasks are beyond one's experience and capabilities, consider employing a vendor-
recommended external security consultancy team.
 Exploit any special features of the UNIX system being used, eg. at night, an Indy's digital camera
could be used to send single frames twice a second across the network to a remote system for
subsequent compression, time-stamping and recording. NB: this is a real example which SGI
once helped a customer to do in order to catch some memory thieves.

Figure 61. Aspects of a system relevant to security.

Since basic security on UNIX systems relies primarily on login accounts, passwords, file
ownership and file permissions, proper administration and adequate education of users is
normally sufficient to provide adequate security for most sites. Lapses in security are usually
caused by human error, or improper use of system security features. Extra security actions such
as commercial security-related software are not worth considering if even basic features are not
used or are compromised via incompetence.

An admin can alter the way in which failed login attempts are dealt with by configuring the
/etc/default/login file. There are many possibilities and options - see the 'login' reference page for
details (man login). For example, an effective way to enhance security is to make repeated
guessing of account passwords an increasingly slow process by penalising further login attempts
with ever increasing delays between login failures. Note that GUI-based login systems may not
support features such as this, though one can always deactivate them via an appropriate
chkconfig command.

Most UNIX vendors offer the use of hardware-level PROM passwords to provide an extra level
of security, ie. a password is required from a users who attempts to gain access to any low-level
hardware PROM-based 'Command Monitor', giving greater control over who can carry out
admin-level actions. While PROM passwords cannot prevent physical theft (eg. someone
stealing a disk and accessing its data by installing it as an option drive on another system), they
do limit the ability of malicious users to boot a system using their own program or device (a
common flaw with Mac systems), or otherwise harm the system at its lowest level. If the PROM
password has been forgotten, the root user can reset it. If both are lost, then one will usually have
to resort to setting a special jumper on the system motherboard, or temporarily removing the
PROM chip altogether (the loss of power to the chip resets the password).

Shadow Passwords

If the /etc/passwd file can be read by users, then there is scope for users to take a copy away to
be brute-force tested with password-cracking software. The solution is to use a shadow password
file called /etc/shadow - this is a copy of the ordinary password file (/etc/passwd) which cannot
be accessed by non-root users. When in use, the password fields in /etc/passwd are replaced with
an 'x'. All the usual password-related programs work in the same way as before, though shadow
passwords are dealt with in a different way for systems using NIS (this is because NIS keeps all
password data for ordinary users in a different file called /etc/passwd.nis). Users won't notice any
difference when shadow passwords are in use, except that they won't be able to see the encrypted
form of their password anymore.

The use of shadow passwords is activated simply by running the 'pwconv' program (see the man
page for details). Shadow passwords are in effect as soon as this command has been executed.

Password Ageing.

An admin can force passwords to age automatically, ensuring that users must set a new password
at desired intervals, or no earlier than a certain interval, or even immediately. The passwd
command is used to control the various available options. Note that NIS does not support
password ageing.

Choosing Passwords.

Words from the dictionary should not be used, nor should obvious items such as film characters
and titles, names of relatives, car number plates, etc. Passwords should include obscure
characters, digits and punctuation marks. Consider using and mixing words from other
languages, eg. Finnish, Russian, etc.

An admin should not use the same root password for more than one system, unless there is good
reason.

When a new account is created, a password should be set there and then. If the user is not
immediately present, a default password such as 'password' might be used in the expectation that
the user will login in immediately and change it to something more suitable. An admin should
lockout the account if the password isn't changed after some duration: replace the password entry
for the user concerned in the /etc/passwd file with anything that contains at least one character
that is not used by the encryption schema, eg. '*'.

Modern UNIX systems often include a minimum password length and may insist on certain rules
about what a password can be, eg. at least one digit.

Network Security.

As with other areas of security, GUI tools may be available for controlling network-related
security issues, especially those concerning the Internet. Since GUI tools may vary between
different UNIX OSs, this discussion deals mainly with the command line tools and related files.

Reminder: there is little point in tightening network security if local security has not yet been
dealt with, or is lax.

Apart from the /etc/passwd file, the other important files which control network behaviour are:
/etc/hosts.equiv A list of trusted hosts.

.rhosts A list of hosts that are allowed


access to a specific user account.

Figure 62. Files relevant to network behaviour.

These three files determine whether a host will accept an access request from programs such as
rlogin, rcp, rsh, or rdist. Both hosts.equiv and .rhosts have reference pages (use 'man hosts.equiv'
and 'man rhosts').

Suppose a user on host A attempts to access a remote host B. As long as the hosts.equiv file on B
contains the host name of A, and B's /etc/passwd lists A's user ID as a valid account, then no
further checks occur and the access is granted (all successful logins are recorded in
/var/adm/SYSLOG). The hosts.equiv file used by the Ve24 Indys contains the following:

localhost
yoda.comp.uclan.ac.uk
akira.comp.uclan.ac.uk
ash.comp.uclan.ac.uk
cameron.comp.uclan.ac.uk
chan.comp.uclan.ac.uk
conan.comp.uclan.ac.uk
gibson.comp.uclan.ac.uk
indiana.comp.uclan.ac.uk
leon.comp.uclan.ac.uk
merlin.comp.uclan.ac.uk
nikita.comp.uclan.ac.uk
ridley.comp.uclan.ac.uk
sevrin.comp.uclan.ac.uk
solo.comp.uclan.ac.uk
spock.comp.uclan.ac.uk
stanley.comp.uclan.ac.uk
warlock.comp.uclan.ac.uk
wolfen.comp.uclan.ac.uk
woo.comp.uclan.ac.uk
milamber.comp.uclan.ac.uk

Figure 63. hosts.equiv files used by Ve24 Indys.

Thus, once logged into one of the Indys, a user can rlogin directly to any of the other Indys
without having to enter their password again, and can execute rsh commands, etc. A staff
member logged into Yoda can login into any of the Ve24 Indys too (students cannot do this).

The hosts.equiv files on Yoda and Milamber are completely different, containing only references
to each other as needed. Yoda's hosts.equiv file contains:

localhost
milamber.comp.uclan.ac.uk
Figure 64. hosts.equiv file for yoda.

Thus, Yoda trusts Milamber. However, Milamber's hosts.equiv only contains:

localhost

Figure 65. hosts.equiv file for milamber.

ie. Milamber doesn't trust Yoda, the rationale being that even if Yoda's root security is
compromised, logging in to Milamber as root is blocked. Hence, even if a hack attack damaged
the server and Ve24 clients, I would still have at least one fully functional secure machine with
which to tackle the problem upon its discovery.

Users can extend the functionality of hosts.equiv by using a .rhosts file in their home directory,
enabling or disabling access based on host names, group names and specific user account names.

The root login only uses the /.rhosts file if one is present - /etc/hosts.equiv is ignored.

NOTE: an entry for root in /.rhosts on a local system allows root users on a remote system to
gain local root access. Thus, including the root name in /.rhosts is unwise. Instead, file transfers
can be more securely dealt with using ftp via a guest account, or through an NFS-mounted
directory. An admin should be very selective as to the entries included in root's .rhosts file.

A user's .rhosts file must be owned by either the user or root. If it is owned by anyone else, or if
the file permissions are such that it is writeable by someone else, then the system ignores the
contents of the user's .rhosts file by default.

An admin may decide it's better to bar the use of .rhosts files completely, perhaps because an
external network of unknown security status is connected. The .rhosts files can be barred by
adding a -l option to the rshd line in /etc/inetd.conf (use 'man rshd' for further details).

Thus, the relationship between the 20 different machines which form the SGI network I run is as
follows:

 All the Indys in Ve24 trust each other, as well as Yoda and Milamber.
 Yoda only trusts Milamber.
 Milamber doesn't trust any system.

With respect to choosing root passwords, I decided to use the following configuration:

 All Ve24 systems have the same root password and the same PROM password.
 Yoda and Milamber have their own separate passwords, distinct from all others.

This design has two deliberate consequences:

 Ordinary users have flexible access between the Indys in Ve24,


 If the root account of any of the Ve24 Indys is compromised, the unauthorised user will not be
able to gain access to Yoda or Milamber as root. However, the use of NFS compromises such a
schema since, for example, a root user on a Ve24 Indy could easily alter any files in /home,
/var/mail, /usr/share and /mapleson.

With respect to the use of identical root and PROM passwords on the Ve24 machines: because Internet
access (via a proxy server) has recently been setup for users, I will probably change the schema in order
to hinder brute force attacks.

The /etc/passwd File and NIS.

The NIS service enables users to login to a client by including the following entry as the last line
in the client's /etc/passwd file:

+::0:0:::

Figure 66. Additional line in /etc/passwd enabling NIS.

For simplicity, a + on its own can be used. I prefer to use the longer version so that if I want to
make changes, the fields to change are immediately visible.

If a user logs on with an account ID which is not listed in the /etc/passwd file as a local account,
then such an entry at the end of the file instructs the system to try and get the account
information from the NIS server, ie. Yoda. Since Yoda and Milamber do not include this extra
line in /etc/passwd, students cannot login to them with their own ID anyway, no matter the
contents of .rhosts and hosts.equiv.

inetd and inetd.conf

inetd is the 'Internet Super-server'. inetd listens for requests for network services, executing the
appropriate program for each request.

inetd is started on bootup by the /etc/init.d/network script (called by the /etc/rc2.d/S30network


link via the init process). It reads its configuration information from /etc/inetd.conf.

By using a super-daemon in this way, a single daemon is able to invoke other daemons when
necessary, reducing system load and using resources such as memory more efficiently.

The /etc/inetd.conf file controls how various network services are configured, eg. logging
options, debugging modes, service restrictions, the use of the bootp protocol for remote OS
installation, etc. An admin can control services and logging behaviour by customising this file. A
reference page is available with complete information ('man inetd').
Services communicate using 'port' numbers, rather like separate channels on a CB radio.
Blocking the use of certain port numbers is a simple way of preventing a particular service from
being used. Network/Internet services and their associated port numbers are contained in the
/etc/services database. An admin can use the 'fuser' command to identify which processes are
currently using a particular port, eg. to see the current use of TCP port 25:

fuser 25/tcp

On Yoda, an output similar to the following would be given:

yoda # fuser 25/tcp


25/tcp: 855o
yoda # ps -ef | grep 855 | grep -v grep
root 855 1 0 Apr 27 ? 5:01 /usr/lib/sendmail -bd -q15m

Figure 67. Typical output from fuser.

Insert (a quick example of typical information hunting): an admin wants to do the same on the
ftp port, but can't remember the port number. Solution: use grep to find the port number from
/etc/services:

yoda 25# grep ftp /etc/services


ftp-data 20/tcp
ftp 21/tcp
tftp 69/udp
sftp 115/tcp
yoda 26# fuser 21/tcp
21/tcp: 255o
yoda 28# ps -ef | grep 255 | grep -v grep
root 255 1 0 Apr 27 ? 0:04 /usr/etc/inetd
senslm 857 255 0 Apr 27 ? 11:44 fam
root 11582 255 1 09:49:57 pts/1 0:01 rlogind

An important aspect of the inetd.conf file is the user name field which determines which user ID
each process runs under. Changing this field to a less privileged ID (eg. nobody) enables system
service processes to be given lower access permissions than root, which may be useful for further
enhancing security. Notice that services such as http (the WWW) are normally already set to run
as nobody. Proxy servers should also run as nobody, otherwise http requests may be able to
retrieve files such as /etc/passwd (however, some systems may have the nobody user defined so
that it cannot run programs, so another user may have to be used - an admin can make one up).

Another common modification made to inetd.conf in order to improve security is to restrict the
use of the finger command, eg. with -S to prevent login status, home directory and shell
information from being given out. Or more commonly the -f option is used which forces any
finger request to just return the contents of a file, eg. yoda's entry for the finger service looks like
this:

finger stream tcp nowait guest /usr/etc/fingerd fingerd -f


/etc/fingerd.message
Figure 68. Blocking the use of finger in the /etc/inetd.conf file.

Thus, any remote user who executes a finger request to yoda is given a brief message [3].

If changes are made to the inetd.conf file, then inetd must be notified of the changes, either by
rebooting the system or via the following command (which doesn't require a reboot afterwards):

killall -HUP inetd

Figure 69. Instructing inetd to restart itself (using killall).

In general, a local trusted network is less likely to require a highly restricted set of services, ie.
modifying inetd.conf becomes more important when connecting to external networks, especially
the Internet. Thus, an admin should be aware that creating a very secure inetd.conf file on an
isolated network or Intranet may be unduly harsh on ordinary users.

X11 Windows Network Access

The X Windows system is a window system available for a wide variety of different computer
platforms which use bitmap displays [8]. Its development is managed by the X Consortium, Inc.
On SGI IRIX systems, the X Windows server daemon is called 'Xsgi' and conforms to Release 6
of the X11 standard (X11R6).

The X server, Xsgi, manages the flow of user/application input and output requests to/from client
programs using a number of interprocess communication links. The xdm daemon acts as the
display manager. Usually, user programs are running on the same host as the X server, but X
Windows also supports the display of client programs which are actually running on remote
hosts, even systems using completely different OSs and hardware platforms, ie. X is network-
transparent.

The X man page says:

"X supports overlapping hierarchical subwindows and text and


graphics operations, on both monochrome and color displays."

One unique side effect of this is that access to application mouse menus is independent of
application focus, requiring only a single mouse click for such actions. For example, suppose
two application windows are visible on screen:

 a jot editor session containing an unsaved file (eg. /etc/passwd.nis),


 a shell window which is partially obscuring the jot window.

With the shell window selected, the admin is about to run /var/yp/ypmake to reparse the password
database file, but realises the file isn't saved. Moving the mouse over the partially hidden jot window,
the admin holds down the right mouse button: this brings up jot's right-button menu (which may or may
not be partly ontop of the shell window even though the jot window is at the back) from which the
admin clicks on 'Save'; the menu disappears, the file is saved, but the shell window is still on top of the
jot window, ie. their relative front/back positions haven't changed during the operation.

The ability of X to process screen events independently of which application window is currently
in focus is a surprisingly useful time-saving feature. Every time a user does an action like this, at
least one extraneous mouse click is prevented; this can be shown by comparing to MS Windows
interfaces:

 Under Win95 and Win98, trying to access an application's right-button menu when the
application's window is currently not in focus requires at least two extraneous mouse clicks: the
first click brings the application in focus (ie. to the front), the second brings up the menu, and a
third (perhaps more if the original application window is now completely hidden) brings the
original application window back to the front and in focus. Thus, X is at least 66% more efficient
for carrying out this action compared to Win95/Win98.
 Under WindowsNT, attempting the same action requires at least one extraneous mouse click:
the first click brings the application in focus and reveals the menu, and a second (perhaps more,
etc.) brings the original application window back to the front and in focus. Thus, X is at least 50%
more efficient for carrying out this action compared to NT.

The same effect can be seen when accessing middle-mouse menus or actions under X, eg. text can be
highlighted and pasted to an application with the middle-mouse button even when that application is
not in focus and not at the front. This is a classic example of how much more advanced X is over
Microsoft's GUI interface technologies, even though X is now quite old. X also works in a way which links
to graphics libraries such as OpenGL.

Note that most UNIX-based hardware platforms use video frame buffer configurations which
allow a large number of windows to be present without causing colour map swapping or other
side effects, ie. the ability to have multiple overlapping windows is a feature supported in
hardware, eg. Indigo2 [6].

X is a widely used system, with emulators available for systems which don't normally use X, eg.
Windows Exceed for PCs.

Under the X Window System, users can run programs transparently on remote hosts that are part
of the local network, and can even run applications on remote hosts across the Internet with the
windows displayed locally if all the various necessary access permissions have been correctly set
at both ends. An 'X Display Variable' is used to denote which host the application should attempt
to display its windows on. Thus, assuming a connection with a remote host to which one had
authorised telnet access (eg. haarlem.vuurwerk.nl), from a local host whose domain name is
properly visible on the Internet (eg. thunder.uclan.ac.uk), then the local display of applications
running on the remote host is enabled with a command such as:

haarlem% set DISPLAY = thunder.uclan.ac.uk:0.0


I've successfully used this method while at Heriot Watt to run an xedit editor on a remote system
in England but with the xedit window itself displayed on the monitor attached to the system I
was physically using in Scotland.

The kind of inter-system access made possible by X has nothing to do with login accouns,
passwords, etc. and is instead controlled via the X protocols. The 'X' man page has full details,
but note: the man page for X is quite large.

A user can utilise the xhost command to control access to their X display. eg. 'xhost -' bars access
from all users, while 'xhost +harry' gives X access to the user harry.

Note that system-level commands and files which relate to xhost and X in general are stored in
/var/X11/xdm.

Firewalls [4].

A firewall is a means by which a local network of trusted hosts can be connected to an external
untrusted network, such as the Internet, in a more secure manner than would otherwise be the
case. 'Firewall' is a conceptual idea which refers to a combination of hardware and software steps
taken to setup a desired level of security; although an admin can setup a firewall via basic steps
with as-supplied tools, all modern systems have commercial packages available to aid in the task
of setting up a firewall environment, eg. Gauntlet for IRIX systems.

As with other security measures, there is a tradeoff between ease of monitoring/administration,


the degree of security required, and the wishes/needs of users. A drawback of firewalls is when a
user has a legitimate need to access packets which are filtered out - an alternative is to have each
host on the local network configured according to a strict security regime.

The simplest form of a firewall is a host with more than one network interface, called a dual-
homed host [9]. Such hosts effectively exist on two networks at once. By configuring such a host
in an appropriate manner, it acts as a controllable obstruction between the local and external
network, eg. the Internet.
A firewall does not affect the communications between hosts on an internal network; only the
way in which the internal network interacts with the external connection is affected. Also, the
presence of a firewall should not be used as an excuse for having less restrictive security
measures on the internal network.

One might at first think that Yoda could be described as a firewall, but it is not, for a variety of
reasons. Ideally, a firewall host should be treated thus:

 no ordinary user accounts (root admin only, with a different password),


 as few services as possible (the more services are permitted, the greater is the chance of a
security hole; newer, less-tested software is more likely to be at risk) and definitely no NIS or
NFS,
 constantly monitored for access attempts and unusual changes in files, directories and software
(commands: w, ps, 'versions changed', etc.),
 log files regularly checked (and not stored on the firewall host!),
 no unnecessary applications,
 no anonymous ftp!

Yoda breaks several of these guidelines, so it cannot be regarded as a firewall, even though a range of
significant security measures are in place. Ideally, an extra host should be used, eg. an Indy (additional
Ethernet card required to provide the second Ethernet port), or a further server such as Challenge S. A
simple system like Indy is sufficient though, or other UNIX system such as an HP, Sun, Dec, etc. - a Linux
PC should not be used though since Linux has too many security holes in its present form. [1]

Services can be restricted by making changes to files such as /etc/inetd.conf, /etc/services, and
others. Monitoring can be aided via the use of free security-related packages such as COPS - this
package can also check for bad file permission settings, poorly chosen passwords, system setup
file integrity, root security settings, and many other things. COPS can be downloaded from:
ftp://ftp.cert.org/pub/tools/cops

Monitoring a firewall host is also a prime candidate for using scripts to automate the monitoring
process.

Other free tools include Tripwire, a file and directory integrity checker:

ftp://ftp.cert.org/pub/tools/tripwire

With Tripwire, files are monitored and compared to information stored in a database. If files
change when they're supposed to remain static according to the database, the differences are
logged and flagged for attention. If used regularly, eg. via cron, action can be taken immediately
if something happens such as a hacking attempt.

Firewall environments often include a router - a high speed packet filtering machine installed
either privately or by the ISP providing the external connection. Usually, a router is installed
inbetween a dual-homed host and the outside world [9]. This is how yoda is connected, via a
router whose address is 193.61.250.33, then through a second router at 193.61.250.65 before
finally reaching the JANET gateway at Manchester.

Routers are not very flexible (eg. no support for application-level access restriction systems such
as proxy servers), but their packet-filtering abilities do provide a degree of security, eg. the router
at 193.61.250.33 only accepts packets on the 193.61.250.* address space.

However, because routers can block packet types, ports, etc. it is possible to be overly restrictive
with their use, eg. yoda cannot receive USENET packets because they're blocked by the router.
In such a scenario, users must resort to using WWW-based news services (eg. DejaNews) which
are obviously less secure than running and managing a locally controlled USENET server, as
well as being more wasteful of network resources.

Accessing sites on the web poses similar security problems to downloading and using Internet-
sourced software, ie. the source is untrusted, unless vendor-verified with checksums, etc. When a
user accesses a site and attempts to retrieves data, what happens next cannot be predicted, eg. a
malicious executable program could be downloaded (this is unlikely to damage root-owned files,
but users could lose data if they're not careful). Users should be educated on these issues, eg.
turning off Java script features and disallowing cookies if necessary.

If web access is of particular concern with regard to security, one solution is to restrict web
access to just a limited number of internal hosts.

Anonymous ftp.

An anonymous FTP account allows a site to make information available to anyone, while still
maintaining control over access issues. Users can login to an anonymous FTP account as
'anonymous' or 'ftp'. The 'chroot' command is used to put the user in the home directory for
anonymous ftp access (~ftp), preventing access to other parts of the filesystem. A firewall host
should definitely not have an anonymous FTP account. A site should not provide such a service
unless absolutely necessary, but if it does then an understanding of how the anonymous FTP
access system works is essential to ensuring site security, eg. preventing outside agents from
using the site as a transfer point for pirated software. How an anon FTP account is used should
be regularly monitored.

Details of how to setup an anon FTP account can usually be found in a vendor's online
information; for IRIX, the relevant source is the section entitled, "Setting Up an Anonymous
FTP Account" in chapter three of the, "IRIX Admin: Networking and Mail" guide.

UNIX Fundamentals: Internet access: files and services. Email.

For most users, the Internet means the World Wide Web ('http' service), but this is just one
service out of many, and was in fact a very late addition to the Internet as a whole. Before the
advent of the web, Internet users were familiar with and used a wide range of services, including:

 telnet (interactive login sessions on remote hosts),



 ftp (file/data transfer using continuous connections),

 tftp (file/data transfer using temporary connections)

 NNTP (Internet newsgroups, ie. USENET)

 SMTP (email)

 gopher (remote host data searching and retrieval system)

 archie (another data-retrieval system)

 finger (probe remote site for user/account information)

 DNS (Domain Name Service)

Exactly which services users can use is a decision best made by consultation, though some users
may have a genuine need for particular services, eg. many public database systems on sites such
as NASA are accessed by telnet only.

Disallowing a service automatically improves security, but the main drawback will always be a
less flexible system from a user's point of view, ie. a balance must be struck between the need for
security and the needs of users. However, such discussions may be irrelevant if existing site
policies already state what is permitted, eg. UCLAN's campus network has no USENET service,
so users exploit suitable external services such as DejaNews [2].

For the majority of admins, the most important Internet service which should be appropriately
configured with respect to security is the web, especially considering today's prevalence of Java,
Java Script, and browser cookie files. It is all too easy for a modern web user to give out a
surprising amount of information about the system they're using without ever knowing it.
Features such as cookies and Java allow a browser to send a substantial amount of information to
a remote host about the user's environment (machine type, OS, browser type and version, etc.);
there are sites on the web which an admin can use to test how secure a user's browser
environment is - the site will display as much information as it can extract using all methods, so
if such sites can only report very little or nothing in return, then that is a sign of good security
with respect to user-side web issues.

There are many good web server software systems available, eg. Apache. Some even come free,
or are designed for local Intranet use on each host. However, for enhanced security, a site should
use a professional suite of web server software such as Netscape Enterprise Server; these
packages come with more advanced control mechanisms and security management features, the
configuration of which is controlled by GUI-based front-end servers, eg. Netscape
Administration Server. Similarly, lightweight proxy servers are available, but a site should a
professional solution, eg. Netscape Proxy Server. The GUI administration of web server software
makes it much easier for an admin to configure security issues such as access and service
restrictions, permitted data types, blocked sites, logging settings, etc.

Example: after the proxy server on the SGI network was installed, I noticed that users of the
campus-wide PC network were using Yoda as a proxy server, which would give them a faster
service than the University's proxy server. A proxy server which is accessible in this way is said
to be 'open'. Since all accesses from the campus PCs appear in the web logs as if they originate
from the Novix security system (ie. there is no indication of individual workstation or user), any
illegal activity would be untraceable. Thus, I decided to prevent campus PCs from using Yoda as
a proxy. The mechanism employed to achieve this was the ipfilterd program, which I had heard
of before but not used.

ipfilterd is a network packet-filtering daemon which screens all incoming IP packets based on
source/destination IP address, physical network interface, IP protocol number, source/destination
TCP/UDP port number, required service type (eg. ftp, telnet, etc.) or a combination of these. Up
to 1000 filters can be used. To improve efficiency, a configurable memory caching mechanism is
used to retain recently decided filter verdicts for a specified duration.

ipfilterd operates by using a searchable database of packet-filtering clauses stored in the


/etc/ipfilterd.conf file. Each incoming packet is compared with the filters in the file one at a time
until a match is found; if no match occurs, the packet is rejected by default. Since filtering is a
line-by-line database search process, the order in which filters are listed is important, eg. a reject
clause to exclude a particular source IP address from Ethernet port ec0 would have no effect if an
accept clause was earlier in the file that accepted all IP data from ec0, ie. in this case, the reject
should be listed before the accept. IP addresses may be specified in hex, dot format (eg.
193.61.255.4 - see the man page for 'inet'), host name or fully-qualified host name.

With IRIX 6.2, ipfilterd is not installed by default. After consulting with SGI to identify the
appropriate source CD, the software was installed, /etc/ipfilterd.conf defined, and the system
activated with:

chkconfig -f ipfilterd on
reboot

Since there was no ipfilterd on/off flag file in /etc/config by default, the -f forces the creation of
such a file with the given state.

Filters in the /etc/ipfilterd.conf file consist of a keyword and an expression denoting the type of
filter to be used; available keywords are:

 accept Accept all packets matching this filter



 reject Discard all packets matching this filter (silently)

 grab Grab all packets matching this filter

 define Define a new macro

ipfilterd supports macros, with no limit to the number of macros used.

Yoda's /etc/ipfilterd.conf file looks like this:


#
# ipfilterd.conf
# $Revision: 1.3 $
#
# Configuration file for ipfilterd(1M) IP layer packet filtering.
# Lines that begin with # are comments and are ignored.
# Lines begin with a keyword, followed either by a macro definition or
# by an optional interface filter, which may be followed by a protocol
filter.
# Both macros and filters use SGI's netsnoop(1M) filter syntax.
#
# The currently supported keywords are:
# accept : accept all packets matching this filter
# reject : silently discard packets matching this filter
# define : define a new macro to add to the standard netsnoop macros
#
# See the ipfilterd(1M) man page for examples of filters and macros.
#
# The network administrator may find the following macros useful:
#
define ip.netAsrc (src&0xff000000)=$1
define ip.netAdst (dst&0xff000000)=$1
define ip.netBsrc (src&0xffff0000)=$1
define ip.netBdst (dst&0xffff0000)=$1
define ip.netCsrc (src&0xffffff00)=$1
define ip.netCdst (dst&0xffffff00)=$1
define ip.notnetAsrc not((src&0xff000000)=$1)
define ip.notnetAdst not((dst&0xff000000)=$1)
define ip.notnetBsrc not((src&0xffff0000)=$1)
define ip.notnetBdst not((dst&0xffff0000)=$1)
define ip.notnetCsrc not((src&0xffffff00)=$1)
define ip.notnetCdst not((dst&0xffffff00)=$1)
#
# Additional macros:
#
# Filters follow:
#
accept -i ec0
reject -i ec3 ip.src 193.61.255.21 ip.dst 193.61.250.34
reject -i ec3 ip.src 193.61.255.22 ip.dst 193.61.250.34
accept -i ec3

Any packet coming from an SGI network machine is immediately accepted (traffic on the ec0
network interface). The web logs contained two different source IP addresses for accesses
coming from the campus PC network. These are rejected first if detected; a final accept clause is
then included so that all other types of packet are accepted.

The current contents of Yoda's ipfilterd.conf file does mean that campus PC users will not be
able to access Yoda as a web server either, ie. requests to www.comp.uclan.ac.uk by legitimate
users will be blocked too. Thus, the above contents of the file are experimental. Further
refinement is required so that accesses to Yoda's web pages are accepted, while requests which
try to use Yoda as a proxy to access non-UCLAN sites are rejected. This can be done by using
the ipfilterd-expression equivalent of the following if/then C-style statement:
if ((source IP is campus PC) and (destination IP is not Yoda)) then
reject packet;

Using ipfilterd has system resource implications. Filter verdicts stored in the ipfilterd cache by
the kernel take up memory; if the cache size is increased, more memory is used. A longer cache
and/or a larger number of filters means a greater processing overhead before each packet is dealt
with. Thus, for busy networks, a faster processor may be required to handle the extra load, and
perhaps more RAM if an admin increases the ipfilterd kernel cache size. In order to monitor such
issues and make decisions about resource implications as a result of using ipfilterd, the daemon
can be executed with the -d option which causes extra logging information about each filter to be
added to /var/adm/SYSLOG, ie. an /etc/config/ipfilterd.options file should be created, containing
'-d'.

As well as using programs like 'top' and 'ps' to monitor CPU loading and memory usage, log files
should be monitored to ensure they do not become too large, wasting disk space (the same
applies to any kind of log file). System logs are 'rotated' automatically to prevent this from
happening, but other logs created by 3rd-party software usually are not; such log files are not
normally stored in /var/adm either. For example, the proxy server logs are in this directory:

/var/netscape/suitespot/proxy-sysname-proxy/logs

If an admin wishes to retain the contents of older system logs such as /var/adm/oSYSLOG, then
the log file could be copied to a safe location at regular intervals, eg. once per night (the old log
file could then be emptied to save space).

A wise policy would be to create scripts which process the logs, summarising the data in a more
intuitive form. General shell script methods and programs such as grep can be used for this.

The above is just one example of the typical type of problem and its consequences that admins
come up against when managing a system:

 The first problem was how to give SGI network users Internet access, the solution to which was
a proxy server. Unfortunately, this allowed campus-PC users to exploit Yoda as an open proxy,
so ipfilterd was then employed to prevent such unauthorised use.

Thus, as stated in the introduction, managing system security is an ongoing, dynamic process.

Another example problem: in 1998, I noticed that some students were not using the SGIs (or not
asking if they could) because they thought the machines were turned off, ie. the monitor power-
saving feature would blank out the screen after some duration. I decided to alter the way the
Ve24 Indys behaved so that monitor power-saving would be deactivated during the day, but
would still happen overnight.

The solution I found was to modify the /var/X11/xdm/Xlogin file. This file contains a section
controlling monitor power-saving using the xset command, which normally looks like this:
#if [ -x /usr/bin/X11/xset ] ; then
# /usr/bin/X11/xset s 600 3600
#fi

If these lines are uncommented (the hash symbols removed), a system whose monitor supports
power-saving will tell the monitor to power down after ten minutes of unuse, after the last user
logs out. With the lines still commented out, modern SGI monitors use power-saving by default
anyway.

I created two new files in /var/X11/xdm:

-rwxr-xr-x 1 root sys 1358 Oct 28 1998 Xlogin.powersaveoff*


-rwxr-xr-x 1 root sys 1361 Oct 28 1998 Xlogin.powersaveon*

They are identical except for the the section concerning power-saving. Xlogin.powersaveoff
contains:

if [ -x /usr/bin/X11/xset ] ; then
/usr/bin/X11/xset s 0 0
fi

while Xlogin.powersaveon contains:

#if [ -x /usr/bin/X11/xset ] ; then


# /usr/bin/X11/xset s 0 0
#fi

The two '0' parameters supplied to xset in the Xlogin.powersaveoff file have a special effect (see
the xset man page for full details): the monitor is instructed to disable all power-saving features.

The cron system is used to switch between the two files when no one is present: every night at
9pm and every morning at 8am, followed by a reboot after the copy operation is complete. The
entries from the file /var/spool/cron/crontabs/cron on any of the Ve24 Indys are thus:

# Alternate monitor power-saving. Turn it on at 9pm. Turn it off at


8am.
0 21 * * * /bin/cp /var/X11/xdm/Xlogin.powersaveon
/var/X11/xdm/Xlogin && init 6&
#
0 8 * * * /bin/cp /var/X11/xdm/Xlogin.powersaveoff
/var/X11/xdm/Xlogin && init 6&

Hence, during the day, the SGI monitors are always on with the login logo/prompt visible -
students can see the Indys are active and available for use; during the night, the monitors turn
themselves off due to the new xset settings. The times at which the Xlogin changes are made
were chosen so as to occur when other cron jobs would not be running. Students use the Indys
each day without ever noticing the change, unless they happen to be around at the right time to
see the peculiar sight of 18 Indys all rebooting at once.
Static Routes.

A simple way to enable packets from clients to be forwarded through an external connection is
via the use of a 'static route'. A file called /etc/init.d/network.local is created with a simple script
that adds a routing definition to the current routing database, thus enabling packets to be
forwarded to their destination. To ensure the script is executed on bootup or shutdown, extra
links are added to the /etc/rc0.d and /etc/rc2.d directories (the following commands need only be
executed once as root):

ln -s /etc/init.d/network.local /etc/rc0.d/K39network
ln -s /etc/init.d/network.local /etc/rc2.d/S31network

Yoda once had a modem link to 'Demon Internet' for Internet access. A static route was used to
allow SGI network clients to access the Internet via the link. The contents of
/etc/init.d/network.local (supplied by SGI) was:

#!/sbin/sh
#Tag 0x00000f00
IS_ON=/sbin/chkconfig
case "$1" in
'start')
if $IS_ON network; then
/usr/etc/route add default 193.61.252.1 1
fi ;;

'stop')
/usr/etc/route delete default 193.61.252.1 ;;

*)
echo "usage: $0 {start|stop}"
;;
esac

Note the use of chkconfig to ensure that a static route is only installed on bootup if the network is
defined as active.

The other main files for controlling Internet access are /etc/services and /etc/inetd.conf. These
were discussed earlier.

Internet Access Policy.

Those sites which choose to allow Internet access will probably want to minimise the degree to
which someone outside the site can access internal services. For example, users may be able to
telnet to remote hosts from a company workstation, but should the user be able to successfully
telnet to that workstation from home in order to continue working? Such an ability would
obviously be very useful to users, and indeed administrators, but there are security implications
which may be prohibitive.
For example, students who have accounts on the SGI network cannot login to Yoda because the
/etc/passwd file contains /dev/null as their default shell, ie. they can't login because their account
'presence' on Yoda itself does not have a valid shell - another cunning use of /dev/null. The
/etc/passwd.nis file has the main user account database, so users can logon to the machines in
Ve24 as desired. Thus, with the use of /dev/null in the password file's shell field, students cannot
login to Yoda via telnet from outside UCLAN. Staff accounts on the SGI network do not have
/dev/null in the shell field, so staff can indeed login to Yoda via telnet from a remote host.

Ideally, I'd like students to be able to telnet to a Ve24 machine from a remote host, but this is not
yet possible for reasons explained in Appendix A (detailed notes for Day 2 Part 1).

There are a number of Internet sites which are useful sources of information on Internet issues,
some relating to specific areas such as newsgroups. In fact, USENET is an excellent source of
information and advice on dealing with system management, partly because of preprepared FAQ
files, but also because of the many experts who read and post to the newsgroups. Even if site
policy means users can't access USENET, an admin should exploit the service to obtain relevant
admin information.

A list of some useful reference sites are given in Appendix C.

Example Questions:

1. The positions of the 'accept ec0' and 'reject' lines in /etc/ipfilterd.conf could be swapped around
without affecting the filtering logic. So why is the ec0 line listed first? The 'netstat -i' command
(executed on Yoda) may be useful here.
2. What would an appropriate ipfilterd.conf filter (or filters) look like which blocked unauthorised
use of Yoda as a proxy to connect to an external site but still allowed access to Yoda's own web
pages via www.comp.uclan.ac.uk? Hint: the netsnoop command may be useful.

Course summary.

This course has focused on what an admin needs to know in order to run a UNIX system. SGI
systems running IRIX 6.2 have been used as an example UNIX platform, with occasional
mention of IRIX 6.5 as an example of how OSs evolve.

Admins are, of course, ordinary users too, though they often do not use the same set of
applications that other users do. Though an admin needs to know things an ordinary user does
not, occasionally users should be made aware of certain issues, eg. web browser cookie files,
choosing appropriate passwords etc.

Like any modern OS, UNIX has a vast range of features and services. This course has not by any
means covered them all (that would be impossible to do in just three days, or even thirty).
Instead, the basic things a typical admin needs to know have been introduced, especially the
techniques used to find information when needed, and how to exploit the useful features of
UNIX for daily administration.

Whatever flavour of UNIX an admin has to manage, a great many issues are always the same,
eg. security, Internet concepts, etc. Thus, an admin should consider purchasing relevant reference
books to aid in the learning process. When writing shell scripts, knowledge of the C
programming language is useful; since UNIX is the OS being used, a C programming book
(mentioned earlier) which any admin will find particularly useful is:

"C Programming in a UNIX Environment"

Judy Kay & Bob Kummerfeld, Addison Wesley Publishing, 1989.


ISBN: 0 201 12912 4

For further information on UNIX or related issues, read/post to relevant newsgroups using
DejaNews; example newsgroups are given in Appendix D.

Background Notes:

1. UNIX OSs like IRIX can be purchased in a form that passes the US Department of Defence's
Trusted-B1 security regulations (eg. 'Trusted IRIX'), whereas Linux doesn't come anywhere near
such rigorous security standards as yet. The only UNIX OS (and in fact the only OS of any kind)
which passes all of the US DoD's toughest security regulations is Unicos, made by Cray Research
(a subsidiary of SGI). Unicos and IRIX will be merged sometime in the future, creating the first
widely available commercial UNIX OS that is extremely secure - essential for fields such as
banking, local and national government, military, police (and other emergency/crime services),
health, research, telecoms, etc.

References:

2. DejaNews USENET Newsgroups, Reading/Posting service:

http://www.dejanews.com/

4. "Firewalls: Where there's smoke...", Network Week, Vol4, No. 12, 2nd December
1998, pp. 33 to 37.

5. Gauntlet 3.2 for IRIX Internet Firewall Software:

http://www.sgi.com/solutions/internet/products/gauntlet/

6. Framebuffer and Clipping Planes, Indigo2 Technical Report, SGI, 1994:


http://www.futuretech.vuurwerk.nl/i2sec4.html#4.3
http://www.futuretech.vuurwerk.nl/i2sec5.html#5.6.3

7. Useful security-related web sites:

UKERNA: http://www.ukerna.ac.uk/
JANET: http://www.ja.net/
CERT: http://www.cert.org/
RootShell: http://www.rootshell.com/
2600: http://www.2600.com/mindex.html

8. "About the X Window System", part of X11.org:

http://www.X11.org/wm/index.shtml

9. Images are from the online book, "IRIX Admin: Backup, Security, and Accounting.",
Chapter 5.

Appendix B:

3. Contents of /etc/fingerd.message:

Sorry, the finger service is not available from this host.

However, thankyou for your interest in the Department of


Computing at the University of Central Lancashire.

For more information, please see:

http://www.uclan.ac.uk/
http://www.uclan.ac.uk/facs/destech/compute/comphom.htm

Or contact Ian Mapleson at mapleson@gamers.org

Regards,

Ian.

Senior Technician,
Department of Computing,
University of Central Lancashire,
Preston,
England,
PR1 2HE.

mapleson@gamers.org
Tel: (+44 -0) 1772 893297
Fax: (+44 -0) 1772 892913

Doom Help Service (DHS): http://doomgate.gamers.org/dhs/


SGI/Future Technology/N64: http://sgi.webguide.nl/
BSc Dissertation (Doom): http://doomgate.gamers.org/dhs/diss/

Appendix C:

Example web sites useful to administrators:

AltaVista: http://altavista.digital.com/cgi-
bin/query?pg=aq
Webcrawler: http://webcrawler.com/
Lycos: http://www.lycos.com/
Yahoo: http://www.yahoo.com/
DejaNews: http://www.dejanews.com/
SGI Support: http://www.sgi.com/support/
SGI Tech/Advice Center: http://www.futuretech.vuurwerk.nl/sgi.html
X Windows: http://www.x11.org/
Linux Home Page: http://www.linux.org/
UNIXHelp for Users: http://unixhelp.ed.ac.uk/
Hacker Security Update: http://www.securityupdate.com/
UnixVsNT: http://www.unix-vs-nt.org/
RootShell: http://www.rootshell.com/
UNIX System Admin (SunOS): http://sunos-wks.acs.ohio-
state.edu/sysadm_course/html/sysadm-1.html

Appendix D:

Example newsgroups useful to administrators:

comp.security.unix
comp.unix.admin
comp.sys.sgi.admin
comp.unix.admin
comp.sys.sun.admin
comp.sys.next.sysadmin
comp.unix.aix
comp.unix.cray
comp.unix.misc
comp.unix.questions
comp.unix.shell
comp.unix.solaris
comp.unix.ultrix
comp.unix.wizards
comp.unix.xenix.misc
comp.sources.unix
comp.unix.bsd.misc
comp.unix.sco.misc
comp.unix.unixware.misc
comp.sys.hp.hpux
comp.unix.sys5.misc
comp.infosystems.www.misc
Detailed Notes for Day 3 (Part 5)
Project: Indy/Indy attack/defense (IRIX 5.3 vs. IRIX 6.5)

The aim of this practical session, which lasts two hours, is to give some experience of how an
admin typically uses a UNIX system to investigate a problem, locate information, construct and
finally implement a solution. The example problem used will likely require:

 the use of online information (man pages, online books, release notes, etc.),
 writing scripts and exploiting shell script methods as desired,
 the use of a wide variety of UNIX commands,
 identifying and exploiting important files/directories,

and so on. A time limit on the task is included to provide some pressure, which often happens in
real-world situations.

The problem situation is a simulated hacker attack/defense. Two SGI Indys are directly
connected together with an Ethernet cable; one Indy, referred to here as Indy X, is using an older
version of IRIX called IRIX 5.3 (1995), while the other (Indy Y) is using a much newer version,
namely IRIX 6.5 (1998).

Students will be split into two groups (A and B) of 3 or 4 persons each. For the first hour, group
A is placed with Indy X, while group B is with Indy Y. For the second hour, the situation is
reversed. Essentially, each group must try to hack the other group's system, locate and steal some
key information (described below), and finally cripple the enemy machine. However, since both
groups are doing this, each group must also defend against attack. Whether a group focuses on
attack or defense, or a mixture of both, is for the group's members to decide during the
preparatory stage.

The first hour is is dealt with as follows:

 For the first 35 minutes, each group uses the online information and any available notes
to form a plan of action. During this time, the Ethernet cable between the Indys X and Y
is not connected, and separate 'Research' Indys are used for this investigative stage in
order to prevent any kind of preparatory measures. Printers will be available if printouts
are desired.
 After a short break of 5 minutes to prepare/test the connection between the two Indys and
move the groups to Indys X and Y, the action begins. Each group must try to hack into
the other group's Indy, exploiting any suspected weaknesses, whilst also defending
against the other group's attack. In addition, the hidden data must be found, retrieved, and
the enemy copy erased. The end goal is to shutdown the enemy system after retrieving
the hidden data. How the shutdown is effected is entirely up to the group members.
At the end of the hour, the groups are reversed so that group B will now use an Indy running
IRIX 5.3, while group A will use an Indy running IRIX 6.5. The purpose of this second attempt
is to demonstrate how an OS evolves and changes over time with respect to security and OS
features, especially in terms of default settings, online help, etc.

Indy Specifications.

Both systems will have default installations of the respective OS version, with only minor
changes to files so that they are aware of each other's existence (/etc/hosts, and so on).

All systems will have identical hardware (133MHz R4600PC CPU, 64MB RAM, etc.) except for
disk space: Indys with IRIX 6.5 will use 2GB disks, while Indys with IRIX 5.3 will use 549MB
disks. Neither system will have any patches installed from any vendor CD updates.

The hidden data which must be located and stolen from the enemy machine by each group is the
Blender V1.57 animation and rendering archive file for IRIX 6.2:

blender1.57_SGI_6.2_iris.tar.gz

Size: 1228770 bytes.

For a particular Indy, the file will be placed in an appropriate directory in the file system, the
precise location of which will only be made known to the group using that Indy - how an
attacking group locates the file is up to the attackers to decide.

It is expected that groups will complete the task ahead of schedule; any spare time will be used
for a discussion of relevant issues:

 Reliability of relying on default settings for security, etc.


 How to detect hacking in progress, especially if an unauthorised person is carrying out
actions as root.
 Whose responsibility is it to ensure security? The admin or the user?
 If a hacker is 'caught', what kind of evidence would be required to secure a conviction?
How reliable is the evidence?

END OF COURSE.
Figure Index for Detailed Notes.

Day 1:

Figure 1. A typical root directory shown by 'ls'.


Figure 2. The root directory shown by 'ls -F /'.
Figure 3. Important directories visible in the root directory.
Figure 4. Key files for the novice administrator.
Figure 5. Output from 'man -f file'.
Figure 6. Hidden files shown with 'ls -a /'.
Figure 7. Manipulating an NFS-mounted file system with 'mount'.
Figure 8. The various available shells.
Figure 9. The commands used most often by any user.
Figure 10. Editor commands.
Figure 11. The next most commonly used commands.
Figure 12. File system manipulation commands.
Figure 13. System Information and Process Management Commands.
Figure 14. Software Management Commands.
Figure 15. Application Development Commands.
Figure 16. Online Information Commands (all available from the
'Toolchest')
Figure 17. Remote Access Commands.
Figure 18. Using chown to change both user ID and group ID.
Figure 19. Handing over file ownership using chown.

Day 2:

Figure 20. IP Address Classes: bit field and width allocations.


Figure 21. IP Address Classes: supported network types and sizes.
Figure 22. The contents of the /etc/hosts file used on the SGI network.
Figure 23. Yoda's /etc/named.boot file.
Figure 24. The example named.boot file in /var/named/Examples.
Figure 25. A typical find command.
Figure 26. Using cat to quickly create a simple shell script.
Figure 27. Using echo to create a simple one-line shell script.
Figure 28. An echo sequence without quote marks.
Figure 29. The command fails due to * being treated as a
Figure 30. Using a backslash to avoid confusing the shell.
Figure 31. Using find with the -exec option to execute rm.
Figure 32. Using find with the -exec option to execute ls.
Figure 33. Redirecting the output from find to a file.
Figure 34. A simple script with two lines.
Figure 35. The simple rebootlab script.
Figure 36. The simple remountmapleson script.
Figure 37. The daily tasks of an admin.
Figure 38. Using df without options.
Figure 39. The -k option with df to show data in K.
Figure 40. Using df to report usage for the file
Figure 41. Using du to report usage for several directories/files.
Figure 42. Restricting du to a single directory.
Figure 43. Forcing du to ignore symbolic links.
Figure 44. Typical output from the ps command.
Figure 45. Filtering ps output with grep.
Figure 46. top shows a continuously updated output.
Figure 47. The IRIX 6.5 version of top, giving extra information.
Figure 48. System information from osview.
Figure 49. CPU information from osview.
Figure 50. Memory information from osview.
Figure 51. Network information from osview.
Figure 51. Miscellaneous information from osview.
Figure 52. Results from ttcp between two hosts on a 10Mbit network.
Figure 53. The output from netstat.
Figure 54. Example use of the ping command.
Figure 55. The output from rup.
Figure 56. The output from uptime.
Figure 57. The output from w showing current user activity.
Figure 58. Obtaining full domain addresses from w with the -W option.
Figure 59. The output from rusers, showing who is logged on where.

Day 3:

Figure 60. Standard UNIX security features.


Figure 61. Aspects of a system relevant to security.
Figure 62. Files relevant to network behaviour.
Figure 63. hosts.equiv files used by Ve24 Indys.
Figure 64. hosts.equiv file for yoda.
Figure 65. hosts.equiv file for milamber.
Figure 66. Additional line in /etc/passwd enabling NIS.
Figure 67. Typical output from fuser.
Figure 68. Blocking the use of finger in the /etc/inetd.conf file.
Figure 69. Instructing inetd to restart itself (using killall).
UNIX Administration Course
Day 1:
Part 1: Introduction to the course. Introduction to UNIX. History
of UNIX and key features. Comparison with other OSs.

Part 2: The basics: files, UNIX shells, editors, commands.


Regular Expressions and Metacharacters in Shells.

Part 3: File ownership and access permissions.


Online help (man pages, etc.)

Day 2:
Part 1: System identity (system name, IP address, etc.)
Software: vendor, commercial, shareware, freeware (eg. GNU).
Hardware features: auto-detection, etc.
UNIX Characteristics: integration, stability,
reliability, security, scalability, performance.

Part 2: Shell scripts.

Part 3: System monitoring tools and tasks.

Part 4: Further shell scripts. Application development tools:


compilers, debuggers, GUI toolkits, high-level APIs.

Day 3:
Part 1: Installing an OS and software: inst, swmgr.
OS updates, patches, management issues.

Part 2: Organising a network with a server. NFS. Quotas.


Installing/removing internal/external hardware.
SGI OS/software/hardware installation. Network setup.

Part 3: Daily system administration tasks, eg. data backup.


System bootup and shutdown, events, daemons.

Part 4: Security/Access control: the law, firewalls, ftp.


Internet access: relevant files and services.
Course summary.

Part 5: Exploring administration issues, security, hacking,


responsibility, end-user support, the law (discussion).
Indy/Indy attack/defense using IRIX 5.3 vs. IRIX 6.5
(two groups of 3 or 4 each).

Figures
Day 1:
Part 1: Introduction to the course. Introduction to UNIX. History
Of UNIX and key features. Comparison with other OSs.

Introduction to UNIX and the Course.

The UNIX operating system (OS) is widely used around the world, eg.

 The backbone of the Internet relies on UNIX-based systems and services, as do the
systems used by most Internet Service Providers (ISPs).
 Major aspects of everyday life are managed using UNIX-based systems, eg. banks,
booking systems, company databases, medical records, etc.
 Other 'behind the scenes' uses concern data-intensive tasks, eg. art, design, industrial
design, CAD and computer animation to real-time 3D graphics, virtual reality, visual
simulation & training, data visualisation, database management, transaction processing,
scientific research, military applications, computational challenges, medical modeling,
entertainment and games, film/video special effects, live on-air broadcast effects, space
exploration, etc.

As an OS, UNIX is not often talked about in the media, perhaps because there is no single large
company such as Microsoft to which one can point at and say, "There's the company in charge of
UNIX." Most public talk is of Microsoft, Bill gates, Intel, PCs and other more visible aspects of
the computing arena, partly because of the home-based presence of PCs and the rise of the
Internet in the public eye. This is ironic because OSs like MS-DOS, Win3.1, Win95 and WinNT
all draw many of their basic features from UNIX, though they lack UNIX's sophistication and
power, mainly because they lack so many key features and a lengthy development history.

In reality, a great deal of the everyday computing world relies on UNIX-based systems running
on computers from a wide variety of vendors such as Compaq (Digital Equipment Corporation,
or DEC), Hewlett Packard (HP), International Business Machines (IBM), Intel, SGI (was Silicon
Graphics Inc., now just 'SGI'), Siemens Nixdorf, Sun Microsystems (Sun), etc.

In recent years, many companies which previously relied on DOS or Windows have begun to
realise that UNIX is increasingly important to their business, mainly because of what UNIX has
to offer and why, eg. portability, security, reliability, etc. As demands for handling data grow,
and companies embrace new methods of manipulating data (eg. data mining and visualisation),
the need for systems that can handle these problems forces companies to look at solutions that
are beyond the Wintel platform in performance, scalability and power.

Oil companies such as Texaco [1] and Chevron [2] are typical organisations which already use
UNIX systems extensively because of their data-intensive tasks and a need for extreme reliability
and scalability. As costs have come down, along with changes in the types of available UNIX
system (newer low-end designs, eg. Ultra5, O2, etc.), small and medium-sized companies are
looking towards UNIX solutions to solve their problems. Even individuals now find that older
2nd-hand UNIX systems have significant advantages over modern Wintel solutions, and many
companies/organisations have adopted this approach too [3].

This course serves as an introduction to UNIX, its history, features, operation, use and services,
applications, typical administration tasks, and relevant related topics such as the Internet,
security and the Law. SGI's version of UNIX, called IRIX, is used as an example UNIX OS. The
network of SGI Indys and an SGI Challenge S server I admin is used as an example UNIX
hardware platform.
The course lasts three days, each day consisting of a one hour lecture followed by a two hour
practical session in the morning, and then a three hour practical session in the afternoon; the only
exceptions to this are Day 1 which begins with a two hour lecture, and Day 3 which has a 1 hour
afternoon lecture.

Detailed notes are provided for all areas covered in the lectures and the practical sessions. With
new topics introduced step-by-step, the practical sessions enable first-hand familiarity with the
topics covered in the lectures.

As one might expect of an OS which has a vast range of features, capabilities and uses, it is not
possible to cover everything about UNIX in three days, especially the more advanced topics such
as kernel tuning which most administrators rarely have to deal with. Today, modern UNIX
hardware and software designs allow even very large systems with, for example, 64 processors to
be fully setup at the OS level in little more than an hour [4]. Hence, the course is based on the
author's experience of what a typical UNIX user and administrator (admin) has to deal with,
rather than attempting to present a highly compressed 'Grand Description of Everything' which
simply isn't necessary to enable an admin to perform real-world system administration on a daily
basis.

For example, the precise nature and function of the Sendmail email system on any flavour of
UNIX is not immediately easy to understand; looking at the various files and how Sendmail
works can be confusing. However, in the author's experience, due to the way UNIX is designed,
even a default OS installation without any further modification is sufficient to provide users with
a fully functional email service [5], a fact which shouldn't be of any great surprise since email is
a built-in aspect of any UNIX OS. Thus, the presence of email as a fundamental feature of UNIX
is explained, but configuring and customising Sendmail is not.

History of UNIX

Key:

BTL = Bell Telephone Laboratories


GE = General Electric
WE = Western Electric
MIT = Massachusetts Institute of Technology
BSD = Berkeley Standard Domain
Summary History:
1957: BTL creates the BESYS OS for internal use.
1964: BTL needs a new OS, develops Multics with GE and MIT.
1969: UNICS project started at BTL and MIT; OS written using the B
language.
1970: UNICS project well under way; anonymously renamed to UNIX.
1971: UNIX book published. 60 commands listed.
1972: C language completed (a rewritten form of B). Pipe concept invented.
1973: UNIX used on 16 sites. Kernel rewritten in C. UNIX spreads rapidly.
1974: Work spreads to Berkeley. BSD UNIX is born.
1975: UNIX licensed to universities for free.
1978: Two UNIX styles, though similar and related: System V and BSD.
1980s: Many companies launch their versions of UNIX, including Microsoft.
A push towards cross-platform standards: POSIX/X11/Motif
Independent organisations with cross-vendor membership
Control future development and standards. IEEE included.
1990s: 64bit versions of UNIX released. Massively scalable systems.
Internet springs to life, based on UNIX technologies. Further
Standardisation efforts (OpenGL, UNIX95, UNIX98).

Detailed History.

UNIX is now nearly 40 years old. It began


life in 1969 as a combined project run by
BTL, GE and MIT, initially created and
managed by Ken Thompson and Dennis
Ritchie [6]. The goal was to develop an
operating system for a large computer which
could support hundreds of simultaneous
users. The very early phase actually started at
BTL in 1957 when work began on what was
to become BESYS, an OS developed by BTL
for their internal needs.

In 1964, BTL started on the third generation


of their computing resources. They needed a
new operating system and so initiated the MULTICS (MULTIplexed operating and Computing
System) project in late 1964, a combined research programme between BTL, GE and MIT. Due
to differing design goals between the three groups, Bell pulled out of the project in 1969, leaving
personnel in Bell's Computing Science and Research Center with no usable computing
environment.

As a response to this move, Ken Thompson and Dennis Ritchie offered to design a new OS for
BTL, using a PDP-7 computer which was available at the time. Early work was done in a
language designed for writing compilers and systems programming, called BCPL (Basic
Combined Programming Language). BCPL was quickly simplified and revised to produce a
better language called B.

By the end of 1969 an early version of the OS was completed; a pun at previous work on
Multics, it was named UNICS (UNIplexed operating and Computing System) - an "emasculated
Multics". UNICS included a primitive kernel, an editor, assembler, a simple shell command
interpreter and basic command utilities such as rm, cat and cp. In 1970, extra funding arose from
BTL's internal use of UNICS for patent processing; as a result, the researchers obtained a DEC
PDP-11/20 for further work (24K RAM). At that time, the OS used 12K, with the remaining 12K
used for user programs and a RAM disk (file size limit was 64K, disk size limit was 512K).
BTL's Patent Department then took over the project, providing funding for a newer machine,
namely a PDP-11/45. By this time, UNICS had been abbreviated to UNIX - nobody knows
whose idea it was to change the name (probably just phonetic convenience).
In 1971, a book on UNIX by Thompson and Ritchie described over 60 commands, including:

 b (compile a B program)

 chdir (change working directory)

 chmod (change file access permissions)

 chown (change file ownership)

 cp (copy a file)

 ls (list directory contents)

 who (show who is on the system)

Even at this stage, fundamentally important aspects of UNIX were already firmly in place as core
features of the overall OS, eg. file ownership and file access permissions. Today, other operating
systems such as WindowsNT do not have these features as a rigorously integrated aspect of the
core OS design, resulting in a plethora of overhead issues concerning security, file management,
user access control and administration. These features, which are very important to modern
computing environments, are either added as convoluted bolt-ons to other OSs or are totally non-
existent (NT does have a concept of file ownership, but it isn't implemented very well;
regrettably, much of the advice given by people from VMS to Microsoft on how to implement
such features was ignored).

In 1972, Ritchie and Thompson rewrote B to create a new language called C. Around this time,
Thompson invented the 'pipe' - a standard mechanism for allowing the output of one program or
process to be used as the input for another. This became the foundation of the future UNIX OS
development philosophy: write programs which do one thing and do it well; write programs
which can work together and cooperate using pipes; write programs which support text streams
because text is a 'universal interface' [6].

By 1973, UNIX had spread to sixteen sites, all within AT&T and WE. First made public at a
conference in October that year, within six months the number of sites using UNIX had tripled.
Following a publication of a version of UNIX in 'Communications of the ACM' in July 1974,
requests for the OS began to rapidly escalate. Crucially at this time, the fundamentals of C were
complete and much of UNIX's 11000 lines of code were rewritten in C - this was a major
breakthrough in operating systems design: it meant that the OS could be used on virtually any
computer platform since C was hardware independent.

In late 1974, Thompson went to University of California at Berkeley to teach for a year. Working
with Bill Joy and Chuck Haley, the three developed the 'Berkeley' version of UNIX (named
BSD, for Berkeley Software Distribution), the source code of which was widely distributed to
students on campus and beyond, ie. students at Berkeley and elsewhere also worked on
improving the OS. BTL incorporated useful improvements as they arose, including some work
from a user in the UK. By this time, the use and distribution of UNIX was out of BTL's control,
largely because of the work at Berkeley on BSD.

Developments to BSD UNIX added the vi editor, C-based shell interpreter, the Sendmail email
system, virtual memory, and support for TCP/IP networking technologies (Transmission Control
Protocol/Internet Protocol). Again, a service as important as email was now a fundamental part
of the OS, eg. the OS uses email as a means of notifying the system administrator of system
status, problems, reports, etc. Any installation of UNIX for any platform automatically includes
email; by complete contrast, email is not a part of Windows3.1, Win95, Win98 or WinNT -
email for these OSs must be added separately (eg. Pegasus Mail), sometimes causing problems
which would not otherwise be present.

In 1975, a further revision of UNIX known as the Fifth Edition was released and licensed to
universities for free. After the release of the Seventh Edition in 1978, the divergence of UNIX
development along two separate but related paths became clear: System V (BTL) and BSD
(Berkeley). BTL and Sun combined to create System V Release 4 (SVR4) which brought
together System V with large parts of BSD. For a while, SVR4 was the more rigidly controlled,
commercial and properly supported (compared to BSD on its own), though important work
occurred in both versions and both continued to be alike in many ways. Fearing Sun's possible
domination, many other vendors formed the Open Software Foundation (OSF) to further work on
BSD and other variants. Note that in 1979, a typical UNIX kernel was still only 40K.

Because of a legal decree which prevented AT&T from selling the work of BTL, AT&T allowed
UNIX to be widely distributed via licensing schemas at minimal or zero cost. The first genuine
UNIX vendor, Interactive Systems Corporation, started selling UNIX systems for automating
office work. Meanwhile, the work at AT&T (various internal design groups) was combined, then
taken over by WE, which became UNIX System Laboratories (now owned by Novell). Later
releases included Sytem III and various releases of System V. Today, most popular brands of
UNIX are based either on SVR4, BSD, or a combination of both (usually SVR4 with standard
enhancements from BSD, which for example describes SGI's IRIX version perfectly). As an
aside, there never was a System I since WE feared companies would assume a 'system 1' would
be bug-ridden and so would wait for a later release (or purchase BSD instead!).

It's worth noting the influence from the superb research effort at Xerox Parc, which was working
on networking technologies, electronic mail systems and graphical user interfaces, including the
proverbial 'mouse'. The Apple Mac arose directly from the efforts of Xerox Parc which,
incredibly and much against the wishes of many Xerox Parc employees, gave free
demonstrations to people such as Steve Jobs (founder of Apple) and sold their ideas for next to
nothing ($50000). This was perhaps the biggest financial give-away in history [7].

One reason why so many different names for UNIX emerged over the years was the practice of
AT&T to license the UNIX software, but not the UNIX name itself. The various flavours of
UNIX may have different names (SunOS, Solaris, Ultrix, AIX, Xenix, UnixWare, IRIX, Digital
UNIX, HP-UX, OpenBSD, FreeBSD, Linux, etc.) but in general the differences between them
are minimal. Someone who learns a particular vendor's version of UNIX (eg. Sun's Solaris) will
easily be able to adapt to a different version from another vendor (eg. DEC's Digital UNIX).
Most differences merely concern the names and/or locations of particular files, as opposed to any
core underlying aspect of the OS.

Further enhancements to UNIX included compilation management systems such as make and
Imake (allowing for a single source code release to be compiled on any UNIX platform) and
support for source code management (SCCS). Services such as telnet for remote communication
were also completed, along with ftp for file transfer, and other useful functions.

In the early 1980s, Microsoft developed and released its version of UNIX called Xenix (it's a
shame this wasn't pushed into the business market instead of DOS). The first 32bit version of
UNIX was released at this time. SCO developed UnixWare which is often used today by Intel for
publishing performance ratings for its x86-based processors [8]. SGI started IRIX in the early
1980s, combining SVR4 with an advanced GUI. Sun's SunOS sprang to life in 1984, which
became widely used in educational institutions. NeXT-Step arrived in 1989 and was hailed as a
superb development platform; this was the platform used to develop the game 'Doom', which was
then ported to DOS for final release. 'Doom' became one of the most successful and influential
PC games of all time and was largely responsible for the rapid demand for better hardware
graphics systems amongst home users in the early 1990s - not many people know that it was
originally designed on a UNIX system though. Similarly, much of the development work for
Quake was done using a 4-processor Digital Alpha system [9].

During the 1980s, developments in standardised graphical user interface elements were
introduced (X11 and Motif) along with other major additional features, especially Sun's
Networked File System (NFS) which allows multiple file systems, from multiple UNIX
machines from different vendors, to be transparently shared and treated as a single file structure.
Users see a single coherant file system even though the reality may involve many different
systems in different physical locations.

By this stage, UNIX's key features had firmly established its place in the computing world, eg.
Multi-tasking and multi-user (many independent processes can run at once; many users can use a
single system at the same time; a single user can use many systems at the same time). However,
in general, the user interface to most UNIX variants was poor: mainly text based. Most vendors
began serious GUI development in the early 1980s, especially SGI which has traditionally
focused on visual-related markets [10].

From the point of view of a mature operating system, and certainly in the interests of companies
and users, there were significant moves in the 1980s and early 1990s to introduce standards
which would greatly simplify the cross-platform use of UNIX. These changes, which continue
today, include:

 The POSIX standard [6], begun in 1985 and released in 1990: a suite of application
programming interface standards which provide for the portability of application source
code relating to operating system services, managed by the X/Open group.
 X11 and Motif: GUI and windowing standards, managed by the X Consortium and OSF.
 UNIX95, UNIX98: a set of standards and guidelines to help make the various UNIX
flavours more coherant and cross-platform.
 OpenGL: a 3D graphics programming standard originally developed by SGI as GL
(Graphics Library), then IrisGL, eventually released as an open standard by SGI as
OpenGL and rapidly adopted by all other vendors.
 Journaled file systems such as SGI's XFS which allow the creation, management and use
of very large file systems, eg. multiple terabytes in size, with file sizes from a single byte
to millions of terabytes, plus support for real-time and predictable response. EDIT
(2008): Linux can now use XFS.
 Interoperability standards so that UNIX systems can seamlessly operate with non-UNIX
systems such as DOS PCs, WindowsNT, etc.

Standards Notes
POSIX:
X/Open eventually became UNIX International (UI), which competed for a while with
OSF. The US Federal Government initiated POSIX (essentially a version of UNIX),
requiring all government contracts to conform to the POSIX standard - this freed the US
government from being tied to vendor-specific systems, but also gave UNIX a major
boost in popularity as users benefited from the industry's rapid adoption of accepted
standards.

X11 and Motif:


Programming directly using low-level X11/Motif libraries can be non-trivial. As a result,
higher level programming interfaces were developed in later years, eg. the ViewKit
library suite for SGI systems. Just as 'Open Inventor' is a higher-level 3D graphics API to
OpenGL, ViewKit allows one to focus on developing the application and solving the
client's problem, rather than having to wade through numerous low-level details. Even
higher-level GUI-based toolkits exist for rapid application development, eg. SGI's
RapidApp.

UNIX95, UNIX98:
Most modern UNIX variants comply with these standards, though Linux is a typical
exception (it is POSIX-compliant, but does not adhere to other standards). There are
several UNIX variants available for PCs, excluding Alpha-based systems which can also
use NT (MIPS CPUs could once be used with NT as well, but Microsoft dropped NT
support for MIPS due to competition fears from Intel whose CPUs were not as fast at the
time [11]):

 Linux Open-architecture, free, global development,


insecure.

 OpenBSD More rigidly controlled, much more secure.

 FreeBSD Somewhere inbetween the above two.

 UnixWare More advanced. Scalable. Not free.

There are also commercial versions of Linux which have additional features and services,
eg. Red Hat Linux and Calderra Linux. Note that many vendors today are working to
enable the various UNIX variants to be used with Intel's CPUs - this is needed by Intel in
order to decrease its dependence on the various Microsoft OS products.

OpenGL:
Apple was the last company to adopt OpenGL. In the 1990s, Microsoft attempted to force
its own standards into the marketplace (Direct3D and DirectX) but this move was
doomed to failure due to the superior design of OpenGL and its ease of use, eg. games
designers such as John Carmack (Doom, Quake, etc.) decided OpenGL was the much
better choice for games development. Compared to Direct3D/DirectX, OpenGL is far
superior for seriously complex problems such as visual simulation, military/industrial
applications, image processing, GIS, numerical simulation and medical imaging.

In a move to unify the marketplace, SGI and Microsoft signed a deal in the late 1990s to
merge DirectX and Direct3D into OpenGL - the project, called Fahrenheit, will
eventually lead to a single unified graphics programming interface for all platforms from
all vendors, from the lowest PC to the fastest SGI/Cray supercomputer available with
thousands of processors. To a large degree, Direct3D will simply either be phased out in
favour of OpenGL's methods, or focused entirely on consumer-level applications, though
OpenGL will dominate in the final product for the entertainment market.

OpenGL is managed by the OpenGL Architecture Review Board, an independent


organisation with member representatives from all major UNIX vendors, relevant
companies and institutions.

Journaled file systems:


File systems like SGI's XFS running on powerful UNIX systems like CrayOrigin2000
can easily support sustained data transfer rates of hundreds of gigabytes per second. XFS
has a maximum file size limit of 9 million terabytes.
The end result of the last 30 years of UNIX development is what is known as an 'Open System',
ie. a system which permits reliable application portability, interoperability between different
systems and effective user portability between a wide variety of different vendor hardware and
software platforms. Combined with a modern set of compliance standards, UNIX is now a
mature, well-understood, highly developed, powerful and very sophisticated OS.

Many important features of UNIX do not exist in other OSs such as WindowsNT and will not do
so for years to come, if ever. These include guaranteeable reliability, security, stability, extreme
scalability (thousands of processors), proper support for advanced multi-processing with unified
shared memory and resources (ie. parallel compute systems with more than 1 CPU), support for
genuine real-time response, portability and an ever-increasing ease-of-use through highly
advanced GUIs. Modern UNIX GUIs combine the familiar use of icons with the immense power
and flexibility of the UNIX shell command line which, for example, supports full remote
administration (a significant criticism of WinNT is the lack of any real command line interface
for remote administration). By contrast, Windows2000 includes a colossal amount of new code
which will introduce a plethora of new bugs and problems.

A summary of key UNIX features would be:

 Multi-tasking: many different processes can operate independently at once.


 Multi-user: many users can use a single machine at the same time; a single user can use
multiple machines at the same time.
 Multi-processing: most commercial UNIX systems scale to at least 32 or 64 CPUs (Sun,
IBM, HP), while others scale to hundreds or thousands (IRIX, Unicos, AIX, etc.; Blue
Mountain [12], Blue Pacific, ASCI Red). Today, WindowsNT cannot reliably scale to
even 8 CPUs. Intel will not begin selling 8-way chip sets until Q3 1999.
 Multi-threading: automatic parallel execution of applications across multiple CPUs and
graphics systems when programs are written using the relevant extensions and libraries.
Some tasks are naturally non-threadable, eg. Rendering animation frames for movies
(each processor computes a single frame using a round-robin approach), while others
lend themselves very well to parallel execution, eg. Computational Fluid Dynamics,
Finite Element Analysis, Image Processing, Quantum Chronodynamics, weather
modeling, database processing, medical imaging, visual simulation and other areas of 3D
graphics, etc.
 Platform independence and portability: applications written on UNIX systems will
compile and run on other UNIX systems if they're developed with a standards-based
approach, eg. the use of ANSI C or C++, Motif libraries, etc.; UNIX hides the hardware
architecture from the user, easing portability. The close relationship between UNIX and
C, plus the fact that the UNIX shell is based on C, provides for a powerful development
environment. Today, GUI-based development environments for UNIX systems also exist,
giving even greater power and flexibility, eg. SGI's WorkShop Pro CASE tools and
RapidApp.
 Full 64bit environment: proper support for very large memory spaces, up to hundreds of
GB of RAM, visible to the system as a single combined memory space. Comparison:
NT's current maximum limit is 4GB; IRIX's current commercial limit is 512GB, though
Blue Mountain's 6144-CPU SGI system has a current limit of 12000GB RAM (twice that
if the CPUs were upgraded to the latest model). Blue Mountain has 1500GB RAM
installed at the moment.
 Inter-system communication: services such as telnet, Sendmail, TCP/IP, remote login
(rlogin), DNS, NIS, NFS, etc. Sophisticated security and access control. Features such as
email and telnet are a fundamental part of UNIX, but they must be added as extras to
other OSs. UNIX allows one to transparently access devices on a remote system and even
install the OS using a CDROM, DAT or disk that resides on a remote machine. Note that
some of the development which went into these technologies was in conjunction with the
evolution of ArpaNet (the early Internet that was just for key US government, military,
research and educational sites).
 File identity and access: unique file ownership and a logical file access permission
structure provide very high-level management of file access for use by users and
administrators alike. OSs which lack these features as a core part of the OS make it far
too easy for a hacker or even an ordinary user to gain administrator-level access (NT is a
typical example).
 System identity: every UNIX system has a distinct unique entity, ie. a system name and
an IP (Internet Protocol) address. These offer numerous advantages for users and
administrators, eg. security, access control, system-specific environments, the ability to
login and use multiple systems at once, etc.
 Genuine 'plug & play': UNIX OSs already include drivers and support for all devices that
the source vendor is aware of. Adding most brands of disks, printers, CDROMs, DATs,
Floptical drives, ZIP or JAZ drives, etc. to a system requires no installation of any drivers
at all (the downside of this is that a typical modern UNIX OS installation can be large,
eg. 300MB). Detection and name-allocation to devices is largely automatic - there is no
need to assign specific interrupt or memory addresses for devices, or assign labels for
disk drives, ZIP drives, etc. Devices can be added and removed without affecting the
long-term operation of the system. This also often applies to internal components such as
CPUs, video boards, etc. (at least for SGIs).

UNIX Today.

In recent years, one aspect of UNIX that was holding it back from spreading more widely was
cost. Many vendors often charged too high a price for their particular flavour of UNIX. This
made its use by small businesses and home users prohibitive. The ever decreasing cost of PCs,
combined with the sheer marketing power of Microsoft, gave rise of the rapid growth of
Windows and now WindowsNT. However, in 1993, Linus Torvalds developed a version of
UNIX called Linux (he pronounces it rather like 'leenoox', rhyming with 'see-books') which was
free and ran on PCs as well as other hardware platforms such as DEC machines. In what must be
one of the most astonishing developments of the computer age, Linux has rapidly grown to
become a highly popular OS for home and small business use and is now being supported by
many major companies too, including Oracle, IBM, SGI, HP, Dell and others.
Linux does not have the sophistication of the more traditional UNIX variants such as SGI's IRIX,
but Linux is free (older releases of IRIX such as IRIX 6.2 are also free, but not the very latest
release, namely IRIX 6.5). This has resulted in the rapid adoption of Linux by many people and
businesses, especially for servers, application development, home use, etc. With the recent
announcement of support for multi-processing in Linux for up to 8 CPUs, Linux is becoming an
important player in the UNIX world and a likely candidate to take on Microsoft in the battle for
OS dominance.

However, it'll be a while before Linux


will be used for 'serious' applications
since it does not have the rigorous
development history and discipline of
other UNIX versions, eg. Blue Mountain
is an IRIX system consisting of 6144
CPUs, 1500GB RAM, 76000GB disk
space, and capable of 3000 billion
floating-point operations per second.
This level of system development is
what drives many aspects of today's
UNIX evolution and the hardware which
supports UNIX OSs. Linux lacks this
top-down approach and needs a lot of
work in areas such as security and
support for graphics, but Linux is
nevertheless becoming very useful in
fields such as render-farm construction
for movie studios, eg. a network of cheap PentiumIII machines, networked and running the free
Linux OS, reliable and stable. The film "Titanic" was the first major film which used a Linux-
based render-farm, though it employed many other UNIX systems too (eg. SGIs, Alphas), as
well as some NT systems.

EDIT (2008): Linux is now very much used for serious work, running most of the planet's
Internet servers, and widely used in movie studios for Flame/Smoke on professional x86
systems. It's come a long way since 1999, with new distributions such as Ubuntu and Gentoo
proving very popular. At the high-end, SGI offers products that range from its shared-memory
Linux-based Altix 4700 system with up to 1024 CPUs, to the Altix ICE, a highly expandable
XEON/Linux cluster system with some sites using machines with tens of thousands of cores.

UNIX has come a long way since 1969. Thompson and Ritchie could never have imagined that it
would spread so widely and eventually lead to its use in such things as the control of the Mars
Pathfinder probe which last year landed on Mars, including the operation of the Internet web
server which allowed millions of people around the world to see the images brought back as the
Martian event unfolded [13].

Today, from an administrator's perspective, UNIX is a stable and reliable OS which pretty much
runs itself once it's properly setup. UNIX requires far less daily administration than other OSs
such as NT - a factor not often taken into account when companies form purchasing decisions
(salaries are a major part of a company's expenditure). UNIX certainly has its baggage in terms
of file structure and the way some aspects of the OS actually work, but after so many years most
if not all of the key problems have been solved, giving rise to an OS which offers far superior
reliability, stability, security, etc. In that sense, UNIX has very well-known baggage which is
absolutely vital to safety-critical applications such as military, medical, government and
industrial use. Byte magazine once said that NT was only now tackling OS issues which other
OSs had solved years before [14].

Thanks to a standards-based and top-down approach, UNIX is evolving to remove its baggage in
a reliable way, eg. the introduction of the NSD (Name Service Daemon) to replace DNS
(Domain Name Service), NIS (Network Information Service) and aspects of NFS operation; the
new service is faster, more efficient, and easier on system resources such as memory and
network usage.

However, in the never-ending public relations battle for computer systems and OS dominance,
NT has firmly established itself as an OS which will be increasingly used by many companies
due to the widespread use of the traditional PC and the very low cost of Intel's mass-produced
CPUs. Rival vendors continue to offer much faster systems than PCs, whether or not UNIX is
used, so I expect to see interesting times ahead in the realm of OS development. Companies like
SGI bridge the gap by releasing advanced hardware systems which support NT (eg. the Visual
Workstation 320 [15]), systems whose design is born out of UNIX-based experience.

One thing is certain: some flavour of UNIX will always be at the forefront of future OS
development, whatever variant it may be.

References

1. Texaco processes GIS data in order to analyse suitable sites for oil exploration. Their
models can take several months to run even on large multi-processor machines. However,
as systems become faster, companies like Texaco simply try to solve more complex
problems, with more detail, etc.
2. Chevron's Nigerian office has, what was in mid-1998, the fastest supercomputer in
Africa, namely a 16-processor SGI POWER Challenge (probably replaced by now with a
modern 64-CPU Origin2000). A typical data set processed by the system is about 60GB
which takes around two weeks to process, during which time the system must not go
wrong or much processing time is lost. For individual work, Chevron uses Octane
workstations which are able to process 750MB of volumetric GIS data in less than three
seconds. Solving these types of problems with PCs is not yet possible.
3. The 'Tasmania Parks and Wildlife Services' (TPWS) organisation is responsible for the
management and environmental planning of Tasmania's National Parks. They use modern
systems like the SGI O2 and SGI Octane for modeling and simulation (virtual park
models to aid in decision making and planning), but have found that much older systems
such as POWER Series Predator and Crimson RealityEngine (SGI systems dating from
1992) are perfectly adequate for their tasks, and can still outperform modern PCs. For
example, the full-featured pixel-fill rate of their RealityEngine system (320M/sec), which
supports 48bit colour at very high resolutions (1280x2048 with 160MB VRAM), has still
not been bettered by any modern PC solution. Real-time graphics comparisons at
http://www.blender.nl/stuff/blench1.html show Crimson RE easily outperforming many
modern PCs which ought to be faster given RE is 7 years old. Information supplied by
Simon Pigot (TPWS SysAdmin).
4. "State University of New York at Buffalo Teams up with SGI for Next-Level
Supercomputing Site. New Facility Brings Exciting Science and Competitive Edge to
University":
http://www.sgi.com/origin/successes/buffalo.html

5. Even though the email-related aspects of the Computing Department's SGI network have
not been changed in any way from the default settings (created during the original OS
installation), users can still email other users on the system as well as send email to
external sites.
6. Unix history:
http://virtual.park.uga.edu/hc/unixhistory.html
A Brief History of UNIX:
http://pantheon.yale.edu/help/unixos/unix-intro.html
UNIX Lectures:
http://www.sis.port.ac.uk/~briggsjs/csar4/U2.htm
Basic UNIX:
http://osiris.staff.udg.mx/man/ingles/his.html
POSIX: Portable Operating System Interface:
http://www.pasc.org/abstracts/posix.htm

7. "The Triumph of the Nerds", Channel 4 documentary.


8. Standard Performance Evaluation Corporation:
http://www.specbench.org/
Example use of UnixWare by Intel for benchmark reporting:
http://www.specbench.org/osg/cpu95/results/res98q3/cpu95-980831-03026.html
http://www.specbench.org/osg/cpu95/results/res98q3/cpu95-980831-03023.html
9. "My Visit to the USA" (id Software, Paradigm Simulation Inc., NOA):
http://doomgate.gamers.org/dhs/dhs/usavisit/dallas.html
10. Personal IRIS 4D/25, PCW Magazine, September 1990, pp. 186:
http://www.futuretech.vuurwerk.nl/pcw9-90pi4d25.html
IndigoMagic User Environment, SGI, 1993 [IND-MAGIC-BRO(6/93)].

IRIS Indigo Brochure, SGI, 1991 [HLW-BRO-01 (6/91)].

"Smooth Operator", CGI Magazine, Vol4, Issue 1, Jan/Feb 1999, pp. 41-42.

Digital Media World '98 (Film Effects and Animation Festival, Wembley Conference
Center, London). Forty six pieces of work were submitted to the conference magazine by
company attendees. Out of the 46 items, 43 had used SGIs; of these, 34 had used only
SGIs.
11. "MIPS-based PCs fastest for WindowsNT", "MIPS Technologies announces 200MHz
R4400 RISC microprocessor", "MIPS demonstrates Pentium-class RISC PC designs", -
all from IRIS UK, Issue 1, 1994, pp. 5.
12. Blue Mountain, Los Alamos National Laboratory:
13. http://www.lanl.gov/asci/
14. http://www.lanl.gov/asci/bluemtn/ASCI_fly.pdf
15. http://www.lanl.gov/asci/bluemtn/bluemtn.html
16. http://www.lanl.gov/asci/bluemtn/t_sysnews.shtml
http://www.lanl.gov/orgs/pa/News/111298.html#anchor263034

17. "Silicon Graphics Technology Plays Mission-Critical Role in Mars Landing"


http://www.sgi.com/newsroom/press_releases/1997/june/jplmars_release.html
"Silicon Graphics WebFORCE Internet Servers Power Mars Web Site, One of the
World's Largest Web Events"
http://www.sgi.com/newsroom/press_releases/1997/july/marswebforce_release.html
"PC Users Worldwide Can Explore VRML Simulation of Mars Terrain Via the Internet"
http://www.sgi.com/newsroom/press_releases/1997/june/vrmlmars_release.html

18. "Deja Vu All Over Again"; "Windows NT security is under fire. It's not just that there are
holes, but that they are holes that other OSes patched years ago", Byte Magazine, Vol 22
No. 11, November 1997 Issue, pp. 81 to 82, by Peter Mudge and Yobie Benjamin.
19. VisualWorkstation320 Home Page:
http://visual.sgi.com/
Day 1:
Part 2: The basics: files, UNIX shells, editors, commands.
Regular Expressions and Metacharacters in Shells.

UNIX Fundamentals: Files and the File System.

At the lowest level, from a command-line point of view, just about everything in a UNIX
environment is treated as a file - even hardware entities, eg. Printers, disks and DAT drives. Such
items might be described as 'devices' or with other terms, but at the lowest level they are visible
to the admin and user as files somewhere in the UNIX file system (under /dev in the case of
hardware devices). Though this structure may seem a little odd at first, it means that system
commands can use a common processing and communication interface no matter what type of
file they're dealing with, eg. Text, pipes, data redirection, etc. (these concepts are explained in
more detail later).

The UNIX file system can be regarded as a top-down tree of files and directories, starting with
the top-most 'root' directory. A directory can be visualised as a filing cabinet, other directories as
folders within the cabinet and individual files as the pieces of paper within folders. It's a useful
analogy if one isn't familiar with file system concepts, but somewhat inaccurate since a directory
in a computer file system can contain files on their own as well as other directories, ie. Most
office filing cabinets don't have loose pieces of paper outside of folders.

UNIX file systems can also have 'hidden' files and directories. In DOS, a hidden file is just a file
with a special attribute set so that 'dir' and other commands do not show the file; by contrast, a
hidden file in UNIX is any file which begins with a dot '.' (period) character, ie. the hidden status
is a result of an aspect of the file's name, not an attribute that is bolted onto the file's general
existence. Further, whether or not a user can access a hidden file or look inside a hidden
directory has nothing to do with the fact that the file or directory is hidden from normal view (a
hidden file in DOS cannot be written to). Access permissions are a separate aspect of the
fundamental nature of a UNIX file and are dealt with later.

The 'ls' command lists files and directories in the current directory, or some other part of the file
system by specifying a 'path' name. For example:

ls /

Will show the contents of the root directory, which may typically contain the following:

CDROM dev home mapleson proc stand usr


bin dumpster lib nsmail root.home tmp var
debug etc lib32 opt sbin unix

Figure 1. A typical root directory shown by 'ls'.


Almost every UNIX system has its own unique root directory and file system, stored on a disk within the
machine. The exception is a machine with no internal disk, running off a remote server in some way;
such systems are described as 'diskless nodes' and are very rare in modern UNIX environments, though
still used if a diskless node is an appropriate solution.

Some of the items in Fig 1. Are files, while others are directories? If one uses an option '-F' with
the ls command, special characters are shown after the names for extra clarity:

/ - directory

* - executable file

@ - link to another file or directory


Elsewhere in the file system

Thus, using 'ls -F' gives this more useful output:

CDROM/ dev/ home/ mapleson/ proc/ stand/ usr/


Bin/ dumpster/ lib/ nsmail/ root.home tmp/ var/
Debug/ etc/ lib32/ opt/ sbin/ unix*

Figure 2. The root directory shown by 'ls -F /'.


Fig 2 shows that most of the items are in fact other directories. Only two items are ordinary files: 'unix'
and 'root.home'. 'UNIX' is the main UNIX kernel file and is often several megabytes in size for today's
modern UNIX systems - this is partly because the kernel must often include support for 64bit as well as
older 32bit system components. 'root.home' is merely a file created when the root user accesses the
WWW using Netscape, ie. an application-specific file.

Important directories in the root directory:

/bin - many as-standard system commands


are here (links to /usr/bin)

/dev - device files for keyboard, disks, printers, etc.

/etc - system configuration files

/home - user accounts are here (NFS mounted)

/lib - library files used by executable programs

/sbin - user applications and other commands

/tmp - temporary directory (anyone can create


files here). This directory is normally
erased on bootup

/usr - Various product-specific directories, system


resource directories, locations of online help
(/usr/share), header files of application
development (usr/include), further system
configuration files relating to low-level
hardware which are rarely touched even by an
administrator (eg. /usr/cpu and /usr/gfx).

/var - X Windows files (/var/X11), system services


files (eg. software licenses in /var/flexlm),
various application related files (/var/netscape,
/var/dmedia), system administration files and data
(/var/adm, /var/spool) and a second temporary
directory (/var/tmp) which is not normally erased
on bootup (an administrator can alter the
behaviour of both /tmp and /var/tmp).

/mapleson - (non-standard) my home account is here, NFS-


mounted from the admin Indy called Milamber.

Figure 3. Important directories in the root directory.


Comparisons with other UNIX variants such as HP-UX, SunOS and Solaris can be found in the many FAQ
(Frequently Asked Questions) files available via the Internet [1].

Browsing around the UNIX file system can be enlightening but also a little overwhelming at
first. However, an admin never has to be concerned with most parts of the file structure; low-
level system directories such as /var/cpu are managed automatically by various system tasks and
programs. Rarely, if ever, does an admin even have to look in such directories, never mind alter
their contents (the latter is probably an unwise thing to do).

From the point of view of a novice admin, the most important directory is /etc. It is this directory
which contains the key system configuration files and it is these files which are most often
changed when an admin wishes to alter system behaviour or properties. In fact, an admin can get
to grips with how a UNIX system works very quickly, simply by learning all about the following
files to begin with:

/etc/sys_id - the name of the system (may include full domain)

/etc/hosts - summary of full host names (standard file,


added to by the administrator)

/etc/fstab - list of file systems to mount on bootup

/etc/passwd - password file, contains user account information

/etc/group - group file, contains details of all user groups

Figure 4. Key files for the novice administrator.


Note that an admin also has a personal account, ie. an ordinary user account, which should be used for
any task not related to system administration. More precisely, an admin should only be logged in as root
when it is strictly necessary, mainly to avoid unintended actions, eg. accidental use of the 'rm'
command.
A Note on the 'man' Command.

The manual pages and other online information for the files shown in Fig 4 all list references to
other related files, eg. the man page for 'fstab' lists 'mount' and 'xfs' in its 'SEE ALSO' section, as
well as an entry called 'filesystems' which is a general overview document about UNIX file
systems of all types, including those used by CDROMs and floppy disks. Modern UNIX releases
contain a large number of useful general reference pages such as 'filesystems'. Since one may not
know what is available, the 'k' and 'f' options can be used with the man command to offer
suggestions, eg. 'man -f file' gives this output (the -f option shows all man page titles for entries
that begin with the word 'file'):

ferror, feof, clearerr,


fileno (3S) stream status inquiries
file (1) determine file type
file (3Tcl) Manipulate file names and attributes
File::Compare (3) Compare files or filehandles
File::Copy (3) Copy files or filehandles
File::DosGlob (3) DOS like globbing and then some
File::Path (3) create or remove a series
of directories
File::stat (3) by-name interface to Perl's built-in
stat() functions
filebuf (3C++) buffer for file I/O.
FileCache (3) keep more files open than
the system permits
fileevent (3Tk) Execute a script when a file
becomes readable or writable
FileHandle (3) supply object methods for filehandles
filename_to_devname (2) determine the device name
for the device file
filename_to_drivername (2) determine the device name
for the device file
fileparse (3) split a pathname into pieces
files (7P) local files name service
parser library
FilesystemManager (1M) view and manage filesystems
filesystems: cdfs, dos,
fat, EFS, hfs, mac,
iso9660, cd-rom, kfs,
nfs, XFS, rockridge (4) IRIX filesystem types
filetype (5) K-AShare's filetype
specification file
filetype, fileopen,
filealtopen, wstype (1) determine filetype of specified
file or files
routeprint, fileconvert (1) convert file to printer or
to specified filetype

Figure 5. Output from 'man -f file'.


'man -k file' gives a much longer output since the '-k' option runs a search on every man page title
containing the word 'file'. So a point to note: judicious use of the man command along with other online
information is an effective way to learn how any UNIX system works and how to make changes to
system behaviour. All man pages for commands give examples of their use, a summary of possible
options, syntax, further references, a list of any known bugs with appropriate workarounds, etc.

The next most important directory is probably /var since this is where the configuration files for
many system services are often housed, such as the Domain Name Service (/var/named) and
Network Information Service (/var/yp). However, small networks usually do not need these
services which are aimed more at larger networks. They can be useful though, for example in
aiding Internet access.

Overall, a typical UNIX file system will have over several thousand files. It is possible for an
admin to manage a system without ever knowing what the majority of the system's files are for.
In fact, this is a preferable way of managing a system. When a problem arises, it is more
important to know where to find relevant information on how to solve the problem, rather than
try to learn the solution to every possible problem in the first instance (which is impossible).

I once asked an experienced SGI administrator (the first person to ever use the massive Cray
T3D supercomputer at the Edinburgh Parallel Computing Centre) what the most important thing
in his daily working life was. He said it was a small yellow note book in which he had written
where to find information about various topics. The book was an index on where to find facts,
not a collection of facts in itself.

Hidden files were described earlier. The '-a' option can be used with the ls command to show
hidden files:

ls -a /

gives:

./ .sgihelprc lib/
../ .ssh/ lib32/
.Acroread.License .varupdate mapleson/
.Sgiresources .weblink nsmail/
.cshrc .wshttymode opt/
.desktop-yoda/ .zmailrc proc/
.ebtpriv/ CDROM/ sbin/
.expertInsight bin/ stand/
.insightrc debug/ swap/
.jotrc* dev/ tmp/
.login dumpster/ unix*
.netscape/ etc/ usr/
.profile floppy/ var/
.rhosts home/

Figure 6. Hidden files shown with 'ls -a /'.


For most users, important hidden files would be those which configure their basic working environment
when they login:

.cshrc
.login
.profile
Other hidden files and directories refer to application-specific resources such as Netscape, or
GUI-related resources such as the .desktop-sysname directory (where 'sysname' is the name of
the host).

Although the behaviour of the ls command can be altered with the 'alias' command so that it
shows hidden files by default, the raw behaviour of ls can be accessed by using an absolute
directory path to the command:

/bin/ls

Using the absolute path to any file in this way allows one to ignore any aliases which may have
been defined, as well as the normal behaviour of the shell to search the user's defined path for the
first instance of a command. This is a useful technique when performing actions as root since it
ensures that the wrong command is not executing by mistake.

Network File System (NFS)

An important feature of UNIX is the ability to access a particular directory on one machine from
another machine. This service is called the 'Network File System' (NFS) and the procedure itself
is called 'mounting'.

For example, on the machines in Ve24, the directory /home is completely empty - no files are in
it whatsoever (except for a README file which is explained below). When one of the Indys is
turned on, it 'mounts' the /home directory from the server 'on top' of the /home directory of the
local machine. Anyone looking in the /home directory actually sees the contents of /home on the
server.

The 'mount' command is used to mount a directory on a file system belonging to a remote host
onto some directory on the local host's filesystem. The remote host must 'export' a directory in
order for other hosts to locally mount it. The /etc/exports file contains a list of directories to be
exported.

For example, the following shows how the /home directory on one of the Ve24 Indys (akira) is
mounted off the server, yet appears to an ordinary user to be just another part of akira's overall
file system (NB: the '#' indicates these actions are being performed as root; an ordinary user
would not be able to use the mount command in this way):

AKIRA 1# mount | grep YODA


YODA:/var/www on /var/www type nfs (vers=3,rw,soft,intr,bg,dev=c0001)
YODA:/var/mail on /var/mail type nfs (vers=3,rw,dev=c0002)
YODA:/home on /home type nfs (vers=3,rw,soft,intr,bg,dev=c0003)
AKIRA 1# ls /home
dist/ projects/ pub/ staff/ students/ tmp/ yoda/
AKIRA 2# umount /home
AKIRA 1# mount | grep YODA
YODA:/var/www on /var/www type nfs (vers=3,rw,soft,intr,bg,dev=c0001)
YODA:/var/mail on /var/mail type nfs (vers=3,rw,dev=c0002)
AKIRA 3# ls /home
README
AKIRA 4# mount /home
AKIRA 5# ls /home
dist/ projects/ pub/ staff/ students/ tmp/ yoda/
AKIRA 6# ls /
CDROM/ dev/ home/ mapleson/ proc/ stand/ usr/
bin/ dumpster/ lib/ nsmail/ root.home tmp/ var/
debug/ etc/ lib32/ opt/ sbin/ unix*

Figure 7. Manipulating an NFS-mounted file system with 'mount'.

Each Indy has a README file in its local /home, containing:

The /home filesystem from Yoda is not mounted for some reason.
Please contact me immediately!

Ian Mapleson, Senior Technician.

3297 (internal)
mapleson@gamers.org

After /home is remounted in Fig 7, the ls command no longer shows the README file as being
present in /home, ie. when /home is mounted from the server, the local contents of /home are
completely hidden and inaccessible.

When accessing files, a user never has to worry about the fact that the files in a directory which
has been mounted from a remote system actually reside on a physically separate disk, or even a
different UNIX system from a different vendor. Thus, NFS gives a seamless transparent way to
merge different files systems from different machines into one larger structure. At the
department where I studied years ago [2], their UNIX system included Hewlett Packard
machines running HP-UX, Sun machines running SunOS, SGIs running IRIX, DEC machines
running Digital UNIX, PCs running an X-Windows emulator called Windows Exceed, and some
Linux PCs. All the machines had access to a single large file structure so that any user could
theoretically use any system in any part of the building (except where deliberately prevented
from doing so via local system file alterations).

Another example is my home directory /mapleson - this directory is mounted from the admin
Indy (Technicians' office Ve48) which has my own extra external disk locally mounted. As far as
the server is concerned, my home account just happens to reside in /mapleson instead of
/home/staff/mapleson. There is a link to /mapleson from /home/staff/mapleson which allows
other staff and students to access my directory without having to ever be aware that my home
account files do not physically reside on the server.

Every user has a 'home directory'. This is where all the files owned by that user are stored. By
default, a new account would only include basic files such as .login, .cshrc and .profile. Admin
customisation might add a trash 'dumpster' directory, user's WWW site directory for public
access, email directory, perhaps an introductory README file, a default GUI layout, etc.
UNIX Fundamentals: Processes and process IDs.

As explained in the UNIX history, a UNIX OS can run many programs, or processes, at the same
time. From the moment a UNIX system is turned on, this process is initiated. By the time a
system is fully booted so that users can login and use the system, many processes will be running
at once. Each process has its own unique identification number, or process ID. An administrator
can use these ID numbers to control which processes are running in a very direct manner.

For example, if a user has run a program in the background and forgotten to close it down before
logging off (perhaps the user's process is using up too much CPU time) then the admin can
shutdown the process using the kill command. Ordinary users can also use the kill command, but
only on processes they own.

Similarly, if a user's display appears frozen due to a problem with some application (eg.
Netscape) then the user can logon to a different system, login to the original system using rlogin,
and then use the kill command to shutdown the process at fault either by using the specific
process ID concerned, or by using a general command such as killall, eg.:

killall netscape

This will shutdown all currently running Netscape processes, so using specific ID numbers is
often attempted first.

Most users only encounter the specifics of processes and how they work when they enter the
world of application development, especially the lower-level aspects of inter-process
communication (pipes and sockets). Users may often run programs containing bugs, perhaps
leaving processes which won't close on their own. Thus, kill can be used to terminate such
unwanted processes.

The way in which UNIX manages processes and the resources they use is extremely tight, ie. it is
very rare for a UNIX system to completely fall over just because one particular process has
caused an error. 3rd-party applications like Netscape are usually the most common causes of
process errors. Most UNIX vendors vigorously test their own system software to ensure they are,
as far as can be ascertained, error-free. One reason why alot of work goes into ensuring programs
are bug free is that bugs in software are a common means by which hackers try to gain root
(admin) access to a system: by forcing a particular error condition, a hacker may be able to
exploit a bug in an application.

For an administrator, most daily work concerning processes is about ensuring that system
resources are not being overloaded for some reason, eg. a user running a program which is
forking itself repeatedly, slowing down a system to a crawl.

In the case of the SGI system I run, staff have access to the SGI server, so I must ensure that staff
do not carelessly run processes which hog CPU time. Various means are available by which an
administrator can restrict the degree to which any particular process can utilise system resources,
the most important being a process priority level (see the man pages for 'nice' and 'renice').
The most common process-related command used by admins and users is 'ps', which displays the
current list of processes. Various options are available to determine which processes are
displayed and in what output format, but perhaps the most commonly used form is this:

ps -ef

which shows just about everything about every process, though other commands exist which can
give more detail, eg. the current CPU usage for each process (osview). Note that other UNIX
OSs (eg. SunOS) require slightly different options, eg. 'ps -aux' - this is an example of the kind of
difference which users might notice between System V and BSD derived UNIX variants.

The Pipe.

An important aspect of processes is inter-process communication. From an every day point of


view, this involves the concept of pipes. A pipe, as the name suggests, acts as a communication
link between two processes, allowing the output of one processes to be used as the input for
another. The pipe symbol is a vertical bar '|'.

One can use the pipe to chain multiple commands together, eg.:

cat *.txt | grep pattern | sort | lp

The above command sequence dumps the contents of all the files in the current directory ending
in .txt, but instead of the output being sent to the 'standard output' (ie. the screen), it is instead
used as the input for the grep operation which scans each incoming line for any occurence of the
word 'pattern' (grep's output will only be those lines which do contain that word, if any). The
output from grep is then sorted by the sort program on a line-by-line basis for each file found by
cat (in alphanumeric order). Finally, the output from sort is sent to the printer using lp.

The use of pipes in this way provides an extremely effective way of combining many commands
together to form more powerful and flexible operations. By contrast, such an ability does not
exist in DOS, or in NT.

Processes are explained further in a later lecture, but have been introduced now since certain
process-related concepts are relevant when discussing the UNIX 'shell'.

UNIX Fundamentals: The Shell Command Interface.

A shell is a command-line interface to a UNIX OS, written in C, using a syntax that is very like
the C language. One can enter simple commands (shell commands, system commands, user-
defined commands, etc.), but also more complex sequences of commands, including expressions
and even entire programs written in a scripting language called 'shell script' which is based on C
and known as 'sh' (sh is the lowest level shell; rarely used by ordinary users, it is often used by
admins and system scripts). Note that 'command' and 'program' are used synonymously here.
Shells are not in any way like the PC DOS environment; shells are very powerful and offer users
and admins a direct communication link to the core OS, though ordinary users will find there is a
vast range of commands and programs which they cannot use since they are not the root user.

Modern GUI environments are popular and useful, but some tasks are difficult or impossible to
do with an iconic interface, or at the very least are simply slower to perform. Shell commands
can be chained together (the output of one command acts as the input for another), or placed into
an executable file like a program, except there is no need for a compiler and no object file - shell
'scripts' are widely used by admins for system administration and for performing common tasks
such as locating and removing unwanted files. Combined with the facility for full-scale remote
administration, shells are very flexible and efficient. For example, I have a single shell script
'command' which simultaneously reboots all the SGI Indys in Ve24. These shortcuts are useful
because they minimise keystrokes and mistakes. An admin who issues lengthy and complex
command lines repeatedly will find these shortcuts a handy and necessary time-saving feature.

Shells and shell scripts can also use variables, just as a C program can, though the syntax is
slightly different. The equivalent of if/then statements can also be used, as can case statements,
loop structures, etc. Novice administrators will probably not have to use if/then or other more
advanced scripting features at first, and perhaps not even after several years. It is certainly true
that any administrator who already knows the C programming language will find it very easy to
learn shell script programming, and also the other scripting languages which exist on UNIX
systems such as perl (Practical Extraction and Report Language), awk (pattern scanning and
processing language) and sed (text stream editor).

perl is a text-processing language, designed for processing text files, extracting useful data,
producing reports and results, etc. perl is a very powerful tool for system management, especially
combined with other scripting languages. However, perl is perhaps less easy to learn for a
novice; the perl man page says, "The language is intended to be practical (easy to use, efficient,
complete) rather than beautiful (tiny, elegant, minimal)." I have personally never had to write a
perl program as yet, or a program using awk or sed. This is perhaps a good example if any were
required of how largely automated modern UNIX systems are. Note that the perl man page
serves as the entire online guide to the perl language and is thus quite large.

An indication of the fact that perl and similar languages can be used to perform complex
processing operations can be seen by examining the humourous closing comment in the perl man
page:

"Perl actually stands for Pathologically Eclectic


Rubbish Lister, but don't tell anyone I said that."

Much of any modern UNIX OS actually operates using shell scripts, many of which use awk, sed
and perl as well as ordinary shell commands and system commands. These scripts can look quite
complicated, but in general they need not be of any concern to the admin; they are often quite old
(ie. written years ago), well understood and bug-free.

Although UNIX is essentially a text-based command-driven system, it is perfectly possible for


most users to do the majority or even all of their work on modern UNIX systems using just the
GUI interface. UNIX variants such as IRIX include advanced GUIs which combine the best of
both worlds. It's common for a new user to begin with the GUI and only discover the power of
the text interface later. This probably happens because most new users are already familiar with
other GUI-based systems (eg. Win95) and initially dismiss the shell interface because of prior
experience of an operating system such as DOS, ie. they perceive a UNIX shell to be just some
weird form of DOS. Shells are not DOS, ie.:

 DOS is an operating system. Win3.1 is built on top of DOS, as is Win95, etc.


 UNIX is an operating system. Shells are a powerful text command interface to UNIX and not the
OS itself. A UNIX OS uses shell techniques in many aspects of its operation.

Shells are thus nothing like DOS; they are closely related to UNIX in that the very first version of UNIX
included a shell interface, and both are written in C. When a UNIX system is turned on, a shell is used
very early in the boot sequence to control what happens and execute actions.

Because of the way UNIX works and how shells are used, much of UNIX's inner workings are
hidden, especially at the hardware level. This is good for the user who only sees what she or he
wants and needs to see of the file structure. An ordinary user focuses on their home directory and
certain useful parts of the file system such as /var/tmp and /usr/share, while an admin will also be
interested in other directories which contain system files, device files, etc. such as /etc, /var/adm
and /dev.

The most commonly used shells are:

bsh - Bourne Shell; standard/job control


- command programming language

ksh - modern alternative to bsh, but still restricted

csh - Berkeley's C Shell; a better bsh


- with many additional features

tcsh - an enhanced version of csh

Figure 8. The various available shells.


These offer differing degrees of command access/history/recall/editing and support for shell script
programming, plus other features such as command aliasing (new names for user-defined sequences of
one or more commands). There is also rsh which is essentially a restricted version of the standard
command interpreter sh; it is used to set up login names and execution environments whose capabilities
are more controlled than those of the standard shell. Shells such as csh and tcsh execute the file
/etc/cshrc before reading the user's own .cshrc, .login and perhaps .tcshrc file if that exists.

Shells use the concept of a 'path' to determine how to find commands to execute. The 'shell path
variable', which is initially defined in the user's .cshrc or .tcshrc file, consists of a list of
directories, which may be added to by the user. When a command is entered, the shell
environment searches each directory listed in the path for the command. The first instance of a
file which matches the command is executed, or an error is given if no such executable command
is found. This feature allows multiple versions of the same command to exist in different
locations (eg. different releases of a commercial application). The user can change the path
variable so that particular commands will run a file from a desired directory.

Try:

echo $PATH

The list of directories is given.

WARNING: the dot '.' character at the end of a path definition means 'current directory'; it is
dangerous to include this in the root user's path definition (this is because a root user could run
an ordinary user's program(s) by mistake). Even an ordinary user should think twice about
including a period at the end of their path definition. For example, suppose a file called 'la' was
present in /tmp and was set so that it could be run by any user. Enterting 'la' instead of 'ls' by
mistake whilst in /tmp would fail to find 'la' in any normal system directory, but a period in the
path definition would result in the shell finding la in /tmp and executing it; thus, if the la file
contained malicious commands (eg. '/bin/rm -rf $HOME/mail'), then loss of data could occur.

Typical commands used in a shell include (most useful commands listed first):

cd - change directory
ls - list contents of directory
rm - delete a file (no undelete!)
mv - move a file
cat - dump contents of a file
more - display file contents in paged format
find - search file system for files/directories
grep - scan file(s) using pattern matching
man - read/search a man page (try 'man man')
mkdir - create a directory
rmdir - remove directory ('rm -r' has the same effect)
pwd - print current absolute working directory
cmp - show differences between two files
lp - print a file
df - show disk usage
du - show space used by directories/files
mail - send an email message to another user
passwd - change password (yppasswd for systems with NIS)

Figure 9. The commands used most often by any user.


Editors:

vi - ancient editor. Rarely used (arcane),


but occasionally useful, especially
for remote administration.

xedit
jot
nedit - GUI editors (jot is old, nedit is
- newer, xedit is very simple).
Figure 10. Editor commands.
Most of these are not built-in shell commands. Enter 'man csh' or 'man tcsh' to see which commands are
part of the shell and hence which are other system programs, eg. 'which' is a shell command, but 'grep'
is not; 'cd' is a shell command, but 'ls' is not.

vi is an ancient editor developed in the very early days of UNIX when GUI-based displays did
not exist. It is not used much today, but many admins swear by it - this is only really because
they know it so well after years of experience. The vi editor can have its uses though, eg. for
remote administration: if you happen to be using a Wintel PC in an Internet cafe and decide to
access a remote UNIX system via telnet, the vi editor will probably be the only editor which you
can use to edit files on the remote system.

Jot has some useful features, especially for programmers (macros, "Electric C Mode"), but is old
and contains an annoying colour map bug; this doesn't affect the way jot works, but does
sometimes scramble on-screen colours within the jot window. SGI recommends nedit be used
instead.

xedit is a very simple text editor. It has an extremely primitive file selection interface, but has a
rather nice search/replace mechanism.

nedit is a newer GUI editor with more modern features.

jot is specific to SGI systems, while vi, xedit and nedit exist on any UNIX variant (if not by
default, then they can be downloaded in source code or executable format from relevant
anonymous ftp sites).

Creating a new shell:

sh, csh, tcsh, bsh, ksh - use man pages to see differences

I have configured the SGI machines in Ve24 to use tcsh by default due to the numerous extra
useful features in tcsh, including file name completion (TAB), command-line editing, alias
support, file listing in the middle of a typed command (CTRL-D), command recall/reuse, and
many others (the man page lists 36 main extras compared to csh).

Further commands:

which - show location of a command based


on current path definition
chown - change owner ID of a file
chgrp - change group ID of a file
chmod - change file access permissions
who - show who is on the local system
rusers - show all users on local network
sleep - pause for a number of seconds
sort - sort data into a particular order
spell - run a spell-check on a file
split - split a file into a number of pieces
strings - show printable text strings in a file
cut - cut out selected fields of
each line of a file
tr - substitute/delete characters from
a text stream or file
wc - count the number of words in a file
whoami - show user ID
write - send message to another user
wall - broadcast to all users on local system
talk - request 1:1 communication link
with another user
to_dos - convert text file to DOS format
(add CTRL-M and CTRL-Z)
to_unix - convert text file to UNIX format
(opposite of to_dos)
su - adopt the identity of another user
(password usually required)

Figure 11. The next most commonly used commands.

Of the commands shown in Fig 11, only 'which' is a built-in shell command.

Any GUI program can also be executed via a text command (the GUI program is just a high-
level interface to the main program), eg. 'fm' for the iconic file manager/viewer, 'apanel' for the
Audio Panel, 'printers' for the Printer Manager, 'iconbook' for Icon Catalog, 'mouse' for
customise mouse settings, etc. However, not all text commands will have a GUI equivalent - this
is especially true of many system administration commands. Other categories are shown in Figs
12 to 17 below.

fx - repartition a disk, plus other functions


mkfs - make a file system on a disk
mount - mount a file system (NFS)
ln - create a link to a file or directory
tar - create/extract an archive file
gzip - compress a file (gunzip)
compress - compress a file (uncompress).
Different format from gzip.
pack - a further compression method (eg. used
with man pages and release notes)
head - show the first few lines in a file
tail - show the last few lines in a file

Figure 12. File system manipulation commands.

The tar command is another example where slight differences between UNIX variants exist with
respect to default settings. However, command options can always be used to resolve such
differences.
hinv - show hardware inventory (SGI specific)
uname - show OS version
gfxinfo - show graphics hardware information (SGI-specific)
sysinfo - print system ID (SGI-specific)
gmemusage - show current memory usage
ps - display a snapshot of running process information
top - constantly updated process list (GUI: gr_top)
kill - shutdown a process
killall - shutdown a group of processes
osview - system resource usage (GUI: gr_osview)
startconsole - system console, a kind of system monitoring
xterm which applications will echo messages into

Figure 13. System Information and Process Management Commands.

inst - install software (text-based)


swmgr - GUI interface to inst (the preferred
method; easier to use)
versions - show installed software

Figure 14. Software Management Commands.

cc,
CC,
gcc - compile program (further commands
may exist for other languages)
make - run program compilation script
xmkmf - Use imake on an Imakefile to create
vendor-specific make file
lint - check a C program for errors/bugs
cvd - CASE tool, visual debugger for C
programs (SGI specific)

Figure 15. Application Development Commands.

relnotes - software release notes (GUI: grelnotes)


man - manual pages (GUI: xman)
insight - online books
infosearch - searchable interface to the above
three (IRIX 6.5 and later)

Figure 16. Online Information Commands (all available from the 'Toolchest')

telnet - open communication link


ftp - file transfer
ping - send test packets
traceroute - display traced route to remote host
nslookup - translate domain name into IP address
finger - probe remote host for user information

Figure 17. Remote Access Commands.

This is not a complete list! And do not be intimidated by the apparent plethora of commands. An
admin won't use most of them at first. Many commands are common to any UNIX variant, while
those that aren't (eg. hinv) probably have equivalent commands on other UNIX platforms.

Shells can be displayed in different types of window, eg. winterm, xterm. xterms comply with
the X11 standard and offer a wider range of features. xterms can be displayed on remote
displays, as can any X-based application (this includes just about every program one ever uses).
Security note: the remote system must give permission or be configured to allow remote display
(xhost command).

If one is accessing a UNIX system via an older text-only terminal (eg. VT100) then the shell
operates in 'terminal' mode, where the particular characteristics of the terminal in use determine
how the shell communicates with the terminal (details of all known terminals are stored in the
/usr/lib/terminfo directory). Shells shown in visual windows (xterms, winterms, etc.) operate a
form of terminal emulation that can be made to exactly mimic a basic text-only terminal if
required.

Tip: if one ever decides to NFS-mount /usr/lib to save space (thus normally erasing the contents
of /usr/lib on the local disk), it is wise to at least leave behind the terminfo directory on the local
disk's /usr/lib; thus, should one ever need to logon to the system when /usr/lib is not mounted,
terminal communication will still operate normally.

The lack of a fundamental built-in shell environment in WindowsNT is one of the most common
criticisms made by IT managers who use NT. It's also why many high-level companies such as
movie studios do not use NT, eg. no genuine remote administration makes it hard to manage
clusters of several dozen systems all at once, partly because different systems may be widely
dispersed in physical location but mainly because remote administration makes many tasks
considerably easier and more convenient.

Regular Expressions and Metacharacters.

Shell commands can employ regular expressions and metacharacters which can act as a means
for referencing large numbers of files or directories, or other useful shortcuts. Regular
expressions are made up of a combination of alphanumeric characters and a series of punctuation
characters that have special meaning to the shell. These punctuation characters are called
metacharacters when they are used for their special meanings with shell commands.
The most common metacharacter is the wildcard '*', used to reference multiple files and/or
directories, eg.:

Dump the contents of all files in the current directory to the display:

cat *

Remove all object files in the current directory:

rm *.o

Search all files ending in .txt for the word 'Alex':

grep Alex *.txt

Print all files beginning with 'March' and ending in '.txt':

lp March*.txt

Print all files beginning with 'May':

lp May*

Note that it is not necessary to use 'May*.*' - this is because the dot is just another character that
can be a valid part of any UNIX file name at any position, ie. a UNIX file name may include
multiple dots. For example, the Blender shareware animation program archive file is called:

blender1.56_SGI_6.2_ogl.tar.gz

By contrast, DOS has a fixed file name format where the dot is a rigid aspect of any file name.
UNIX file names do not have to contain a dot character, and can even contain spaces (though
such names can confuse the shell unless one encloses the entire name in quotes "").

Other useful metacharacters relate to executing previously entered commands, perhaps with
modification, eg. the '!' is used to recall a previous command, as in:

!! - Repeat previous command


!grep - Repeat the last command which began with 'grep'

For example, an administrator might send 20 test packets to a remote site to see if the remote
system is active:

ping -c 20 www.sgi.com

Following a short break, the administrator may wish to run the same command again, which can
be done by entering '!!'. Minutes later, after entering other commands, the admin might want to
run the last ping test once more, which is easily possible by entering '!ping'. If no other command
had since been entered beginning with 'p', then even just '!p' would work.
The '^' character can be used to modify the previous command, eg. suppose I entered:

grwp 'some lengthy search string or whatever' *

grep has been spelled incorrectly here, so an error is given ('gwrp: Command not found'). Instead
of typing the whole line again, I could enter:

^w^e

The shell searches the previous command for the first appearance of 'w', replaces that letter with
'e', displays the newly formed command as a means of confirmation and then executes the
command. Note: the '^' operator can only search for the first occurrence of the character or string
to be changed, ie. in the above example, the word 'whatever' is not changed to 'ehatever'. The
parameter to search for, and the pattern to replace any targets found, can be any standard regular
expression, ie. a valid sequence of ASCII characters. In the above example, entering
'^grwp^grep^' would have had the same effect, though is unnecessarily verbose.

Note that characters such as '!' and '^' operate entirely within the shell, ie. they are not
'memorised' as discrete commands. Thus, within a tcsh, using the Up-Arrow key to recall the
previous command after the '^w^e' command sequence does not show any trace of the '^w^e'
action. Only the corrected, executed command is shown.

Another commonly used character is the '&' symbol, normally employed to control whether or
not a process executed from with a shell is run in the foreground or background. As explained in
the UNIX history, UNIX can run many processes at once. Processes employ a parental
relationship whereby a process which creates a new process (eg. a shell running a program) is
said to be creating a child process. The act of creating a new process is called forking. When
running a program from within a shell, the prompt may not come back after the command is
entered - this means the new process is running in 'foreground', ie. the shell process is suspended
until such time as the forked process terminates. In order to run the process in background, which
will allow the shell process to carry on as before and still be used, the '&' symbol must be
included at the end of the command.

For example, the 'xman' command normally runs in the foreground: enter 'xman' in a shell and
the prompt does not return; close the xman program, or type CTRL-C in the shell window, and
the shell prompt returns. This effectively means the xman program is 'tied' to the process which
forked it, in this case the shell. If one closes the shell completely (eg. using the top-left GUI
button, or a kill command from a different shell) then the xman window vanishes too.

However, if one enters:

xman &

then the xman program is run in the 'background', ie. the shell prompt returns immediately (note
the space is optional, ie. 'xman&' is also valid). This means the xman session is now independent
of the process which forked it (the shell) and will still exist even if the shell is closed.
Many programs run in the background by default, eg. swmgr (install system software). The 'fg'
command can be used to bring any process into the foreground using the unqiue process ID
number which every process has. With no arguments, fg will attempt to bring to the foreground
the most recent process which was run in the background. Thus, after entering 'xman&', the 'fg'
command on its will make the shell prompt vanish, as if the '&' symbol had never been used.

A process currently running in the foreground can be deliberately 'suspended' using the CTRL-Z
sequence. Try running xman in the foreground within a shell and then typing CTRL-Z - the
phrase 'suspended' is displayed and the prompt returns, showing that the xman process has been
temporarily halted. It still exists, but is frozen. Try using the xman program at this point: notice
that the menus cannot be accessed and the window overlay/underlay actions are not dealt with
anymore.

Now go back to the shell and enter 'fg' - the xman program is brought back into the foreground
and begins running once more. As a final example, try CTRL-Z once more, but this time enter
'bg'. Now the xman process is pushed fully into the background. Thus, if one intends to run a
program in the background but forgets to include the '&' symbol, then one can use CTRL-Z
followed by 'bg' to place the process in the background.

Note: it is worth mentioning at this point an example of how I once observed Linux to be
operating incorrectly. This example, seen in 1997, probably wouldn't happen today, but at the
time I was very surprised. Using a csh shell on a PC running Linux, I ran the xedit editor in the
background using:

xedit&

Moments later, I had cause to shutdown the relevant shell, but the xedit session terminated as
well, which should not have happened since the xedit process was supposed to be running in
background. Exactly why this happened I do not know - presumably there was a bug in the way
Linux handled process forking which I am sure has now been fixed. However, in terms of how
UNIX is supposed to work, it's a bug which should not have been present.

Actually, since many shells such as tcsh allow one to recall previous commands using the arrow
keys, and to edit such commands using Alt/CTRL key combinations and other keys, the need to
use metacharacter such as '!' and '^' is lessened. However, they're useful to know in case one
encounters a different type of shell, perhaps as a result of a telnet session to a remote site where
one may not have any choice over which type of shell is used.

Standard Input (stdin), Standard Output (stdout), Standard Error (stderr).

As stated earlier, everything in UNIX is basically treated as a file. This even applies to the
concept of where output from a program goes to, and where the input to a program comes from.
The relevant files, or text data streams, are called stdin and stdout (standard 'in', standard 'out').
Thus, whenever a command produces a visible output in a shell, what that command is actually
doing is sending its output to the file handle known as stdout. In the case of the user typing
commands in a shell, stdout is defined to be the display which the user sees.
Similarly, the input to a command comes from stdin which, by default, is the keyboard. This is
why, if you enter some commands on their own, they will appear to do nothing at first, when in
fact they are simply waiting for input from the stdin stream, ie. the keyboard. Enter 'cat' on its
own and see what happens; nothing at first, but then enter any text sequence - what you enter is
echoed back to the screen, just as it would be if cat was dumping the contents of a file to the
screen.

This stdin input stream can be temporarily redefined so that a command takes its input from
somewhere other than the keyboard. This is known as 'redirection'. Similarly, the stdout stream
can be redirected so that the output goes somewhere other than the display. The '<' and '>'
symbols are used for data redirection. For example:

ps -ef > file

This runs the ps command, but sends the output into a file. That file could then be examined with
cat, more, or loaded into an editor such as nedit or jot.

Try:

cat > file

You can then enter anything you like until such time as some kind of termination signal is sent,
either CTRL-D which acts to end the text stream, or CTRL-C which stops the cat process. Type
'hello', press Enter, then press CTRL-D. Enter 'cat file' to see the file's contents.

A slightly different form of output redirection is '>>' which appends a data stream to the end of
an existing file, rather than completely overwriting its current contents. Enter:

cat >> file

and type 'there!' followed by Enter and then CTRL-D. Now enter 'cat file' and you will see:

% cat file
hello
there!

By contrast, try the above again but with the second operation also using the single '>' operator.
This time, the files contents will only be 'there!'. And note that the following has the same effect
as 'cat file' (why?):

cat < file

Anyone familiar with C++ programming will recognise this syntax as being similar to the way
C++ programs display output.

Input and output redirection is used extensively by system shell scripts. Users and administrators
can use these operators as a quick and convenient way for managing program input and output.
For example, the output from a find command could be redirected into a file for later
examination. I often use 'cat > whatever' as a quick and easy way to create a short file without
using an editor.

Error messages from programs and commands are also often sent to a different output stream
called stderr - by default, stderr is also the relevant display window, or the Console Window if
one exists on-screen.

The numeric file handles associated with these three text streams are:

0 - stdin
1 - stdout
2 - stderr

These numbers can be placed before the < and > operators to select a particular stream to deal
with. Examples of this are given in the notes on shell script programming (Day 2).

The '&&' combination allows one to chain commands together so that each command is only
executed if the preceding command was successful, eg.:

run_my_prog_which_takes_hours > results && lp results

In this example, some arbitrary program is executed which is expected to take a long time. The
program's output is redirected into a file called results. If and only if the program terminates
successfully will the results file be sent to the default printer by the lp program. Note: any error
encountered by the program will also have the error message stored in the results file.

One common use of the && sequence is for on-the-spot backups:

cd /home && tar cv . && eject

This sequence changes directory to the /home area, archives the contents of /home to DAT and
ejects the DAT tape once the archive process has completed. Note that the eject command
without any arguments will search for a default removable media device, so this example
assumes there is only one such device, a DAT drive, attached to the system. Otherwise, one
could use 'eject /dev/tape' to be more specific.

The semicolon can also be used to chain commands together, but in a manner which does not
require each command to be successful in order for the next command to be executed, eg. one
could run two successive find commands, searching for different types of file, like this (try
executing this command in the directory /mapleson/public_html/sgi):

find . -name "*.gz" -print; find . -name "*.mpg" -print

The output given is:

./origin/techreport/compcon97_dv.pdf.gz
./origin/techreport/origin_chap7.pdf.gz
./origin/techreport/origin_chap6.pdf.gz
./origin/techreport/origin_chap5.pdf.gz
./origin/techreport/origin_chap4.pdf.gz
./origin/techreport/origin_chap3.pdf.gz
./origin/techreport/origin_chap2.pdf.gz
./origin/techreport/origin_chap1.5.pdf.gz
./origin/techreport/origin_chap1.0.pdf.gz
./origin/techreport/compcon_paper.pdf.gz
./origin/techreport/origin_techrep.pdf.tar.gz
./origin/techreport/origin_chap1-7TOC.pdf.gz
./pchall/pchal.ps.gz
./o2/phase/phase6.mpg
./o2/phase/phase7.mpg
./o2/phase/phase4.mpg
./o2/phase/phase5.mpg
./o2/phase/phase2.mpg
./o2/phase/phase3.mpg
./o2/phase/phase1.mpg
./o2/phase/phase8.mpg
./o2/phase/phase9.mpg

If one changes the first find command so that it will give an error, the second find command still
executes anyway:

% find /tmp/gurps -name "*.gz" -print ; find . -name "*.mpg" -print


cannot stat /tmp/gurps
No such file or directory
./o2/phase/phase6.mpg
./o2/phase/phase7.mpg
./o2/phase/phase4.mpg
./o2/phase/phase5.mpg
./o2/phase/phase2.mpg
./o2/phase/phase3.mpg
./o2/phase/phase1.mpg
./o2/phase/phase8.mpg
./o2/phase/phase9.mpg

However, if one changes the ; to && and runs the sequence again, this time the second find
command will not execute because the first find command produced an error:

% find /tmp/gurps -name "*.gz" -print && find . -name "*.mpg" -print
cannot stat /tmp/gurps
No such file or directory

As a final example, enter the following:

find /usr -name "*.htm*" -print & find /usr -name "*.rgb" -print &

This command runs two separate find processes, both in the background at the same time. Unlike
the previous examples, the output from each command is displayed first from one, then from the
other, and back again in a non-deterministic manner, as and when matching files are located by
each process. This is clear evidence that both processes are running at the same time. To shut
down the processes, either use 'killall find' or enter 'fg' followed by the use of CTRL-C twice (or
one could use kill with the appropriate process IDs, identifiable using 'ps -ef | grep find').

When writing shell script files, the ; symbol is most useful when one can identify commands
which do not depend on each other. This symbol, and the other symbols described here, are
heavily used in the numerous shell script files which manage many aspects of any modern UNIX
OS.

Note: if non-dependent commands are present in a script file or program, this immediately
allows one to imagine the idea of a multi-threaded OS, ie. an OS which can run many processes
in parallel across multiple processors. A typical example use of such a feature would be batch
processing scripts for image processing of medical data, or scripts that manage database systems,
financial accounts, etc.

References:

1. HP-UX/SUN Interoperability Cookbook, Version 1.0, Copyright 1994 Hewlett-Packard Co.:


2. http://www.hp-partners.com/ptc_public/techsup/SunInterop/

comp.sys.hp.hpux FAQ, Copyright 1995 by Colin Wynd:

http://hpux.csc.liv.ac.uk/hppd/FAQ/

3. Department of Computer Science and Electrical Engineering, Heriot Watt University, Riccarton
Campus, Edinburgh, Scotland:
4.
5. http://www.cee.hw.ac.uk/
Day 1:
Part 3: File ownership and access permissions.
Online help (man pages, etc.)

UNIX Fundamentals: File Ownership

UNIX has the concept of file 'ownership': every file has a unique owner, specified by a user ID
number contained in /etc/passwd. When examining the ownership of a file with the ls command,
one always sees the symbolic name for the owner, unless the corresponding ID number does not
exist in the local /etc/passwd file and is not available by any system service such as NIS.

Every user belongs to a particular group; in the case of the SGI system I run, every user belongs
to either the 'staff' or 'students' group (note that a user can belong to more than one group, eg. my
network has an extra group called 'projects'). Group names correspond to unique group IDs and
are listed in the /etc/group file. When listing details of a file, usually the symbolic group name is
shown, as long as the group ID exists in the /etc/group file, or is available via NIS, etc.

For example, the command:

ls -l /

shows the full details of all files in the root directory. Most of the files and directories are owned
by the root user, and belong to the group called 'sys' (for system). An exception is my home
account directory /mapleson which is owned by me.

Another example command:

ls -l /home/staff

shows that every staff member owns their particular home directory. The same applies to
students, and to any user which has their own account. The root user owns the root account (ie.
the root directory) by default.

The existence of user groups offers greater flexibility in how files are managed and the way in
which users can share their files with other users. Groups also offer the administrator a logical
way of managing distinct types of user, eg. a large company might have several groups:

accounts
clerical
investors
management
security

The admin decides on the exact names. In reality though, a company might have several internal
systems, perhaps in different buildings, each with their own admins and thus possibly different
group names.
UNIX Fundamentals: Access Permissions

Every file also has a set of file 'permissions'; the file's owner can set these permissions to alter
who can read, write or execute the file concerned. The permissions for any file can be examined
using the ls command with the -l option, eg.:

% ls -l /etc/passwd
-rw-r--r-- 1 root sys 1306 Jan 31 17:07 /etc/passwd

uuugggooo owner group size date mod name

Each file has three sets of file access permissions (uuu, ggg, ooo), relating to:

 the files owner, ie. the 'user' field


 the group which the file's owner belongs to
 the 'rest of the world' (useful for systems with more than one group)

This discussion refers to the above three fields as 'user', 'group' and 'others'. In the above example, the
three sets of permissions are represented by field shown as uuugggooo, ie. the main system password
file can be read by any user that has access to the relevant host, but can only be modified by the root
user. The first access permission is separate and is shown as a 'd' if the file is a directory, or 'l' if the file is
a link to some other file or directory (many examples of this can be found in the root directory and in
/etc).

Such a combination of options offers great flexibility, eg. one can have private email (user-only),
or one can share documents only amongst one's group (eg. staff could share exam documents, or
students could share files concerning a Student Union petition), or one can have files that are
accessible by anyone (eg. web pages). The same applies to directories, eg. since a user's home
directory is owned by that user, an easy way for a user to prevent anyone else from accessing
their home directory is to remove all read and execute permissions for groups and others.

File ownership and file access permissions are a fundamental feature of every UNIX file,
whether that file is an ordinary file, a directory, or some kind of special device file. As a result,
UNIX as an OS has inherent built-in security for every file. This can lead to problems if the
wrong permissions are set for a file by mistake, but assuming the correct permissions are in
place, a file's security is effectively secured.

Note that no non-UNIX operating system for PCs yet offers this fundamental concept of file-
ownership at the very heart of the OS, a feature that is definitely required for proper security.
This is largely why industrial-level companies, military, and government institutions do not use
NT systems where security is important. In fact, only Cray's Unicos (UNIX) operating system
passes all of the US DoD's security requirements.
Relevant Commands:

chown - change file ownership

chgrp - change group status of a file

chmod - change access permissions for one or more files

For a user to alter the ownership and/or access permissions of a file, the user must own that file.
Without the correct ownership, an error is given, eg. assuming I'm logged on using my ordinary
'mapleson' account:

% chown mapleson var


var - Operation not permitted

% chmod go+w /var


chmod() failed on /var: Operation not permitted

% chgrp staff /var


/var - Operation not permitted

All of these operations are attempting to access files owned by root, so they all fail.

Note: the root user can access any file, no matter what ownership or access permissions have
been set (unless a file owned by root has had its read permission removed). As a result, most
hacking attempts on UNIX systems revolve around trying to gain root privileges.

Most ordinary users will rarely use the chown or chgrp commands, but administrators may often
use them when creating accounts, installing custom software, writing scripts, etc.

For example, an admin might download some software for all users to use, installing it
somewhere in /usr/local. The final steps might be to change the ownership of every newly
installed file so ensure that it is owned by root, with the group set to sys, and then to use chmod
to ensure any newly installed executable programs can be run by all users, and perhaps to restrict
access to original source code.

Although chown is normally used to change the user ID of a file, and chgrp the group ID, chown
can actually do both at once. For example, while acting as root:

yoda 1# echo hello > file


yoda 2# ls -l file
-rw-r--r-- 1 root sys 6 May 2 21:50 file
yoda 3# chgrp staff file
yoda 4# chown mapleson file
yoda 5# ls -l file
-rw-r--r-- 1 mapleson staff 6 May 2 21:50 file
yoda 6# /bin/rm file
yoda 7# echo hello > file
yoda 8# ls -l file
-rw-r--r-- 1 root sys 6 May 2 21:51 file
yoda 9# chown mapleson.staff file
yoda 10# ls -l file
-rw-r--r-- 1 mapleson staff 6 May 2 21:51 file

Figure 18. Using chown to change both user ID and group ID.

Changing File Permissions: Examples.

The general syntax of the chmod command is:

chmod [-R] <mode> <filename(s)>

Where <mode> defines the new set of access permissions. The -R option is optional (denoted by
square brackets []) and can be used to recursively change the permissions for the contents of a
directory.

<mode> can be defined in two ways: using Octal (base-8) numbers or by using a sequence of
meaningful symbolic letters. This discussion covers the symbolic method since the numeric
method (described in the man page for chmod) is less intuitive to use. I wouldn't recommend an
admin use Octal notation until greater familiarity with how chmod works is attained.

<mode> can be summarised as containing three parts:

U operator P

where U is one or more characters corresponding to user, group, or other; operator is +, -, or =,


signifying assignment of permissions; and P is one or more characters corresponding to the
permission mode.

Some typical examples would be:

chmod go-r file - remove read permission for groups and others
chmod ugo+rx file - add read/execute permission for all
chmod ugo=r file - set permission to read-only for all users

A useful abbreviation in place of 'ugo' is 'a' (for all), eg.:

chmod a+rx file - give read and execute permission for all
chmod a=r file - set to read-only for all

For convenience, if the U part is missing, the command automatically acts for all, eg.:

chmod -x file - remove executable access from everyone


chmod =r file - set to read-only for everyone

though if a change in write permission is included, said change only affects user, presumably for
better security:
chmod +w file - add write access only for user
chmod +rwx file - add read/execute for all, add write only for
user
chmod -rw file - remove read from all, remove write from user

Note the difference between the +/- operators and the = operator: + and - add or take away from
existing permissions, while = sets all the permissions to a particular state, eg. consider a file
which has the following permissions as shown by ls -l:

-rw-------

The command 'chmod +rx' would change the permissions to:

-rwxr-xr-x

while the command 'chmod =rx' would change the permissions to:

-r-xr-xr-x

ie. the latter command has removed the write permission from the user field because the rx
permissions were set for everyone rather than just added to an existing state. Further examples of
possible permissions states can be found in the man page for ls.

A clever use of file ownership and groups can be employed by anyone to 'hand over' ownership
of a file to another user, or even to root. For example, suppose user alex arranges with user sam
to leave a new version of a project file (eg. a C program called project.c) in the /var/tmp
directory of a particular system at a certain time. User alex not only wants sam to be able to read
the file, but also to remove it afterwards, eg. move the file to sam's home directory with mv.
Thus, alex could perform the following sequence of commands:

cp project.c /var/tmp - copy the file


cd /var/tmp - change directory
chmod go-rwx project.c - remove all access for everyone else
chown sam project.c - change ownership to sam

Figure 19. Handing over file ownership using chown.

Fig 19 assumes alex and sam are members of the same group, though an extra chgrp command
could be used before the chown if this wasn't the case, or a combinational chown command used
to perform both changes at once.

After the above commands, alex will not be able to read the project.c file, or remove it. Only sam
has any kind of access to the file.

I once used this technique to show students how they could 'hand-in' project documents to a
lecturer in a way which would not allow students to read each others' submitted work.
Note: it can be easy for a user to 'forget' about the existence of hidden files and their associated
permissions. For example, someone doing some confidential movie editing might forget or not
even know that temporary hidden files are often created for intermediate processing. Thus,
confidential tasks should always be performed by users inside a sub-directory in their home
directory, rather than just in their home directory on its own.

Experienced users make good use of file access permissions to control exactly who can access
their files, and even who can change them.

Experienced administrators develop a keen eye and can spot when a file has unusual or perhaps
unintended permissions, eg.:

-rwxrwxrwx

if a user's home directory has permissions like this, it means anybody can read, write and execute
files in that directory: this is insecure and was probably not intended by the user concerned.

A typical example of setting appropriate access permissions is shown by my home directory:

ls -l /mapleson

Only those directories and files that I wish to be readable by anyone have the group and others
permissions set to read and execute.

Note: to aid security, in order for a user to access a particular directory, the execute permission
must be set on for that directory as well as read permission at the appropriate level (user, group,
others). Also, only the owner of a file can change the permissions or ownership state for that file
(this is why a chown/chgrp sequence must have the chgrp done first, or both at once via a
combinational chown).

The Set-UID Flag.

This special flag appears as an 's' instead of 'x' in either the user or group fields of a file's
permissions, eg.:

% ls -l /sbin/su
-rwsr-xr-x 1 root sys 40180 Apr 10 22:12 /sbin/su*

The online book, "IRIX Admin: Backup, Security, and Accounting", states:

"When a user runs an executable file that has either of these


permissions, the system gives the user the permissions of the
owner of the executable file."

An admin might use su to temporarily become root or another user without logging off. Ordinary
users may decide to use it to enable colleagues to access their account, but this should be
discouraged since using the normal read/write/execute permissions should be sufficient.
Mandatory File Locking.

If the 'l' flag is set in a file's group permissions field, then the file will be locked while another
user from the same group is accessing the file. For example, file locking allows a user to gather
data from multiple users in their own group via a group-writable file (eg. petition, questionnaire,
etc.), but blocks simultaneous file-write access by multiple users - this prevents data loss which
might otherwise occur via two users writing to a file at the same time with different versions of
the file.

UNIX Fundamentals: Online Help

From the very early days of UNIX, online help information was available in the form of manual
pages, or 'man' pages. These contain an extensive amount of information on system commands,
program subroutines, system calls and various general references pages on topics such as file
systems, CPU hardware issues, etc.

The 'man' command allows one to search the man page database using keywords, but this text-
based interface is still somewhat restrictive in that it does not allow one to 'browse' through
pages at will and does not offer any kind of direct hyperlinked reference system, although each
man pages always includes a 'SEE ALSO' section so that one will know what other man pages
are worth consulting.

Thus, most modern UNIX systems include the 'xman' command: a GUI interface using X
Window displays that allows one to browse through man pages at will and search them via
keywords. System man pages are actually divided into sections, a fact which is not at all obvious
to a novice user of the man command. By contrast, xman reveals immediately the existence of
these different sections, making it much easier to browse through commands.

Since xman uses the various X Windows fonts to display information, the displayed text can
incorporate special font styling such as italics and bold text to aid clarity. A man page shown in a
shell can use bright characters and inverted text, but data shown using xman is much easier to
read, except where font spacing is important, eg. enter 'man ascii' in a shell and compare it to the
output given by xman (use xman's search option to bring up the man page for ascii).

xman doesn't include a genuine hypertext system, but the easy-to-access search option makes it
much more convenient to move from one page to another based on the contents of a particular
'SEE ALSO' section.

Most UNIX systems also have some form of online book archive. SGIs use the 'Insight' library
system which includes a great number of books in electronic form, all written using hypertext
techniques. An ordinary user would be expected to begin their learning process by using the
online books rather than the man pages since the key introductory books guide the user through
the basics of using the system via the GUI interface rather than the shell interface.
SGIs also have online release notes for each installed software product. These can be accessed
via the command 'grelnotes' which gives a GUI interface to the release notes archive, or one can
use relnotes in a shell or terminal window. Other UNIX variants probably also have a similar
information resource. Many newer software products also install local web pages as a means of
providing online information, as do 3rd-party software distributions. Such web pages are usually
installed somewhere in /usr/local, eg. /usr/local/doc. The URL format 'file:/file-path' is used to
access such pages, though an admin can install file links with the ln command so that online
pages outside of the normal file system web area (/var/www/htdocs on SGIs) are still accessible
using a normal http format URL.

In recent years, there have been moves to incorporate web technologies into UNIX GUI systems.
SGI began their changes in 1996 (a year before anyone else) with the release of the O2
workstation. IRIX 6.3 (used only with O2) included various GUI features to allow easy
integration between the existing GUI and various web features, eg. direct iconic links to web
sites, and using Netscape browser window interface technologies for system administration,
online information access, etc. Most UNIX variants will likely have similar features; on SGIs
with the latest OS version (IRIX 6.5), the relevant system service is called InfoSearch - for the
first time, users have a single entry point to the entire online information structure, covering man
pages, online books and release notes.

Also, extra GUI information tools are available for consulting "Quick Answers" and "Hints and
Shortcuts". These changes are all part of a general drive on UNIX systems to make them easier
to use.

Unlike the xman resource, viewing man pages using InfoSearch does indeed hyperlink
references to other commands and resources throughout each man page. This again enhances the
ability of an administrator, user or application developer to locate relevant information.

Summary: UNIX systems have a great deal of online information. As the numerous UNIX
variants have developed, vendors have attempted to improve the way in which users can access
that information, ultimately resulting in highly evolved GUI-based tools that employ standard
windowing technologies such as those offered by Netscape (so that references may include direct
links to web sites, ftp sites, etc.), along with hypertext techniques and search mechanisms.
Knowing how to make the best use of available documentation tools can often be the key to
effective administration, ie. locating answers quickly as and when required.
Detailed Notes for Day 2 (Part 1)
UNIX Fundamentals: System Identity, IP Address, Domain Name, Subdomain.

Every UNIX system has its own unique name, which is the means by which that machine is
referenced on local networks and beyond, eg. the Internet. The normal term for this name is the
local 'host' name. Systems connected to the Internet employ naming structures that conform to
existing structures already used on the Internet. A completely isolated network can use any
naming scheme.

Under IRIX, the host name for a system is stored in the /etc/sys_id file. The name may be up to
64 alphanumeric characters in length and can include hyphens and periods. Period characters '.'
are not part of the real name but instead are used to separate the sequence into a domain-name
style structure (eg. www.futuretech.vuurwerk.nl). The SGI server's host name is yoda, the fully-
qualified version of which is written as yoda.comp.uclan.ac.uk. The choice of host names is
largely arbitrary, eg. the SGI network host names are drawn from my video library (I have
chosen names designed to be short without being too uninteresting).

On bootup, a system's /etc/rc2.d/S20sysetup script reads its /etc/sys_id file to determine the local
host name. From then onwards, various system commands and internal function calls will return
that system name, eg. the 'hostname' and 'uname' commands (see the respective man pages for
details).

Along with a unique identity in the form of a host name, a UNIX system has its own 32bit
Internet Protocol (IP) address, split for convenience into four 8bit integers separated by periods,
eg. yoda's IP address is 193.61.250.34, an address which is visible to any system anywhere on
the Internet.

IP is the network-level communications protocol used by Internet systems and services. Various
extra options can be used with IP layer communications to create higher-level services such as
TCP (Transmission Control Protocol). The entire Internet uses the TCP/IP protocols for
communication.

A system which has more than one network interface (eg. multiple Ethernet ports) must have a
unique IP address for each port. Special software may permit a system to have extra addresses,
eg. 'IP Aliasing', a technique often used by an ISP to provide a more flexible service to its
customers. Note: unlike predefined Ethernet addresses (every Ethernet card has its own unique
address), a system's IP address is determined by the network design, admin personnel, and
external authorities.

Conceptually speaking, an IP address consists of two numbers: one represents the network while
the other represents the system. In order to more efficiently make use of the numerous possible
address 'spaces', four classes of addresses exist, named A, B, C and D. The first few bits of an
address determines its class:
Initial Binary No. of Bits for No. of Bits for
Class Bit Field the Network Number The Host Number

A 0 7 24
B 10 14 16
C 110 21 8
D 1110 [special 'multicast' addresses for internal network
use]

Figure 20. IP Address Classes: bit field and width allocations.

This system allows the Internet to support a range of different network sizes with differing
maximum limits on the number of systems for each type of network:

Class A Class B Class C Class D

No. of networks: 128 16384 2097152 [multicast]


No. of systems each: 16777214 65534 254 [multicast]

Figure 21. IP Address Classes: supported network types and sizes.

The numbers 0 and 255 are never used for any host. These are reserved for special uses.

Note that a network which will never be connected to the Internet can theoretically use any IP
address and domain/subdomain configuration.
Which class of network an organisation uses depends on how many systems it expects to have
within its network. Organisations are allocated IP address spaces by Internet Network
Information Centers (InterNICs), or by their local ISP if that is how they are connected to the
Internet. An organisation's domain name (eg. uclan.ac.uk) is also obtained from the local
InterNIC or ISP. Once a domain name has been allocated, the organisation is free to setup its
own network subdomains such as comp.uclan.ac.uk (comp = Computing Department), within
which an individual host would be yoda.comp.uclan.ac.uk. A similar example is Heriot Watt
University in Edinburgh (where I studied for my BSc) which has the domain hw.ac.uk, with its
Department of Computer Science and Electrical Engineering using a subdomain called
cee.hw.ac.uk, such that a particular host is www.cee.hw.ac.uk (see Appendix A for an example
of what happens when this methodology is not followed correctly).

UCLAN uses Class C addresses, with example address spaces being 193.61.255 and 193.61.250.
A small number of machines in the Computing Department use the 250 address space, namely
the SGI server's external Ethernet port at 193.61.250.34, and the NT server at 193.61.250.35
which serves the NT network in Ve27.

Yoda has two Ethernet ports; the remaining port is used to connect to the SGI Indys via a hub -
this port has been defined to use a different address space, namely 193.61.252. The machines' IP
addresses range from 193.61.252.1 for yoda, to 193.61.252.23 for the admin Indy; .20 to .22 are
kept available for two HP systems which are occasionally connected to the network, and for a
future plan to include Apple Macs on the network.

The IP addresses of the Indys using the 252 address space cannot be directly accessed outside the
SGI network or, as the jargon goes, 'on the other side' of the server's Ethernet port which is being
used for the internal network. This automatically imposes a degree of security at the physical
level.

IP addresses and host names for systems on the local network are brought together in the file
/etc/hosts. Each line in this file gives an IP address, an official hostname and then any name
aliases which represent the same system, eg. yoda.comp.uclan.ac.uk is also known as
www.comp.uclan.ac.uk, or just yoda, or www, etc. When a system is first booted, the ifconfig
command uses the /etc/hosts file to assign addresses to the various available Ethernet network
interfaces. Enter 'more /etc/hosts' or 'nedit /etc/hosts' to examine the host names file for the
particular system you're using.

NB: due to the Internet's incredible expansion in recent years, the world is actually beginning to
run out of available IP addresses and domain names; at best, existing top-level domains are being
heavily overused (eg. .com, .org, etc.) and the number of allocatable network address spaces is
rapidly diminishing, especially if one considers the possible expansion of the Internet into
Russia, China, the Far East, Middle East, Africa, Asia and Latin America. Thus, there are moves
afoot to change the Internet so that it uses 128bit instead of 32bit IP addresses. When this will
happen is unknown, but such a change would solve the problem.
Special IP Addresses

Certain reserved IP addresses have special meanings, eg. the address 127.0.0.1 is known as the
'loopback' address (equivalent host name 'localhost') and always refers to the local system which
one happens to be using at the time. If one never intends to connect a system to the Internet,
there's no reason why this default IP address can't be left as it is with whatever default name
assigned to it in the /etc/hosts file (SGIs always use the default name, "IRIS"), though most
people do change their system's IP address and host name in case, for example, they have to
connect their system to the network used at their place of work, or to provide a common naming
scheme, group ID setup, etc.

If a system's IP address is changed from the default 127.0.0.1, the exact procedure is to add a
new line to the /etc/hosts file such that the system name corresponds to the information in
/etc/sys_id. One must never remove the 127.0.0.1 entry from the /etc/hosts file or the system will
not work properly. The important lines of the /etc/hosts file used on the SGI network are shown
in Fig 22 below (the appearance of '[etc]' in Fig 22 means some text has been clipped away to aid
clarity).

# This entry must be present or the system will not work.


127.0.0.1 localhost

# SGI Server. Challenge S.


193.61.252.1 yoda.comp.uclan.ac.uk yoda www.comp.uclan.ac.uk www
[etc]

# Computing Services router box link.


193.61.250.34 gate-yoda.comp.uclan.ac.uk gate-yoda

# SGI Indys in Ve24, except milamber which is in Ve47.

193.61.252.2 akira.comp.uclan.ac.uk akira


193.61.252.3 ash.comp.uclan.ac.uk ash
193.61.252.4 cameron.comp.uclan.ac.uk cameron
193.61.252.5 chan.comp.uclan.ac.uk chan
193.61.252.6 conan.comp.uclan.ac.uk conan
193.61.252.7 gibson.comp.uclan.ac.uk gibson
193.61.252.8 indiana.comp.uclan.ac.uk indiana
193.61.252.9 leon.comp.uclan.ac.uk leon
193.61.252.10 merlin.comp.uclan.ac.uk merlin
193.61.252.11 nikita.comp.uclan.ac.uk nikita
193.61.252.12 ridley.comp.uclan.ac.uk ridley
193.61.252.13 sevrin.comp.uclan.ac.uk sevrin
193.61.252.14 solo.comp.uclan.ac.uk solo
193.61.252.15 spock.comp.uclan.ac.uk spock
193.61.252.16 stanley.comp.uclan.ac.uk stanley
193.61.252.17 warlock.comp.uclan.ac.uk warlock
193.61.252.18 wolfen.comp.uclan.ac.uk wolfen
193.61.252.19 woo.comp.uclan.ac.uk woo
193.61.252.23 milamber.comp.uclan.ac.uk milamber

[etc]

Figure 22. The contents of the /etc/hosts file used on the SGI network.
One example use of the localhost address is when a user accesses a system's local web page
structure at:

http://localhost/

On SGIs, such an address brings up a page about the machine the user is using. For the SGI
network, the above URL always brings up a page for yoda since /var/www is NFS-mounted from
yoda. The concept of a local web page structure for each machine is more relevant in company
Intranet environments where each employee probably has her or his own machine, or where
different machines have different locally stored web page information structures due to, for
example, differences in available applications, etc.

The BIND Name Server (DNS).

If a site is to be connected to the Internet, then it should use a name server such as BIND
(Berkeley Internet Name Domain) to provide an Internet Domain Names Service (DNS). DNS is
an Internet-standard name service for translating hostnames into IP addresses and vice-versa. A
client machine wishing to access a remote host executes a query which is answered by the DNS
daemon, called 'named'. Yoda runs a DNS server and also a Proxy server, allowing the machines
in Ve24 to access the Internet via Netscape (telnet, ftp, http, gopher and other services can be
used).

Most of the relevant database configuration files for a DNS setup reside in /var/named. A set of
example configuration files are provided in /var/named/Examples - these should be used as
templates and modified to reflect the desired configuration. Setting up a DNS database can be a
little confusing at first, thus the provision of the Examples directory. The files which must be
configured to provide a functional DNS are:

/etc/named.boot
/var/named/root.cache
/var/named/named.hosts
/var/named/named.rev
/var/named/localhost.rev

If an admin wishes to use a configuration file other than /etc/named.boot, then its location should
be specified by creating a file called /etc/config/named.options with the following contents (or
added to named.options if it already exists):

-b some-other-boot-file

After the files in /var/named have been correctly configured, the chkconfig command is used to
set the appropriate variable file in /etc/config:

chkconfig named on
The next reboot will activate the DNS service. Once started, named reads initial configuration
information from the file /etc/named.boot, such as what kind of server it should be, where the
DNS database files are located, etc. Yoda's named.boot file looks like this:

;
; Named boot file for yoda.comp.uclan.ac.uk.
;
directory /var/named

cache . root.cache
primary comp.uclan.ac.uk named.hosts
primary 0.0.127.IN-ADDR.ARPA localhost.rev
primary 252.61.193.IN-ADDR.ARPA named.rev
primary 250.61.193.IN-ADDR.ARPA 250.rev
forwarders 193.61.255.3 193.61.255.4

Figure 23. Yoda's /etc/named.boot file.

Looking at the contents of the example named.boot file in /var/named/Examples, the differences
are not that great:

;
; boot file for authoritative master name server for Berkeley.EDU
; Note that there should be one primary entry for each SOA record.
;
;
sortlist 10.0.0.0

directory /var/named

; type domain source host/file backup


file

cache . root.cache
primary Berkeley.EDU named.hosts
primary 32.128.IN-ADDR.ARPA named.rev
primary 0.0.127.IN-ADDR.ARPA localhost.rev

Figure 24. The example named.boot file in /var/named/Examples.

Yoda's file has an extra line for the /var/named/250.rev file; this was an experimental attempt to
make Yoda's subdomain accessible outside UCLAN, which failed because of the particular
configuration of a router box elsewhere in the communications chain (the intention was to enable
students and staff to access the SGI network using telnet from a remote host).

For full details on how to configure a typical DNS, see Chapter 6 of the online book, "IRIX
Admin: Networking and Mail". A copy of this Chapter has been provided for reference. As an
example of how identical DNS is across UNIX systems, see the issue of Network Week [10]
which has an article on configuring a typical DNS. Also, a copy of each of Yoda’s DNS files
which I had to configure is included for reference. Together, these references should serve as an
adequate guide to configuring a DNS; as with many aspects of managing a UNIX system,
learning how someone else solved a problem and then modifying copies of what they did can be
very effective.

Note: it is not always wise to use a GUI tool for configuring a service such as BIND [11]. It's too
easy for ill-tested grandiose software management tools to make poor assumptions about how an
admin wishes to configure a service/network/system. Services such as BIND come with their
own example configuration files anyway; following these files as a guide may be considerably
easier than using a GUI tool which itself can cause problems created by whoever wrote the GUI
tool, rather than the service itself (in this case BIND).

Proxy Servers

A Proxy server acts as a go-between to the outside world, answering client requests for data from
the Internet, calling the DNS system to obtain IP addresses based on domain names, opening
connections to the Internet perhaps via yet another Proxy server elsewhere (the Ve24 system uses
Pipex as the next link in the communications chain), and retrieving data from remote hosts for
transmission back to clients.

Proxy servers are a useful way of providing Internet access to client systems at the same time as
imposing a level of security against the outside world, ie. the internal structure of a network is
hidden from the outside world due to the operational methods employed by a Proxy server, rather
like the way in which a representative at an auction can act for an anonymous client via a mobile
phone during the bidding. Although there are more than a dozen systems in Ve24, no matter
which machine a user decides to access the Internet from, the access will always appear to a
remote host to be coming from the IP address of the closest proxy server, eg. the University web
server would see Yoda as the accessing client. Similarly, I have noticed that when I access my
own web site in Holland, the site concerned sees my access as if it had come from the proxy
server at Pipex, ie. the Dutch system cannot see 'past' the Pipex Proxy server.

There are various proxy server software solutions available. A typical package which is easy to
install and configure is the Netscape Proxy Server. Yoda uses this particular system.

Network Information Service (NIS)

It is reasonably easy to ensure that all systems on a small network have consistent /etc/hosts files
using commands such as rcp. However, medium-sized networks consisting of dozens to
hundreds of machines may present problems for administrators, especially if the overall setup
consists of several distinct networks, perhaps in different buildings and run by different people.
For such environments, a Network Information Service (NIS) can be useful. NIS uses a single
system on the network to act as the sole trusted source of name service information - this system
is known as the NIS master. Slave servers may be used to which copies of the database on the
NIS master are periodically sent, providing backup services should the NIS master system fail.
Client systems locate a name server when required, requesting data based on a domain name and
other relevant information.

Unified Name Service Daemon (UNS, or more commonly NSD).

Extremely recently, the DNS and NIS systems have been superseded by a new system called the
Unified Name Service Daemon, or NSD for short. NSD handles requests for domain information
in a considerably more efficient manner, involving fewer system calls, replacing multiple files
for older services with a single file (eg. many of the DNS files in /var/named are replaced by a
single database file under NSD), and allowing for much larger numbers of entries in data files,
etc.

However, NSD is so new that even I have not yet had an opportunity to examine properly how it
works, or the way in which it correlates to the older DNS and NIS services. As a result, this
course does not describe DNS, NIS or NSD in any great detail. This is because, given the rapid
advance of modern UNIX OSs, explaining the workings of DNS or NIS would likely be a
pointless task since any admin beginning her or his career now is more likely to encounter the
newer NSD system which I am not yet comfortable with. Nevertheless, administrators should be
aware of the older style services as they may have to deal with them, especially on legacy
systems. Thus, though not discussed in these lectures, some notes on a typical DNS setup are
provided for further reading [10]. Feel free to login to the SGI server yourself with:

rlogin yoda

and examine the DNS and NIS configuration files at your leisure; these may be found in the
/var/named and /var/yp directories. Consult the online administration books for further details.

UNIX Fundamentals: UNIX Software Features

Software found on UNIX systems can be classified into several types:

 System software: items provided by the vendor as standard.


 Commercial software: items purchased either from the same vendor which supplied the OS, or
from some other commercial 3rd-party.
 Shareware software: items either supplied with the OS, or downloaded from the Internet, or
obtained from some other source such as a cover magazine CD.
 Freeware software: items supplied in the same manner as Shareware, but using a more open
'conditions of use'.
 User software: items created by users of a system, whether that user is an admin or an ordinary
user.
System Software

Any OS for any system today is normally supplied on a set of CDs. As the amount of data for an
OS installation increases, perhaps the day is not far away when vendors will begin using DVDs
instead.

Whether or not an original copy of OS CDs can be installed on a system depends very much on
the particular vendor, OS and system concerned. Any version of IRIX can be installed on an SGI
system which supports that particular version of IRIX - this ability to install the OS whether or
not one has a legal right to use the software is simply a practice SGI has adopted over the years.
SGI could have chosen to make OS installation more difficult by requiring license codes and
other details at installation time, but instead SGI chose a different route. What is described here
applies only to SGI's IRIX OS.

SGI decided some time ago to adopt a strategy of official software and hardware management
which makes it extremely difficult to make use of 'pirated' software. The means by which this is
achieved is explained in the System Hardware section below, but the end result is a policy where
any version IRIX older than the 'current' version is free by default. Thus, since the current release
of IRIX is 6.5, one could install IRIX 6.4, 6.3, 6.2 (or any older version) on any appropriate SGI
system (eg. installing IRIX 6.2 on a 2nd-hand Indy) without having to worry about legal issues.
There's nothing to stop one physically installing 6.5 if one had the appropriate CDs (ie. the
software installation tools and CDs do not include any form of installation protection or copy
protection), but other factors might make for trouble later on if the user concerned did not apply
for a license at a later date, eg. attempting to purchase commercial software and licenses for the
latest OS release.

It is highly likely that in future years, UNIX vendors will also make their current OSs completely
free, probably as a means of combating WindowsNT and other rivals.

As an educational site operating under an educational license agreement, UCLAN's Computing


Department is entitled to install IRIX 6.5 on any of the SGI systems owned by the Computing
Department, though at present most systems use the older IRIX 6.2 release for reasons connected
with system resources on each machine (RAM, disk space, CPU power).

Thus, the idea of a license can have two meanings for SGIs:

 A theoretical 'legal' license requirement which applies, for example, to the current release of
IRIX, namely IRIX 6.5 - this is a legal matter and doesn't physically affect the use of IRIX 6.5 OS
CDs.
 A real license requirement for particular items of software using license codes, obtainable either
from SGI or from whatever 3rd-party the software in question was purchased.

Another example of the first type is the GNU licensing system, explained in the 'Freeware Software'
section below (what the GNU license is and how it works is fascinatingly unique).
Due to a very early top-down approach to managing system software, IRIX employs a high-level
software installation structure which ensures that:

 It is extremely easy to add, remove, or update software, especially using the GUI software tool
called Software Manager (swmgr is the text command name which can be entered in a shell).
 Changes to system software are handled correctly with very few, if any, errors most of the time;
'most' could be defined as 'rarely, if ever, but not never'. A real world example might be to state
that I have installed SGI software elements thousands of times and rarely if ever encountered
problems, though I have had to deal with some issues on occasion.
 Software 'patches' (modificational updates to existing software already installed) are handled in
such a way as to allow the later removal of said patches if desired, leaving the system in exactly
its original state as if the patch had never been installed.

As an example of software installation reliability, my own 2nd-hand Indigo2 at home has been in use
since March 1998, was originally installed with IRIX 6.2, updated with patches several times, added to
with extra software over the first few months of ownership (mid-1998), then upgraded to IRIX 6.5,
added to with large amounts of freeware software, then upgraded to IRIX 6.5.1, then 6.5.2, then 6.5.3,
and all without a single software installation error of any kind. In fact, my Indigo2 hasn't crashed or
given a single error since I first purchased it. As is typical of any UNIX system which is/was widely used in
various industries, most if not all of the problems ever encountered on the Indigo2 system have been
resolved by now, producing an incredibly stable platform. In general, the newer the system and/or the
newer the software, then the greater number of problems there will be to deal with, at least initially.

Thankfully, OS revisions largely build upon existing code and knowledge. Plus, since so many
UNIX vendors have military, government and other important customers, there is incredible
pressure to be very careful when planning changes to system or application software. Intensive
testing is done before any new version is released into the marketplace (this contrasts completely
with Microsoft which deliberately allows the public to test Beta versions of its OS revisions as a
means of locating bugs before final release - a very lazy way to handle system testing by any
measure).

Because patches often deal with release versions of software subsystems, and many software
subsystems may have dependencies on other subsystems, the issue of patch installation is the
most common area which can cause problems, usually due to unforseen conflicts between
individual versions of specific files. However, rigorous testing and a top-down approach to
tracking release versions minimises such problems, especially since all UNIX systems come
supplied with source code version/revision tracking tools as-standard, eg. SCCS. The latest
'patch CD' can usually be installed automatically without causing any problems, though it is wise
for an administrator to check what changes are going to be made before commencing any such
installation, just in case.

The key to such a high-level software management system is the concept of a software
'subsystem'. SGI has developed a standard means by which a software suite and related files
(manual pages, release notes, data, help documents, etc.) are packaged together in a form suitable
for installation by the usual software installation tools such as inst and swmgr. Once this
mechanism was carefully defined many years ago, insisting that all subsequent official software
releases comply with the same standard ensures that the opportunity for error is greatly
minimised, if not eliminated. Sometimes, certain 3rd-party applications such as Netscape can
display apparent errors upon installation or update, but these errors are usually explained in
accompanying documentation and can always be ignored.

Each software subsystem is usually split into several sub-units so that only relevant components
need be installed as desired. The sub-units can then be examined to see the individual files which
would be installed, and where. When making updates to software subsystems, selecting a newer
version of a subsystem automatically selects only the relevant sub-units based on which sub-
units have already been installed, ie. new items will not automatically be selected. For ease of
use, an admin can always choose to execute an automatic installation or removal (as desired),
though I often select a custom installation just so that I can see what's going on and learn more
about the system as a result. In practice, I rarely need to alter the default behaviour anyway.

The software installation tools automatically take care not to overwrite existing configuration
files when, for example, installing new versions (ie. upgrades) of software subsystems which
have already been installed (eg. Netscape). In such cases, both the old and new configuration
files are kept and the user (or admin) informed that there may be a need to decide which of the
two files to keep, or perhaps to copy key data from the old file to the new file, deleting the old
file afterwards.

Commercial Software

A 3rd-party commercial software package may or may not come supplied in a form which
complies with any standards normally used by the hardware system vendor. UNIX has a long
history of providing a generic means of packaging software and files in an archive which can be
downloaded, uncompressed, dearchived, compiled and installed automatically, namely the
'tar.gz' archive format (see the man pages for tar and gzip). Many commercial software suppliers
may decide to sell software in this format. This is ok, but it does mean one may not be able to
use the usual software management tools (inst/swmgr in the case of SGIs) to later remove the
software if desired. One would have to rely on the supplier being kind enough to either provide a
script which can be used to remove the software, or at the very least a list of which files get
installed where.

Thankfully, it is likely that most 3rd-parties will at least try to use the appropriate distribution
format for a particular vendor's OS. However, unlike the source vendor, one cannot be sure that
the 3rd-party has taken the same degree of care and attention to ensure they have used the
distribution format correctly, eg. checking for conflicts with other software subsystems,
providing product release notes, etc.

Commercial software for SGIs may or may not use the particular hardware feature of SGIs
which SGI uses to prevent piracy, perhaps because exactly how it works is probably itself a
licensed product from SGI. Details of this mechanism are given in the System Hardware section
below.
Shareware Software

The concept of shareware is simple: release a product containing many useful features, but which
has more advanced features and perhaps essential features limited, restricted, or locked out
entirely, eg. being able to save files, or working on files over a particular size.

A user can download the shareware version of the software for free. They can test out the
software and, if they like it, 'register' the software in order to obtain either the 'full' (ie. complete)
version, or some kind of encrypted key or license code that will unlock the remaining features
not accessible or present in the shareware version. Registration usually involves sending a small
fee, eg. $30, to the author or company which created the software. Commonly, registration
results in the author(s) sending the user proper printed and bound documentation, plus regular
updates to the registered version, news releases on new features, access to dedicated mailing
lists, etc.

The concept of shareware has changed over the years, partly due to the influence of the computer
game 'Doom' which, although released as shareware in name, actually effectively gave away an
entire third of the complete game for free. This was a ground-breaking move which proved to be
an enormous success, earning the company which made the game (id Software, Dallas, Texas,
USA) over eight million $US and a great deal of respect and loyalty from gaming fans. Never
before had a company released shareware software in a form which did not involve deliberately
'restricting' key aspects of the shareware version. As stated above, shareware software is often
altered so that, for example, one could load files, work on them, make changes, test out a range
of features, but (crucially) not save the results. Such shareware software is effectively not of any
practical use on its own, ie. it serves only as a kind of hands-on advertisement for the full
version. Doom was not like this at all. One could play an entire third of the game, including over
a network against other players.

Today, other creative software designers have adopted a similar approach, perhaps the most
famous recent example of which is 'Blender' [1], a free 3D rendering and animation program for
UNIX and (as of very soon) WindowsNT systems.

In its as-supplied form, Blender can be used to do a great deal of work, creating 3D scenes,
renderings and animations easily on a par with 3D Studio Max, even though some features in
Blender are indeed locked out in the shareware version. However, unlike conventional traditional
concepts, Blender does allow one to save files and so can be used for useful work. It has spread
very rapidly in the last few months amongst students in educational sites worldwide, proving to
be of particular interest to artists and animators who almost certainly could not normally afford a
commercial package which might cost hundreds or perhaps thousands of pounds. Even small
companies have begun using Blender.

However, supplied documentation for Blender is limited. As a 'professional level' system, it is


unrealistic to expect to be able to get the best out of it without much more information on how it
works and how to use it. Thus, the creators of Blender, a company called NaN based in Holland,
makes most of their revenue by offering a very detailed 350 page printed and bound manual for
about $50 US, plus a sequence of software keys which make available the advanced features in
Blender.

Software distribution concepts such as the above methods used by NaN didn't exist just a few
years ago, eg. before 1990. The rise of the Internet, certain games such as Doom, the birth of
Linux, and changes in the way various UNIX vendors manage their business have caused a
quantum leap in what people think of as shareware.

Note that the same caveat stated earlier with respect to software quality also applies to
shareware, and to freeware too, ie. such software may or may not use the normal distribution
method associated with a particular UNIX platform - in the case of SGIs, the 'inst' format.

Another famous example of shareware is the XV [2] image-viewer program, which offers a
variety of functions for image editing and image processing (even though its author insists it's
really just an image viewer). XV does not have restricted features, but it is an official shareware
product which one is supposed to register if one intends to use the program for commercial
purposes. However, as is typical with many modern shareware programs, the author stipulates
that there is no charge for personal (non-commercial) or educational use.

Freeware Software

Unlike shareware software, freeware software is exactly that: completely free. There is no
concept of registration, restricted features, etc. at all.

Until recently, even I was not aware of the vast amount of free software available for SGIs and
UNIX systems in general. There always has been free software for UNIX systems, but as in
keeping with other changes by UNIX vendors over the past few years, SGI altered its application
development support policy in 1997 to make it much easier for users to make use of freeware on
SGI systems. Prior to that time, SGI did not make the system 'header' files (normally kept in
/usr/include) publicly available. Without these header files, one could not compile any new
programs even if one had a free compiler.

So, SGI adopted a new stance whereby the header files, libraries, example source code and other
resources are provided free, but its own advanced compiler technologies (the MIPS Pro
Compilers) remain commercial products. Immediately, anyone could then write their own
applications for SGI systems using the supplied CDs (copies of which are available from SGI's
ftp site) in conjunction with free compilation tools such as the GNU compilers. As a result, the
2nd-hand market for SGI systems in the USA has skyrocketed, with extremely good systems
available at very low cost (systems which cost 37500 pounds new can now be bought for as little
as 500 pounds, even though they can still be better than modern PCs in many respects).

It is highly likely that other vendors have adopted similar strategies in recent years (most of my
knowledge concerns SGIs). Sun Microsystems made its SunOS free for students some years ago
(perhaps Solaris too); my guess is that a similar compiler/development situation applies to
systems using SunOS and Solaris as well - one can write applications using free software and
tools. This concept probably also applies to HP systems, Digital UNIX systems, and other
flavours of UNIX.

Linux is a perfect example of how the ideas of freeware development can determine an OS'
future direction. Linux was meant to be a free OS from its very inception - Linus Torvalds, its
creator, loathes the idea of an OS supplier charging for the very platform upon which essential
software is executed. Although Linux is receiving considerable industry support these days,
Linus is wary of the possibility of Linux becoming more commercial, especially as vendors such
as Red Hat and Caldera offer versions of Linux with added features which must be paid for.
Whether or not the Linux development community can counter these commercial pressures in
order to retain some degree of freeware status and control remains to be seen.

Note: I'm not sure of the degree to which completely free development environments on a
quality-par with GNU are available for MS Windows-based systems (whether that involves
Win95, Win98, WinNT or even older versions such as Win3.1).

The GNU Licensing System

The GNU system is, without doubt, thoroughly unique in the modern era of copyright,
trademarks, law suits and court battles. It can be easily summarised as a vast collection of free
software tools, but the detail reveals a much deeper philosophy of software development, best
explained by the following extract from the main GNU license file that accompanies any GNU-
based program [3]:

"The licenses for most software are designed to take away your freedom to share and
change it. By contrast, the GNU General Public License is intended to guarantee your
freedom to share and change free software--to make sure the software is free for all its
users. This General Public License applies to most of the Free Software Foundation's
software and to any other program whose authors commit to using it. (Some other Free
Software Foundation software is covered by the GNU Library General Public License
instead.) You can apply it to your programs, too.

When we speak of free software, we are referring to freedom, not price. Our General Public Licenses
are designed to make sure that you have the freedom to distribute copies of free software (and charge
for this service if you wish), that you receive source code or can get it if you want it, that you can
change the software or use pieces of it in new free programs; and that you know you can do these
things.

To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to
ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.

For example, if you distribute copies of such a program, whether gratis or for a fee, you must give
the recipients all the rights that you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which
gives you legal permission to copy, distribute and/or modify the software.

Also, for each author's protection and ours, we want to make certain that everyone understands that
there is no warranty for this free software. If the software is modified by someone else and passed on,
we want its recipients to know that what they have is not the original, so that any problems
introduced by others will not reflect on the original authors' reputations.

Finally, any free program is threatened constantly by software patents. We wish to avoid the danger
that redistributors of a free program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any patent must be licensed for
everyone's free use or not licensed at all."

Reading the above extract, it is clear that those responsible for the GNU licensing system had to
spend a considerable amount of time actually working out how to make something free! Free in a
legal sense that is. So many standard legal matters are designed to restrict activities, the work put
into the GNU Free Software Foundation makes the license document read like some kind of
software engineer's nirvana. It's a serious issue though, and the existence of GNU is very
important in terms of the unimaginable amount of creative work going on around the world
which would not otherwise exist (without GNU, Linux would probably not exist).

SGI, and other UNIX vendors I expect, ships its latest OS (IRIX 6.5) with a CD entitled
'Freeware', which not only contains a vast number of freeware programs in general (everything
from spreadsheets and data plotting to games, audio/midi programming and molecular
modeling), but also a complete, pre-compiled inst-format distribution of the entire GNU archive:
compilers, debugging tools, GNU versions of shells and associated utilities, calculators,
enhanced versions of UNIX commands and tools, even higher-level tools such as a GUI-based
file manager and shell tool, and an absolutely superb Photoshop-style image editing tool called
GIMP [4] (GNU Image Manipulation Program) which is extendable by the user. The individual
software subsystems from the Freeware CD can also be downloaded in precompiled form from
SGI's web site [5].

The February 1999 edition of SGI's Freeware CD contains 173 different software subsystems, 29
of which are based on the GNU licensing system (many others are likely available from
elsewhere on the Internet, along with further freeware items). A printed copy of the contents of
the Feb99 Freeware CD is included with the course notes for further reading.

Other important freeware programs which are supplied separately from such freeware CD
distributions (an author may wish to distribute just from a web site), include the Blue Moon
Rendering Tools (BMRT) [6], a suite of advanced 3D ray-tracing and radiosity tools written by
one of the chief architects at Pixar animation studios - the company which created "Toy Story",
"Small Soldiers" and "A Bug's Life". Blender can output files in Inventor format, which can then
be converted to RIB format for use by BRMT.

So why is shareware and freeware important? Well, these types of software matter because,
today, it is perfectly possible for a business to operate using only shareware and/or freeware
software. An increasingly common situation one comes across is an entrepreneurial multimedia
firm using Blender, XV, GIMP, BMRT and various GNU tools to manage its entire business,
often running on 2nd-hand equipment using free versions of UNIX such as Linux, SunOS or
IRIX 6.2! I know of one such company in the USA which uses decade-old 8-CPU SGI servers
and old SGI workstations such as Crimson RealityEngine and IRIS Indigo. The hardware was
acquired 2nd-hand in less than a year.

Whether or not a company decides to use shareware or freeware software depends on many
factors, especially the degree to which a company feels it must have proper, official support.
Some sectors such as government, medical and military have no choice: they must have proper,
fully guaranteeable hardware and software support because of the nature of the work they do, so
using shareware or freeware software is almost certainly out of the question. However, for
medium-sized or smaller companies, and especially home users or students, the existence of
shareware and freeware software, combined with the modern approaches to these forms of
software by today's UNIX vendors, offers whole new avenues of application development and
business ideas which have never existed before as commercially viable possibilities.

System Hardware

The hardware platforms supplied by the various UNIX vendors are, like UNIX itself today, also
designed and managed with a top-down approach.

The world of PCs has always been a bottom-up process of putting together a mish-mash of
different components from a wide variety of sources. Motherboards, video cards, graphics cards
and other components are available in a plethora of types of varying degrees of quality. This
bottom-up approach to systems design means it's perfectly possible to have a PC with a good
CPU, good graphics card, good video card, but an awful motherboard. If the hardware is suspect,
problems faced by the user may appear to be OS-related when in fact they could be down to poor
quality hardware. It's often difficult or impossible to ascertain the real cause of a problem -
sometimes system components just don't work even though they should, or a system suddenly
stops recognising the presence of a device; these problems are most common with peripherals
such as CDROM, DVD, ZIP, sound cards, etc.

Dealing only with hardware systems designed specifically to run a particular vendor's UNIX
variant, the situation is very different. The vendor maintains a high degree of control over the
design of the hardware platform. Hence, there is opportunity to focus on the unique requirements
of target markets, quality, reliability, etc. rather than always focusing on absolute minimum cost
which inevitably means cutting corners and making tradeoffs.

This is one reason why even very old UNIX systems, eg. multi-processor systems from 1991
with (say) eight 33MHz CPUs, are still often found in perfect working order. The initial focus on
quality results in a much lower risk of component failure. Combined with generous hardware and
software support policies, hardware platforms for traditional UNIX systems are far more reliable
than PCs.

My personal experience is with hardware systems designed by SGI, about which I know a great
deal. Their philosophy of design is typical of most UNIX hardware vendors (others would be
Sun, HP, IBM, DEC, etc.) and can be contrasted very easily with the way PCs are designed and
constructed:

UNIX low-end: "What can we give the customer for 5000?"


mid-range: "What can we give the customer for 15000?"
high-end: "What can we give the customer for 65000+?"

PC: "How cheap can we make a machine which offers


a particular feature set and level of ability?"

Since the real driving force behind PC development is the home market, especially games, the
philosophy has always been to decide what features a typical 'home' or 'office' PC ought to have
and then try and design the cheapest possible system to offer those features. This approach has
eventually led to incredibly cut-throat competition, creating new concepts such as the 'sub-$1000'
PC, and even today's distinctly dubious 'free PC', but in reality the price paid by consumers is the
use of poor quality components which do not integrate well, especially components from
different suppliers. Hardware problems in PCs are common, and now unavoidable. In Edinburgh,
I know of a high-street PC store which always has a long queue of customers waiting to have
their particular problem dealt with.

By contrast, most traditional UNIX vendors design their own systems with a top-down approach
which focuses on quality. Since the vendor usually has complete control, they can ensure a much
greater coherence of design and degree of integration. System components work well with each
other because all parts of the system were designed with all the other parts in mind.

Another important factor is that a top-down approach allows vendors to innovate and develop
new architectural designs, creating fundamentally new hardware techniques such as SMP and
S2MP processing, highly scalable systems, advanced graphics architectures, and perhaps most
importantly of all from a customer's point of view: much more advanced CPU designs (Alpha,
MIPS, SPARC, PA-RISC, POWER series, etc.) Such innovations and changes in design concept
are impossible in the mainstream PC market: there is too much to lose by shifting from the
status-quo. Everything follows the lowest common denominator.

The most obvious indication of these two different approaches is that UNIX hardware platforms
have always been more expensive than PCs, but that is something which should be expected
given that most UNIX platforms are deliberately designed to offer a much greater feature set,
better quality components, better integration, etc.

A good example is the SGI Indy. With respect to absolute cost, the Indy was very expensive
when it was first released in 1993, but because of what it offered in terms of hardware and
software features it was actually a very cheap system compared to trying to put together a PC
with a similar feature set. In fact, Indy offered features such as hardware-accelerated 3D graphics
at high resolution (1280x1024) and 24bit colour at a time when such features did not exist at all
for PCs.

PCW magazine said in its original review [7] that to give a PC the same standard features and
abilities, such as ISDN, 4-channel 16bit stereo sound with multiple stereo I/O sockets, S-
Video/Composite/Digital video inputs, NTSC-resolution CCD digital camera, integrated SCSI,
etc. would have cost twice as much as an Indy. SGI set out to design a system which would
include all these features as-standard, so the end result was bound to cost several thousand
pounds, but that was still half the cost of trying to cobble together a collection of mis-matched
components from a dozen different companies to produce something which still would not have
been anywhere near as good. As PCW put it, the Indy - for its time - was a great machine
offering superb value if one was the kind of customer which needed its features and would be
able to make good use of them.

Sun Microsystems adopted a similar approach to its recent Ultra5, Ultra10 and other systems:
provide the user with an integrated design with a specific feature set that Sun knew its customers
wanted. SGI did it again with their O2 system, released in October 1996. O2 has such a vast
range of features (highly advanced for its time) that few ordinary customers would find
themselves using most or all of them. However, for the intended target markets (ranging from
CAD, design, animation, film/video special effects, video editing to medical imaging, etc.) the
O2 was an excellent system. Like most UNIX hardware systems, O2 today is not competitive in
certain areas such as basic 3D graphics performance (there are exceptions to this), but certain
advanced and unique architectural features mean it's still purchased by customers who require
those features.

This, then, is the key: UNIX hardware platforms which offer a great many features and high-
quality components are only a good choice if one:

 is the kind of customer which definitely needs those features


 values the ramifications of using a better quality system that has been designed top-down:
reliability, quality, long-term value, ease of maintenance, etc.

One often observes people used to PCs asking why systems like O2, HP's Visualize series, SGI's Octane,
Sun's Ultra60, etc. cost so much compared compared to PCs. The reason for the confusion is that the
world of PCs focuses heavily on the abilities of the main CPU, whereas all UNIX vendors have, for many
years, made systems which include as much dedicated acceleration hardware as possible, easing the
burden on the main CPU. For the home market, systems like the Amiga pioneered this approach;
unfortunately, the company responsible fort the Amiga doomed itself to failure as a result of various
marketing blunders.

From an admin's point of view, the practical side effect of having to administer and run a UNIX
hardware platform is that there is far, far less effort needed in terms of configuring systems at the
hardware level, or having to worry about different system hardware components operating
correctly with one other. Combined with the way most UNIX variants deal with hardware
devices (ie. automatically and transparently most of the time), a UNIX admin can swap hardware
components between different systems from the same vendor without any need to alter system
software, ie. any changes in system hardware configuration are dealt with automatically.

Further, many UNIX vendors use certain system components that are identical (usually memory,
disks and backup devices), so admins can often swap generic items such as disks between
different vendor platforms without having to reconfigure those components (in the case of disks)
or worry about damaging either system. SCSI disks are a good example: they are supplied
preformatted, so an admin should never have to reformat a SCSI disk. Swapping a SCSI disk
between different vendor platforms may require repartitioning of the disk, but never a reformat.
In the 6 years I've been using SGIs, I've never had to format a SCSI disk.

Examining a typical UNIX hardware system such as Indy, one notices several very obvious
differences compared to PCs:

 There are far fewer cables in view.


 Components are positioned in such a way as to greatly ease access to all parts of the system.
 The overall design is highly integrated so that system maintenance and repairs/replacements
are much easier to carry out.

Thus, problems that are solvable by the admin can be dealt with quickly, while problems requiring
vendor hardware support assistance can be fixed in a short space of time by a visiting technician, which
obviously reduces costs for the vendor responsible by enabling their engineers to deal with a larger
number of queries in the same amount of time.

Just as with the approaches taken to hardware and software design, the way in which support
contracts for UNIX systems operate also follow a top-down approach. Support costs can be high,
but the ethos is similar: you get what you pay for - fast no-nonsense support when it's needed.

I can only speak from experience of dealing with SGIs, but I'm sure the same is true of other
UNIX vendors. Essentially, if I encounter a hardware problem of some kind, the support service
always errs on the side of caution in dealing with the problem, ie. I don't have to jump through
hoops in order to convince them that there is a problem - they accept what I say and organise a
visiting technician to help straight away (one can usually choose between a range of response
times from 1 hour to 5 days). Typically, unless the technician can fix the problem on-site in a
matter of minutes, then some, most, or even all of the system components will be replaced if
necessary to get the system in working order once more.

For example, when I was once encountering SCSI bus errors, the visiting engineer was almost at
the point of replacing the motherboard, video card and even the main CPU (several thousand
pounds worth of hardware in terms of new-component replacement value at the time) before
some extra further tests revealed that it was in fact my own personal disk which was causing the
problem (I had an important jumper clip missing from the jumper block). In other words, UNIX
vendor hardware support contracts tend to place much less emphasis on the customer having to
prove they have a genuine problem.

I should imagine this approach exists because many UNIX vendors have to deal with extremely
important clients such as government, military, medical, industrial and other sectors (eg. safety
critical systems). These are customers with big budgets who don't want to waste time messing
around with details while their faulty system is losing them money - they expect the vendor to
help them get their system working again as soon as possible.
Note: assuming a component is replaced (eg. motherboard), even if the vendor's later tests show
the component to be working correctly, it is not returned to the customer, ie. the customer keeps
the new component. Instead, most vendors have their own dedicated testing laboratories which
pull apart every faulty component returned to them, looking for causes of problems so that the
vendor can take corrective action if necessary at the production stage, and learn any lessons to
aid in future designs.

To summarise the above:

 A top-down approach to hardware design means a better feature set, better quality, reliability,
ease of use and maintenance, etc.
 As a result, UNIX hardware systems can be costly. One should only purchase such a system if
one can make good use of the supplied features, and if one values the implications of better
quality, etc., despite the extra cost.

However, a blurred middle-ground between the top-down approach to UNIX hardware platforms and
the bottom-up approach to the supply of PCs is the so-called 'vendor-badged' NT workstation market. In
general, this is where UNIX vendors create PC-style hardware systems that are still based on off-the-
shelf components, but occasionally include certain modifications to improve performance, etc. beyond
what one normally sees of a typical PC. The most common example is where vendors such as Compaq
supply systems which have two 64bit PCI busses to increase available system bandwidth.

All these systems are targeted at the 'NT workstation' market. Cynics say that such systems are
just a clever means of placing a 'quality' brand name on ordinary PC hardware. However, such
systems do tend to offer a better level of quality and integration that ordinary PCs (even
expensive ordinary PCs), but an inevitable ironic side effect is that these vendor-badged systems
do cost more. Just as with traditional UNIX hardware systems, whether or not that cost is worth
it depends on customers' priorities. Companies such as movie studios regard stability and
reliability as absolutely critical, which is why most studios do not use NT [8]. Those that do,
especially smaller studios (perhaps because of limited budgets) will always go for vendor-badged
NT workstations rather than purchasing systems from PC magazines and attempting to cobble
together a reliable platform. The extra cost is worth it.

There is an important caveat to the UNIX hardware design approach: purchasing what can be a
very good UNIX hardware system is a step that can easily be ruined by not equipping that
system in the first instance with sufficient essential system resources such as memory capacity,
disk space, CPU power and (if relevant) graphics/image/video processing power. Sometimes,
situations like this occur because of budget constraints, but the end result may be a system which
cannot handle the tasks for which it was purchased. If such mis-matched purchases are made, it's
usually a good sign that the company concerned is using a bottom-up approach to making
decisions about whether or not to buy a hardware platform that has been built using a top-down
approach. The irony is plain to see. Since admins often have to advise on hardware purchases or
upgrades, a familiarity with these issues is essential.

Conclusion: decide what is needed to solve the problem. Evaluate which systems offer
appropriate solutions. If there no system is affordable, do not compromise on essentials such as
memory or disk as a means of lowering cost - choose a different platform instead such as good
quality NT system, or a system with lower costs such as an Intel machine running Linux, etc.

Similarly, it makes no sense to have a good quality UNIX system, only to then adopt a strategy
of buying future peripherals (eg. extra disks, memory, printers, etc.) that are of poor quality. In
fact, some UNIX vendors may not offer or permit hardware support contracts unless the
customer sticks to using approved 3rd-party hardware sources.

Summary: UNIX hardware platforms are designed top-down, offer better quality components,
etc., but tend to be more expensive as a result.

Today, an era when even SGI has started to sell systems that support WindowsNT, the
philosophy is still the same: design top-down to give quality hardware, etc. Thus, SGI's
WindowsNT systems start at around 2500 pounds - alot by the standards of any home user, but
cheap when considering the market in general. The same caveat applies though: such a system
with a slow CPU is wasting the capabilities of the machine.

UNIX Characteristics.

Integration:

A top-down approach results in an integrated design. Systems tend to be supplied 'complete', ie.
everything one requires is usually supplied as-standard. Components work well together since
the designers are familiar with all aspects of the system.

Stability and Reliability:

The use of quality components, driven by the demands of the markets which most UNIX vendors
aim for, results in systems that experience far fewer component failures compared to PCs. As a
result of a top-down and integrated approach, the chances of a system experiencing hardware-
level conflicts are much lower compared to PCs.

Security:

It is easy for system designers to incorporate hardware security features such as metal hoops that
are part of the main moulded chassis, for attaching to security cables.

On the software side, and as an aid to preventing crime (as well as making it easier to solve
crime in terms of tracing components, etc.) systems such as SGIs often incorporate unique
hardware features. The following applies to SGIs but is also probably true of hardware from
other UNIX vendors in some equivalent form.
Every SGI has a PROM chip on the motherboard, without which the system will not boot. This
PROM chip is responsible for initiating the system bootup sequence at the very lowest hardware
level. However, the chip also contains an ID number which is unique to that particular machine.
One can display this ID number with the following command:

sysinfo -s

Alternatively, the number can be displayed in hexadecimal format by using sysinfo command on
its own (one notes the first 4 groups of two hex digits). A typical output might look like this:

% sysinfo -s
1762299020
% sysinfo
System ID:
69 0a 8c 8c 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The important part of the output from the second command is the beginning sequence consisting
of '690A8C8C'.

The ID number is not only used by SGI when dealing with system hardware and software
support contracts, it is also the means by which license codes are supplied for SGI's commercial
software packages.

If one wishes to use a particular commercial package, eg. the VRML editor called
CosmoWorlds, SGI uses the ID number of the machine to create a license code which will be
recognised by the program concerned as being valid only for that particular machine. The 20-
digit hexadecimal license code is created using a special form of encryption, presumably
combining the ID number with some kind of internal database of codes for SGI's various
applications which only SGI has access to. In the case of the O2 I use at home, the license code
for CosmoWorlds is 4CD4FB82A67B0CEB26B7 (ie. different software packages on the same
system need different license codes). This code will not work for any other software package on
any other SGI anywhere in the world.

There are two different license management systems in use by SGIs: the NetLS environment on
older platforms, and the FlexLM environment on newer platforms. FlexLM is being widely
adopted by many UNIX vendors. NetLS licenses are stored in the /var/netls directory, while
FlexLM licenses are kept in /var/flexlm. To the best of my knowledge, SGI's latest version of
IRIX (6.5) doesn't use NetLS licenses anymore, though it's possible that 3rd-party software
suppliers still do.

As stated in the software section, the use of the ID number system at the hardware level means it
is impossible to pirate commercial software. More accurately, anyone can copy any SGI software
CD, and indeed install the software, but that software will not run without the license code which
is unique to each system, so there's no point in copying commercial software CDs or installing
copied commercial software in the first place.
Of course, one could always try to reverse-engineer the object code of a commercial package to
try and get round the section which makes the application require the correct license code, but
this would be very difficult. The important point is that, to the best of my knowledge, SGI's
license code schema has never been broken at the hardware level.

Note: from the point of view of an admin maintaining an SGI system, if a machine completely
fails, eg. damage by fire and water, the admin should always retain the PROM chip if possible -
ie. a completely new system could be obtained but only the installation of the original PROM
chip will make the new system effectively the same as the old one. For PCs, the most important
system component in terms of system identity is the system disk (more accurately, its contents);
but for machines such as SGIs, the PROM chip is just as if not more important than the contents
of the system disk when it comes to a system having a unique identity.

Scalability.

Because a top-down hardware design approach has been used by all UNIX hardware vendors
over the years, most UNIX vendors offer hardware solutions that scale to a large number of
processors. Sun, IBM, SGI, HP and other vendors all offer systems that scale to 64 CPUs.
Currently, one cannot obtain a reliable PC/NT platform that scales to even 8 CPUs (Intel won't
begin shipping 8-way chip sets until Q3 1999).

Along with the basic support for a larger number of processors, UNIX vendors have spent a great
deal of time researching advanced ways of properly supporting many CPUs. There are complex
issues concerning how such systems handle shared memory, the movement of data,
communications links, efficient use of other hardware such as graphics and video subsystems,
maximised use of storage systems (eg. RAID), and so on.

The result is that most UNIX vendors offer large system solutions which can tackle extremely
complex problems. Since these systems are obviously designed to the very highest quality
standards with a top-down approach to integration, etc., they are widely used by companies and
institutions which need such systems for solving the toughest of tasks, from processing massive
databases to dealing with huge seismic data sets, large satellite images, complex medical data
and intensive numerical processing (eg. weather modeling).

One very beneficial side-effect of this kind of development is that the technology which comes
out of such high-quality designs slowly filters down to the desktop systems, enabling customers
to eventually utilise extremely advanced and powerful computing systems. A particularly good
example of this is SGI's Octane system [9] - it uses the same components and basic technology
as SGI's high-end Origin server system. As a result, the user benefits from many advanced
features, eg.

 Octane has no inherent maximum memory limit. Memory is situated on a 'node board' along
with the 1 or 2 main CPUs, rather than housed on a backplane. As CPU designs improve, so
memory capacity on the node board can be increased by using a different node board design, ie.
without changing the base system at all. For example, Octane systems using the R10000 CPU
can have up to 2GB RAM, while Octane systems using the R12000 CPU can have up to 4GB RAM.
Future CPUs (R14K, R16K, etc.) will change this limit again to 8GB, 16GB, etc.
 The speed at which all internal links operate is directly synchronised to the clock speed of the
main CPU. As a result, internal data pathways can always supply data to both main CPUs faster
than they can theoretically cope with, ie. one can get the absolute maximum performance out
of a CPU (this is fundamentally not possible with any PC design). As CPU clock speeds increase,
so does the rate at which the system can move data around internally. An Octane using 195MHz
R10000s offers three separate internal data pathways each operating at 1560MB/sec (10X faster
than a typical PCI bus). An Octane using 300MHz R12000s runs the same pathways at the faster
rate of 2400MB/sec per link. ie. system bandwidth and memory bandwidth increase to match
CPU speed.

The above is not a complete list of advanced features.

SGI's high-end servers are currently the most scalable in the world, offering up to 256 CPUs for
a commercially available system, though some sites with advance copies of future OS changes
have systems with 512 and 720 CPUs. As stated elsewhere, one system has 6144 CPUs.

The quality of design required to create technologies like this, along with software and OS
concepts that run them properly, are quite incredible. These features are passed on down to
desktop systems and eventually into consumer markets. But it means that, at any one time, mid-
range systems based on such advanced technologies can be quite expensive (Octanes generally
start at around 7000 pounds). Since much of the push behind these developments comes from
military and government clients, again there is great emphasis on quality, reliability, security,
etc. Cray Research, which is owned by SGI, holds the world record for the most stable and
reliable system: a supercomputer with 2048 CPUs which ran for 2.5 years without any of the
processors exhibiting a single system-critical error.

Sun, HP, IBM, DEC, etc. all operate similar design approaches, though SGI/Cray happens to
have the most advanced and scalable server and graphics system designs at the present time,
mainly because they have traditionally targeted high-end markets, especially US government
contracts.

The history of UNIX vendor CPU design follows a similar legacy: typical customers have
always been willing to pay 3X as much as an Intel CPU in order to gain access to 2X the
performance. Ironically, as a result, Intel have always produced the world's slowest CPUs, even
though they are the cheapest. CPUs at much lower clock speeds from other vendors (HP, IBM,
Sun, SGI, etc.) can easily be 2X to 5X faster than Intel's current best. As stated above though,
these CPUs are much more expensive - even so, it's an extra cost which the relevant clients say
they will always bare in order to obtain the fastest available performance. The exception today is
the NT workstation market where systems from UNIX vendors utilise Intel CPUs and
WindowsNT (and/or Linux), offering a means of gaining access to better quality graphics and
video hardware while sacrificing the use of more powerful CPUs and the more sophisticated
UNIX OSs, resulting in lower cost. Even so, typical high-end NT systems still cost around 3000
to 15000 pounds.
So far, no UNIX vendor makes any product that is targeted at the home market, though some
vendors create technologies that are used in the mass consumer market (eg. the R3000 CPU
which runs the Sony PlayStation is designed by SGI and was used in their older workstations in
the late 1980s and early 1990s; all of the Nintendo64's custom processors were designed by
SGI). In terms of computer systems, it is unlikely this situation will ever change because to do so
would mean a vendor would have to adopt a bottom-up design approach in order to minimise
cost above all else - such a change wouldn't be acceptable to customers and would contradict the
way in which the high-end systems are developed. Vendors which do have a presence in the
consumer market normally use subsidiaries as a means of avoiding internal conflicts in design
ethos, eg. SGI's MIPS subsidiary (soon to be sold off).

References:

1. Blender Animation and Rendering Program:


2. http://www.blender.nl/
3. XV Image Viewer:
4. http://www.trilon.com/xv/xv.html
5. Extract taken from GNU GENERAL PUBLIC LICENSE, Version 2, June 1991, Copyright (C) 1989,
1991 Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
6. GIMP (GNU Image Manipulation Program):
7. http://www.gimp.org/
8. SGI Freeware Sites (identical):
9. http://freeware.sgi.com/
10. http://toolbox.sgi.com/TasteOfDT/public/freeware/
11. Pixar's Blue Moon Rendering Tools (BMRT):
12. http://www.bmrt.org/
13. Silicon Graphics Indy, PCW, September 1993:
14. http://www.futuretech.vuurwerk.nl/pcw9-93indy.html
15. "LA conferential", CGI Magazine, Vol4, Issue 1, Jan/Feb 1999, pp. 21, by Richard Spohrer.

Interview from the 'Digital Content and Creation' conference and exhibition:

'"No major production facilities rely on commercial software, everyone has to customise
applications in order to get the most out of them," said Hughes. "We run Unix on SGI as we need
a stable environment which allows fast networking. NT is not a professional solution and was
never designed to handle high-end network environments," he added. "Windows NT is the
antithesis of what the entertainment industry needs. If we were to move from Irix, we would
use Linux over NT."'

- John Hughes, president/CEO of Rhythm & Hues and Scott Squires, visual effects
supervisor at ILM and ceo of Puffin Design.
16. Octane Information Index:

http://www.futuretech.vuurwerk.nl/octane/

17. "How to set up the BIND domain name server", Network Week, Vol4 No. 29, 14th April 1999, pp.
17, by David Cartwright.
18. A letter from a reader in response to [10]:

"Out of a BIND", Network Week, Vol4 No. 31, 28th April 1999, pp. 6:

"A couple of weeks ago, I had a problem. I was attempting to configure NT4's DNS
Server for use on a completely private network, but it just wasn't working properly. The
WindowsNT 'help' - and I use that term loosely - assumed my network was connected to
the Internet, so the examples it gave were largely useless. Then I noticed David
Cartwright's article about setting up DNS servers. (Network Week, 14th April). The light
began to dawn. Even better, the article used BIND's configuration files as examples. This
meant that I could dump NT's obtuse GUI DNS Manager application and hand-hack the
configuration files myself. A few minor problems later (most of which were caused by
Microsoft's example DNS config files being a bit... um... optimistic) and the DNS server
finally lurched into life. Thank you Network Week. The more Q&A and how-to type
information you print, the better."

- Matthew Bell, Fluke UK.

General References:

Anonymous SGI FTP Site List: http://reality.sgi.com/billh/anonftp/


Origin2000 Information Index: http://www.futuretech.vuurwerk.nl/origin/
Onyx2 Information Index: http://www.futuretech.vuurwerk.nl/onyx2/
SGI: http://www.sgi.com/
Hewlett Packard: http://www.hp.com/
Sun Microsystems: http://www.sun.com/
IBM: http://www.ibm.com/
Compaq/Digital: http://www.digital.com/
SCO: http://www.sco.com/
Linux: http://www.linux.org/

Appendix A: Case Study.

For unknown and unchangeable reasons, UCLAN's central admin system has a DNS
setup which, incorrectly, does not recognise comp.uclan.ac.uk as a subdomain. Instead,
the central DNS lists comp as a host name, ie. comp.uclan.ac.uk is listed as a direct
reference to Yoda's external IP address, 193.61.250.34; in terms of the intended use of
the word 'comp', this is rather like referring to a house on a street by using just the street
name. As a result, the SGI network's fully qualified host names, such as
yoda.comp.uclan.ac.uk, are not recognised outside UCLAN, and neither is
comp.uclan.ac.uk since all the machines on the SGI network treat comp as a subdomain.
Thus, external users can access Yoda's IP address directly by referring to 193.61.250.34
(so ftp is possible), but they cannot access Yoda as a web server, or access individual
systems in Ve24 such as sevrin.comp.uclan.ac.uk, or send email to the SGI network.
Also, services such as USENET cannot be setup, so internal users must use web sites to
access newsgroups.

This example serves as a warning: organisations should thoroughly clarify what their
individual department's network structures are going to be, through a proper consultation
and discussion process, before allowing departments to setup internal networks.
Otherwise, confusion and disagreement can occur. In the case of the SGI network, its
internal structure is completely correct (as confirmed by SGI themselves), but the way it
is connected to the Internet is incorrect. Only the use of a Proxy server allows clients to
access the Internet, but some strange side-effects remain; for example, email can be sent
from the SGI network to anywhere on the Internet (from Yoda to Yahoo in less than 10
seconds!), but not vice-versa because incoming data is blocked by the incorrectly
configured central DNS.

Email from the SGI network can reach the outside world because of the way the email
system works: the default settings installed along with the standard Berkeley Sendmail
software (/usr/lib/sendmail) are sufficient to forward email from the SGI network to the
Internet via routers further along the communications chain, which then send the data to
JANET at Manchester, and from there to the final destination (which could include a
UCLAN student or staff member). The situation is rather like posting a letter without a
sender's address, or including an address which gives everything as far as the street name
but not the house number - the letter will be correctly delivered, but the recipient will not
be able to reply to the sender.
Detailed Notes for Day 2 (Part 2)
UNIX Fundamentals: Shell scripts.

It is an inevitable consequence of using a command interface such as shells that one would wish
to be able to run a whole sequence of commands to perform more complex tasks, or perhaps the
same task many times on multiple systems.

Shells allow one to do this by creating files containing sequences of commands. The file,
referred to as a shell script, can be executed just like any other program, though one must ensure
the execute permissions on the file are set appropriately in order for the script to be executable.

Large parts of all modern UNIX variants use shell scripts to organise system management and
behaviour. Programming in shell script can include more complicated structures such as if/then
statements, case statements, for loops, while loops, functions, etc. Combined with other features
such as metacharacters and the various text-processing utilities (perl, awk, sed, grep, etc.) one
can create extremely sophisticated shell scripts to perform practically any system administration
task, ie. one is able to write programs which can use any available application or existing
command as part of the code in the script. Since shells are based on C and the commands use a
similar syntax, shell programming effectively combines the flexibility of C-style programming
with the ability to utilise other programs and resources within the shell script code.

Looking at typical system shell script files, eg. the bootup scripts contained in /etc/init.d, one can
see that most system scripts make extensive use of if/then expressions and case statements.
However, a typical admin will find it mostly unnecessary to use even these features. In fact,
many administration tasks one might choose to do can be performed by a single command or
sequence of commands on a single line (made possible via the various metacharacters). An
admin might put such mini-scripts into a file and execute that file when required; even though
the file's contents may not appear to be particularly complex, one can perform a wide range of
tasks using just a few commands.

Hash symbol '#' in a script file at the beginning of a line is used to denote a comment.

One of the most commonly used commands in UNIX is 'find' which allows one to search for
files, directories, files belonging to a particular user or group, files of a special type (eg. a link to
another file), files modified before or after a certain time, and so on (there are many options).
Most admins tend to use the find command to select certain files upon which to perform some
other operation, to locate files for information gathering purposes, etc.

The find command uses a Boolean expression which defines the type of file the command is to
search for. The name of any file matching the Boolean expression is returned.
For example (see the 'find' man page for full details):

find /home/students -name "capture.mv" -print

Figure 25. A typical find command.

This command searches all students directories, looking for any file called 'capture.mv'. On Indy
systems, users often capture movie files when first using the digital camera, but usually never
delete them, wasting disk space. Thus, an admin might have a site policy that, at regular
intervals, all files called capture.mv are erased - users would be notified that if they captured a
video sequence which they wished to keep, they should either set the name to use as something
else, or rename the file afterwards.

One could place the above command into a executable file called 'loc', running that file when one
so desired. This can be done easily by the following sequence of actions (only one line is entered
in this example, but one could easily enter many more):

% cat > loc


find /home/students -name "capture.mv" -print
[press CTRL-D]
% chmod u+x loc
% ls -lF loc
-rwxr--r-- 1 mapleson staff 46 May 3 13:20 loc*

Figure 26. Using cat to quickly create a simple shell script.

Using ls -lF to examine the file, one would see the file has the execute permission set for user,
and a '*' has been appended after the file name, both indicating the file is now executable. Thus,
one could run that file just as if it were a program. One might imagine this is similar to .BAT
files in DOS, but the features and functionality of shell scripts are very different (much more
flexible and powerful, eg. the use of pipes).

There's no reason why one couldn't use an editor to create the file, but experienced admins know
that it's faster to use shortcuts such as employing cat in the above way, especially compared to
using GUI actions which requires one to take hold the mouse, move it, double-click on an icon,
etc. Novice users of UNIX systems don't realise until later that very simple actions can take
longer to accomplish with GUI methods.

Creating a file by redirecting the input from cat to a file is a technique I often use for typing out
files with little content. cat receives its input from stdin (the keyboard by default), so using 'cat >
filename' means anything one types is redirected to the named file instead of stdout; one must
press CTRL-D to end the input stream and close the file.

An even lazier way of creating the file, if just one line was required, is to use echo:

% echo 'find /home/students -name "capture.mv" -print' > loc


% chmod u+x loc
% ls -lF loc
-rwxr--r-- 1 mapleson staff 46 May 3 13:36 loc
% cat loc
find /home/students -name "capture.mv" -print

Figure 27. Using echo to create a simple one-line shell script.

This time, there is no need to press CTRL-D, ie. the prompt returns immediately and the file has
been created. This happens because, unlike cat which requires an 'end of file' action to terminate
the input, echo's input terminates when it receives an end-of-line character instead (this
behaviour can be overridden with the '-n' option).

The man page for echo says, "echo is useful for producing diagnostics in command files and for
sending known data into a pipe."

For the example shown in Fig 27, single quote marks surrounding the find command were
required. This is because, without the quotes, the double quotes enclosing capture.mv are not
included in the output stream which is redirected into the file. When contained in a shell script
file, find doesn't need double quotes around the file name to search for, but it's wise to include
them because other characters such as * have special meaning to a shell. For example, without
the single quote marks, the script file created with echo works just fine (this example searches
for any file beginning with the word 'capture' in my own account):

% echo find /mapleson -name "capture.*" -print > loc


% chmod u+x loc
% ls -lF loc
-rwxr--r-- 1 mapleson staff 38 May 3 14:05 loc*
% cat loc
find /mapleson -name capture.* -print
% loc
/mapleson/work/capture.rgb

Figure 28. An echo sequence without quote marks.

Notice the loc file has no double quotes. But if the contents of loc is entered directly at the
prompt:

% find /mapleson -name capture.* -print


find: No match.

Figure 29. The command fails due to * being treated as a metacommand by the shell.

Even though the command looks the same as the contents of the loc file, entering it directly at
the prompt produces an error. This happens because the * character is interpreted by the shell
before the find command, ie. the shell tries to evaluate the capture.* expression for the current
directory, instead of leaving the * to be part of the find command. Thus, when entering
commands at the shell prompt, it's wise to either use double quotes where appropriate, or use the
backslash \ character to tell the shell not to treat the character as if it was a shell metacommand,
eg.:

% find /mapleson -name capture.\* -print


/mapleson/work/capture.rgb

Figure 30. Using a backslash to avoid confusing the shell.

A -exec option can be used with the find command to enable further actions to be taken on each
result found, eg. the example in Fig 25 could be enhanced by including making the find
operation execute a further command to remove each capture.mv file as it is found:

find /home/students -name "capture.mv" -print -exec /bin/rm {} \;

Figure 31. Using find with the -exec option to execute rm.

Any name returned by the search is passed on to the rm command. The shell substitutes the {}
symbols with each file name result as it is returned by find. The \; grouping at the end serves to
terminate the find expression as a whole (the ; character is normally used to terminate a
command, but a backslash is needed to prevent it being interpreted by the shell as a
metacommand).

Alternatively, one could use this type of command sequence to perform other tasks, eg. suppose I
just wanted to know how large each movie file was:

find /home/students -name "capture.mv" -print -exec /bin/ls -l {} \;

Figure 32. Using find with the -exec option to execute ls.

This works, but two entries will be printed for each command: one is from the -print option, the
other is the output from the ls command. To see just the ls output, one can omit the -print option.

Consider this version:

find /home/students -name "*.mov" -exec /bin/ls -l {} \; > results

Figure 33. Redirecting the output from find to a file.

This searches for any .mov movie file (usually QuickTime movies), with the output redirected
into a file. One can then perform further operations on the results file, eg. one could search the
data for any movie that contains the word 'star' in its name:

grep star results

A final change might be to send the results of the grep operation to the printer for later reading:

grep star results | lp

Thus, the completed script looks like this:

find /home/students -name "*.mv" -exec /bin/ls -l {} \; > results


grep star results | lp
Figure 34. A simple script with two lines.

Only two lines, but this is now a handy script for locating any movies on the file system that are
likely to be related to the Star Wars or Star Trek sagas and thus probably wasting valuable disk
space! For the network I run, I could then use the results to send each user a message saying the
Star Wars trailer is already available in /home/pub/movies/misc, so they've no need to download
extra copies to their home directory.

It's a trivial example, but in terms of the content of the commands and the way extra commands
are added, it's typical of the level of complexity of most scripts which admins have to create.

Further examples of the use of 'find' are in the relevant man page; an example file which contains
several different variations is:

/var/spool/cron/crontabs/root

This file lists the various administration tasks which are executed by the system automatically on
a regular basis. The cron system itself is discussed in a later lecture.

WARNING. The Dangers of the Find Command and Wildcards.

Although UNIX is an advanced OS with powerful features, sometimes one encounters an aspect
of its operation which catches one completely off-guard, though this is much less the case after
just a little experience.

A long time ago (January 1996), I realised that many students who used the Capture program to
record movies from the Digital Camera were not aware that using this program or other movie-
related programs could leave unwanted hidden directories containing temporary movie files in
their home directory, created during capture, editing or conversion operations (I think it happens
when an application is killed of suddenly, eg. with CTRL-C, which doesn't give it an opportunity
to erase temporary files).

These directories, which are always located in a user's home directory, are named
'.capture.mv.tmpXXXXX' where XXXXX is some 5-digit string such as '000Hb', and can easily
take up many megabytes of space each.

So, I decided to write a script to automatically remove such directories on a regular basis. Note
that I was logged on as root at this point, on my office Indy.

In order to test that a find command would work on hidden files (I'd never used the find
command to look for hidden files before), I created some test directories in the /tmp directory,
whose contents would be given by 'ls -AR' as something like this:

% ls -AR
.b/ .c/ a/ d/
./.b:

./.c:
.b a

./a:

./d:
a

ie. a simple range of hidden and non-hidden directories with or without any content:

 Ordinary directories with or without hidden/non-hidden files inside,


 Hidden directories with or without hidden/non-hidden files inside,
 Directories with ordinary files,
 etc.

The actual files such as .c/a and .c/.b didn't contain anything. Only the names were important for the
test.

So, to test that find would work ok, I executed the following command from within the /tmp
directory:

find . -name ".*" -exec /bin/rm -r {} \;

(NB: the -r option for rm means do a recursive removal, and note that there was no -i option used
with the rm here)

What do you think this find command would do? Would it remove the hidden directories .b and
.c and their contents? If not, why not? Might it do anything else as well?

Nothing happened at first, but the command did seem to be taking far too long to return the shell
prompt. So, after a few seconds, I decided something must have gone wrong; I typed CTRL-C to
stop the find process (NB: it was fortunate I was not distracted by a phone call or something at
this point).

Using the ls command showed the test files I'd created still existed, which seemed odd. Trying
some further commands, eg. changing directories, using the 'ps' command to see if there was
something causing system slowdown, etc., produced strange errors which I didn't understand at
the time (this was after only 1 or 2 months' admin experience), so I decided to reboot the system.

The result was disaster: the system refused to boot properly, complaining about swap file errors
and things relating to device files. Why did this happen?

Consider the following command sequence by way of demonstration:


cd /tmp
mkdir xyz
cd xyz
/bin/ls -al

The output given will look something like this:

drwxr-xr-x 2 root sys 9 Apr 21 13:28 ./


drwxrwxrwt 6 sys sys 512 Apr 21 13:28 ../

Surely the directory xyz should be empty? What are these two entries? Well, not quite empty. In
UNIX, as stated in a previous lecture, virtually everything is treated as a file. Thus, for example,
the command so commonly performed even on the DOS operating system:

cd ..

is actually doing something rather special on UNIX systems. 'cd ..' is not an entire command in
itself. Instead, every directory on a UNIX file system contains two hidden directories which are
in reality special types of file:

./ - this refers to the current directory.


../ - this is effectively a link to the
directory above in the file system.

So typing 'cd ..' actually means 'change directory to ..' (logical since cd does mean 'change
directory to') and since '..' is treated as a link to the directory above, then the shell changes the
current working directory to the next level up.

[by contrast, 'cd ..' in DOS is treated as a distinct command in its own right - DOS recognises the
presence of '..' and if possible changes directory accordingly; this is why DOS users can type
'cd..' instead if desired]

But this can have an unfortunate side effect if one isn't careful, as is probably becoming clear by
now. The ".*" search pattern in the find command will also find these special './' and '../' entries
in the /tmp directory, ie.:

 The first thing the find command locates is './'


 './' is inserted into the search string ".*" to give "../*"
 find changes directory to / (root directory). Uh oh...
 find locates the ./ entry in / and substitutes this string into ".*" to give "../*". Since the current
directory cannot be any higher, the search continues in the current directory; ../ is found next
and is treated the same way.
 The -exec option with 'rm' causes find to begin erasing hidden files and directories such as
.Sgiresources, eventually moving onto non-hidden files: first the /bin link to /usr/bin, then the
/debug link, then all of /dev, /dumpster, /etc and so on.

By the time I realised something was wrong, the find command had gone as far as deleting most of /etc.
Although important files in /etc were erased which I could have replaced with a backup tape or reinstall,
the real damage was the erasure of the /dev directory. Without important entries such as /dev/dsk,
/dev/rdsk, /dev/swap and /dev/tty*, the system cannot mount disks, configure the swap partition on
bootup, connect to keyboard input devices (tty terminals), and accomplish other important tasks.

In other words, disaster. And I'd made it worse by rebooting the system. Almost a complete
repair could have been done simply by copying the /dev and /etc directories from another
machine as a temporary fix, but the reboot made everything go haywire. I was partly fooled by
the fact that the files in /tmp were still present after I'd stopped the command with CTRL-C. This
led me to at first think that nothing had gone awry.

Consulting an SGI software support engineer for help, it was decided the only sensible solution
was to reinstall the OS, a procedure which was alot simpler than trying to repair the damage I'd
done.

So, the lessons learned:

 Always read up about a command before using it. If I'd searched the online books with the
expression 'find command', I would have discovered the following paragraph in Chapter 2
("Making the Most of IRIX") of the 'IRIX Admin: System Configuration and Operation' manual:

"Note that using recursive options to commands can be very dangerous in that the command
automatically makes changes to your files and file system without prompting you in each case.
The chgrp command can also recursively operate up the file system tree as well as down. Unless
you are sure that each and every case where the recursive command will perform an action is
desired, it is better to perform the actions individually. Similarly, it is good practice to avoid the
use of metacharacters (described in "Using Regular Expressions and Metacharacters") in
combination with recursive commands."

I had certainly broken the rule suggested by the last sentence in the above paragraph. I
also did not know what the command would do before I ran it.

 Never run programs or scripts with as-yet unknown effects as root.

ie. when testing something like removing hidden directories, I should have logged on as
some ordinary user, eg. a 'testuser' account, so that if the command went wrong it would
not have been able to change or remove any files owned by root, or files owned by
anyone else for that matter, including my own in /mapleson. If I had done this, the
command I used would have given an immediate error and halted when the find string
tried to remove the very first file found in the root directory (probably some minor hidden
file such as .Sgiresources).

Worrying thought: if I hadn't CTRL-C'd the find command when I did, after enough time, the command
would have erased the entire file system (including /home), or at least tried to. I seem to recall that, in
reality (tested once on a standalone system deliberately), one can get about as far as most of /lib before
the system actually goes wrong and stops the current command anyway, ie. the find command
sequence eventually ends up failing to locate key libraries needed for the execution of 'rm' (or perhaps
the 'find' itself) at some point.

The only positive aspects of the experience were that, a) I'd learned alot about the subtleties of
the find command and the nature of files very quickly; b) I discovered after searching the Net
that I was not alone in making this kind of mistake - there was an entire web site dedicated to the
comical mess-ups possible on various operating systems that can so easily be caused by even
experienced admins, though more usually as a result of inexperience or simple errors, eg. I've
had at least one user so far who has erased their home directory by mistake with 'rm -r *' (he'd
thought his current working directory was /tmp when in fact it wasn't). A backup tape restored
his files.

Most UNIX courses explain how to use the various available commands, but it's also important
to show how not to use certain commands, mainly because of what can go wrong when the root
user makes a mistake. Hence, I've described my own experience of making an error in some
detail, especially since 'find' is such a commonly used command.

As stated in an earlier lecture, to a large part UNIX systems run themselves automatically. Thus,
if an admin finds that she/he has some spare time, I recommend using that time to simply read up
on random parts of the various administration manuals - look for hints & tips sections, short-cuts,
sections covering daily advice, guidance notes for beginners, etc. Also read man pages: follow
them from page to page using xman, rather like the way one can become engrossed in an
encyclopedia, looking up reference after reference to learn more.

A Simple Example Shell Script.

I have a script file called 'rebootlab' which contains the following:

rsh akira init 6&


rsh ash init 6&
rsh cameron init 6&
rsh chan init 6&
rsh conan init 6&
rsh gibson init 6&
rsh indiana init 6&
rsh leon init 6&
rsh merlin init 6&
rsh nikita init 6&
rsh ridley init 6&
rsh sevrin init 6&
rsh solo init 6&
#rsh spock init 6&
rsh stanley init 6&
rsh warlock init 6&
rsh wolfen init 6&
rsh woo init 6&
Figure 35. The simple rebootlab script.

The rsh command means 'remote shell'. rsh allows one to execute commands on a remote system
by establishing a connection, creating a shell on that system using one's own user ID
information, and then executing the supplied command sequence.

The init program is used for process control initialisation (see the man page for details). A
typical use for init is to shutdown the system or reboot the system into a particular state, defined
by a number from 0 to 6 (0 = full shutdown, 6 = full reboot) or certain other special possibilities.

As explained in a previous lecture, the '&' runs a process in the background.

Thus, each line in the file executes a remote shell on a system, instructing that system to reboot.
The init command in each case is run in the background so that the rsh command can
immediately return control to the rebootlab script in order to execute the next rsh command.

The end result? With a single command, I can reboot the entire SGI lab without ever leaving the
office.

Note: the line for the machine 'spock' is commented out. This is because the Indy called spock is
currently in the technician's office, ie. not in service. This is a good example of where I could
make the script more efficient by using a for loop, something along the lines of: for each name in
this list of names, do <command>.

As should be obvious, the rebootlab script makes no attempt to check if anybody is logged into
the system. So in practice I use the rusers command to make sure nobody is logged on before
executing the script. This is where the script could definitely be improved: the command sent by
rsh to each system could be modified with some extra commands so that each system is only
rebooted if nobody is logged in at the time (the 'who' command could probably be used for this,
eg. 'who | grep -v root' would give no output if nobody was logged on).

The following script, called 'remountmapleson', is one I use when I go home in the evening, or
perhaps at lunchtime to do some work on the SGI I use at home.

rsh yoda umount /mapleson && mount /mapleson &


rsh akira umount /mapleson && mount /mapleson &
rsh ash umount /mapleson && mount /mapleson &
rsh cameron umount /mapleson && mount /mapleson &
rsh chan umount /mapleson && mount /mapleson &
rsh conan umount /mapleson && mount /mapleson &
rsh gibson umount /mapleson && mount /mapleson &
rsh indiana umount /mapleson && mount /mapleson &
rsh leon umount /mapleson && mount /mapleson &
rsh merlin umount /mapleson && mount /mapleson &
rsh nikita umount /mapleson && mount /mapleson &
rsh ridley umount /mapleson && mount /mapleson &
rsh sevrin umount /mapleson && mount /mapleson &
rsh solo umount /mapleson && mount /mapleson &
#rsh spock umount /mapleson && mount /mapleson &
rsh stanley umount /mapleson && mount /mapleson &
rsh warlock umount /mapleson && mount /mapleson &
rsh wolfen umount /mapleson && mount /mapleson &
rsh woo umount /mapleson && mount /mapleson &

Figure 36. The simple remountmapleson script.

When I leave for home each day, my own external disk (where my own personal user files
reside) goes with me, but this means the mount status of the /mapleson directory for every SGI in
Ve24 is now out-of-date, ie. each system still has the directory mounted even though the file
system which was physically mounted from the remote system (called milamber) is no longer
present. As a result, any attempt to access the /mapleson directory would give an error: "Stale
NFS file handle." Even listing the contents of the root directory would show the usual files but
also the error as well.

To solve this problem, the script makes every system unmount the /mapleson directory and, if
that was successfully done, remount the directory once more. Without my disk present on
milamber, its /mapleson directory simply contains a file called 'README' whose contents state:

Sorry, /mapleson data not available - my external disk has been temporarily removed. I've
probably gone home to work for a while. If you need to contact me, please call <phone
number>.

As soon as my disk is connected again and the script run once more, milamber's local /mapleson
contents are hidden by my own files, so users can access my home directory once again.

Thus, I'm able to add or remove my own personal disk and alter what users can see and access at
a global level without users ever noticing the change.

Note: the server still regards my home directory as /mapleson on milamber, so in order to ensure
that I can always logon to milamber as mapleson even if my disk is not present, milamber's
/mapleson directory also contains basic .cshrc, .login and .profile files.

Yet again, a simple script is created to solve a particular problem.

Command Arguments.

When a command or program is executed, the name of the command and any parameters are
passed to the program as arguments. In shell scripts, these arguments can be referenced via the '$'
symbol. Argument 0 is always the name of the command, then argument 1 is the first parameter,
argument 2 is the second parameter, etc. Thus, the following script called (say) 'go':

echo $0
echo $1
echo $2
would give this output upon execution:

% go somewhere nice
go
somewhere
nice

Including extra echo commands such 'echo $3' merely produces blank lines after the supplied
parameters are displayed.

If one examines any typical system shell script, this technique of passing parameters and
referencing arguments is used frequently. As an example, I once used the technique to aid in the
processing of a large number of image files for a movie editing task. The script I wrote is also
typical of the general complexity of code which most admins have to deal with; called 'go', it
contained:

subimg $1 a.rgb 6 633 6 209


gammawarp a.rgb m.rgb 0.01
mult a.rgb a.rgb n.rgb
mult n.rgb m.rgb f.rgb
addborder f.rgb b.rgb x.rgb
subimg x.rgb ../tmp2/$1 0 767 300 875

(the commands used in this script are various image processing commands that are supplied as
part of the Graphics Library Image Tools software subsystem. Consult the relevant man pages
for details)

The important feature is the use of the $1 symbol in the first line. The script expects a single
parameter, ie. the name of the file to be processed. By eventually using this same argument at the
end of an alternative directory reference, a processed image file with the same name is saved
elsewhere after all the intermediate processing steps have finished. Each step uses temporary
files created by previous steps.

When I used the script, I had a directory containing 449 image files, each with a different name:

i000.rgb
i001.rgb
i002.rgb
.
.
.
i448.rgb

To process all the frames in one go, I simply entered this command:

find . -name "i*.rgb" -print -exec go {} \;

As each file is located by the find command, its name is passed as a parameter to the go script.
The use of the -print option displays the name of each file before the go script begins processing
the file's contents. It's a simple way to execute multiple operations on a large number of files.
Secure/Restricted Shell Scripts.

It is common practice to include the following line at the start of a shell script:

#!/bin/sh

This tells any shell what to use to interpret the script if the script is simply executed, as opposed
to sourcing the script within the shell.

The 'sh' shell is a lower level shell than csh or tcsh, ie. it's more restricted in what it can do and
does not have all the added features of csh and tcsh. However, this means a better level of
security, so many scripts (especially as-standard system scripts) include the above line in order to
make sure that security is maximised.

Also, by starting a new shell to run the script in, one ensures that the commands are always
performed in the same way, ie. a script without the above line may work slightly differently
when executed from within different shells (csh, tcsh, etc.), perhaps because of any aliases
present in the current shell environment, or a customised path definition, etc.
Detailed Notes for Day 2 (Part 3)
UNIX Fundamentals: System Monitoring Tools.

Running a UNIX system always involves monitoring how a system is behaving on a daily basis.
Admins must keep an eye on such things as:

 disk space usage


 system performance and statistics, eg. CPU usage, disk I/O, memory, etc.
 network performance and statistics
 system status, user status
 service availability, eg. Internet access
 system hardware failures and related maintenance
 suspicious/illegal activity

Figure 37. The daily tasks of an admin.

This section explains the various system monitoring tools, commands and techniques which an
admin can use to monitor the areas listed above. Typical example administration tasks are
discussed in a later lecture. The focus here is on available tools and what they offer, not on how
to use them as part of an admin strategy.

Disk Space Usage.

The df command reports current disk space usage. Run on its own, the output is expressed in
terms of numbers of blocks used/free, eg.:

yoda # df
Filesystem Type blocks use avail %use Mounted on
/dev/root xfs 8615368 6116384 2498984 71 /
/dev/dsk/dks4d5s7 xfs 8874746 4435093 4439653 50 /home
milamber:/mapleson nfs 4225568 3906624 318944 93 /mapleson

Figure 38. Using df without options.

A block is 512 bytes. But most people tend to think in terms of kilobytes, megabytes and
gigabytes, not multiples of 512 bytes. Thus, the -k option can be used to show the output in K:

yoda # df -k
Filesystem Type kbytes use avail %use Mounted on
/dev/root xfs 4307684 3058192 1249492 71 /
/dev/dsk/dks4d5s7 xfs 4437373 2217547 2219826 50 /home
milamber:/mapleson nfs 2112784 1953312 159472 93 /mapleson

Figure 39. The -k option with df to show data in K.


The df command can be forced to report data only for the file system housing the current
directory by adding a period:

yoda # cd /home && df -k .


Filesystem Type kbytes use avail %use Mounted on
/dev/dsk/dks4d5s7 xfs 4437373 2217547 2219826 50 /home

Figure 40. Using df to report usage for the file system holding the current directory.

The du command can be used to show the amount of space used by a particular directory or file,
or series of directories and files. The -k option can be used to show usage in K instead of 512byte
blocks just as with df. du's default behaviour is to report a usage amount recursively for every
sub-directory, giving a total at the end, eg.:

yoda # du -k /usr/share/data/models
436 /usr/share/data/models/sgi
160 /usr/share/data/models/food
340 /usr/share/data/models/toys
336 /usr/share/data/models/buildings
412 /usr/share/data/models/household
864 /usr/share/data/models/scenes
132 /usr/share/data/models/chess
1044 /usr/share/data/models/geography
352 /usr/share/data/models/CyberHeads
256 /usr/share/data/models/machines
1532 /usr/share/data/models/vehicles
88 /usr/share/data/models/simple
428 /usr/share/data/models/furniture
688 /usr/share/data/models/robots
7760 /usr/share/data/models

Figure 41. Using du to report usage for several directories/files.

The -s option can be used to restrict the output to just an overall total for the specified directory:

yoda # du -k -s /usr/share/data/models
7760 /usr/share/data/models

Figure 42. Restricting du to a single directory.

By default, du does not follow symbolic links, though the -L option can be used to force links to
be followed if desired.

However, du does examine NFS-mounted file systems by default. The -l and -m options can be
used to restrict this behaviour, eg.:

ASH # cd /
ASH # du -k -s -l
0 CDROM
0 bin
0 debug
68 dev
0 disk2
2 diskcopy
0 dumpster
299 etc
0 home
2421 lib
2579 lib32
0 opt
0 proc
1 root.home
4391 sbin
565 stand
65 tmp
3927 unix
397570 usr
6346 var

Figure 43. Forcing du to ignore symbolic links.

The output in Fig 43 shows that the /home directory has been ignored.

Another example: a user can find out how much disk space their account currently uses by
entering: du -k -s ~/

Swap space (ie. virtual memory on disk) can be monitored using the swap command with the -l
option.

For full details on these commands, see the relevant man pages.

Commands relating to file system quotas are dealt with in a later lecture.

System Performance.

This includes processor loading, disk loading, etc.

The most common command used by admins/users to observe CPU usage is ps, which displays a
list of currently running processes along with associated information, including the percentage of
CPU time currently being consumed by each process, eg.:

ASH 6# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 0 0 0 08:00:41 ? 0:01 sched
root 1 0 0 08:00:41 ? 0:01 /etc/init
root 2 0 0 08:00:41 ? 0:00 vhand
root 3 0 0 08:00:41 ? 0:03 bdflush
root 4 0 0 08:00:41 ? 0:00 munldd
root 5 0 0 08:00:41 ? 0:02 vfs_sync
root 900 895 0 08:03:27 ? 1:25 /usr/bin/X11/Xsgi -bs
[etc]
root 7 0 0 08:00:41 ? 0:00 shaked
root 8 0 0 08:00:41 ? 0:00 xfsd
root 9 0 0 08:00:41 ? 0:00 xfsd
root 10 0 0 08:00:41 ? 0:00 xfsd
root 11 0 0 08:00:41 ? 0:00 pdflush
root 909 892 0 08:03:31 ? 0:02 /usr/etc/videod
root 1512 1509 0 15:37:17 ? 0:00 sh -c /var/X11/xdm/Xlogin
root 158 1 0 08:01:01 ? 0:01 /usr/etc/ypbind -ypsetme
root 70 1 0 08:00:50 ? 0:00 /usr/etc/syslogd
root 1536 211 0 16:06:04 pts/0 0:00 rlogind
root 148 1 0 08:01:00 ? 0:01 /usr/etc/routed -h -[etc]
root 146 1 0 08:01:00 ? 0:00 /usr/etc/portmap
root 173 172 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 172 1 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 174 172 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 175 172 0 08:01:03 ? 0:01 /usr/etc/nfsd 4
root 178 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 179 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 180 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 181 1 0 08:01:03 ? 0:00 /usr/etc/biod 4
root 189 1 0 08:01:04 ? 0:00 bio3d
root 190 1 0 08:01:04 ? 0:00 bio3d
root 191 1 0 08:01:04 ? 0:00 bio3d
root 202 1 0 08:01:05 ? 0:00 /usr/etc/rpc.statd
root 192 1 0 08:01:04 ? 0:00 bio3d
root 188 1 0 08:01:03 ? 0:00 bio3d
root 311 1 0 08:01:08 ? 0:00 /usr/etc/timed -M -F yoda
root 211 1 0 08:01:05 ? 0:02 /usr/etc/inetd
root 823 1 0 08:01:33 ? 0:13 /usr/lib/sendmail -bd -
q15m
root 1557 1537 9 16:10:58 pts/0 0:00 ps -ef
root 892 1 0 08:03:25 ? 0:00 /usr/etc/videod
root 1513 1512 0 15:37:17 ? 0:07 /usr/Cadmin/bin/clogin -f
root 1546 872 0 16:07:55 ? 0:00
/usr/Cadmin/bin/directoryserver
root 1537 1536 1 16:06:04 pts/0 0:01 -tcsh
root 903 1 0 08:03:27 tablet 0:00 /sbin/getty ttyd1 co_9600
lp 460 1 0 08:01:17 ? 0:00 /usr/lib/lpsched
root 1509 895 0 15:37:13 ? 0:00 /usr/bin/X11/xdm
root 488 1 0 08:01:19 ? 0:01 /sbin/cron
root 1556 1537 28 16:10:56 pts/0 0:01 find /usr -name *.txt -
print
root 895 1 0 08:03:27 ? 0:00 /usr/bin/X11/xdm
root 872 1 0 08:02:32 ? 0:06
/usr/Cadmin/bin/directoryserver

Figure 44. Typical output from the ps command.

Before obtaining the output shown in Fig 44, I ran a find command in the background. The
output shows that the find command was utilising 28% of available CPU resources; tasks such as
find are often limited by the speed and bandwidth capacity of the disk, not the speed of the main
CPU.

The ps command has a variety of options to show or not show various information. Most of the
time though, 'ps -ef' is adequate to display the kind of information required. Note that other
UNIX variants use slightly different options, eg. the equivalent command on SunOS would be
'ps -aux'.

One can use grep to only report data for a particular process, eg.:

ASH 5# ps -ef | grep lp

lp 460 1 0 08:01:17 ? 0:00 /usr/lib/lpsched

Figure 45. Filtering ps output with grep.

This only reports data for the lp printer scheduler.

However, ps only gives a snapshot of the current system state. Often of more interest is a
system's dynamic behaviour. A more suitable command for monitoring system performance over
time is 'top', a typical output of which looks like this:

IRIX ASH 6.2 03131015 IP22 Load[0.22,0.12,0.01] 16:17:47 166 procs


user pid pgrp %cpu proc pri size rss time command
root 1576 1576 24.44 * 20 386 84 0:02 find
root 1577 1577 0.98 0 65 432 100 0:00 top
root 1513 1509 0.18 * 60 4322 1756 0:07 clogin
root 900 900 0.12 * 60 2858 884 1:25 Xsgi
root 146 146 0.05 * 60 351 77 0:00 portmap
root 158 0 0.05 * 60 350 81 0:00 ypbind
root 1567 1567 0.02 * 60 349 49 0:00 rlogind
root 3 0 0.01 * +39 0 0 0:03 bdflush
root 172 0 0.00 * 61 0 0 0:00 nfsd
root 173 0 0.00 * 61 0 0 0:00 nfsd
root 174 0 0.00 * 61 0 0 0:00 nfsd
root 175 0 0.00 * 61 0 0 0:00 nfsd

Figure 46. top shows a continuously updated output.

From the man page for top:

"Two header lines are displayed. The first gives the machine name, the release and build date
information, the processor type, the 1, 5, and 15 minute load average, the current time and the
number of active processes. The next line is a header containing the name of each field
highlighted."

The display is constantly updated at regular intervals, the duration of which can be altered with
the -i option (default duration is 5 seconds). top shows the following data for each process:
"user name, process ID, process group ID, CPU usage, processor currently executing the process
(if process not currently running), process priority, process size (in pages), resident set size (in
pages), amount of CPU time used by the process, and the process name."

Just as with the ps command, top shows the ID number for each process. These IDs can be used
with the kill command (and others) to control running processes, eg. shut them down, suspend
them, etc.

There is a GUI version of top called gr_top.

Note that IRIX 6.5 contains a newer version of top which gives even more information, eg.:

IRIX WOLFEN 6.5 IP22 load averages: 0.06 0.01 0.00


17:29:44
58 processes: 56 sleeping, 1 zombie, 1 running
CPU: 93.5% idle, 0.5% usr, 5.6% ker, 0.0% wait, 0.0% xbrk, 0.5%
intr
Memory: 128M max, 116M avail, 88M free, 128M swap, 128M free swap

PID PGRP USERNAME PRI SIZE RES STATE TIME WCPU% CPU%
COMMAND
1372 1372 root 20 2204K 1008K run/0 0:00 0.2 3.22 top
153 153 root 20 2516K 1516K sleep 0:05 0.1 1.42 nsd
1364 1364 root 20 1740K 580K sleep 0:00 0.0 0.24
rlogind

Figure 47. The IRIX 6.5 version of top, giving extra information.

A program which offers much greater detail than top is osview. Like top, osview constantly
updates a whole range of system performance statistics. Unlike top though, so much information
is available from osview that it offers several different 'pages' of data. The number keys are used
to switch between pages. Here is a typical output for each of the five pages:

Page 1 (system information):

Osview 2.1 : One Second Average WOLFEN 17:32:13 04/21/99 #5


int=5s
Load Average fs ctl 2.0M
1 Min 0.000 fs data 7.7M
5 Min 0.000 delwri 0
15 Min 0.000 free 87.5M
CPU Usage data 26.0M
%user 0.20 empty 61.4M
%sys 0.00 userdata 20.7M
%intr 0.00 reserved 0
%gfxc 0.00 pgallocs 2
%gfxf 0.00 Scheduler
%sxbrk 0.00 runq 0
%idle 99.80 swapq 0
System Activity switch 4
syscall 19 kswitch 95
read 1 preempt 1
write 0 Wait Ratio
fork 0 %IO 1.2
exec 0 %Swap 0.0
readch 19 %Physio 0.0
writech 38
iget 0
System Memory
Phys 128.0M
kernel 10.1M
heap 3.9M
mbufs 96.0K
stream 40.0K
ptbl 1.2M

Figure 48. System information from osview.

Page 2 (CPU information):

Osview 2.1 : One Second Average WOLFEN 17:36:27 04/21/99 #1


int=5s
CPU Usage
%user 0.00
%sys 100.00
%intr 0.00
%gfxc 0.00
%gfxf 0.00
%sxbrk 0.00
%idle 0.00

Figure 49. CPU information from osview.

Page 3 (memory information):

Osview 2.1 : One Second Average WOLFEN 17:36:56 04/21/99 #1


int=5s
System Memory iclean 0
Phys 128.0M *Swap
kernel 10.5M *System VM
heap 4.2M *Heap
mbufs 100.0K *TLB Actions
stream 48.0K *Large page stats
ptbl 1.3M
fs ctl 1.5M
fs data 8.2M
delwri 0
free 77.1M
data 28.8M
empty 48.3M
userdata 30.7M
reserved 0
pgallocs 450
Memory Faults
vfault 1.7K
protection 225
demand 375
cw 25
steal 375
onswap 0
oncache 1.4K
onfile 0
freed 0
unmodswap 0
unmodfile 0

Figure 50. Memory information from osview.

Page 4 (network information):

Osview 2.1 : One Second Average WOLFEN 17:38:15 04/21/99 #1


int=5s
TCP
acc. conns 0
sndtotal 33
rcvtotal 0
sndbyte 366
rexmtbyte 0
rcvbyte 0
UDP
ipackets 0
opackets 0
dropped 0
errors 0
IP
ipackets 0
opackets 33
forward 0
dropped 0
errors 0
NetIF[ec0]
Ipackets 0
Opackets 33
Ierrors 0
Oerrors 0
collisions 0
NetIF[lo0]

Figure 51. Network information from osview.

Page 5 (miscellaneous):

Osview 2.1 : One Second Average WOLFEN 17:38:43 04/21/99 #1


int=5s
Block Devices
lread 37.5K
bread 0
%rcache 100.0
lwrite 0
bwrite 0
wcancel 0
%wcache 0.0
phread 0
phwrite 0
Graphics
griioctl 0
gintr 75
swapbuf 0
switch 0
fifowait 0
fifonwait 0
Video
vidioctl 0
vidintr 0
drop_add 0
*Interrupts
*PathName Cache
*EfsAct
*XfsAct
*Getblk
*Vnodes

Figure 51. Miscellaneous information from osview.

osview clearly offers a vast amount of information for monitoring system and network activity.

There is a GUI version of osview called gr_osview. Various options exist to determine which
parameters are displayed with gr_osview, the most commonly used being -a to display as much
data as possible.

Programs such as top and osview may be SGI-specific (I'm not sure). If they are, other versions
of UNIX are bound to have equivalent programs to these.

Example use: although I do virtually all the administration of the server remotely using the office
Indy (either by command line or GUI tools), there is also a VT-style terminal in my office
connected to the server's serial port via a lengthy cable (the Challenge S server itself is in a small
ante room). The VT display offers a simple text-only interface to the server; thus, most of the
time, I leave osview running on the VT display so that I can observe system activity whenever I
need to. The VT also offers an extra communications link for remote administration should the
network go down, ie. if the network links fail (eg. broken hub) the admin Indy cannot be used to
communicate with the server, but the VT still can.

Another tool for monitoring memory usage is gmemusage, a GUI program which displays a
graphical split-bar chart view of current memory consumption. gmemusage can also display a
breakdown of the regions within a program's memory space, eg. text, data, shared memory, etc.
Much lower-level tools exist too, such as sar (system activity reporter). In fact, osview works by
using sar. Experienced admins may use tools like sar, but most admins will prefer to use higher-
level tools such as top, osview and gmemusage. However, since sar gives a text output, one can
use it in script files for automated system information gathering, eg. a system activity report
produced by a script, executed every hour by the cron job-scheduling system (sar-based
information gathering scripts are included in the cron job schedule as standard). sar can be given
options to report only on selected items, eg. the number of processes in memory waiting for CPU
resource time. sar can be told to monitor some system feature for a certain period, saving the data
gathered during that period to a file. sar is a very flexible program.

Network Performance and Statistics.

osview can be used to monitor certain network statistics, but another useful program is ttcp. The
online book, "IRIX Admin: Networking and Mail", says:

"The ttcp tool measures network throughput. It provides a realistic measurement of network
performance between two stations because it allows measurements to be taken at both the
local and remote ends of the transmission."

To run a test with ttcp, enter the following on one system, eg. sevrin:

ttcp -r -s

Then enter the following on another system, eg. akira:

ttcp -t -s sevrin

After a delay of roughly 20 seconds for a 10Mbit network, results are reported by both systems,
which will look something like this:

SEVRIN # ttcp -r -s
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp
ttcp-r: socket
ttcp-r: accept from 193.61.252.2
ttcp-r: 16777216 bytes in 18.84 real seconds = 869.70 KB/sec +++
ttcp-r: 3191 I/O calls, msec/call = 6.05, calls/sec = 169.39
ttcp-r: 0.1user 3.0sys 0:18real 16% 118maxrss 0+0pf 1170+1csw

AKIRA # ttcp -t -s sevrin


ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ->
sevrin
ttcp-t: socket
ttcp-t: connect
ttcp-t: 16777216 bytes in 18.74 real seconds = 874.19 KB/sec +++
ttcp-t: 2048 I/O calls, msec/call = 9.37, calls/sec = 109.27
ttcp-t: 0.0user 2.3sys 0:18real 12% 408maxrss 0+0pf 426+4csw

Figure 52. Results from ttcp between two hosts on a 10Mbit network.
Full details of the output are in the ttcp man page, but one can immediately see that the observed
network throughput (around 870KB/sec) is at a healthy level.

Another program for gathering network performance information is netstat. The online book,
"IRIX Admin: Networking and Mail", says:

"The netstat tool displays various network-related data structures that are useful for monitoring
and troubleshooting a network. Detailed statistics about network collisions can be captured with
the netstat tool."

netstat is commonly used with the -i option to list basic local network information, eg.:

yoda # netstat -i
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs
Coll
ec0 1500 193.61.252 yoda.comp.uclan 3906956 3 2945002 0
553847
ec3 1500 193.61.250 gate-yoda.comp. 560206 2 329366 0
16460
lo0 8304 loopback localhost 476884 0 476884 0
0

Figure 53. The output from netstat.

Here, the packet collision rate has averaged at 18.8%. This is within acceptable limits [1].

Another useful command is 'ping'. This program sends packets of data to a remote system
requesting an acknowledgement response for each packet sent. Options can be used to send a
specific number of packets, or send as many packets as fast as they are returned, send a packet
every so often (user-definable duration), etc.

For example:

MILAMBER # ping yoda


PING yoda.comp.uclan.ac.uk (193.61.252.1): 56 data bytes
64 bytes from 193.61.252.1: icmp_seq=0 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=1 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=2 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=3 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=4 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=5 ttl=255 time=1 ms
64 bytes from 193.61.252.1: icmp_seq=6 ttl=255 time=1 ms

----yoda.comp.uclan.ac.uk PING Statistics----


7 packets transmitted, 7 packets received, 0% packet loss
round-trip min/avg/max = 1/1/1 ms
Figure 54. Example use of the ping command.

I pressed CTRL-C after the 7th packet was sent. ping is a quick and easy way to see if a host is
active and if so how responsive the connection is.

If a ping test produces significant packet loss on a local network, then it is highly likely there
exists a problem of some kind. Normally, one would rarely see a non-zero packet loss on a local
network from a direct machine-to-machine ping test.

A fascinating use of ping I once observed was at The Moving Picture Company (MPC) [2]. The
admin had written a script which made every host on the network send a ping test to every other
host. The results were displayed as a table with host names shown down the left hand side as
well as along the top. By looking for horizontal or diagonal lines of unusually large ping times,
the admin could immediately see if there was a problem with a single host, or with a larger part
of the network. Because of the need for a high system availability rate, the script allows the
admin to spot problems almost as soon as they occur, eg. by running the script once every ten
seconds.

When the admin showed me the script in use, one column had rather high ping times (around
20ms). Logging into the host with rlogin, ps showed everything was ok: a complex process was
merely consuming alot of CPU time, giving a slower network response.

System Status and User Status.

The rup command offers an immediate overview of current system states, eg.:

yoda # rup
yoda.comp.uclan.ac.u up 6 days, 8:25, load average: 0.33, 0.36,
0.35
gate-yoda.comp.uclan up 6 days, 8:25, load average: 0.33, 0.36,
0.35
wolfen.comp.uclan.ac up 11:28, load average: 0.00, 0.00,
0.00
conan.comp.uclan.ac. up 11:28, load average: 0.06, 0.01,
0.00
akira.comp.uclan.ac. up 11:28, load average: 0.01, 0.00,
0.00
nikita.comp.uclan.ac up 11:28, load average: 0.03, 0.00,
0.00
gibson.comp.uclan.ac up 11:28, load average: 0.00, 0.00,
0.00
woo.comp.uclan.ac.uk up 11:28, load average: 0.01, 0.00,
0.00
solo.comp.uclan.ac.u up 11:28, load average: 0.00, 0.00,
0.00
cameron.comp.uclan.a up 11:28, load average: 0.02, 0.00,
0.00
sevrin.comp.uclan.ac up 11:28, load average: 0.69, 0.46,
0.50
ash.comp.uclan.ac.uk up 11:28, load average: 0.00, 0.00,
0.00
ridley.comp.uclan.ac up 11:28, load average: 0.00, 0.00,
0.00
leon.comp.uclan.ac.u up 11:28, load average: 0.00, 0.00,
0.00
warlock.comp.uclan.a up 1:57, load average: 0.08, 0.13,
0.11
milamber.comp.uclan. up 9:52, load average: 0.11, 0.07,
0.00
merlin.comp.uclan.ac up 11:28, load average: 0.01, 0.00,
0.00
indiana.comp.uclan.a up 11:28, load average: 0.00, 0.00,
0.02
stanley.comp.uclan.a up 1:56, load average: 0.00, 0.00,
0.00

Figure 55. The output from rup.

The load averages for a single machine can be ascertained by running 'uptime' on that machine,
eg.:

MILAMBER 84# uptime


8:05pm up 10:28, 6 users, load average: 0.07, 0.06, 0.25
MILAMBER 85# rsh yoda uptime
8:05pm up 6 days, 9:02, 2 users, load average: 0.47, 0.49, 0.42

Figure 56. The output from uptime.

The w command displays current system activity, including what each user is doing. The man
page says, "The heading line shows the current time of day, how long the system has been up,
the number of users logged into the system, and the load averages." For example:

yoda # w
8:10pm up 6 days, 9:07, 2 users, load average: 0.51, 0.50, 0.41
User tty from login@ idle JCPU PCPU what
root q0 milamber.comp. 7:02pm 8 w
cmprj ftp UNKNOWN@ns5ip. 7:29pm -

Figure 57. The output from w showing current user activity.

With the -W option, w shows the 'from' information on a separate line, allowing one to see the
full domain address of ftp connections, etc.:

yoda # w -W
8:11pm up 6 days, 9:08, 2 users, load average: 0.43, 0.48, 0.40
User tty login@ idle JCPU PCPU what
root ttyq0 7:02pm 8 w -W
milamber.comp.uclan.ac.uk
cmprj ftp22918 7:29pm -
UNKNOWN@ns5ip.uclan.ac.uk
Figure 58. Obtaining full domain addresses from w with the -W option.

The rusers command broadcasts to all machines on the local network, gathering data about who
is logged on and where, eg.:

yoda # rusers
yoda.comp.uclan.ac.uk root
wolfen.comp.uclan.ac.uk guest guest
gate-yoda.comp.uclan.ac.uk root
milamber.comp.uclan.ac.uk root root root root mapleson mapleson
warlock.comp.uclan.ac.uk sensjv sensjv

Figure 59. The output from rusers, showing who is logged on where.

The multiple entries for certain users indicate that more than one shell is active for that user. As
usual, my login data shows I'm doing several things at once.

rusers can be modified with options to:

 report for all machines, whether users are logged in or not (-a),
 probe a specific machine (supply host name(s) as arguments),
 display the information sorted alphabetically by:
o host name (-h),
o idle time (-i),
o number of users (-u),
 give a more detailed output in the same style as the who command (-l).

Service Availability.

The most obvious way to check if a service is available for use by users is to try and use the
service, eg. ftp or telnet to a test location, run up a Netscape sessions and enter a familiar URL,
send an email to a local or remote account, etc. The ps command can be used to make sure the
relevant background process is running for a service too, eg. 'nfsd' for the NFS system. However,
if a service is experiencing problems, simply attempting to use the service will not reveal what
may be wrong.

For example, if one cannot ftp, it could be because of anything from a loose cable connection to
some remote server that's gone down.

The ping command is useful for an immediate check of network-related services such as ftp,
telnet, WWW, etc. One pings each host in the communication chain to see if the hosts respond. If
a host somewhere in the chain does not respond, then that host may be preventing any data from
getting through (eg. a remote proxy server is down).

A useful command one can use to aid in such detective work is traceroute. This command sends
test packets in a similar way to ping, but it also reports how the test packets reached the target
site at each stage of the communication chain, showing response times in milliseconds for each
step, eg.:

yoda # traceroute www.cee.hw.ac.uk


traceroute to osiris.cee.hw.ac.uk (137.195.52.12), 30 hops max, 40 byte
packets
1 193.61.250.33 (193.61.250.33) 6 ms (ttl=30!) 3 ms (ttl=30!) 4 ms
(ttl=30!)
2 193.61.250.65 (193.61.250.65) 5 ms (ttl=29!) 5 ms (ttl=29!) 5 ms
(ttl=29!)
3 gw-mcc.netnw.net.uk (194.66.24.1) 9 ms (ttl=28!) 8 ms (ttl=28!) 10 ms
(ttl=28!)
4 manchester-core.ja.net (146.97.253.133) 12 ms 11 ms 9 ms
5 scot-pop.ja.net (146.97.253.42) 15 ms 13 ms 14 ms
6 146.97.253.34 (146.97.253.34) 20 ms 15 ms 17 ms
7 gw1.hw.eastman.net.uk (194.81.56.110) 20 ms (ttl=248!) 18 ms 14 ms
8 cee-gw.hw.ac.uk (137.195.166.101) 17 ms (ttl=23!) 31 ms (ttl=23!) 18
ms (ttl=23!)
9 osiris.cee.hw.ac.uk (137.195.52.12) 14 ms (ttl=56!) 26 ms (ttl=56!)
30 ms (ttl=56!)
If a particular step shows a sudden jump in response time, then there may be a communications
problem at that step, eg. the host in question may be overloaded with requests, suffering from lack of
communications bandwidth, CPU processing power, etc.

At a lower level, system services often depend on background system processes, or daemons. If
these daemons are not running, or have shut down for some reason, then the service will not be
available.

On the SGI Indys, one example is the GUI service which handles the use of on-screen icons. The
daemon responsible is called objectserver. Older versions of this particular daemon can
occasionally shut down if an illegal iconic operation is performed, or if the file manager daemon
experiences an error. With no objectserver running, the on-screen icons disappear.

Thus, a typical task might be to periodically check to make sure the objectserver daemon is
running on all relevant machines. If it isn't, then the command sequence:

/etc/init.d/cadmin stop
/etc/init.d/cadmin start

restarts the objectserver. Once running, the on-screen icons return.

A common cause of objectserver shutting down is when a user's desktop layout configuration
files (contained in .desktop- directories) become corrupted in some way, eg. edited by hand in an
incorrect manner, or mangled by some other operation (eg. a faulty Java script from a home
made web page). One solution is to erase the user's desktop layout configuration directory, then
login as the user and create a fresh .desktop- directory.

objectserver is another example of UNIX GUI evolution. In 1996 SGI decided to replace the
objectserver system entirely in IRIX 6.3 (and later) with a new service that was much more
reliable, less likely to be affected by errors made in other applications, and fully capable of
supporting new 'webified' iconic services such as on-screen icons that are direct links to ftp,
telnet or WWW sites.

In general, checking the availability of a service requires one to check that the relevant daemons
are running, that the appropriate configuration files are in place, accessible and have the correct
settings, that the relevant daemon is aware of any changes which may have been made (perhaps
the service needs to be stopped and restarted?) and to investigate via online information what
may have caused services to fail as and when incidents occur. For every service one can use, the
online information explains how to setup, admin and troubleshoot the service. The key is to
know where to find that information when it is needed.

A useful source of constantly updated status information is the /var/adm/SYSLOG file. This file
is where any important system events are logged. One can configure all the various services and
daemons to log different degrees of detailed information in the SYSLOG file. Note: logging too
much detail can cause the log file to grow very quickly, in which case one would also have to
ensure that it did not consume valuable disk space. The SYSLOG file records user logins,
connections via ftp, telnet, etc., messages logged at system bootup/shutdown time, and many
other things.

Vendor Information Updates.

Most UNIX vendors send out periodic information booklets containing indepth articles on
various system administration issues. SGI's bulletin is called Pipeline. Such information guides
are usually supplied as part of a software support contract, though the vendor will often choose
to include copies on the company web site. An admin should read any relevant articles from
these guides - they can often be unexpectedly enlightening.

System Hardware Failures.

When problems occur on a system, what might at first appear to be a software problem may in
fact be a hardware fault. Has a disk failed? The fx program can be used to check disk status
(block read tests, disk label checks, etc.)

Has a network cable failed? Are all the cable connections firmly in place in the hub? Has a plug
come loose?

In late 1998, the Ve24 network stopped operating quite unexpectedly one morning. The errors
made it appear that there was a problem with the NFS service or perhaps the main user files disk
connected to the server; in fact, the fault lay with the Ve24 hub.

The online guides have a great deal of advice on how to spot possible hardware failures. My
advice is to check basic things first and move onto the more complex possibilities later. In the
above example, I wasted a great deal of time investigating whether the NFS service was
responsible, or the external user files disk, when in fact I should have checked the hub
connections first. As it happens, the loose connection was such that the hub indicator light was
on even though the connection was not fully working (thus, a visual check would not have
revealed the problem) - perhaps the fault was caused by a single loose wire out of the 8 running
through the cable, or even an internal fault in the hub (more likely). Either way, the hub was
eventually replaced.

Other things that can go wrong include memory faults. Most memory errors are not hardware
errors though, eg. applications with bugs can cause errors by trying to access some protected area
of memory.

Hardware memory errors will show up in the system log file /var/adm/SYSLOG as messages
saying something like 'Hardware ECC Memory Error in SIMM slot 4'. By swapping the memory
SIMMs around between the slots, one can identify which SIMM is definitely at fault (assuming
there is only one causing the problem).

The most common hardware component to go wrong on a system, even a non-PC system, is the
disk drive. When configuring systems, or carrying out upgrades/expansions, it is wise to stick
with models recommended by the source vendor concerned, eg. SGI always uses high-quality
Seagate, IBM or Quantum disk drives for their systems; thus, using (for example) a Seagate
drive is a good way to ensure a high degree of reliability and compatibility with the system
concerned.

Sometimes an admin can be the cause of the problem. For example, when swapping disks around
or performing disk tasks such as disk cloning, it is possible to incorrectly set the SCSI ID of the
disk. SGI systems expect the system disk to be on SCSI ID 1 (though this is a configurable
setting); if the internal disk is on the wrong SCSI ID, then under certain circumstances it can
appear to the system as if there are multiple disks present, one on each possible ID. If hardware
errors are observed on bootup (the system diagnostics checks), then the first thing to do is to
reboot and enter the low-level 'Command Monitor' (an equivalent access method will exist for all
UNIX systems): the Command Monitor has a small set of commands available, some of which
can be used to perform system status checks, eg. the hinv command. For the problem described
above, hinv would show multiple instances of the same disk on all SCSI IDs from 1 to 7 - the
solution is to power down and check the SCSI jumpers carefully.

Other problems can occasionally be internal, eg. a buildup of dust blocking air vents (leading to
overheating), or a fan failure, followed by overheating and eventually an automatic system
shutdown (most UNIX systems' power supplies include circuitry to monitor system temperature,
automatically shutting down if the system gets too hot). This leads on to questions of system
maintenance which will be dealt with on Day 3.

After disk failures, the other most common failure is the power supply. It can sometimes be
difficult to spot because a failure overnight or when one isn't around can mean the system shuts
down, cools off and is thus rebootable again the next morning. All the admin sees is a system
that's off for no readily apparent reason the next morning. The solution is to, for example, move
the system somewhere close at hand so that it can be monitored, or write a script which tests
whether the system is active every few seconds, logging the time of each successful test - if the
system goes down, the admin is notified in some way (eg. audio sound file played) and the
admin can then quickly check the machine - if the power supply area feels overly hot, then that is
the likely suspect, especially if an off/on mains switch toggle doesn't turn the system back on
(power supplies often have circuitry which will not allow power-on if the unit is still too hot). If
the admin wasn't available at the time, then the logged results can show when the system failed.

All SGIs (and UNIX systems in general) include a suite of hardware and software diagnostics
tests as part of the OS. IRIX contains a set of tests for checking the mouse, keyboard, monitor,
audio ports, digital camera and other basic hardware features.

Thankfully, for just about any hardware failure, hardware support contracts cover repairs and/or
replacements very effectively for UNIX systems. It's worth noting that although the Computing
Department has a 5-day support contract with SGI, all problems I've encountered so far have
been dealt either on the same day or early next morning by a visiting support engineer (ie. they
arrived much earlier than they legally had to). Since November 1995 when I took charge of the
Ve24 network, the hardware problems I've encountered have been:

 2 failed disks
 1 replaced power supply
 2 failed memory SIMMs (1 failed SIMM from two different machines)
 1 replaced keyboard (user damage)
 1 failed monitor
 1 suspect motherboard (replaced just in case)
 1 suspect video card (replaced just in case)
 1 problematic 3rd-party disk (incorrect firmware, returned to supplier and corrected with up-to-
date firmware; now operating ok)
 1 suspect hub (unknown problem; replaced just in case)

Given that the atmosphere in Ve24 is unfiltered, often humid air, and the fact that many of the
components in the Indys in Ve24 have been repeatedly swapped around to create different
configurations at different times, such a small number of failures is an excellent record after nearly 4
years of use.

It is likely that dirty air (dust, humidity, corrosive gases) was largely responsible for the disk,
power supply and memory failures - perhaps some of the others too. A build up of dust can
combine with airborne moisture to produce corrosive chemicals which can short-circuit delicate
components.

To put the above list another way: 14 out of the 18 Indys have been running non-stop for 3.5
years without a single hardware failure of any kind, despite being housed in an area without
filtered air or temperature control. This is very impressive and is quite typical of UNIX hardware
platforms.
Installing systems with proper air filtering and temperature control can be costly, but the benefit
may be a much reduced chance of hardware failure - this could be important for sites with many
more systems and a greater level of overall system usage (eg. 9 to 5 for most machines).

Some companies go to great lengths to minimise the possibility of hardware failure. For
example, MPC [2] has an Origin200 render farm for rendering movie animation frames. The
render farm consists of 50 Origin200 servers, each with 2 R10000 CPUs, ie. 100 CPUs in total.
The system is housed in a dedicated room with properly filtered air and temperature control.
Almost certainly as a result of this high-quality setup, MPC has never had a single hardware
failure of any kind in nearly 3 years of operation. Further, MPC has not experienced a single OS
failure over the same period either, even though the system operates 24hours/day.

This kind of setup is common amongst companies which have time-critical tasks to perform, eg.
oil companies with computational models that can take six months to complete - such
organisations cannot afford to have failures (the problem would likely have to be restarted from
scratch, or at least delayed), so it's worth spending money on air filters, etc.

If one does not have filtered air, then the very least one should do is keep the systems clean
inside and out, performing system cleaning on a regular basis.

At present, my current policy is to thoroughly clean the Indys twice a year: every machine is
stripped right down to the bare chassis; every component is individually cleaned with appropriate
cleaning solutions, cloths, air-dusters, etc. (this includes removing every single key from all the
keyboards and mass-cleaning them with a bucket of hot water and detergent! And cleaning the
keyboard bases inside and out too). Aside from these major bi-annual cleanings, simple regular
cleaning is performed on a weekly or monthly basis: removing dirt from the mice (inside
especially), screen, chassis/monitor surface, cables and so on; cleaning the desks; opening each
system and blowing away internal dust using a can of compressed filtered air, etc.

Without a doubt, this process greatly lengthens the life-span of the systems' hardware
components, and users benefit too from a cleaner working environment - many new students
each autumn often think the machines must be new because they look so clean.

Hardware failures do and will occur on any system whether it's a UNIX platform or not. An
admin can use information from online sources, combined with a knowledge of relevant system
test tools such as fx and ping, to determine the nature of hardware failures and take corrective
action (contacting vendor support if necessary); such a strategy may include setting up
automated hardware tests using regularly-executed scripts.

Another obvious source of extensive information about any UNIX platform is the Internet.
Hundreds of existing users, including company employees, write web pages [3] or USENET
posts describing their admin experiences and how to deal with typical problems.
Suspicious/Illegal Activity.

Users inevitably get up to mischief on occasion, or external agencies may attempt to hack the
system. Types of activity could include:

 users downloading illegal or prohibited material, either with respect to national/local laws or
internal company policy,
 accessing of prohibited sites, eg. warez software piracy sites,
 mail spamming and other abuses of Internet services,
 attacks by hackers,
 misuse/abuse of system services internally.

There are other possibilities, but these are the main areas. This lecture is an introduction to security and
monitoring issues. A more in-depth discussion is given in the last lecture.

As an admin who is given the task of supposedly preventing and/or detecting illegal activities,
the first thing which comes to mind is the use of various file-searching methods to locate suspect
material, eg. searching every user's netscape bookmarks file for particular keywords. However,
this approach can pose legal problems.

Some countries have data protection and/or privacy laws [4] which may prohibit one from
arbitrarily searching users' files. Searches of this type are the equivalent of a police force tapping
all the phones in an entire street and recording every single conversation just on the off-chance
that they might record something interesting; such methods are sometimes referred to as 'fishing'
and could be against the law. So, for example, the following command might be illegal:

find /home/students -name "*" -print > list


grep sex list > suspected
grep warez list >> suspected
grep xxx list >> suspected
grep pics list >> suspected
grep mpg list >> suspected
grep jpg list >> suspected
grep gif list >> suspected
grep sites list >> suspected

As a means of finding possible policy violations, the above script would be very effective, but
it's definitely a form of fishing (even the very first line).

Now consider the following:

find /home/students -name "bookmarks.html" -print -exec grep playboy


{} \;

This command will effectively locate any Netscape bookmarks file which contains a possible
link to the PlayBoy web site. Such a command is clearly looking for fairly specific content in a
very specific file in each user's .netscape directory; further, it is probably accessing a user's
account space without her or his permission (this opens the debate on whether 'root' even needs a
user's permission since root actually owns all files anyway - more on this below).

The whole topic of computer file access is a grey area. For example, might the following
command also be illegal?

find . -name "*.jpg" -print > results && grep sex results

A user's lawyer could argue that it's clearly looking for any JPEG image file that is likely to be of
an explicit nature. On the other hand, an admin's lawyer could claim the search was actually
looking for any images relating to tourism in Sussex county, or musical sextets, or adverts for
local unisex hair salons, and just accidentally happened to be in a directory above /home/students
when the command was executed (the find would eventually reach /home/students). Obviously a
setting for a messy court-room battle.

But even ignoring actions taken by an admin using commands like find, what about data
backups? An extremely common practice on any kind of computer system is to backup user data
to a media such as DAT on a regular basis - but isn't this accessing user files without permission?
But hang on, on UNIX systems, the root user is effectively the absolute owner of any file, eg.
suppose a file called 'database' in /tmp, owned by an ordinary user, contained some confidential
data; if the the admin (logged in as root) then did this:

cat /tmp/database

the contents of the database file would indeed be displayed.

Thus, since root basically owns all files anyway by default, surely a backup procedure is just the
root user archiving files it already owns? If so, does one instead have to create some abstract
concept of ownership in order to offer users a concrete concept of what data privacy actually is?
Who decides? Nations which run their legal systems using case-law will find these issues very
difficult to clarify, eg. the UK's Data Protection Act is known to be 'weak'.

Until such arguments are settled and better laws created, it is best for an admin to err on the side
of caution. For example, if an admin wishes to have some kind of regular search conducted, the
existence of the search should be included as part of stated company policy, and enshrined into
any legal documents which users must sign before they begin using the system, ie. if a user signs
the policy document, then the user has agreed to the actions described in that document. Even
then, such clauses may not be legally binding. An admin could also setup some form of login
script which would require users to agree to a sytsem usage policy before they were fully logged-
in.

However, these problems won't go away, partly because of the specifics of how some modern
Internet services such as the web are implemented. For example, a user could access a site which
automatically forces the pop-up of a Netscape window which is directed to access a prohibited
site; inline images from the new site will then be present in the user's Netscape cache directory in
their home account area even though they haven't specifically tried to download anything. Are
they legally liable? Do such files even count as personal data? And if the site has its own proxy
server, then the images will also be in the server's proxy cache - are those responsible for the
server also liable? Nobody knows. Legal arguments on the nature of cache directories and other
file system details have not yet been resolved. Clearly, there is a limit to how far one can go in
terms of prevention simply because of the way computing technologies work.

Thus, the best thing to do is to focus efforts on information that does not reside inside user
accounts. The most obvious place is the system log file, /var/adm/SYSLOG. This file will show
all the ftp and telnet sites which users have been accessing; if one of these sites is a prohibited
place, then that is sufficient evidence to take action.

The next most useful data resource to keep an eye on is the web server log(s). The web logs
record every single access by all users to the WWW. Users have their own record of their
accesses in the form of a history file, hidden in their home directory inside the .netscape
directory (or other browser); but the web logs are outside their accounts and so can be probably
be freely examined, searched, processed, etc. by an admin without having to worry about legal
issues. Even here though, there may be legal issues, eg. log data often includes user IDs which
can be used to identify specific individuals and their actions - does a user have a legal right to
have such data kept private? Only a professional lawyer in the field would know the correct
answer.

Note: the amount of detail placed into a log file can be changed to suit the type of logging
required. If a service offers different levels of logging, then the appropriate online documentation
will explain how to alter the settings.

Blocking Sites.

If an admin does not want users to be able to access a particular site, then that site can be added
to a list of 'blocked' sites by using the appropriate option in the web server software concerned,
eg. Netscape Enterprise Server, CERN web server, Apache web server, etc. Even this may pose
legal problems if a country has any form of freedom-of-speech legislation though (non-existent
in the UK at present, so blocking sites should be legally OK in the UK).

However, blocking sites can become somewhat cumbersome because there are thousands of web
sites which an admin could theoretically have to deal with - once the list becomes quite large,
web server performance decreases as every access has to have its target URL checked against the
list of banned sites. So, if an admin does choose to use such a policy, it is best to only add sites
when necessary, and to construct some kind of checking system so that if no attempt is made to
access a blocked site after a duration of, say, two weeks (whatever), then that site is removed
from the list of blocked sites. In the long term, such a policy should help to keep the list to a
reasonably manageable size. Even so, just the act of checking the web logs and adding sites to
the list could become a costly time-consuming process (time = money = wages).

One can also use packet filtering systems such as hardware routers or software daemons like
ipfilterd which can accept, reject, or reject-and-log incoming packets based on source/destination
IP address, host name, network interface, port number, or any combination of these. Note that
daemons such as ipfilterd may require the presence of a fast CPU if the overhead from a busy
site is to be properly supported. The ipfilterd system is discussed in detail on Day 3.

System Temporary Directories.

An admin should keep a regular eye on the contents of temporary directories on all systems, ie.
/tmp and /var/tmp. Users may download material and leave the material lying around for anyone
to see. Thus, a suspicious file can theoretically be traced to its owner via the user ID and group
ID of the file. I say theoretically because, as explained elsewhere, it is possible for a user X to
download a file (eg. by ftp so as to avoid the web logs, or by telnet using a shell on a remote
system) and then 'hand over' ownership of the file to someone else (say user Y) using the chgrp
and chown commands, making it look as though a different user is responsible for the file. In that
sense, files found outside a user's home directory could not normally be used as evidence, though
they would at least alert the admin to the fact that suspect activities may be occurring, permitting
a refocusing of monitoring efforts, etc.

However, one way in which it could be possible to reinforce such evidence is by being able to
show that user Y was not logged onto the system at the time when the file in question was
created (this information can be gleaned from the system's own local /var/adm/SYSLOG file,
and the file's creation time and date).

Unfortunately, both users could have been logged onto the same system at the time of the file's
creation. Thus, though a possibility, the extra information may not help. Except in one case:
video evidence. If one can show by security camera recordings that user X did indeed login 'on
console' (ie. at the actual physical keyboard) then that can be tied in with SYSLOG data plus file
creation times, irrespective of what user Y was doing at the time.

Certainly, if someone wished to frame a user, it would not be difficult to cause a considerable
amount of trouble for that user with just a little thought on how to access files, where to put
them, changing ownership, etc.

In reality, many admins probably just do what they like in terms of searching for files, examining
users' areas, etc. This is because there is no way to prove someone has attempted to search a
particular part of the file system - UNIX doesn't keep any permanent record of executed
commands.

Ironically, the IRIX GUI environment does keep a record of any file-related actions taken with
the GUI system (icons, file manager windows, directory views, etc.) but the log file with this
information is kept inside the user's .desktop- directory and thus may be legally out of bounds.

File Access Permissions.

Recall the concept of file access permissions for files. If a user has a directory or file with its
permissions set so that another ordinary user can read it (ie. not just root, who can access
anything by default anyway), does the fact that the file is globally readable mean the user has by
default given permission for anyone else to read the file? If one says no, then that would mean it
is illegal to read any user's own public_html web area! If one says yes, and a legal body
confirmed this for the admin, then that would at least enable the admin to examine any directory
or file that had the groups and others permissions set to a minimum of read-only (read and
executable for directories).

The find command has an option called -perm which allows one to search for files with
permissions matching a given mode. If nothing else, such an ability would catch out careless
users since most users are not aware that their account has hidden directories such as .netscape.
An admin ought to make users aware of security issues beforehand though.

Backup Media.

Can an admin search the data residing on backup media? (DAT, CDR, ZIP, DLT, etc.) After all,
the data is no longer inside the normal home account area. In my opinion yes, since root owns all
files anyway (though I've never done such a search), but others might disagree. For that matter,
consider the tar commands commonly used to perform backups: a full backup accesses every file
on the file system by default (ie. including all users' files, whatever the permissions may be), so
are backups a problem area?

Yet again, one can easily see how legal grey areas emerge concerning the use of computing
technologies.

Conclusion.

Until the law is made clearer and brought up-to-date (unlikely) the best an admin can do is
consult any available internal legal team, deciding policy based on any advice given.

References:

1. "Ethernet Collisions on Silicon Graphics Systems", SGI Pipeline magazine (support info bulletin),
July/August 1998 (NB: URL not accessible to those without a software support contract):
2. http://support.sgi.com/surfzone/content/pipeline/html/19980404EthernetC
ollisions.html

3. The Moving Picture Company, Soho Square, London. Responsible for some or all of the special
effects in Daylight, The English Patient, Goldeneye, The Borrowers, and many other feature
films, music videos, adverts, etc. Hardware used: several dozen Octane workstations, many
Onyx2 graphics supercomputers, a 6.4TB Ampex disk rack with real-time Philips cinescan film-to-
digital-video converter (cinema resolution 70mm uncompressed video converter; 250K's worth),
Challenge L / Discrete Logic video server, a number of O2s, various older SGI models such as
Onyx RealityEngine2, Indigo2, Personal IRIS, etc., some high-end Apple Macs and a great deal of
dedicated video editing systems and VCRs, supported by a multi-gigabit network. I saw one NT
system which the admin said nobody used.

4. The SGI Tech/Advice Centre: Holland #1:


http://www.futuretech.vuurwerk.nl/
5. Worldwide Mirror Sites: Holland #2: http://sgi.webguide.nl/
6. Holland #3: http://sgi.webcity.nl/
7. USA: http://sgi.cartsys.net/
8. Canada: http://sgi-tech.unixology.com/
9.
10. The Data Protection Act 1984, 1998. Her Majesty's Stationary Office (HMSO):
http://www.hmso.gov.uk/acts/acts1984/1984035.htm
Detailed Notes for Day 2 (Part 4)
UNIX Fundamentals: Further Shell scripts.

for/do Loops.

The rebootlab script shown earlier could be rewritten using a for/do loop, a control structure
which allows one to execute a series of commands many times.

Rewriting the rebootlab script using a for/do loop doesn't make much difference to the
complexity of this particular script, but using more sophisticated shell code is worthwhile when
one is dealing with a large number of systems. Other benefits arise too; a suitable summary is
given at the end of this discussion.

The new version could be rewritten like this:

#!/bin/sh
for machine in akira ash cameron chan conan gibson indiana leon merlin
\
nikita ridley sevrin solo stanley warlock wolfen woo
do
echo $machine
rsh $machine init 6&
done

The '\' symbol is used to continue a line onto the next line. The 'echo' line displays a comment as
each machine is dealt with.

This version is certainly shorter, but whether or not it's easier to use in terms of having to modify
the list of host names is open to argument, as opposed to merely commenting out the relevant
lines in the original version. Even so, if one happened to be writing a script that was fairly
lengthy, eg. 20 commands to run on every system, then the above format is obviously much
more efficient.

Similarly, the remountmapleson script could be rewritten as follows:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
nikita ridley sevrin solo stanley warlock wolfen woo
do
echo $machine
rsh $machine "umount /mapleson && mount /mapleson"
done

Note that in this particular case, the command to be executed must be enclosed within quotes in
order for it to be correctly sent by rsh to the remote system. Quotes like this are normally not
needed; it's only because rsh is being used in this example that quotes are required.
Also note that the '&' symbol is not used this time. This is because the rebootlab procedure is
asynchronous, whereas I want the remountdir script to output its messages just one action at a
time.

In other words, for the rebootlab script, I don't care in what order the machines reboot, so each
rsh call is executed as a background process on the remote system, thus the rebootlab script
doesn't wait for each rsh call to return before progressing.

By contrast, the lack of a '&' symbol in remountdir's rsh command means the rsh call must finish
before the script can continue. As a result, if an unexpected problem occurs, any error message
will be easily noticed just by watching the output as it appears.

Sometimes a little forward thinking can be beneficial; suppose one might have reason to want to
do exactly the same action on some other NFS-mounted area, eg. /home, or /var/mail, then the
script could be modified to include the target directory as a single argument supplied on the
command line. The new script looks like this:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
nikita ridley sevrin solo stanley warlock wolfen woo
do
echo $machine
rsh $machine "umount $1 && mount $1"
done

The script would probably be renamed to remountdir (whatever) and run with:

remountdir /mapleson

or perhaps:

remountdir /home

if/then/else constructs.

But wait a minute, couldn't one use the whole concept of arguments to solve the problem of
communicating to the script exactly which hosts to deal with? Well, a rather useful feature of any
program is that it will always return a result of some kind. Whatever the output actually is, a
command always returns a result which is defined to be true or false in some way.

Consider the following command:

grep target database


If grep doesn't find 'target' in the file 'database', then no output is given. However, as a program
that has been called, grep has also passed back a value of 'FALSE' - the fact that grep does this is
simply invisible during normal usage of the command.

One can exploit this behaviour to create a much more elegant script for the remountdir
command. Firstly, imagine that I as an admin keep a list of currently active hosts in a file called
'live' (in my case, I'd probably keep this file in /mapleson/Admin/Machines). So, at the present
time, the file would contain the following:

yoda
akira
ash
cameron
chan
conan
gibson
indiana
leon
merlin
nikita
ridley
sevrin
solo
stanley
warlock
wolfen
woo

ie. the host called spock is not listed.

The remountdir script can now be rewritten using an if/then construct:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live; then


echo Remounting $1 on $machine...
rsh $machine "umount $1 && mount $1"
fi
done

This time, the complete list of hosts is always used in the script, ie. once the script is rewritten, it
doesn't need to be altered again. For each machine, the grep command searches the 'live' file for
the target name; if it finds the name, then the result is some output to the screen from grep, but
also a 'TRUE' condition, so the echo and rsh commands are executed. If grep doesn't find the
target host name in the live file then that host is ignored.
The result is a much more elegant and powerful script. For example, suppose some generous
agency decided to give the department a large amount of money for an extra 20 systems: the only
changes required are to add the names of the new hosts to remountdir's initial list, and to add the
names of any extra active hosts to the live file. Along similar lines, when spock finally is
returned to the lab, its name would be added to the live file, causing remountdir to deal with it in
the future.

Even better, each system could be setup so that, as long as it is active, the system tells the server
every so often that all is well (a simple script could achieve this). The server brings the results
together on a regular basis, constantly keeping the live file up-to-date. Of course, the server
includes its own name in the live file. A typical interval would be to update the live file every
minutes. If an extra program was written which used the contents of the live file to create some
kind of visual display, then an admin would know in less than a minute when a system had gone
down.

Naturally, commercial companies write professional packages which offer these kinds of
services and more, with full GUI-based monitoring, but at least it is possible for an admin to
create home-made scripts which would do the job just as well.

/dev/null.

There is still an annoying feature of the script though: if grep finds a target name in the live file,
the output from grep is visible on the screen which we don't really want to see. Plus, the umount
command will return a message if /mapleson wasn't mounted anyway. These messages clutter up
the main 'trace' messages.

To hide the messages, one of UNIX's special device files can be used. Amongst the various
device files in the /dev directory, one particularly interesting file is called /dev/null. This device
is known as a 'special' file; any data sent to the device is discarded, and the device always returns
zero bytes. Conceptually, /dev/null can be regarded as an infinite sponge - anything sent to it is
just ignored. Thus, for dealing with the unwanted grep output, one can simply redirect grep's
output to /dev/null.

The vast majority of system script files use this technique, often many times even in a single
script.

Note: descriptions of all the special device files /dev are given in Appendix C of the online book,
"IRIX Admin: System Configuration and Operation".

Since grep returns nothing if a host name is not in the live file, a further enhancement is to
include an 'else' clause as part of the if construct so that a separate message is given for hosts that
are currently not active. Now the final version of the script looks like this:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live > /dev/null; then


echo Remounting $1 on $machine...
rsh $machine "umount $1 && mount $1"
else
echo $machine is not active.
fi
done

Running the above script with 'remountdir /mapleson' gives the following output:

Checking yoda...
Remounting /mapleson on yoda...
Checking akira...
Remounting /mapleson on akira...
Checking ash...
Remounting /mapleson on ash...
Checking cameron...
Remounting /mapleson on cameron...
Checking chan...
Remounting /mapleson on chan...
Checking conan...
Remounting /mapleson on conan...
Checking gibson...
Remounting /mapleson on gibson...
Checking indiana...
Remounting /mapleson on indiana...
Checking leon...
Remounting /mapleson on leon...
Checking merlin...
Remounting /mapleson on merlin...
Checking spock...
spock is not active.
Checking nikita...
Remounting /mapleson on nikita...
Checking ridley...
Remounting /mapleson on ridley...
Checking sevrin...
Remounting /mapleson on sevrin...
Checking solo...
Remounting /mapleson on solo...
Checking stanley...
Remounting /mapleson on stanley...
Checking warlock...
Remounting /mapleson on warlock...
Checking wolfen...
Remounting /mapleson on wolfen...
Checking woo...
Remounting /mapleson on woo...
Notice the output from grep is not shown, and the different response given when the script deals
with the host called spock.

Scripts such as this typically take around a minute or so to execute, depending on how quickly
each host responds.

The rebootlab script can also be rewritten along similar lines to take advantage of the new 'live'
file mechanism, but with an extra if/then structure to exclude yoda (the rebootlab script is only
meant to reboot the lab machines, not the server). The extra if/then construct uses the 'test'
command to compare the current target host name with the word 'yoda' - the rsh command is
only executed if the names do not match; otherwise, a message is given stating that yoda has
been excluded. Here is the new rebootlab script:

#!/bin/sh
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live > /dev/null; then


if test $machine != yoda; then
echo Rebooting $machine...
rsh $machine init 6&
else
echo Yoda excluded.
fi
else
echo $machine is not active.
fi
done

Of course, an alternative way would be to simply exclude 'yoda' from the opening 'for' line.
However, one might prefer to always use the same host name list in order to minimise the
amount of customisation between scripts, ie. to create a new script just copy an existing one and
modify the content after the for/do structure.

Notes:

 All standard shell commands and other system commands, programs, etc. can be used in
shell scripts, eg. one could use 'cd' to change the current working directory between
commands.
 An easy way to ensure that a particular command is used with the default or specifically
desired behaviour is to reference the command using an absolute path description, eg.
/bin/rm instead of just rm. This method is frequently found in system shell scripts. It also
ensures that the scripts are not confused by any aliases which may be present in the
executing shell.
 Instead of including a raw list of hosts in the script at beginning, one could use other
commands such as grep, awk, sed, perl and cut to obtain relevant host names from the
/etc/hosts file, one at a time. There are many possibilities.

Typically, as an admin learns the existence of new commands, better ways of performing tasks
are thought of. This is perhaps one reason why UNIX is such a well-understood OS: the process
of improving on what has been done before has been going on for 30 years, largely because
much of the way UNIX works can be examined by the user (system script files, configuration
files, etc.) One can imagine the hive of activity at BTL and Berkeley in the early days, with
suggestions for improvements, additions, etc. pouring in from enthusiastic testers and volunteers.
Today, after so much evolution, most basic system scripts and other files are probably as good as
they're going to be, so efforts now focus on other aspects such as system service improvements,
new technology (eg. Internet developments, NSD), security enhancements, etc. Linux evolved in
a very similar way.

I learned shell programming techniques mostly by looking at existing system scripts and reading
the relevant manual pages. An admin's shell programming experience usually begins with simple
sequential scripts that do not include if/then structures, for loops, etc. Later on, a desire to be
more efficient gives one cause to learn new techniques, rewriting earlier work as better ideas are
formed.

Simple scripts can be used to perform a wide variety of tasks, and one doesn't have to make them
sophisticated or clever to get the job done - but with some insightful design, and a little
knowledge of how the more useful aspects of UNIX work, one can create extremely flexible
scripts that can include error checking, control constructs, progress messages, etc. written in a
way which does not require them to be modified, ie. external ideas, such as system data files, can
be used to control script behaviour; other programs and scripts can be used to extract information
from other parts of the system, eg. standard configuration files.

A knowledge of the C programming language is clearly helpful in writing shell scripts since the
syntax for shell programming is so similar. An excellent book for this is "C Programming in a
UNIX Environment", by Judy Kay & Bob Kummerfeld (Addison Wesley Publishing, 1989.
ISBN: 0 201 12912 4).

Other Useful Commands.

A command found in many of the numerous scripts used by any UNIX OS is 'test'; typically used
to evaluate logical expressions within 'if' clauses, test can determine the existence of files, status
of access permissions, type of file (eg. ordinary file, directory, symbolic link, pipe, etc.), whether
or not a file is empty (zero size), compare strings and integers, and other possibilities. See the
test man page for full details.

For example, the test command could be used to include an error check in the rebootlab script, to
ascertain whether the live file is accessible:
#!/bin/sh
if test -r /mapleson/Admin/Machines/live; then
for machine in yoda akira ash cameron chan conan gibson indiana leon
merlin \
spock nikita ridley sevrin solo stanley warlock wolfen
woo
do
echo Checking $machine...

if grep $machine /mapleson/Admin/Machines/live > /dev/null; then


if test $machine != yoda; then
echo Rebooting $machine...
rsh $machine init 6&
else
echo Yoda excluded.
fi
else
echo $machine is not active.
fi
done
else
echo Error: could not access live file, or file is not readable.
fi

NOTE: Given that 'test' is a system command...

% which test
/sbin/test

...any user who creates a program called test, or an admin who writes a script called test, will be
unable to execute the file unless one of the following is done:

 Use a complete pathname for the file, eg. /home/students/cmpdw/test


 Insert './' before the file name
 Alter the path definition ($PATH) so that the current directory is searched before /sbin
(dangerous! The root user should definitely not do this).

In my early days of learning C, I once worked on a C program whose source file I'd called
simply test.c - it took me an hour to realise why nothing happened when I ran the program
(obviously, I was actually running the system command 'test', which does nothing when given no
arguments except return an invisible 'false' exit status).

Problem Question 1.

Write a script which will locate all .capture.mv.* directories under /home and remove them
safely. You will not be expected to test this for real, but feel free to create 'mini' test directories if
required by using mkdir.

Modify the script so that it searches a directory supplied as a single argument ($1).

Relevant commands: find, rm


Tips:

 Research the other possible options for rm which might be useful.


 Don't use your home directory to test out ideas. Use /tmp or /var/tmp.

Problem Question 2.

This is quite a complicated question. Don't feel you ought to be able to come up with an answer
after just one hour.

I want to be able to keep an eye on the amount of free disk space on all the lab machines. How
could this be done?

If a machine is running out of space, I want to be able to remove particular files which I know
can be erased without fear of adverse side effects, including:

 Unwanted user files left in /tmp and /var/tmp, ie. files such as movie files, image files,
sound files, but in general any file that isn't owned by root.
 System crash report files left in /var/adm/crash, in the form of unix.K and
vmcore.K.comp, where K is some digit.
 Unwanted old system log information in the file /var/adm/SYSLOG. Normally, the file is
moved to oSYSLOG minus the last 10 or 20 lines, and a new empty SYSLOG created
containing the aforementioned most recent 10 or 20 lines.

a. Write a script which will probe each system for information, showing disk space usage.

b. Modify the script (if necessary) so that it only reports data for the local system disk.

c. Add a means for saving the output to some sort of results file or files.

d. Add extra features to perform space-saving operations such as those described above.

Advanced:

e. Modify the script so that files not owned by root are only removed if the relevant user is not
logged onto the target system.

Relevant commands: grep, df, find, rm, tail, cd, etc.


UNIX Fundamentals: Application Development Tools.

A wide variety of commands, programs, tools and applications exist for application development
work on UNIX systems, just as for any system. Some come supplied with a UNIX OS as-
standard, some are free or shareware, while others are commercial packages.

An admin who has to manage a system which offers these services needs to be aware of their
existence because there are implications for system administration, especially with respect to
installed software.

This section does not explain how to use these tools (even though an admin would probably find
many of them useful for writing scripts, etc.) The focus here is on explaining what tools are
available and may exist on a system, where they are usually located (or should be installed if an
admin has to install non-standard tools), and how they might affect administration tasks and/or
system policy.

There tend to be several types of software tools:

1. Software executed usually via command line and written using simple editors, eg. basic
compilers such as cc, development systems such as the Sun JDK for Java.

Libraries for application development, eg. OpenGL, X11, Motif, Digital Media Libraries
- such library resources will include example source code and programs, eg. X11 Demo
Programs.

In both cases, online help documents are always included: man pages, online books, hints
& tips, local web pages either in /usr/share or somewhere else such as /usr/local/html.

2. Higher-level toolkits providing an easier way of programming with various libraries, eg.
Open Inventor. These are often just extra library files somewhere in /usr/lib and so don't
involve executables, though example programs may be supplied (eg. SceneViewer,
gview, ivview). Any example programs may be in custom directories, eg. SceneViewer is
in /usr/demos/Inventor, ie. users would have to add this directory to their path in order to
be able to run the program. These kinds of details are in the release notes and online
books. Other example programs may be in /usr/sbin (eg. ivview).
3. GUI-based application development systems for all manner of fields, eg. WorkShop Pro
CASE tools for C, C++, Ada, etc., CosmoWorlds for VRML, CosmoCreate for HTML,
CosmoCode for Java, RapidApp for rapid prototyping, etc. Executables are usually still
accessible by default (eg. cvd appears to be in /usr/sbin) but the actual programs are
normally stored in application-specific directories, eg. /usr/WorkShop, /usr/CosmoCode,
etc. (/usr/sbin/cvd is a link to /usr/WorkShop/usr/sbin/cvd). Supplied online help
documents are in the usual locations (/usr/share, etc.)
4. Shareware/Freeware programs, eg. GNU, Blender, XV, GIMP, XMorph, BMRT.
Sometimes such software comes supplied in a form that means one can install it
anywhere (eg. Blender) - it's up to the admin to decide where (/usr/local is the usual
place). Other type of software installs automatically to a particular location, usually
/usr/freeware or /usr/local (eg. GIMP). If the admin has to decide where to install the
software, it's best to follow accepted conventions, ie. place such software in /usr/local (ie.
executables in /usr/local/bin, libraries in /usr/local/lib, header files in /usr/local/include,
help documents in /usr/local/docs or /usr/local/html, source code in /usr/local/src). In all
cases, it's the admin's responsibility to inform users of any new software, how to use it,
etc.

The key to managing these different types of tools is consistency; don't put one shareware
program in /usr/local and then another in /usr/SomeCustomName. Users looking for
online source code, help docs, etc. will become confused. It also complicates matters
when one considers issues such as library and header file locations for compiling
programs.

Plus, consistency eases other aspects of administration, eg. if one always uses /usr/local
for 3rd-party software, then installing this software onto a system which doesn't yet have
it is a simple matter of copying the entire contents of /usr/local to the target machine.

It's a good idea to talk to users (perhaps by email), ask for feedback on topics such as
how easy it is to use 3rd-party software, are there further programs they'd like to have
installed to make their work easier, etc. For example, a recent new audio standard is
MPEG3 (MP3 for short); unknown to me until recently, there exists a freeware MP3
audio file player for SGIs. Unusually, the program is available off the Net in executable
form as just a single program file. Once I realised that users were trying to play MP3
files, I discovered the existence of the MP3 player and installed it in /usr/local/bin as
'mpg123'.

My personal ethos is that users come first where issues of carrying out their tasks are
concerned. Other areas such as security, etc. are the admin's responsibility though - such
important matters should either be left to the admin or discussed to produce some
statement of company policy, probably via consulation with users, managers, etc. For
everyday topics concerning users getting the most out of the system, it's wise for an
admin to do what she/he can to make users' lives easier.

General Tools (editors).

Developers always use editing programs for their work, eg. xedit, jot, nedit, vi, emacs,
etc. If one is aware that a particular editor is in use, then one should make sure that all
appropriate components of the relevant software are properly installed (including any
necessary patches and bug fixes), and interested users notified of any changes, newly
installed items, etc.

For example, the jot editor is popular with many SGI programmers because it has some
extra features for those programming in C, eg. an 'Electric C Mode'. However, a bug
exists in jot which can cause file corruption if jot is used to access files from an NFS-
mounted directory. Thus, if jot is being used, then one should install the appropriate patch
file to correct the bug, namely Patch 2051 (patch CDs are supplied as part of any
software support contract, but most patches can also be downloaded from SGI's ftp site).

Consider searching the vendor's web site for information about the program in question,
as well as the relevant USENET newsgroups (eg. comp.sys.sgi.apps, comp.sys.sgi.bugs).
It is always best to prevent problems by researching issues beforehand.

Whether or not an admin chooses to 'support' a particular editor is another matter; SGI
has officially switched to recommending the nedit editor for users now, but many still
prefer to use jot simply because of familiarity, eg. all these course notes have been typed
using jot. However, an application may 'depend' on minor programs like jot for particular
functions. Thus, one may have to install programs such as jot anyway in order to support
some other application (dependency).

An example in the case of the Ve24 network is the emacs editing system: I have chosen
not to support emacs because there isn't enough spare disk space available to install
emacs on the Indys which only have 549MB disks. Plus, the emacs editor is not a vendor-
supplied product, so my position is that it poses too many software management issues to
be worth using, ie. unknown bug status, file installation location issues, etc.

Locations: editors are always available by default; executables tend to be in /usr/sbin, so


users need not worry about changing their path definition in order to use them.

All other supplied-as-standard system commands and programs come under the heading
of general tools.

Compilers.

There are many different compilers which might have to be installed on a system, eg.:

Programming Compiler
Language Executable

C cc, gcc
C++ CC
Ada ?
Fortran77 f77
Fortran90 f90
Pascal ?

Some UNIX vendors supply C and C++ compilers as standard, though licenses may be
required. If there isn't a supplied compiler, but users need one, then an admin can install
the GNU compilers which are free.

An admin must be aware that the release versions of software such as compilers is very
important to the developers who use them (this actually applies to all types of software).
Installing an update to a compiler might mean the libraries have fewer bugs, better
features, new features, etc., but it could also mean that a user's programs no longer
compile with the updated software. Thus, an admin should maintain a suitable
relationship with any users who use compilers and other similar resources, ie. keep each
other informed of relevant issues, changes being made or requested, etc.

Another possibility is to manage the system in such a way as to offer multiple versions of
different software packages, whether that is a compiler suite such as C development kit,
or a GUI-based application such as CosmoWorlds. Multiple versions of low-level tools
(eg. cc and associated libraries, etc.) can be supported by using directories with different
names, or NFS-mounting directories/disks containing software of different versions, and
so on. There are many possibilities - which one to use depends on the size of the network,
ease of management, etc.

Multiple versions of higher-level tools, usually GUI-based development environments


though possibly ordinary programs like Netscape, can be managed by using 'wrapper'
scripts: the admin sets an environment variable to determine which version of some
software package is to be the default; when a system is booted, the script is executed and
uses the environment variable to mount appropriate directories, execute any necessary
initialisation scripts, background daemons, etc. Thus, when a user logs in, they can use
exactly the same commands but find themselves using a different version of the software.
Even better, an admin can customise the setup so that users themselves can decide what
version they want to use; logging out and then logging back in again would then reset all
necessary settings, path definitions, command aliases, etc.

MPC operates its network in this way. They use high-end professional film/video
effects/animation tools such as Power Animator, Maya, Flame, etc. for their work, but the
network actually has multiple versions of each software package available so that
animators and artists can use the version they want, eg. for compatibility reasons, or
personal preferences for older vs. newer features. MPC uses wrapper scripts of a type
which require a system reboot to change software version availability, though the systems
have been setup so that a user can initiate the reboot (I suspect the reboot method offers
better reliability).

Locations:

Executables are normally in /usr/sbin, libraries in /usr/lib, header files in /usr/include and
online documents, etc. in /usr/share. Note also that the release notes for such products
contain valuable information for administrators (setup advice) and users alike.

Debuggers.

Debugging programs are usually part of a compilation system, so everything stated above
for compilers applies to debuggers as well. However, it's perfectly possible for a user to
use a debugger that's part of a high-level GUI-based application development toolkit to
debug programs that are created using low-level tools such as jot and xedit. A typical
example on the Ve24 machines is students using the cvd program (from the WorkShop
Pro CASE Tools package) to debug their C programs, even though they don't use
anything else from the comprehensive suite of CASE tools (source code management,
version control, documentation management, rapid prototyping, etc.)

Thus, an admin must again be aware that users may be using features of high-level tools
for specific tasks even though most work is done with low-level tools. Hence, issues
concerning software updates arise, eg. changing software versions without user
consulation could cause problems for existing code.

High-level GUI-based Development Toolkits.

Usually vendor-supplied or commercial in nature, these toolkits include products such as


CosmoCode (Java development with GUI tools), RapidApp, etc. As stated above, there
are issues with respect to not carrying out updates without proper consideration to how
the changes may affect users who use the products, but the ramifications are usually
much less serious than low-level programs or shareware/freeware. This is because the
software supplier will deliberately develop new versions in such a way as to maximise
compatibility with older versions.

High-level toolkits sometimes rely on low-level toolkits (eg. CosmoCode depends on the
Sun JDK software), so an admin should also be aware that installing updates to low-level
toolkits may have implications for their higher-level counterparts.

High-level APIs (Application Programming Interfaces).

This refers to advanced library toolkits such as Open Inventor, ViewKit, etc. The actual
application developments tools used with these types of products are the same, whether
low-level or high-level (eg. cc and commands vs. WorkShop Pro CASE Tools). Thus,
high-level APIs are not executable programs in their own right; they are a suite of easier-
to-use libraries, header files, etc. which users can use to create applications designed at a
higher level of abstraction. Some example high-level APIs and their low-level
counterparts include:

Lower-level Higher-level

OpenGL Open Inventor


X11/Motif ViewKit/Tk
ImageVision Image Format Library,
Electronic Light Table.

This is not a complete list. And there may be more than one level of abstraction, eg. Open
Inventor is a subset of VRML.
Locations: high-level APIs tend to have their files stored in correspondingly named
directories in /usr/lib, /usr/include, etc. For example, Open Inventor files can be found in
/usr/lib/Inventor and /usr/include/Inventor. An exception is support files such as example
models, images, textures, etc. which will always be in /usr/share, but not necessarily in
specifically named locations, eg. the example 3D Inventor models are in
/usr/share/data/models.

Shareware and Freeware Software.

This category of software, eg. the GNU compiler system, is usually installed either in
/usr/local somewhere, or in /usr/freeware. Many shareware/freeware program don't have
to be installed in one of these two places (Blender is one such example) but it is best to
do so in order to maintain a consistent software management policy.

Since /usr/local and /usr/freeware are not normally referenced by the standard path
definition, an admin must ensure that relevant users are informed of any changes they
may have to make in order to access newly installed software. A typical notification
might be a recommendation of a how a user can modify her/his own .cshrc file so that
shells and other programs know where any new executable files, libraries, online
documents, etc. are stored.

Note that, assuming the presence of Internet access, users can easily download
freeware/shareware on their own and install it in their own directory so that it runs from
their home account area, or they could even install software in globally writeable places
such as /var/tmp. If this happens, it's common for an admin to become annoyed, but the
user has every right to install software in their own account area (unless it's against
company policy, etc.) A better response is to appreciate the user's need for the software
and offer to install it properly so that everyone can use it, unless some other factor is
more important.

Unlike vendor-supplied or commercial applications, newer versions of shareware and


freeware programs can often be radically different from older versions. GIMP is a good
example of this - one version introduced so many changes that it was barely comparable
to an older version. Users who utilise these types of packages might be annoyed if an
update is made without consulting them because:

o it's highly likely their entire working environment may be different in the new
version,
o features of the old version may no longer be available,
o aspects of the new version may be incompatible with the old version,
o etc.

Thus, shareware/freeware programs are a good example of where it might be better for
admins to offer more than one version of a software package, eg. all the files for Blender
V1.57 are stored in /usr/local/blender1.57_SGI_6.2_iris on akira and sevrin. When the
next version comes out (eg. V1.6), the files will be in /usr/local/blender1.6_SGI_6.2_iris
- ie. users can still use the old version if they wish.

Because shareware/free programs tend to be supplied as distinct modules, it's often easier
to support multiple versions of such software compared to vendor-supplied or
commercial packages.

Comments on Software Updates, Version Issues, etc.

Modern UNIX systems usually employ software installation techniques which operate in
such way so as to show any incompatibilities before installation (SGIs certainly operate
this way); the inst program (and thus swmgr too since swmgr is just a GUI interface to
inst) will not allow one to install software if there are conflicts present concerning
software dependency and compatibility. This feature of inst (and swmgr) to monitor
software installation issues applies only to software subsystems that can be installed and
removed using inst/swmgr, ie. those said to be in 'inst' format. Thankfully, large numbers
of freeware programs (eg. GIMP) are supplied in this format and so they can be managed
correctly. Shareware/Freeware programs do not normally offer any means by which one
can detect possible problems before installation or removal, unless the authors have been
kind enough to supply some type of analysis script or program.

Of course, there is nothing to stop an admin using low-level commands such as cp, tar,
mv, etc. to manually install problematic files by copying them from a CD, or another
system, but to do so is highly unwise as it would invalidate the inst database structure
which normally acts as a highly accurate and reliable record of currently installed
software. If an admin must make custom changes, an up-to-date record of these changes
should be maintained.

To observe inst/swmgr in action, either enter 'inst' or 'swmgr' at the command prompt (or
select 'Software Manager' from the Toolchest which runs swmgr). swmgr is the easier to
understand because of its intuitive interface.

Assuming the use of swmgr, once the application window has appeared, click on 'Manage
Installed Software'. swmgr loads the inst database information, reading the installation
history, checking subsystem sizes, calculating dependencies, etc. The inst system is a
very effective and reliable way of managing software.

Most if not all modern UNIX systems will employ a software installation and
management system such as inst, or a GUI-based equivalent.
Summary.

As an administrator, one should not need to know how to use the software products
which users have access to (though it helps in terms of being able to answer simple
questions), but one should:

o be aware of where the relevant files are located,


o understand issues concerning revision control,
o notify users of any steps they must take in order to access new software or
features,
o aid users in being able to use the products efficiently (eg. using /tmp or /var/tmp
for working temporarily with large files or complex tasks),
o have a consistent strategy for managing software products.

These issues become increasingly important as systems become more complex, eg.
multiple vendor platforms, hundreds of systems connected across multiple departments,
etc. One solution for companies with multiple systems and more than one admin is to
create a system administration committee whose responsibilities could include
coordinating site policies, dealing with security problems, sharing information, etc.
Detailed Notes for Day 3 (Part 1)
UNIX Fundamentals: Installing an Operating System and/or Software.

Installation Rationale.

Installing an OS is a common task for an admin to perform, usually often because of the
acquisition of a new system or the installation of a new disk.

Although any UNIX variant should be perfectly satisfactory once it has been installed,
sometimes the admin or a user has a particular problem which requires, for example, a different
system configuration (and thus perhaps a reinstall to take account of any major hardware
changes), or a different OS version for compatibility testing, access to more up-to-date features,
etc. Alternatively, a serious problem or accidental mistake might require a reinstallation, eg.
corrupted file system, damaged disk, or an unfortunate use of the rm command (recall the
example given in the notes for Day 2, concerning the dangers of the 'find' command); although
system restoration via backups is an option, often a simple reinstall is more convenient and
faster.

Whatever the reason, an admin must be familiar with the procedure for installing an OS on the
platform for which she/he is responsible.

Installation Interface and Tools.

Most UNIX systems have two interfaces for software installation: a high-level mode where an
admin can use some kind of GUI-based tool, and a low-level mode which employs the command
line shell. The GUI tool normally uses the command line version for the actual installation
operations. In the case of SGI's IRIX, the low-level program is called 'inst', while the GUI
interface to inst is called 'swmgr' - the latter can be activated from the 'Toolchest' on the desktop
or entered as a command. Users can also run swmgr, but only in 'read-only' mode, ie. Non-root
users cannot use inst or swmgr to install or remove software.

For general software installation tasks (new/extra applications, updates, patches, etc.) the GUI
tool can normally be used, but for installing an OS, virtually every UNIX platform will require
the admin to not only use the low-level tool for the installation, but also carry out the installation
in a 'lower' (restricted) access mode, ie. a mode where only the very basic system services are
operating: no user-related processes are running, the end-user GUI interface is not active, no
network services are running, etc. For SGI's IRIX, this mode is called 'miniroot'.

Major updates to the OS are also usually carried out in miniroot mode - this is because a fully
operational system will have services running which could be altered by a major OS change, ie.
it would be risky to perform any such change in anything but the equivalent of miniroot.

It is common for this restricted miniroot mode to be selected during bootup, perhaps by pressing
the ESC key when prompted. In the case of SGI systems, the motherboard PROM chip includes
a hard-coded GUI interface mechanism called ARCS which displays a mouse-driven menu on
bootup. This provides the admin with a user-friendly way of performing low-level system
administration tasks, eg. installing the OS from miniroot, running hardware diagnostics,
accessing a simple PROM shell called a Command Monitor for performing low-level actions
such as changing PROM settings (eg. which SCSI ID to treat as the system disk), etc.

Systems without graphics boards, such as servers, provide the same menu but in text-only form,
usually through a VT or other compatible text display terminal driven from the serial port. Note
that SGI's VisualWorkstation machine (an NT system) also uses the ARCS GUI interface - a first
for any NT system (ie. no DOS at all for low-level OS operations).

Not many UNIX vendors offer a GUI menu system like ARCS for low-level tasks - SGI is one of
the few who do, probably because of a historical legacy of making machines for the visual arts
and sciences. Though the ARCS system is perhaps unique, after one one has selected 'Software
Installation' the procedure progresses to a stage where the interface does become the more
familiar text-based use of inst (ie. the text information just happens to be presented within a
GUI-style window).

Very early UNIX platforms were not so friendly when it came to offering an easy method for
installing the OS, especially in the days of older storage media such as 5.25" disks, magnetic
tapes, etc. However, some vendors did a good job, eg. the text-only interface for installing HP-
UX on Hewlett Packard machines (eg. HP9000/730) is very user-friendly, allowing the admin to
use the cursor arrow keys to select options, activate tasks, etc. During installation, constantly
updated information shows how the installation is progressing: current file being installed,
number of files installed so far, number of files remaining, amount of disk space used up so far,
disk space remaining, percentage equivalents for all these, and even an estimate of how much
longer the installation will take before completion (surprisingly, inst doesn't provide this last
piece of information as it is running, though one can make good estimates or find out how long
it's going to take from a 3rd-party information source).

The inst program gives progress output equivalent to most of the above by showing the current
software subsystem being installed, which sub-unit of which subsystem, and what percentage of
the overall operation has been done so far.

Perhaps because of the text-only interface which is at the heart of installing any UNIX variant,
installing an OS can be a little daunting at first, but the actual procedure itself is very easy. Once
an admin has installed an OS once, doing it again quickly becomes second nature. The main
reason the task can seem initially confusing is that the printed installation guides are often too
detailed, ie. the supplied documents have to assume that the person carrying out the installation
may know nothing at all about what they're doing. Thankfully, UNIX vendors have recognised
this fact and so nowadays any such printed material also contains a summary installation guide
for experts and those who already know the general methods involved - this is especially useful
when performing an OS update as opposed to an original OS installation.
OS Source Media.

Years ago, an OS would be stored on magnetic tape or 5.25" disks. Today, one can probably
state with confidence that CDROMs are used by every vendor. For example, SGI's IRIX 6.2
comes on 2 CDROMs; IRIX 6.5 uses 4 CDROMs, but this is because 6.5 can be used with any
machine from SGI's entire current product line, aswell as many older systems - thus, the basic
CD set must contain the data for all relevant systems even though an actual installation will only
use a small subset of the data from the CDs (typically less than one CD's worth).

In the future, it is likely that vendors will switch to DVDs due to higher capacities and faster
transfer rates.

Though a normal OS installation uses some form of original OS media, UNIX actually allows
one to install an OS (or any software) via some quite unique ways. For example, one could copy
the data from the source media (I shall assume CDROM) to a fast UltraSCSI disk drive. Since
disks offer faster transfer rates and access times, using a disk as a source media enables a faster
installation, as well as removing the need for swapping CDROMs around during the installation
process. This is essentially a time-saving feature but is also very convenient, eg. no need to carry
around many CDROMs (remember that after an OS installation, an admin may have to install
extra software, applications, etc. from other CDs).

A completely different option is to install the OS using a storage device which is attached to a
remote machine across a network. This may sound strange, ie. the idea that a machine without an
OS can access a device on a remote system and use that as an OS installation source. It's
something which is difficult but not impossible with PCs (I'm not sure whether a Linux PC
would support this method). A low-level communications protocol called bootp (Internet
Bootstrap Protocol), supported by all traditional UNIX variants, is used to facilitate
communication across the network. As long as the remote system has been configured to allow
another system to access its local device as a source for remote OS installation, then the remote
system will effectively act as an attatched storage medium.

However, most admins will rarely if ever have to install an OS this way for small networks,
though it may be more convenient for larger networks. Note that IRIX systems are supplied by
default with the bootp service disabled in the /etc/inetd.conf file (the contents of this file controls
various network services). Full details on how to use the bootp service for remote OS installation
are normally provided by the vendor in the form of an online book or reference page. In the case
of IRIX, see the section entitled, "Booting across the Network" in Chapter 10 of the online book,
"IRIX Admin: System Configuration and Operation".

Note: this discussion does not explain every single step of installing an OS on an SGI system,
though the method will be demonstrated during the practical session if time permits. Instead, the
focus here is on management issues which surround an OS installation, especially those
techniques which can ease the installation task. Because of the SGI-related technical site I run, I
have already created extremely detailed installation guides for IRIX 6.2 [1] and IRIX 6.5 [2]
which also include tables of example installation times (these two documents are included for
future reference). The installation times obtained were used to conduct a CPU and CDROM
performance analysis [3]. Certain lessons were learned from this analysis which are also relevant
to installing an OS - these are explained later.

Installing an OS on multiple systems.

Using a set of CDs to install an OS can take quite some time (15 to 30 minutes is a useful
approximation). If an admin has many machines to install, there are several techniques for
cutting the amount of time required to install the OS on all the machines.

The most obvious method is for all machines to install via a remote network device, but this
could actually be very slow, limited partly by network speed but also by the way in which
multiple systems would all be trying to access the same device (eg. CDROM) at the same time. It
would only really be effective for a situation where the network was very fast and the device - or
devices, there could be more than one - was also fast.

An example would be the company MPC; as explained in previous lectures, their site
configuration is extremely advanced. The network they employ is so fast that it can saturate the
typical 100Mbit Ethernet port of a modern workstation like Octane. MPC's storage systems
include many high-end RAID devices capable of delivering data at hundreds of MB/sec rates
(this kind of bandwidth is needed for editing broadcast-quality video and assuring that animators
can load complete scene databases without significant delay).

Thus, the admin at MPC can use some spare RAID storage to install an OS on a system across
the network. When this is done, the limiting factor which determines how long the installation
takes is the computer's main CPU(s) and/or its Ethernet port (100MBit), the end result of which
is that an installation can take mere minutes. In reality, the MPC admin uses an even faster
technique for installing an OS, which is discussed in a moment.

At the time of my visit, MPC was using a high-speed crossbar switching 288Mbit/sec network
(ie. multiple communications links through the routers - each machine could be supplied with up
to 36MB/sec). Today they use multiple gigabit links (HiPPI) and other supporting devices. But
not everyone has the luxury of having such equipment.

Disk Cloning [1].

If an admin only has a single machine to deal with, the method used may not matter too much,
but often the admin has to deal with many machines. A simple technique which saves a huge
amount of time is called 'disk cloning'. This involves installing an OS onto a single system ('A')
and then copying (ie. cloning) the contents of that system's disk onto other disks. The first
installation might be carried out by any of the usual means (CDROM, DAT, network, etc.), after
which any extra software is also installed; in the case of SGIs, this would mean the admin
starting up the system into a normal state of operation, logging in as root and using swmgr to
install extra items. At this point, the admin may wish to make certain custom changes as well, eg.
installing shareware/freeware software, etc. This procedure could take more than an hour or two
if there is a great deal of software to install.

Once the initial installation has finished, then begins the cloning process. On SGIs, this is
typically done as follows (other UNIX systems will be very similar if not identical):

1. Place the system disk from another system B into system A, installed at, for example,
SCSI ID 2 (B's system disk would be on SCSI ID 1 in the case of SGIs; SCSI ID 0 is
used for the SCSI controller). Bootup the system.
2. Login as root. Use fx to initialise the B disk to be a new 'root' (ie. system) disk; create a
file system on it; mount the disk on some partition on A's disk such as /disk2.
3. Copy the contents of disk A to disk B using a command such as tar. Details of how to do
this with example tar commands are given in the reference guides [1] [2].
4. Every system disk contains special volume header information which is required in order
to allow it to behave as a bootable device. tar cannot copy this information since it does
not reside on the main data partition of the disk in the form of an ordinary file, so the next
step is to copy the volume header data from A to B using a special command for that
purpose. In the case of SGIs, the relevant program is called dvhtool (device volume
header tool).
5. Shut down system A; remove the B disk; place the B disk back into system B,
remembering to change its SCSI ID back to 1. If further cloning is required, insert
another disk into system A on SCSI ID 2, and (if needed) a further disk into system B,
also set to SCSI ID 2. Reboot both systems.
6. System A will reboot as normal. At bootup time, although system B already has a kernel
file available (/unix) because all the files will be recognised as new (ie. changed) system
B will also create a new kernel file (/unix.install) and then bootup normally ready for
login. Reboot system B once more so that the new kernel file is made the current kernel
file.

At this stage, what one has effectively created is a situation comprising two systems as described
in Step 1, instead of only one such system which existed before the cloning process. Thus, one
could now repeat the process again, creating four systems ready to use or clone again as desired.
Then eight, sixteen, thirty two and so on. This is exactly the same way biological cells divide, ie.
binary fission. Most people are familiar with the idea that repeatedly doubling the number of a
thing can create a great many things in a short space of time, but the use of such a technique for
installing an operating system on many machines means an admin can, for example, completely
configure over one hundred machines in less than five hours! The only limiting factor, as the
number of machines to deal with increases, is the amount of help available by others to aid in the
swapping of disks, typing of commands, etc. In the case of the 18 Indys in Ve24, the last
complete reinstall I did on my own took less than three hours.

Note: the above procedure assumes that each cloning step copies one disk onto just a single other
disk - this is because I'm using the Indy as an example, ie. Indy only has internal space for one
extra disk. But if a system has the available room, then many more disks could be installed on
other SCSI IDs (3, 4, 5, etc.) resulting in each cloning step creating three, four, etc. disks from
just one. This is only possible because one can run multiple tar copy commands at the same time.
Of course, one could use external storage devices to connect extra disks. There's no reason why a
system with two SCSI controllers (Indigo2, O2, Octane, etc.) couldn't use external units to clone
the system disk to 13 other disks at the same time; for a small network, such an ability could
allow the reinstallation of the entire system in a single step!

Using a Backup Image.

If a system has been backed up onto a medium such as DAT tape, one could in fact use that tape
for installing a fresh OS onto a different disk, as opposed to the more usual use of the tape for
data restoration purposes.

The procedure would be similar to some of the steps in disk cloning, ie. install a disk on SCSI ID
2, initialise, and use tar to extract the DAT straight to the disk. However, the volume header
information would have to come from the original system since it would not be present on the
tape, and only one disk could be written to at a time from the tape. Backup media are usually
slower than disks too.

Installing a New Version of an OS (Major Updates).

An admin will often have to install updates to various OS components as part of the normal
routine of installing software patches, bug fixes, new features, security fixes, etc. as they arrive
in CD form from the vendor concerned. These can almost always be installed using the GUI
method (eg. swmgr) unless specifically stated otherwise for some reason. However, if an admin
wishes to change a machine which already has an OS installed to a completely new version
(whether a newer version or an older one), then other issues must be considered.

Although it is perfectly possible to upgrade a system to a newer OS, an existing system will often
have so much software installed with a whole range of configuration files, a straight upgrade to a
new OS revision may not work very well. It would be successful, but what usually happens is
that the admin has to resolve installation conflicts before the procedure can begin, which is
annoyingly time wasting. Further, some changes may even alter some fundamental aspect of the
system, in which case an upgrade on top of the existing OS would involve extra changes which
an admin would have to read up on first (eg. IRIX 6.2 uses a completely different file system to
IRIX 5.3: XFS vs. EFS).

Even if an update over an existing OS is successful, one can never really be sure that older files
which aren't needed anymore were correctly removed. To an admin, the system would 'feel' as if
the older OS was somehow still there, rather like an old layer of paint hidden beneath a new
gloss. This aspect of OS management is perhaps only psychological, but it can be important. For
example, if problems occurred later, an admin might waste time checking for issues concerning
the older OS which aren't relevant anymore, even though the admin theoretically knows such
checks aren't needed.
Thus, a much better approach is to perform a 'clean' installation when installing a new OS. A
typical procedure would be as follows:

1. Read all the relevant notes supplied with the new OS release so that any issues relevant to
how the system may be different with the new OS version are known beforehand, eg. if
any system services operate in a different way, or other factors (eg. new type of file
system, etc.)
2. Make a full system backup of the machine concerned.
3. Identify all the key files which make the system what it is, eg. /etc/sys_id, /etc/hosts, and
other configuration files/directories such as /var/named, /var/flexlm/license.dat, etc.
These could be placed onto a DAT, floptical, ZIP, or even another disk. Items such as
shareware/freeware software are probably best installed anew (read any documents
relevant to software such as this too).
4. Use the appropriate low-level method to reinitialise the system disk. For SGI IRIX
systems, this means using the ARCS bootup menu to select the Command Monitor, boot
off of the OS CDROM and use the fx program to reinitialise the disk as a root disk, use
mkfs to create a new file system (the old OS image is now gone), then reboot to access
the 'Install System Software' option from the ARCS menu.
5. Install the OS in the normal manner.
6. Use the files backed up in step 3 to change the system so that it adopts its usual identity
and configuration, baring in mind any important features/caveats of the new OS release.

This is a safe and reliable way of ensuring a clean installation. Of course, the installation data
could come from a different media or over a network from a remote system as described earlier.

Time-saving Tips.

When installing an OS or software from a CDROM, it's tempting to want to use the fastest
possible CDROM available. However, much of the process of installing software, whether the
task is an OS installation or not, involves operations which do not actually use the CDROM. For
example, system checks need to be made before the installation can begin (eg. available disk
space), hundreds of file structures need to be created on the disk, installation images need to be
uncompressed in memory once they have been retrieved from the CDROM, installed files need
to be checked as the installation progresses (checksums), and any post-installation tasks
performed such as compiling any system software indices.

As a result, perhaps 50% of the total installation time may involve operations which do not
access the CDROM. Thus, using a faster CDROM may not speedup the overall installation to
any great degree. This effect is worsened if the CPU in the system is particularly old or slow, ie.
a slow CPU may not be able to take full advantage of an old CDROM, never mind a new one.

In order for a faster CDROM to make any significant difference, the system's CPU must be able
to take advantage of it, and a reasonably large proportion of an installation procedure must
actually consist of accessing the CDROM.
For example, consider the case of installing IRIX 6.5 on two different Indys - one with a slow
CPU, the other with a better CPU - comparing any benefit gained from using a 32X CDROM
instead of a 2X CDROM [3]. Here is a table of installation times, in hours minutes and seconds,
along with percentage speedups.

2X CDROM 32X CDROM %Speedup

100MHz R4600PC Indy: 1:18:36 1:12:11 8.2%

200MHz R4400SC Indy: 0:52:35 0:45:24 13.7%

(data for a 250MHz R4400SC Indigo2 shows the speedup would rise to 15.2% - a valid
comparison since Indy and Indigo2 are almost identical in system design)

In other words, the better the main CPU, the better the speedup obtained by using a faster
CDROM.

This leads on to the next very useful tip for installing software (OS or otherwise)...

Temporary Hardware Swaps.

The example above divided the columns in order to obtain the speedup for using a faster
CDROM, but it should be obvious looking at the table that a far greater speedup can be obtained
by using a better CPU:

Using 200MHz R4400SC CPU


Instead of 100MHz R4600PC.
(Percentage Speedup)

2X CDROM with Indy: 33.1%

32X CDROM with Indy: 37.1%

In other words, no matter what CDROM is used, an admin can save approximately a third of the
normal installation time just by temporarily swapping the best possible CPU into the target
system! And of course, the saving is maximised by using the fastest CDROM available too, or
other installation source such as a RAID containing the CDROM images.

For example, if an admin has to carry out a task which would normally be expected to take, say,
three hours on the target system, then a simple component swap could save over an hour of
installation time. From an admin's point of view, that means getting the job done quicker (more
time for other tasks), and from a management point of view that means lower costs and better
efficiency, ie. less wages money spent on the admin doing that particular task.

Some admins might have to install OS images as part of their job, eg. performance analysis or
configuring systems to order. Thus, saving as much time as possible could result in significant
daily productivity improvements.
The Effects of Memory Capacity.

During the installation of software or an OS, the system may consume large amounts of memory
in order to, for example, uncompress installation images from the CDROM, process existing
system files during a patch update, recompile system file indices, etc. If the target system does
not have enough physical memory, then swap space (otherwise known as virtual memory) will
have to be used. Since software installation is a disk and memory intensive task, this can
massively slow down the installation or removal procedure (the latter can happen too because
complex file processing may be required in order to restore system files to an earlier state prior
to the installation of the software items being removed).

Thus, just as it can be helpful to temporarily swap a better CPU into the target system and use a
faster CDROM if available, it is also a good idea to ensure the system has sufficient physical
memory for the task.

For example, I once had cause to install a large patch upgrade to the various compiler
subsystems on an Indy running IRIX 6.2 with 64MB RAM [1]. The update procedure seemed to
be taking far too long (15 minutes and still not finished). Noticing the unusually large amount of
disk activity compared to what I would normally expect, ie. Noise coming from the disk, I
became suspicious and wondered whether the installation process was running out of memory. A
quick use of gmemusage showed the available memory to be very low (3MB) implying that
memory swapping was probably occurring. I halted the update procedure (easy to do with IRIX)
and cancelled the installation. After upgrading the system temporarily to 96MB RAM (using
32MB from another Indy) I ran the patch again. This time, the update was finished in less than
one minute! Using gmemusage showed the patch procedure required at least 40MB RAM free in
order to proceed without resorting to the use of swap space.

Summary.

1. Before making any major change to a system, make a complete backup just in case
something goes wrong. Read any relevant documents supplied with the software to be
installed, eg. release notes, caveats to installation, etc.
2. When installing an OS or other software, use the most efficient storage media available if
possible, eg. the OS CDs copied onto a disk. NB: using a disk OS image for installation
might mean repartitioning the disk so that the system regards the disk as a bootable
device, just like a CDROM. By default, SCSI disks do not have the same partition layout
as a typical CDROM. On SGIs using IRIX, the fx program is used to repartition disks.
3. If more than one system is involved, use methods such as disk cloning to improve the
efficiency of the procedure.
4. If possible, temporarily swap better system components into the target system in order to
reduce installation time and ensure adequate resources for the procedure (better CPU, lots
of RAM, fastest possible CDROM).
Caution: item 4 above might not be possible if the particular set of files which get installed are
determined by the presence of internal components. In the case of Indy, installing an R5000
series CPU would result in the installation of different low-level bootup CPU-initialisation
libraries compared to R4600 or R4400 (these latter two CPUs can use the same libraries, but any
R5000 CPU uses newer libraries). Files relevant to these kinds of issues are located in directories
such as /var/sysgen.

Patch Files.

Installing software updates to parts of the OS or application software is a common task for
admins. In general, patch files should not be installed unless they are needed, but sometimes an
admin may not have any choice, eg. for security reasons, or Y2K compliance.

Typically, patch updates are supplied on CDs in two separate categories (these names apply to
SGIs; other UNIX vendors probably use a similar methodology):

1. Required/Recommended patches.
2. Fix-on-Fail Patches.

Item 1 refers to patches which the vendor suggests the admin should definitely install. Typically,
a CD containing such patches is accessed with inst/swmgr and an automatic installation carried
out, ie. the admin lets the system work out which of the available required/recommended patches
should be installed. This concept is known as installing a 'patch set'. When discussing system
problems or issues with others (eg. technical support, or colleagues on the Net), the admin can
then easily describe the OS state as being a particular revision modified by a particular dated
patch set, eg. IRIX 6.5 with the April 1999 Patch Set.

Item 2 refers to patches which only concern specific problems or issues, typically a single patch
file for each problem. An admin should not install such patches unless they are required, ie. they
are selectively installed as and when is necessary. For example, an unmodified installation of
IRIX 6.2 contains a bug in the 'jot' editor program which affects they way in which jot accesses
files across an NFS-mounted directory (the bug can cause jot to erase the file). To fix the bug,
one installs patch number 2051 which is shown in the inst/swmgr patch description list as 'Jot fix
for mmapping', but there's no need to install the patch if a machine running 6.2 is not using NFS.

Patch Inheritance.

As time goes by, it is common for various bug fixes and updates from a number of patches to be
brought together into a 'rollup' patch. Also, a patch file may contain the same fixes as an earlier
patch plus some other additional fixes. Two issues arise from this:

1. If one is told to install a patch file of a particular number (eg. advice gained from
someone on a newsgroup), it is usually the case that any later patch which has been
declared to be a replacement for the earlier patch can be used instead. This isn't always
the case, perhaps due to specific hardware issues of a particular system, but in general a
fix for a problem will be described as 'install patch <whatever> or later'. The release
notes for any patch file will describe what hardware platforms and OS revisions that
patch is intended for, what patches it replaces, what bugs are fixed by the patch (official
bug code numbers included), what other known bugs still exist, and what workarounds
can be used to temporarily solve the remaining problems.
2. When a patch is installed, a copy of the effected files prior to installation, called a 'patch
history', is created and safely stored away so that if ever the patch has to be removed at a
later date, the system can restore the relevant files to the state they were in before the
patch was first installed. Thus, installing patch files consumes disk space - how much
depends on the patch concerned. The 'versions' command with the 'removehist' option can
be used to remove the patch history for a particular patch, recovering disk space, eg.:
3. versions removehist patchSG0001537

would remove the patch history file for patch number 1537. To remove all patch
histories, the command to use is:

versions removehist "*"

Conflicts.

When installing patches, especially of the Fix-on-Fail variety, an admin can come across a
situation where a patch to be installed (A) is incompatible with one already present on the system
(B). This usually happens when an earlier problem was dealt with using a more up-to-date patch
than was actually necessary. The solution is to either remove B, then install an earlier but
perfectly acceptable patch C and finally install A, or find a more up-to-date patch D which
supersedes A and is thus compatible with B.

Note: if the history file for a patch has been removed in order to save disk space, then it will not
be possible to remove that patch from the system. Thus, if an admin encounters the situation
described above, the only possible solution will be to find the more up-to-date patch D.

Exploiting Patch File Release Notes.

The release notes for patches can be used to identify which patches are compatible, as well
ascertain other useful information, especially to check whether a particular patch is the right one
an admin is looking for (patch titles can sometimes be somewhat obscure). Since the release
notes exist on the system in text form (stored in /usr/relnotes), one can use the grep command to
search the release notes for information by hand, using appropriate commands. The commands
'relnotes' and 'grelnotes' can be used to view release notes.

relnotes outputs only text. Without arguments, it shows a summary of all installed products for
which release notes are available. One can then supply a product name - relnotes will respond
with a list of chapter titles for that product. Finally, specifying a product name and a chapter
number will output the actual text notes for the chosen chapter, or one can use '*' to display all
chapters for a product. grelnotes gives the same information in a browsable format displayed in a
window, ie. grelnotes is a GUI interface to relnotes. See the man pages for these commands for
full details.

relnotes actually uses the man command to display information, ie. the release notes files are
stored in the same compressed text format ('pack') used by online manual pages (man uses the
'unpack' command to decompress the text data). Thus, in order to grep-search through a release
notes file, the file must first be uncompressed using the unpack command. This is a classic
example of where the UNIX shell becomes very powerful, ie. one could write a shell script using
a combination of find, ls, grep, unpack and perhaps other commands to allow one to search for
specific items in release notes.

Although the InfoSearch tool supplied with IRIX 6.5 allows one to search release notes, IRIX 6.2
does not have InfoSearch, so an admin might decide that writing such a shell script would prove
very useful. Incidentally, this is exactly the kind of useful script which ends up being made
available on the Net for free so that anyone can use it. For all I know, such a script already exists.
Over time, entire collections of useful scripts are gathered together and eventually released as
freeware (eg. GNU shell script tools). An admin should examine any such tools to see if they
could be useful - a problem which an admin has to deal with may already have been solved by
someone else two decades earlier.

Patch Subsystem Components.

Like any other software product, a patch file is a software subsystem usually containing several
sub-units, or components. When manually selecting a patch for installation, inst/swmgr may tag
all sub-units for installation even if certain sub-units are not applicable (this can happen for an
automatic selection too, perhaps because inst selects all of a patch's components by default). If
this happens, any conflicts present will be displayed, preventing the admin from accidentally
installing unwanted or irrelevant items. Remember that an installation cannot begin until all
conflicts are resolved, though an admin can override this behaviour if desired.

Thus, when manually installing a patch file (or files), I always check the individual sub-units to
see what they are. In this way, I can prevent conflicts from arising in the first place by not
selecting subsystems which I know are not relevant, eg. 64bit libraries which aren't needed for a
system with a 32bit memory address kernel like Indy (INFO: all SGIs released after the Indigo
R3000 in 1991 do 64bit processing, but the main kernel file does not need to be compiled using
64bit addressing extensions unless the system is one which might have a very large amount of
memory, eg. an Origin2000 with 16GB RAM). Even when no conflicts are present, I always
check the selected components to ensure no 'older version' items have been selected.
References:

1. Disk and File System Administration:


2. http://www.futuretech.vuurwerk.nl/disksfiles.html
3. How to Install IRIX 6.5:
4. http://www.futuretech.vuurwerk.nl/6.5inst.html
5. SGI General Performance Comparisons:
6. http://www.futuretech.vuurwerk.nl/perfcomp.html
Detailed Notes for Day 3 (Part 2)
UNIX Fundamentals: Organising a network with a server.

This discussion explains basic concepts rather than detailed ideas such as specific 'topologies' to
use with large networks, or how to organise complex distributed file systems, or subdomains and
address spaces - these are more advanced issues which most admins won't initially have to deal
with, and if they do then the tasks are more likely to be done as part of a team.

The SGI network in Ve24 is typical a modern UNIX platform in how it is organised. The key
aspects of this organisation can be summarised as follows:

 A number of client machines and a server are connected together using a hub (24-port in
this case) and a network comprised of 10Mbit Ethernet cable (100Mbit is more common
in modern systems, with Gigabit soon to enter the marketplace more widely).
 Each client machine has its own unique identity, a local disk with an installed OS and a
range of locally installed application software for use by users.
 The network has been configured to have its own subdomain name of a form that
complies with the larger organisation of which it is just one part (UCLAN).
 The server has an external connection to the Internet.
 User accounts are stored on the server, on a separate external disk. Users who login to the
client machines automatically find their own files available via the use of the NFS
service.
 Users can work with files in their home directory (which accesses the server's external
disk across the network) or use the temporary directories on a client machine's local disk
for better performance.
 Other directories are NFS mounted from the server in order to save disk space and to
centralise certain services (eg. /usr/share, /var/mail, /var/www).
 Certain aspects of the above are customised in places. Most networks are customised in
certain ways depending on the requirements of users and the decisions taken by the
admin and management. In this case, specifics include:
o Some machines have better hardware internals, allowing for software installation
setups that offer improved user application performance and services, eg. bigger
disk permits /usr/share to be local instead of NFS-mounted, and extra vendor
software, shareware and freeware can be installed.
o The admin's account resides on an admin machine which is effectively also a
client, but with minor modifications, eg. tighter security with respect to the rest of
the network, and the admin's personal account resides on a disk attached to the
admin machine. NFS is used to export the admin's home account area to the
server and all other clients; custom changes to the admin's account definition
allows the admin account to be treated just like any other user account (eg.
accessible from within /home/staff).
o The server uses a Proxy server in order to allow the client machines to access the
external connection to the Internet.
o Ordinary users cannot login to the server, ensuring that the server's resources are
reserved for system services instead of running user programs. Normally, this
would be a more important factor if the server was a more powerful system than
the clients (typical of modern organisations). In the case of the Ve24 network
though, the server happens to have the same 133MHz R4600PC CPU as the client
machines. Staff can login to the server however - an ability based on assumed
privilege.
o One client machine is using a more up-to-date OS version (IRIX 6.5) in order to
permit the use of a ZIP drive, a device not fully supported by the OS version used
on the other clients (IRIX 6.2). ZIP drives can be used with 6.2 at the command-
line level, but the GUI environment supplied with 6.2 does not fully support ZIP
devices. In order to support 6.5 properly, the client with the ZIP drive has more
memory and a larger disk (most of the clients have 549MB system disks -
insufficient to install 6.5 which requires approximately 720MB of disk space for a
default installation).
o etc.

This isn't a complete list, but the above are the important examples.

Exactly how an admin configures a network depends on what services are to be provided, how
issues such as security and access control are dealt with, Internet issues, available disk space and
other resources, peripherals provided such as ZIP, JAZ, etc., and of course any policy directives
decided by management.

My own personal ethos is, in general, to put users first. An example of this ethos in action is that
/usr/share is made local on any machine which can support it - accesses to such a local directory
occur much faster than across a network to an NFS-mounted /usr/share on a server. Thus,
searching for man pages, accessing online books, using the MIDI software, etc. is much more
efficient/faster, especially when the network or server is busy.

NFS Issues.

Many admins will make application software NFS-mounted, but this results in slower
performance (unless the network is fast and the server capable of supplying as much data as can
be handled by the client, eg. 100Mbit Ethernet, etc.) However, NFS-mounted application
directories do make it easier to manage software versions, updates, etc. Traditional client/server
models assume applications are stored on a server, but this is an old ethos that was designed
without any expectation that the computing world would eventually use very large media files,
huge applications, etc. Throwing application data across a network is a ridiculous waste of
bandwidth and, in my opinion, should be avoided where possible (this is much more important
for slower networks though, eg. 10Mbit).

In the case of the Ve24 network, other considerations also come into play because of hardware-
related factors, eg. every NFS mount point employed by a client system uses up some memory
which is needed to handle the operational overhead of dealing with accesses to that mount point.
Adding more mount points means using more memory on the client; for an Indy with 32MB
RAM, using as many as a dozen mount points can result in the system running out of memory (I
tried this in order to offer more application software on the systems with small disks, but 32MB
RAM isn't enough to support lots of NFS-mounted directories, and virtual memory is not an
acceptable solution). This is a good example of how system issues should be considered when
deciding on the hardware specification of a system. As with any computer, it is unwise to equip a
UNIX system with insufficient resources, especially with respect to memory and disk space.

Network Speed.

Similarly, the required speed of the network will depend on how the network will be used. What
applications will users be running? Will there be a need to support high-bandwidth data such as
video conferencing? Will applications be NFS-mounted or locally stored? What kind of system
services will be running? (eg. web servers, databases, image/document servers, etc.) What about
future expansion? All these factors and more will determine whether typical networking
technologies such as 10Mbit, 100Mbit or Gigabit Ethernet are appropriate, or whether a different
networking system such as ATM should be used instead. For example, MPC uses a fast-
switching high-bandwidth network due to the extensive use of data-intensive applications which
include video editing, special effects, rendering and animation.

After installation, commands such as netstat, osview, ping and ttcp can be used to monitor
network performance. Note that external companies, and vendor suppliers, can offer advice on
suggested system topologies. For certain systems (eg. high-end servers), specific on-site
consultation and analysis may be part of the service.

Storage.

Deciding on appropriate storage systems and capacities can be a daunting task for a non-trivial
network. Small networks such as the SGI network I run can easily be dealt with simply by
ensuring that the server and clients all have large disks, that there is sufficient disk space for user
accounts, and a good backup system is used, eg. DDS3 DAT. However, more complex networks
(eg. banks, commercial businesses, etc.) usually need huge amounts of storage space, use very
different types of data with different requirements (text, audio, video, documents, web pages,
images, etc.), and must consider a whole range of issues which will determine what kind of
storage solution is appropriate, eg.:

 preventing data loss,


 sufficient data capacity with room for future expansion,
 interupt-free fast access to data,
 failure-proof (eg. backup hub units/servers/UPS),
 etc.

A good source of advice may be the vendor supplying the systems hardware, though note that
3rd-party storage solutions can often be cheaper, unless there are other reasons for using a
vendor-sourced storage solution (eg. architectural integration).
See the article listed in reference [1] for a detailed discussion on these issues.

Setting up a network can thus be summarised as follows:

 Decide on the desired final configuration (consultation process, etc.)


 Install the server with default installations of the OS. Install the clients with a default or
expanded/customised configuration as desired.
 Construct the hardware connections.
 Modify the relevant setup files of a single client and the server so that one can rlogin to
the server from the client and use GUI-based tools to perform further system
configuration and administration tasks.
 Create, modify or install the files necessary for the server and clients to act as a coherent
network, eg. /etc/hosts, .rhosts, etc.
 Setup other services such as DNS, NIS, etc.
 Setup any client-specific changes such as NFS mount points, etc.
 Check all aspects of security and access control, eg. make sure guest accounts are
blocked if required, all client systems have a password for the root account, etc. Use any
available FAQ (Frequently Asked Questions) files or vendor-supplied information as a
source of advice on how to deal with these issues. Very usefully, IRIX 6.5 includes a
high-level tool for controlling overall system and network security - the tool can be (and
normally is) accessed via a GUI interface.
 Begin creating group entries in /etc/group ready for user accounts, and finally the user
accounts themselves.
 Setup any further services required, eg. Proxy server for Internet access.
 etc.

The above have not been numbered in a rigid order since the tasks carried out after the very first
step can usually be performed in a different order without affecting the final configuration. The
above is only a guide.

Quotas.

Employing disk quotas is a practice employed by most administrators as a means of controlling


disk space usage by users. It is easy to assume that a really large disk capacity would mean an
admin need not bother with quotas, but unfortunately an old saying definitely holds true: "Data
will expand to fill the space available."

Users are lazy where disk space is concerned, perhaps because it is not their job to manage the
system as a whole. If quotas are not present on a system, most users simply don't bother deleting
unwanted files. Alternatively, the quota management software can be used as an efficient disk
accounting system by setting up quotas for a file system without using limit enforcement.

IRIX employs a quota management system that is common amongst many UNIX variants.
Examining the relevant commands (consult the 'SEE ALSO' section from the 'quotas' man page),
IRIX's quota system appears to be almost identical to that employed by, for example, HP-UX
(Hewlett Packard's UNIX OS). There probably are differences between the two implementations,
eg. issues concerning supported operations on particular types of file system, but in this case the
quota system is typical of the kind of OS service which is very similar or identical across all
UNIX variants. An important fact is that the quota software is part of the overall UNIX OS,
rather than some hacked 3rd-party software addon.

Quota software allows users to determine their current disk usage, and enables an admin to
monitor available resources, how long a user is over their quota, etc. Quotas can be used not only
to limit the amount of available disk space a user has, but also the number of files (inodes) which
a user is permitted to create.

Quotas consist of soft limits and hard limits. If a user's disk usage exceeds the soft limit, a
warning is given on login, but the user can still create files. If disk usage continues to increase,
the hard limit is the point beyond which the user will not be able to use any more disk space, at
least until the usage is reduced so that it is sufficiently below the hard limit once more.

Like most system services, how to setup quotas is explained fully in the relevant online book,
"IRIX Admin: Disks and Filesystems". What follows is a brief summary of how quotas are setup
under IRIX. Of more interest to an admin are the issues which surround quota management -
these are discussed shortly.

To activate quotas on a file system, an extra option is added to the relevant entry in the /etc/fstab
file so that the desired file system is set to have quotas imposed on all users whose accounts
reside on that file system. For example, without quotas imposed, the relevant entry in yoda's
/etc/fstab file looks like this:

/dev/dsk/dks4d5s7 /home xfs rw 0 0

With quotas imposed, this entry is altered to be:

/dev/dsk/dks4d5s7 /home xfs rw,quota 0 0

Next, the quotaon command is used to activate quotas on the root file system. A reboot causes
the quota software to automatically detect that quotas should be imposed on /home and so the
quota system is turned on for that file system.

The repquota command is used to display quota statistics for each user. The edquota command is
used to change quota values for a single user, or multiple users at once. With the -i option,
edquota can also read in quota information from a file, allowing an admin to set quota limits for
many users with a single command. With the -e option, repquota can output the current quota
statistics to a file in a format that is suitable for use with edquota's -i option.

Note: the editor used by edquota is vi by default, but an admin can change this by etting an
environment variable called 'EDITOR', eg.:

setenv EDITOR jot -f


The -f option forces jot to run in the foreground. This is necessary because the editor used by
edquota must run in the foreground, otherwise edquota will simply see an empty file instead of
quota data.

Ordinary users cannot change quota limits.

Quota Management Issues.

Most users do not like disk quotas. They are perceived as the information equivalent of a
straitjacket. However, quotas are usually necessary in order to keep disk usage to a sensible level
and to maintain a fair usage amongst all users.

As a result, the most important decision an admin must make regarding quotas is what limit to
actually set for users, either as a whole or individually.

The key to amicable relations between an admin and users is flexibility, eg. start with a small to
moderate limit for all (eg. 20MB). If individuals then need more space, and they have good
reason to ask, then an admin should increase the user's quota (assuming space is available).

Exactly what quota to set in the first instance can be decided by any sensible/reasonable schema.
This is the methodology I originally adopted:

 The user disk is 4GB. I don't expect to ever have more than 100 users, so I set the initial
quota to 40MB each.

In practice, as expected, some users need more, but most do not. Thus, erring on the side of
caution while also being flexible is probably the best approach.

Today, because the SGI network has a system with a ZIP drive attatched, and the SGIs offer
reliable Internet access to the WWW, many students use the Ve24 machines solely for
downloading data they need, copying or moving the data onto ZIP for final transfer to their PC
accounts, or to a machine at home. Since the ZIP drive is a 100MB device, I altered the quotas to
50MB each, but am happy to change that to 100MB if anyone needs it (this allows for a
complete ZIP image to be downloaded if required), ie. I am tailoring quota limits based on a
specific hardware-related user service issue.

If a user exceeds their quota, warnings are given. If they ask for more disk space, an admin
would normally enquire as to whether the user genuinely needs more space, eg.:

 Does the user have unnecessary files lying around in their home directory somewhere?
For example, movie files from the Internet, unwanted mail files, games files, object files
or core dump files left over from application development, media files created by
'playing' with system tools (eg. the digital camera). What about their Netscape cache?
Has it been set to too high a value? Do they have hidden files they're not aware of, eg.
.capture.tmp.* directories, capture.mv files, etc.? Can the user employ compression
methods to save space? (gzip, pack, compress)

If a user has removed all unnecessary files, but is still short of space, then unless there is some
special reason for not increasing their quota, an admin should provide more space. Exceptions
could include, for example, a system which has a genuine overall shortage of storage space. In
such a situation, it is common for an admin to ask users to compress their files if possible, using
the 'gzip', 'compress' or 'pack' commands. Users can use tar to create archives of many files prior
to compression. There is a danger with asking users to compress files though: eventually, extra
storage has to be purchased; once it has been, many users start uncompressing many of the files
they earlier compressed. To counter this effect, any increase in storage space being considered
should be large, say an order of magnitude, or at the very least a factor of 3 or higher (I'm a firm
believer in future-proofing).

Note that the find command can be used to locate files which are above a certain size, eg. those
that are particularly large or in unexpected places. Users can use the du command to examine
how much space their own directories and files are consuming.

Note: if a user exceeds their hard quota limit whilst in the middle of a write operation such as
using an editor, the user will find it impossible to save their work. Unfortunately, quitting the
editor at that point will lose the contents of the file because the editor will have opened a file for
writing already, ie. the opened file will have zero contents. The man page for quotas describes
the problem along with possible solutions that a user can employ:

"In most cases, the only way for a user to recover from over-quota conditions is to abort
whatever activity is in progress on the filesystem that has reached its limit, remove
sufficient files to bring the limit back below quota, and retry the failed program.

However, if a user is in the editor and a write fails because of an over quota situation, that
is not a suitable course of action. It is most likely that initially attempting to write the file
has truncated its previous contents, so if the editor is aborted without correctly writing the
file, not only are the recent changes lost, but possibly much, or even all, of the contents
that previously existed.

There are several possible safe exits for a user caught in this situation. He can use the
editor ! shell escape command (for vi only) to examine his file space and remove surplus
files. Alternatively, using csh, he can suspend the editor, remove some files, then resume
it. A third possibility is to write the file to some other filesystem (perhaps to a file on
/tmp) where the user's quota has not been exceeded. Then after rectifying the quota
situation, the file can be moved back to the filesystem it belongs on."

It is important that users be made aware of these issues if quotas are installed. This is also
another reason why I constantly remind users that they can use /tmp and /var/tmp for temporary
tasks. One machine in Ve24 (Wolfen) has an extra 549MB disk available which any user can
write to, just in case a particularly complex task requiring alot of disk space must be carried out,
eg. movie file processing.
Naturally, an admin can write scripts of various kinds to monitor disk usage in detailed ways, eg.
regularly identify the heaviest consumers of disk resources; one could place the results into a
regularly updated file for everyone to see, ie. a publicly readable "name and shame" policy (not a
method I'd use unless absolutely necessary, eg. when individual users are abusing the available
space for downloading game files).

UNIX Fundamentals: Installing/removing internal/external hardware.

As explained in this course's introduction to UNIX, the traditional hardware platforms which run
UNIX OSs have a legacy of top-down integrated design because of the needs of the market areas
the systems are sold into.

Because of this legacy, much of the toil normally associated with hardware modifications is
removed. To a great extent, an admin can change the hardware internals of a machine without
ever having to be concerned with system setup files. Most importantly, low-level issues akin to
IRQ settings in PCs are totally irrelevant with traditional UNIX hardware platforms. By
traditional I mean the long line of RISC-based systems from the various UNIX vendors such as
Sun, IBM, SGI, HP, DEC and even Intel. This ease of use does not of course apply to ordinary
PCs running those versions of UNIX which can be used with PCs, eg. Linux, OpenBSD,
FreeBSD, etc.; for this category of system, the OS issues will be simpler (presumably), but the
presence of a bottom-up-designed PC hardware platform presents the usual problems of
compatible components, device settings, and other irritating low-level issues.

This discussion uses the SGI Indy as an example system. If circumstances allow, a more up-to-
date example using the O2 system will also be briefly demonstrated in the practical session.
Hardware from other UNIX vendors will likely be similar in terms of ease-of-access and
modification, though it has to be said that SGI has been an innovator in this area of design.

Many system components can be added to, or removed from a machine, or swapped between
machines, without an admin having to change system setup files in order to make the system run
smoothly after any alterations. Relevant components include:

 Memory units,
 Disk drives (both internal and external),
 Video or graphics boards that do not alter how the system would handle relevant
processing operations.
 CPU subsystems which use the same instruction set and hardware-level initialisation
libraries as are already installed.
 Removable storage devices, eg. ZIP, JAZ, Floptical, SyQuest, CDROM, DVD (where an
OS is said to support it), DAT, DLT, QIC, etc.
 Any option board which does not impact on any aspect of existing system operation not
related to the option board itself, eg. video capture, network expansion (Ethernet, HiPPI,
TokenRing, etc.), SCSI expansion, PCI expansion, etc.
Further, the physical layout means the admin does not have to fiddle with numerous cables and
wires. The only cables present in Indy are the two short power supply cables, and the internal
SCSI device ribbon cable with its associated power cord. No cables are present for graphics
boards, video options, or other possible expansion cards. Some years after the release of the
Indy, SGI's O2 design allows one to perform all these sorts of component changes without
having to fiddle with any cables or screws at all (the only exception being any PCI expansion,
which most O2 users will probably never use anyway).

This integrated approach is certainly true of Indy. The degree to which such an ethos applies to
other specific UNIX hardware platforms will vary from system to system. I should imagine
systems such as Sun's Ultra 5, Ultra 10 and other Ultra-series workstations are constructed in a
similar way.

One might expect that any system could have a single important component replaced without
affecting system operation to any great degree, even though this is usually not the case with PCs,
but it may come as a far greater surprise that an entire set of major internal items can be changed
or swapped from one system to another without having to alter configuration files at all.

Even when setup files do have to be changed, the actual task normally only involves either a
simple reinstall of certain key OS software sub-units (the relevant items will be listed in
accompanying documentation and release notes), or the installation of some additional software
to support any new hardware-level system features. In some cases, a hardware alteration might
require a software modification to be made from miniroot if the software concerned was of a
type involved in normal system operation, eg. display-related graphics libraries which controlled
how the display was handled given the presence of a particular graphics board revision.

The main effect of this flexible approach is that an admin has much greater freedom to:

 modify systems as required, perhaps on a daily basis (eg. the way my external disk is
attatched and removed from the admin machine every single working day),
 experiment with hardware configurations, eg. performance analysis (a field I have
extensively studied with SGIs [2]),
 configure temporary setups for various reasons (eg. demonstration systems for visiting
clients),
 effect maintenance and repairs, eg. cleaning, replacing a power supply, etc.

All this without the need for time-consuming software changes, or the irritating necessity to
consult PC-targeted advice guides about devices (eg. ZIP) before changes are made.

Knowing the scope of this flexibility with respect to a system will allow an admin to plan tasks
in a more efficient manner, resulting in better management of available time.

An example of the above with respect to the SGI Indy would be as follows (this is an imaginary
demonstration of how the above concepts could be applied in real-life):
 An extensive component swap between two indys, plus new hardware installed.

Background information:

CPUs.

All SGIs use a design method which involves supplying a CPU and any necessary secondary
cache plus interface ASICs on a 'daughterboard', or 'daughtercard'. Thus, replacing a CPU merely
involves changing the daughtercard, ie. no fiddling with complex CPU insertion sockets, etc.
Daughtercards in desktop systems can be replaced in seconds, certainly no more than a minute or
two.

The various CPUs available for Indy can be divided into two categories: those which support
everything up to and including the MIPS III instruction set, and those which support all these
plus the MIPS IV instruction set.

The R4000, R4600 and R4400 CPUs all use MIPS III and are initialised on bootup with the same
low-level data files, ie. the files stored in /var/sysgen. This covers the following CPUs:

100MHz R4000PC (no L2)


100MHz R4000SC (1MB L2)
100MHz R4600PC (no L2)
133MHz R4600PC (no L2)
133MHz R4600SC (512K L2)
100MHz R4400SC (1MB L2)
150MHz R4400SC (1MB L2)
175MHz R4400SC (1MB L2)
200MHz R4400SC (1MB L2)

Thus, two Indys with any of the above CPUs can have their CPUs swapped without having to
alter system software.

Similarly, the MIPS IV CPUs:

150MHz R5000PC (no L2)


150MHz R5000SC (512K L2)
180MHz R5000SC (512K L2)

can be treated as interchangeable between systems in the same way.

The difference between an Indy which uses a newer vs. older CPU is that the newer CPUs
require a more up-to-date version of the system PROM chip to be installed on the motherboard (a
customer who orders an upgrade is suppled with the newer PROM if required).

Video/Graphics Boards.

Indy can have three different boards which control display output:
8bit XL
24bit XL
24bit XZ

8bit and 24bit XL are designed for 2D applications. They are identical except for the addition of
more VRAM to the 24bit version. XZ is designed for 3D graphics and so requires a slightly
different installation of software graphics libraries to be installed in order to permit proper use.
Thus, with respect to the XL version, an 8bit XL card can be swapped with a 24bit XL card with
no need to alter system software.

Indy can have two other video options:

 IndyVideo (provides video output ports as well as extra input ports),


 CosmoCompress (hardware-accelerate MJPEG video capture board).

IndyVideo does not require the installation of any extra software in order to be used.
CosmoCompress does require some additional software to be installed (CosmoCompress
compression API and libraries). Thus, IndyVideo could be installed without any post-installation
software changes. swmgr can be used to install the CosmoCompress software after the option
card has been installed.

Removable Media Devices.

As stated earlier, no software modifications are required, unless specifically stated by the vendor.
Once a device has its SCSI ID set appropriately and installed, it is recognised automatically and
a relevant icon placed on the desktop for users to exploit. Some devices may require a group of
DIP switches to be configured on the outside of the device, but that is all (settings to use for a
particular system will be found in the supplied device manual). The first time I used a DDS3
DAT drive (Sony SDT9000) with an Indy, the only setup required was to set four DIP switches
on the underside of the DAT unit to positions appropriate for use with an SGI (as detailed on the
first page of the DAT manual). Connecting the DAT unit to the Indy, booting up and logging in,
the DAT was immediately usable (icon available, etc.) No setup files, no software to install, etc.
The first time I used a 32X CDROM (Toshiba CD-XM-6201B) not even DIP switches had to be
set.

System Disks, Extra Disks.

Again, installed disks are detected automatically and the relevant device files in /dev initialised
to be treated as the communication points with the devices concerned. After bootup, the fx, mkfs
and mount commands can be used to configure and mount new disks, while disks which already
have a valid file system installed can be mounted immediately. GUI tools are available for
performing these actions too.

Thus, consider two Indys:


System A System B

200MHz R4400SC 100MHz R4600PC


24bit XL 8bit XL
128MB RAM 64MB RAM
2GB disk 1GB disk
IRIX 6.2 IRIX 6.2

Suppose an important company visitor is expected the next morning at 11am and the admin is
asked to quickly prepare a decent demonstration machine, using a budget provided by the
visiting company to cover any changes required (as a gift, any changes can be permanent).

The admin orders the following extra items for next-day delivery:

 A new 4GB SCSI disk (Seagate Barracuda 7200rpm)


 IndyVideo board
 Floptical drive
 ZIP drive
 32X Toshiba CDROM (external)
 DDS3 Sony DAT drive (external)

The admin decides to make the following changes (Steps 1 and 2 are carried out immediately; in
order to properly support the ZIP drive, the admin needs to use IRIX 6.5 on B. The support
contract means the CDs are already available.):

1. Swap the main CPU, graphics board and memory components between systems A and B.
2. Remove the 1GB disk from System B and install it as an option disk in System A. The
admin uses fx and mkfs to redine the 1GB disk as an option drive, deciding to use the
disk for a local /usr/share partition (freeing up perhaps 400MB of space from System A's
2GB disk).
3. The order arrives the next morning at 9am (UNIX vendors usually use couriers such as
Fedex and DHL, so deliveries are normally very reliable). The 4GB disk is installed into
System B (empty at this point) and the CDROM connected to the external SCSI port
(SCSI ID 3). The admin then installs IRIX 6.5 onto the 4GB disk, a process which takes
approximately 45 minutes. The system is powered down ready for the final hardware
changes.
4. The IndyVideo board is installed in System B (sits on top of the 24bit XL board, 2 or 3
screws involved, no cables), along with the internal Floptical drive above the 4GB disk
(SCSI ID set to 2). The DAT drive (SCSI ID set to 4) is daisy chained to the external
CDROM. The ZIP drive is daisy chained to the DAT (SCSI ID 5 by default selector,
terminator enabled). This can all be done in less than five minutes.
5. The system is rebooted, the admin logs in as root. All devices are recognised
automatically and icons for each device (ZIP, CDROM, DAT, Floptical) are immediately
present on the desktop and available for use. Final additional software installations can
begin, ready for the visitor's arrival. An hour should be plenty of time to install specific
application(s) or libraries that might be required for the visit.
I am confident that steps 1 and 2 could be completed in less than 15 minutes. Steps 3, 4 and 5
could be completed in less little more than an hour. Throughout the entire process, no OS or
software changes have to be made to either System A, or to the 6.5 OS installed on System B's
new 4GB after initial installation (ie. the ZIP, DAT and Floptical were not attatched to System B
when the OS was installed, but they are correctly recognised by the default 6.5 OS when the
devices are added afterwards).

If time permits and interest is sufficient, almost all of this example can be demonstrated live (the
exception is the IndyVideo board; such a board is not available for use with the Ve24 system at
the moment).

How does the above matter from an admin's point of view? The answer is confidence and lack of
stress. I could tackle a situation such as described here in full confidence that I would not have to
deal with any matters concerning device drivers, interupt addresses, system file modifications,
etc. Plus, I can be sure the components will work perfectly with one another, constructed as they
are as part of an integrated system design. In short, this integrated approach to system design
makes the admin's life substantially easier.

The Visit is Over.

Afterwards, the visitor donates funds for a CosmoCompress board and an XZ board set. Ordered
that day, the boards arrive the next morning. The admin installs the CosmoCompress board into
System B (2 or 3 more screws and that's it). Upon bootup, the admin installs the
CosmoCompress software from the supplied CD with swmgr. With no further system changes,
all the existing supplied software tools (eg. MediaRecorder) can immediately utilise the new
hardware compression board.

The 8bit XL board is removed from System A and replaced with the XZ board set. Using inst
accessed via miniroot, the admin reinstalls the OS graphics libraries so that the appropriate
libraries are available to exploit the new board. After rebooting the system, all existing software
written in OpenGL automatically runs ten times faster than before, without modification.

Summary.

Read available online books and manual pages on general hardware concepts thoroughly.

Get to know the system - every machine will either have its own printed hardware guide, or an
equivalent online book.

Practice hardware changes before they are required for real.

Consult any Internet-based information sources, especially newsgroup posts, 3rd-party web sites
and hardware-related FAQ files.
When performing installations, follow all recommended procedures, eg. use an anti-static strap
to eliminate the risk of static discharge damaging system components (especially important for
handling memory items, but also just as relevant to any other device).

Construct a hardware maintenance strategy for cleansing and system checking, eg. examine all
mice on a regular basis to ensure they are dirt-free, use an air duster once a month to clear away
accumulated dist and grime, clean the keyboards every two months, etc.

Be flexible. System management policies are rarely static, eg. a sudden change in the frequency
of use of a system might mean cleansing tasks need to be performed more often, eg. cleaning
monitor screens.

If you're not sure what the consequences of an action might be, call the vendor's hardware
support service and ask for advice. Questions can be extremely detailed if need be - this kind of
support is what such support services are paid to offer, so make good use of them.

Before making any change to a system, whether hardware or software, inform users if possible.
This is probably more relevant to software changes (eg. if a machine needs to be rebooted, use
'wall' to notify any users logged onto the machine at the time, ie. give them time to log off; if
they don't, go and see why they haven't), but giving advance notice is still advisable for hardware
changes too, eg. if a system is being taken away for cleaning and reinstallation, a user may want
to retrieve files from /var/tmp prior to the system's removal, so place a notice up a day or so
beforehand if possible.

References:

1. "Storage for the network", Network Week, Vol4 No.31, 28th April 1999, pp. 25 to 29, by
Marshall Breeding.
2. SGI General Performance Comparisons:
3. http://www.futuretech.vuurwerk.nl/perfcomp.html
Detailed Notes for Day 3 (Part 3)
UNIX Fundamentals: Typical system administration tasks.

Even though the core features of a UNIX OS are handled automatically, there are still some jobs
for an admin to do. Some examples are given here, but not all will be relevant for a particular
network or system configuration.

Data Backup.

A fundamental aspect of managing any computer system, UNIX or otherwise, is the backup of
user and system data for possible retrieval purposes in the case of system failure, data corruption,
etc. Users depend the admin to recover files that have been accidentally erased, or lost due to
hardware problems.

Backup Media.

Backup devices may be locally connected to a system, or remotely accessible across a network.
Typical backup media types include:

 1/4" cartridge tape, 8mm cartridge tape (used infrequently today)


 DAT (very common)
 DLT (where lots of data must be archived)
 Floptical, ZIP, JAZ, SyQuest (common for user-level backups)

Backup tapes, disks and other media should be well looked after in a secure location [3].

Backup Tools.

Software tools for archiving data include low-level format-independent tools such as dd, file and
directory oriented tools such as tar and cpio, filesystem-oriented tools such as bru, standard
UNIX utilities such as dump and restore (cannot be used with XFS filesystems - use xfsdump
and xfsrestore instead), etc., and high-level tools (normally commercial packages) such as IRIS
NetWorker. Some tools include a GUI frontend interface.

The most commonly used program is tar, which is also widely used for the distribution of
shareware and freeware software. Tar allows one to gather together a number of files and
directories into a single 'tar archive' file which by convention should always have a '.tar' suffix.
By specifying a device such as a DAT instead of an archive file, tar can thus be used to archive
data directly to a backup medium.

Tar files can also be compressed, usually with the .gz format (gzip and gunzip) though there are
other compression utilities (compress, pack, etc.) Backup and restoration speed can be improved
by compressing files before any archiving process commences. Some backup devices have built-
in hardware compression abilities. Note that files such as MPEG movies and JPEG images are
already in a compressed format, so compressing these prior to backup is pointless.

Straightforward networks and systems will almost always use a DAT drive as the backup device
and tar as the software tool. Typically, the 'cron' job scheduling system is used to execute a
backup at regular intervals, usually overnight. Cron is discussed in more detail below.

Backup Strategy.

Every UNIX guide will recommend the adoption of a 'backup strategy', ie. a combination of
hardware and software related management methods determined to be the most suitable for the
site in question.

A backup strategy should be rigidly adhered to once in place. Strict adherence allows an admin
to reliably assess whether lost or damaged data is recoverable when a problem arises.

Exactly how an admin performs backups depends upon the specifics of the site in question.
Regardless of the chosen strategy, at least two full sets of reasonably current backups should
always be maintained. Users should also be encouraged to make their own backups, especially
with respect to files which are changed and updated often.

What/When to Backup.

How often a backup is made depends on the system's frequency of use. For a system like the
Ve24 SGI network, a complete backup of user data every night, plus a backup of the server's
system disk once a week, is fairly typical. However, if a staff member decided to begin important
research with commercial implications on the system, I might decide that an additional backup at
noon each day should also be performed, or even hourly backups of just that person's account.

Usually, a backup archives all user or system data, but this may not be appropriate for some sites.
For example, an artist or animator may only care about their actual project files in their ~/Maya
project directory (Maya is a professional Animation/Rendering package) rather than the files
which define their user environment, etc. Thus, an admin might decide to only backup every
users' Maya projects directory. This would, for example, have the useful side effect of excluding
data such as the many files present in a user's .netscape/cache directory. In general though, all of
a user's account is archived.

If a change is to be made to a system, especially a server change, then separate backups should
be performed before and after the change, just in case anything goes wrong.

Since root file systems do not change very much, they can be backed up less frequently, eg. once
per week. An exception might be if the admin wishes to keep a reliable record of system access
logs which are part of the root file system, eg. those located in the files (for example):
/var/adm/SYSLOG
/var/netscape/suitespot/proxy-sysname-proxy/logs

The latter of the two would be relevant if a system had a Proxy server installed, ie. 'sysname'
would be the host name of the system. Backing up /usr and /var instead of the entire / root
directory is another option - the contents of /usr and /var change more often than many other
areas of the overall file system, eg. users' mail is stored in /var/mail and most executable
programs are under /usr.

In some cases, it isn't necessary to backup an entire root filesystem anyway. For example, the
Indys in Ve24 all have more or less identical installations: all Indys with a 549MB disk have the
same disk contents as each other, likewise for those with 2GB disks. The only exception is
Wolfen which uses IRIX 6.5 in order to provide proper support for an attached ZIP drive. Thus, a
backup of one of the client Indys need only concern specific key files such as /etc/hosts,
/etc/sys_id, /var/flexlm/license.dat, etc. However, this policy may not work too well for servers
(or even clients) because:

 an apparently small change, eg. adding a new user, installing a software patch, can affect
many files,
 the use of GUI-based backup tools does not aid an admin in remembering which files
have been archived.

For this reason, most admins will use tar, or a higher-level tool like xfsdump.

Note that because restoring data from a DAT device is slower than copying data directly from
disk to disk (especially modern UltraSCSI disks), an easier way to restore a client's system disk -
where all clients have identical disk contents - is to clone the disk from another client and then
alter the relevant files; this is what I do if a problem occurs.

Other backup devices can be much faster though [1], eg. DLT9000 tape streamer, or
military/industrial grade devices such as the DCRsi 240 Digital Cartridge Recording System
(30MB/sec) as was used to backup data during the development of the 777 aircraft, or the
Ampex DIS 820i Automated Cartridge Library (scalable from 25GB to 6.4TB max capacity,
80MB/sec sustained record rate, 800MB/sec search/read rate, 30 seconds maximum search time
for any file), or just a simple RAID backup which some sites may choose to use.

It's unusual to use another disk as a backup medium, but not unheard of. Theoretically, it's the
fastest possible backup medium, so if there's a spare disk available, why not? Some sites may
even have a 'mirror' system whereby a backup server B copies exactly the changes made to an
identical file system on the main server A; in the event of serious failure, server B can take over
immediately. SGI's commercial product for this is called IRIS FailSafe, with a switchover time
between A and B of less than a millisecond. Fail-safe server configurations like this are the
ultimate form of backup, ie. all files are being backed up in real-time, and the support hardware
has a backup too. Any safety-critical installation will probably use such methods.

Special power supplies might be important too, eg. a UPS (Uninterruptable Power Supply) which
gives some additional power for a few minutes to an hour or more after a power failure and
notifies the system to facilitate a safe shutdown, or a dedicated backup power generator could be
used, eg. hospitals, police/fire/ambulance, airtraffic control, etc.

Note: systems managed by more than one admin should be backed up more often; admin policies
should be consistent.

Incremental Backup.

This method involves only backing up files which have changed since the previous backup,
based on a particular schedule. An incremental schema offers the same degree of 'protection' as
an entire system backup and is faster since fewer files are archived each time, which means
faster restoration time too (fewer files to search through on a tape).

An example schedule is given in the online book, "IRIX Admin: Backup, Security, and
Accounting':

"An incremental scheme for a particular filesystem


looks something like this:

1. On the first day, back up the entire filesystem.


This is a monthly backup.

2. On the second through seventh days, back up only


the files that changed from the previous day.
These are daily backups.

3. On the eighth day, back up all the files that


changed the previous week. This is a weekly backup.

4. Repeat steps 2 and 3 for four weeks (about one month).

5. After four weeks (about a month), start over,


repeating steps 1 through 4.

You can recycle daily tapes every month, or whenever you feel safe
about doing so. You can keep the weekly tapes for a few months.
You should keep the monthly tapes for about one year before
recycling them."

Backup Using a Network Device.

It is possible to archive data to a remote backup medium by specifying the remote host name
along with the device name. For example, an ordinary backup to a locally attached DAT might
look like this:

tar cvf /dev/tape /home/pub

Or if no other relevant device was present:


tar cv /home/pub

For a remote device, simply add the remote host name before the file/directory path:

tar cvf yoda:/dev/tape /home/pub

Note that if the tar command is trying to access a backup device which is not made by the source
vendor, then '/dev/tape' may not work. In such cases, an admin would have to use a suitable
lower-level device file, ie. one of the files in /dev/rmt - exactly which one can be determined by
deciding on the required functionality of the device, as explained in the relevant device manual,
along with the SCSI controller ID and SCSI device ID.

Sometimes a particular user account name may have to be supplied when accessing a remote
device, eg.:

tar guest@yoda:/dev/tape /home/pub

This example wouldn't actually work on the Ve24 network since all guest accounts are locked
out for security reasons, except on Wolfen. However, an equivalent use of the above syntax can
be demonstrated using Wolfen's ZIP drive and the rcp (remote copy) command:

rcp -r /home/pub guest.guest1@wolfen:/zip

Though note that the above use of rcp would not retain file time/date creation/modification
information when copying the files to the ZIP disk (tar retains all information).

Automatic Backup With Cron.

The job scheduling system called cron can be used to automatically perform backups, eg.
overnight. However, such a method should not be relied upon - nothing is better than someone
manually executing/observing a backup, ensuring that the procedure worked properly, and
correctly labelling the tape afterwards.

If cron is used, a typical entry in the root cron jobs schedule file (/var/spool/cron/crontabs/root)
might look like this:

0 3 * * * /sbin/tar cf /dev/tape /home

This would execute a backup to a locally attached backup device at 3am every morning. Of
course, the admin would have to ensure a suitable media was loaded before leaving at the end of
each day.

This is a case where the '&&' operator can be useful: in order to ensure no subsequent operation
could alter the backed-up data, the 'eject' command could be employed thus:

0 3 * * * /sbin/tar cf /dev/tape /home && eject /dev/tape


Only after the tar command has finished will the backup media be ejected. Notice there is no 'v'
option in these tar commands (verbose mode). Why bother? Nobody will be around to see the
output. However, an admin could modify the command to record the output for later reading:

0 3 * * * /sbin/tar cvf /dev/tape /home > /var/tmp/tarlog && eject


/dev/tape

Caring for Backup Media.

This is important, especially when an admin is responsible for backing up commercially


valuable, sensitive or confidential data.

Any admin will be familiar with the usual common-sense aspects of caring for any storage
medium, eg. keeping media away from strong magnetic fields, extremes of temperature and
humidity, etc., but there are many other factors too. The "IRIX Admin: Backup, Security, and
Accounting' guide contains a good summary of all relevant issues:

"Storage of Backups

Store your backup tapes carefully. Even if you create backups on more durable media,
such as optical disks, take care not to abuse them. Set the write protect switch on tapes
you plan to store as soon as a tape is written, but remember to unset it when you are ready
to overwrite a previously-used tape.

Do not subject backups to extremes of temperature and humidity, and keep tapes away
from strong electromagnetic fields. If there are a large number of workstations at your
site, you may wish to devote a special room to storing backups.

Store magnetic tapes, including 1/4 in. and 8 mm cartridges, upright. Do not store tapes
on their sides, as this can deform the tape material and cause the tapes to read incorrectly.

Make sure the media is clearly labeled and, if applicable, write-protected. Choose a label-
color scheme to identify such aspects of the backup as what system it is from, what level
of backup (complete versus partial), what filesystem, and so forth.

To minimize the impact of a disaster at your site, such as a fire, you may want to store
main copies of backups in a different building from the actual workstations. You have to
balance this practice, though, with the need to have backups handy for recovering files.

If backups contain sensitive data, take the appropriate security precautions, such as
placing them in a locked, secure room. Anyone can read a backup tape on a system that
has the appropriate utilities.
How Long to Keep Backups

You can keep backups as long as you think you need to. In practice, few sites keep
system backup tapes longer than about a year before recycling the tape for new backups.
Usually, data for specific purposes and projects is backed up at specific project
milestones (for example, when a project is started or finished).

As site administrator, you should consult with your users to determine how long to keep
filesystem backups.

With magnetic tapes, however, there are certain physical limitations. Tape gradually loses
its flux (magnetism) over time. After about two years, tape can start to lose data.

For long-term storage, re-copy magnetic tapes every year to year-and-a-half to prevent
data loss through deterioration. When possible, use checksum programs, such as the
sum(1) utility, to make sure data hasn't deteriorated or altered in the copying process. If
you want to reliably store data for several years, consider using optical disk.

Guidelines for Tape Reuse

You can reuse tapes, but with wear, the quality of a tape degrades. The more important
the data, the more precautions you should take, including using new tapes.

If a tape goes bad, mark it as "bad" and discard it. Write "bad" on the tape case before
you throw it out so that someone doesn't accidentally try to use it. Never try to reuse an
obviously bad tape. The cost of a new tape is minimal compared to the value of the data
you are storing on it."

Backup Performance.

Sometimes data archive/extraction speed may be important, eg. a system critical to a commercial
operation fails and needs restoring, or a backup/archive must be made before a deadline.

In these situations, it is highly advisable to use a fast backup medium, eg. DDS3 DAT instead of
DDS1 DAT.

For example, an earlier lecture described a situation where a fault in the Ve24 hub caused
unnecessary fault-hunting. As part of that process, I restored the server's system disk from a
backup tape. At the time, the backup device was a DDS1 DAT. Thus, to restore some 1.6GB of
data from a standard 2GB capacity DAT tape, I had to wait approximately six hours for the
restoration to complete (since the system was needed the next morning, I stayed behind well into
the night to complete the operation).
The next day, it was clear that using a DDS1 was highly inefficient and time-wasting, so a DDS3
DAT was purchased immediately. Thus, if the server ever has to be restored from DAT again,
and despite the fact it now has a larger disk (4GB with 2.5GB of data typically present), even a
full restoration would only take three hours instead of six (with 2.5GB used, the restoration
would finish in less than two hours). Tip: as explained in the lecture on hardware modifications
and installations, consider swapping a faster CPU into a system in order to speedup a backup or
restoration operation - it can make a significant difference [2].

Hints and Tips.

 Keep tape drives clean. Newer tapes deposit more dirt than old ones.
 Use du and df to check that a media will have enough space to store the data. Consider
using data compression options if space on the media is at a premium (some devices may
have extra device files which include a 'c' in the device name to indicate it supports
hardware compression/decompression, eg. a DLT drive whose raw device file is
/dev/rmt/tps0d5vc). There is no point using compression options if the data being
archived is already compressed with pack, compress, gzip, etc. or is naturally compressed
anyway, eg. an MPEG movie, JPEG image, etc.
 Use good quality media. Do not use ordinary audio DAT tapes with DAT drives for
computer data backup; audio DAT tapes are of a lower quality than DAT tapes intended
for computer data storage.
 Consider using any available commands to check beforehand that a file system to be
backed up is not damaged or corrupted (eg. fsck). This will be more relevant to older file
system types and UNIX versions, eg. fsck is not relevant to XFS filesystems (IRIX 6.x
and later), but may be used with EFS file systems (IRIX 5.3 and earlier). Less important
when dealing with a small number of items.
 Label all backups, giving full details, eg. date, time, host name, backup command used
(so you or another admin will know how to extract the files later), general contents
description, and your name if the site has more than one admin with responsibility for
backup procedures.
 Verify a backup after it is made; some commands require specific options, while others
provide a means of listing the contents of a media, eg. the -t option used with tar.
 Write-protect a media after a backup has finished.
 Keep a tally on the media of how many times it has been used.
 Consider including an index file at the very start of the backup on the media, eg.:
 ls -AlhFR /home > /home/0000index && tar cv /home

Note: such index files can be large.

 Exploit colour code schemes to denote special attributes, eg. daily vs. weekly vs. monthly
tapes.
 Be aware of any special issues which may be relevant to the type of data being backed
up. For example, movie files can be very large; on SGIs, tar requires the K option in
order to archive files larger than 2GB. Use of this option may mean the archived media is
not compatible with another vendor's version of tar.
 Consult the online guides. Such guides often have a great deal of advice, examples, etc.

tar is a powerful command with a wide range of available options and is used on UNIX systems
worldwide. It is typical of the kind of UNIX command for which an admin is well advised to
read through the entire man page. Other commands in this category include find, rm, etc.

Note: if compatibility between different versions of UNIX is an issue, one can use the lower-
level dd command which allows one to specify more details about how the data is to be dealt
with as it is sent to or received from a backup device, eg. changing the block size of the data. A
related command is 'mt' which can be used to issue specific commands to a magnetic tape device,
eg. print device details and default block size.

If problems occur during backup/restore operations, remember to check /var/adm/SYSLOG for


any relevant error messages (useful if one cannot be present to monitor the operation in person).

Restoring Data from Backup Media.

Restoring non-root-filesystem data is trivial: just use the relevant extraction tool, eg.:

tar xv /dev/tape

However, restoring the root '/' partition usually requires access to an appropriate set of OS CD(s)
and a full system backup tape of the / partition. Further, many OSs may insist that backup and
restore operations at the system level must be performed with a particular tool, eg. Backup and
Restore. If particular tools were required but not used to create the backup, or if the system
cannot boot to a state where normal extraction tools can be used (eg. damage to the /usr section
of the filesystem) then a complete reinstallation of the OS must be done, followed by the
extraction of the backup media ontop of the newly created filesystem using the original tool.

Alternatively, a fresh OS install can be done, then a second empty disk inserted on SCSI ID 2,
setup to be a root disk, the backup media extracted onto the second disk, then the volume header
copied over using dhvtool or other command relevant to the OS being used (this procedure is
similar to disk cloning). Finally, a quick swap of the disks so that the second disk is on SCSI ID
1 and the system is back to normal. I personally prefer this method since it's "cleaner", ie. one
can never be sure that extracting files ontop of an existing file system will result in a final
filesystem that is genuinely identical to the original. By using a second disk in this way, the
psychological uncertainty is removed.

Just like backing up data to a remote device, data can be restored from a remote device as well.
An OS 'system recovery' menu will normally include an option to select such a restoration
method - a full host:/path specification is required.

Note that if a filesystem was archived with a leading / symbol, eg.:

tar cvf /dev/tape /home/pub/movies/misc


then an extraction may fail if an attempt is made to extract the files without changing the
equivalent extraction path, eg. if a student called cmpdw entered the following command with
such a tape while in their home directory:

tar xvf /dev/tape

then the command would fail since students cannot write to the top level of the /home directory.

Thus, the R option can be used (or equivalent option for other commands) to remove leading /
symbols so that files are extracted into the current directory, ie. if cmpdw entered:

tar xvfR /dev/tape

then tar would place the /home data from the tape into the cmpdw's home directory, ie. cmpdw
would see a new directory with the name:

/home/students/cmpdw/home

Other Typical Daily Tasks.

From my own experience, these are the types of task which most admins will likely carry out
every day:

 Check disk usage across the system.


 Check system logs for important messages, eg. system errors and warnings, possible
suspected access attempts from remote systems (hackers), suspicious user activity, etc.
This applies to web server logs too (use script processing to ease analysis).
 Check root's email for relevant messages (eg. printers often send error messages to root in
the form of an email).
 Monitor system status, eg. all systems active and accessible (ping).
 Monitor system performance, eg. server load, CPU-hogging processes running in
background that have been left behind by a careless user, packet collision checks,
network bandwidth checks, etc.
 Ensure all necessary system services are operating correctly.
 Tour the facilities for general reasons, eg. food consumed in rooms where such activity is
prohibited, users who have left themselves logged in by mistake, a printer with a paper
jam that nobody bothered to report, etc. Users are notoriously bad at reporting physical
hardware problems - the usual response to a problem is to find an alternative
system/device and let someone else deal with it.
 Dealing with user problems, eg. "Somebody's changed my password!" (ie. the user has
forgotten their password). Admins should be accessible by users, eg. a public email
address, web feedback form, post box by the office, etc. Of course, a user can always
send an email to the root account, or to the admin's personal account, or simply visit the
admin in person. Some systems, like Indy, may have additional abilities, eg. video
conferencing: a user can use the InPerson software to request a live video/audio link to
the admin's system, allowing 2-way communication (see the inperson man page). Other
facilities such as the talk command can also be employed to contact the admin, eg. at a
remote site. It's up to the admin to decide how accessible she/he should be - discourage
trivial interruptions.
 Work on improving any relevant aspect of system, eg. security, services available to users
(software, hardware), system performance tuning, etc.
 Cleaning systems if they're dirty; a user will complain about a dirty monitor screen or
sticking mouse behaviour, but they'll never clean them for you. Best to prevent
complaints via regular maintenance. Consider other problem areas that may be hidden,
eg. blowing loose toner out of a printer with an air duster can.
 Learning more about UNIX in general.
 Taking necessary breaks! A tired admin will make mistakes.

This isn't a complete list, and some admins will doubtless have additional responsibilities, but the
above describes the usual daily events which define the way I manage the Ve24 network.

Useful file: /etc/motd

The contents of this file will be echoed to stdout whenever a user activates a login shell. Thus,
the message will be shown when:

 a user first logs in (contents in all visible shell windows),


 a user accesses another system using commands such as rlogin and telnet,
 a user creates a new console shell window; from the man page for console, "The console
provides the operator interface to the system. The operating system and system utility
programs display error messages on the system console."

The contents of /etc/motd are not displayed when the user creates a new shell using 'xterm', but is
displayed when winterm is used. The means by which xterm/winterm are executed are irrelevant
(icon, command, Toolchest, etc.)

The motd file can be used as a simple way to notify users of any developments. Be careful of
allowing its contents to become out of date though. Also note that the file is local to each system,
so maintaining a consistent motd between systems might be necessary, eg. a script to copy the
server's motd to all clients.

Other possible ways to inform users of worthy news is the xconfirm command, which could be
included within startup scripts, user setup files, etc. From the xconfirm man page:

"xconfirm displays a line of text for each -t argument specified (or a file when the -file
argument is used), and a button for each -b argument specified. When one of the buttons
is pressed, the label of that button is written to xconfirm's standard output. The enter key
activates the specified default button. This provides a means of communication/feedback
from within shell scripts and a means to display useful information to a user from an
application. Command line options are available to specify geometry, font style, frame
style, modality and one of five different icons to be presented for tailored visual feedback
to the user."

For example, xconfirm could be used to interactively warn the user if their disk quota has been
exceeded.

UNIX Fundamentals: System bootup and shutdown, events, daemons.

SGI's IRIX is based on System V with BSD enhancements. As such, the way an IRIX system
boots up is typical of many UNIX systems. Some interesting features of UNIX can be discovered
by investigating how the system starts up and shuts down.

After power on and initial hardware-level checks, the first major process to execute is the UNIX
kernel file /unix, though this doesn't show up in any process list as displayed by commands such
as ps.

The kernel then starts the init program to begin the bootup sequence, ie. init is the first visible
process to run on any UNIX system. One will always observe init with a process ID of 1:

% ps -ef | grep init | grep -v grep


root 1 0 0 21:01:57 ? 0:00 /etc/init

init is used to activate, or 'spawn', other processes. The /etc/inittab file is used to determine what
processes to spawn.

The lecture on shell scripts introduced the init command, in a situation where a system was made
to reboot using:

init 6

The number is called a 'run level', ie. a software configuration of the system under which only a
selected group of processes exist. Which processes correspond to which run level is defined in
the /etc/inittab file.

A system can be in any one of eight possible run levels: 0 to 6, s and S (the latter two are
identical). The states which most admins will be familiar with are 0 (total shutdown and power
off), 1 (enter system administration mode), 6 (reboot to default state) and S (or s) for 'single-user'
mode, a state commonly used for system administration. The /etc/inittab file contains an
'initdefault' state, ie. the run level to enter by default, which is normally 2, 3 or 4. 2 is the most
common, ie. the full multi-user state with all processes, daemons and services activated.

The /etc/inittab file is constructed so that any special initialisation operations, such as mounting
filesystems, are executed before users are allowed to access the system.

The init man page has a very detailed description of these first few steps of system bootup. Here
is a summary:
An initial console shell is created with which to begin spawning processes. The fact that a shell is
used this early in the boot cycle is a good indication of how closely related shells are to UNIX in
general.

The scripts which init uses to manage processes are stored in the /etc/init.d directory. During
bootup, the files in /etc/rc2.d are used to bring up system processes in the correct order (the
/etc/rc0.d directory is used for shutdown - more on that later). These files are actually links to the
equivalent script files in /etc/init.d.

Each file in /etc/rc2.d (the 2 presumably corresponding to run level 2 by way of a naming
convention) all begin with S followed by two digits (S for 'Spawn' perhaps), causing them to be
executed in a specific order as determined by the first 3 characters of each file (alphanumeric).
Thus, the first file run in the console shell is /etc/rc2.d/S00anounce (a link to /etc/init.d/announce
- use 'more' or load this file into an editor to see what it does). init will run the script with
appropriate arguments depending on whether the procedure being followed is a startup or
shutdown, eg. 'start', 'stop', etc.

The /etc/config directory is used by each script in /etc/init.d to decide what it should do.
/etc/config contains files which correspond to files found in /etc/rc2.d with the same name. These
/etc/config files contain simply 'on' or 'off'. The chkconfig command is used to test the
appropriate file by each script, returning true or false depending on its contents and thus
determining whether the script does anything. An admin uses chkconfig to set the various files'
contents to on or off as desired, eg. to switch a system into stand-alone mode, turn off all
network-related services on the next reboot:

chkconfig network off


chkconfig nfs off
ckkconfig yp off
chkconfig named off
init 6

Enter chkconfig on its own to see the current configuration states.

Lower-level functions are performed first, beginning with a SCSI driver check to ensure that the
system disk is going to be accessed correctly. Next, key file systems are mounted. Then the
following steps occur, IF the relevant /etc/config file contains 'on' for any step which depends on
that fact:

 A check to see if any system crash files are present (core dumps) and if so to send a
message to stdout.
 Display company trademark information if present; set the system name.
 Begin system activity reporting daemons.
 Create a new OS kernel if any system changes have been made which require it (this is
done by testing whether or not any of the files in /var/sysgen are newer than the /unix
kernel file).
 Configure and activate network ports.
 etc.
Further services/systems/tasks to be activated if need be include ip-aliasing, system auditing,
web servers, license server daemons, core dump manager, swap file configuration, mail daemon,
removal of /tmp files, printer daemon, higher-level web servers such as Netscape Administration
Server, cron, PPP, device file checks, and various end-user and application daemons such as the
midi sound daemon which controls midi library access requests.

This isn't a complete list, and servers will likely have more items to deal with than clients, eg.
starting up DNS, NIS, security & auditing daemons, quotas, internet routing daemons, and more
than likely a time daemon to serve as a common source of current time for all clients.

It should be clear that the least important services are executed last - these usually concern user-
related or application-related daemons, eg. AppleTalk, Performance Co-Pilot, X Windows
Display Manager, NetWare, etc.

Even though a server or client may initiate many background daemon processes on bootup,
during normal system operation almost all of them are doing nothing at all. A process which isn't
doing anything is said to be 'idle'. Enter:

ps -ef

The 'C' column shows the activity level of each process. No matter when one checks, almost all
the C entries will be zero. UNIX background daemons only use CPU time when they have to, ie.
they remain idle until called for. This allows a process which truly needs CPU cycles to make
maximum use of available CPU time.

The scripts in /etc/init.d may startup other services if necessary as well. Extra
configuration/script files are often found in /etc/config in the form of a file called
servicename.options, where 'servicename' is the name of the normal script run by init.

Note: the 'verbose' file in /etc/config is used by scripts to dynamically redefine whether the echo
command is used to output progress messages. Each script checks whether verbose mode is on
using the chkconfig command; if on, then a variable called $ECHO is set to 'echo'; if off,
$ECHO is set to something which is interpreted by a shell to mean "ignore everything that
follows this symbol", so setting verbose mode to off means every echo command in every script
(which uses the $ECHO test and set procedure) will produce no output at all - a simple, elegant
and clean way of controlling system behaviour.

When shutting a system down, the behaviour described above is basically just reversed. Scripts
contained in the /etc/rc0.d directory perform the necessary actions, with the name prefixes
determining execution order. Once again, the first three characters of each file name decide the
alphanumeric order in which to execute the scripts; 'K' probably stands for 'Kill'. The files in
/etc/rc0.d shutdown user/application-related daemons first, eg. the MIDI daemon. Comparing the
contents of /etc/rc2.d and /etc/rc0.d, it can be seen that their contents are mirror images of each
other.

The alphanumeric prefixes used for the /etc/rc*.d directories are defined in such a way as to
allow extra scripts to be included in those directories, or rather links to relevant scripts in
/etc/init.d. Thus, a custom 'static route' (to force a client to always route externally via a fixed
route) can be defined by creating new links from /etc/rc2.d/S31network and
/etc/rc0.2/K39network, to a custom file called network.local in /etc/init.d.

There are many numerical gaps amongst the files, allowing for great expansion in the number of
scripts which can be added in the future.

References:

1. Extreme Technologies:
2. http://www.futuretech.vuurwerk.nl/extreme.html
3. DDS1 vs. DDS3 DAT Performance Tests:
4. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT1
5. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT2
6. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT3
7. http://www.futuretech.vuurwerk.nl/perfcomp.html#DAT4
8. "Success With DDS Media", Hewlett Packard, Edition 1, February 1991.
Detailed Notes for Day 3 (Part 4)
UNIX Fundamentals: Security and Access Control.

General Security.

Any computer system must be secure, whether it's connected to the Internet or not. Some issues
may be irrelevant for Intranets (isolated networks which may or may not use Internet-style
technologies), but security is still important for any internal network, if only to protect against
employee grievances or accidental damage. Crucially, a system should not be expanded to
include external network connections until internal security has been dealt with, and individual
systems should not be added to a network until they have been properly configured (unless the
changes are of a type which cannot be made until the system is physically connected).

However, security is not an issue which can ever be finalised; one must constantly maintain an
up-to-date understanding of relevant issues and monitor the system using the various available
tools such as 'last' (display recent logins; there are many other available tools and commands).

In older UNIX variants, security mostly involved configuring the contents of various
system/service setup files. Today, many UNIX OSs offer the admin a GUI-frontend security
manager to deal with security issues in a more structured way. In the case of SGI's IRIX, version
6.5 has such a GUI tool, but 6.2 does not. The GUI tool is really just a convenient way of
gathering together all the relevant issues concerning security in a form that is easier to deal with
(ie. less need to look through man pages, online books, etc.) The security issues themselves are
still the same.

UNIX systems have a number of built-in security features which offer a reasonably acceptable
level of security without the need to install any additional software. UNIX gives users a great
deal of flexibility in how they manage and share their files and data; such convenience may be
incompatible with an ideal site security policy, so decisions often have to be taken about how
secure a system is going to be - the more secure a system is, the less flexible for users it
becomes.

Older versions of any UNIX variant will always be less secure than newer ones. If possible, an
admin should always try and use the latest version in order to obtain the best possible default
security. For example, versions of IRIX as old as 5.3 (circa 1994) had some areas of subtle
system functionality rather open by default (eg. some feature or service turned on), whereas
versions later than 6.0 turned off the features to improve the security of a default installation -
UNIX vendors began making these changes in order to comply with the more rigorous standards
demanded by the Internet age.

Standard UNIX security features include:

1. File ownership,
2. File permissions,
3. System activity monitoring tools, eg. who, ps, log files,
4. Encryption-based, password-protected user accounts,
5. An encryption program (crypt) which any user can exploit.

Figure 60. Standard UNIX security features.

All except the last item above have already been discussed in previous lectures.

The 'crypt' command can be used by the admin and users to encrypt data, using an encryption
key supplied as an argument. Crypt employs an encryption schema based on similar ideas used in
the German 'Enigma' machine in WWII, although crypt's implementation of the mathematical
equivalent is much more complex, like having a much bigger and more sophisticated Enigma
machine. Crypt is a satisfactorily secure program; the man page says, "Methods of attack on such
machines are known, but not widely; moreover the amount of work required is likely to be
large."

However, since crypt requires the key to be supplied as an argument, commands such as ps could
be used by others to observe the command in operation, and hence the key. This is crypt's only
weakness. See the crypt man page for full details on how crypt is used.

Responsibility.

Though an admin has to implement security policies and monitor the system, ordinary users are
no less responsible for ensuring system security in those areas where they have influence and can
make a difference. Besides managing their passwords carefully, users should control the
availability of their data using appropriate read, write and execute file permissions, and be aware
of the security issues surrounding areas such as accessing the Internet.

Security is not just software and system files though. Physical aspects of the system are also
important and should be noted by users as well as the admin.

Thus:

 Any item not secured with a lock, cable, etc. can be removed by anyone who has physical
access.
 Backups should be securely stored.
 Consider the use of video surveillance equipment and some form of metal-key/key-
card/numeric-code entry system for important areas.
 Account passwords enable actions performed on the system to be traced. All accounts should
have passwords. Badly chosen passwords, and old passwords, can compromise security. An
admin should consider using password-cracking software to ensure that poorly chosen
passwords are not in use.
 Group permissions for files should be set appropriately (user, group, others).
 Guest accounts can be used anonymously; if a guest account is necessary, the tasks which can
be carried out when logged in as guest should be restricted. Having open guest accounts on
multiple systems which do not have common ordinary accounts is unwise - it allows users to
anonymously exchange data between such systems when their normal accounts would not
allow them to do so. Accounts such as guest can be useful, but they should be used with care,
especially if they are left with no password.
 Unused accounts should be locked out, or backed up and removed.
 If a staff member leaves the organisation, passwords should be changed to ensure such former
users do not retain access.
 Sensitive data should not be kept on systems with more open access such as anonymous ftp and
modem dialup accounts.
 Use of the su command amongst users should be discouraged. Its use may be legitimate, but it
encourages lax security (ordinary users have to exchange passwords in order to use su). Monitor
the /var/adm/sulog file for any suspicious use of su.
 Ensure that key files owned by a user are writeable only by that user, thus preventing 'trojan
horse' attacks. This also applies to root-owned files/dirs, eg. /, /bin, /usr/bin, /etc, /var, and so
on. Use find and other tools to locate directories that are globally writeable - if such a directory
is a user's home directory, consider contacting the user for further details as to why their home
directory has been left so open. For added security, use an account-creation schema which sets
users' home directories to not be readable by groups or others by default.
 Instruct users not to leave logged-in terminals unattended. The xlock command is available to
secure an unattended workstation but its use for long periods may be regarded as inconsiderate
by other users who are not able to use the terminal, leading to the temptation of rebooting the
machine, perhaps causing the logged-in user to lose data.
 Only vendor-supplied software should be fully trusted. Commercial 3rd-party software should
be ok as long as one has confidence in the supplier, but shareware or freeware software must
be treated with care, especially if such software is in the form of precompiled ready-to-run
binaries (precompiled non-vendor software might contain malicious code). Software distributed
in source code form is safer, but caution is still required, especially if executables have to be
owned by root and installed using the set-UID feature in order to run. Set-UID and set-GID
programs have legitimate uses, but because they are potentially harmful, their presence on a
system should be minimised. The find command can be used to locate such files, while older file
system types (eg. EFS) can be searched with commands such as ncheck.
 Network hardware can be physically tapped to eavesdrop on network traffic. If security must be
particularly tight, keep important network hardware secure (eg. locked cupboard) and regularly
check other network items (cables, etc.) for any sign of attack. Consider using specially secure
areas for certain hardware items, and make it easy to examine cabling if possible (keep an up-to-
date printed map to aid checks). Fibre-optic cables are harder to interfere with, eg. FDDI.
Consider using video surveillance technologies in such situations.
 Espionage and sabotage are issues which some admins may have to be aware of, especially
where commercially sensitive or government/police-related work data is being manipulated.
Simple example: could someone see a monitor screen through a window using a telescope?
What about RF radiation? Remote scanners can pickup stray monitor emissions, so consider
appropriate RF shielding (Faraday Cage). What about insecure phone lines? Could someone,
even an ordinary user, attach a modem to a system and dial out, or allow someone else to dial
in?
 Keep up-to-date with security issues; monitor security-related sites such as www.rootshell.com,
UKERNA, JANET, CERT, etc. [7]. Follow any extra advice given in vendor-specific security FAQ
files (usually posted to relevant 'announce' or 'misc' newsgroups, eg. comp.sys.sgi.misc). Most
UNIX vendors also have an anonymous ftp site from which customers can obtain security
patches and other related information. Consider joining any specialised mailing lists that may be
available.
 If necessary tasks are beyond one's experience and capabilities, consider employing a vendor-
recommended external security consultancy team.
 Exploit any special features of the UNIX system being used, eg. at night, an Indy's digital camera
could be used to send single frames twice a second across the network to a remote system for
subsequent compression, time-stamping and recording. NB: this is a real example which SGI
once helped a customer to do in order to catch some memory thieves.

Figure 61. Aspects of a system relevant to security.

Since basic security on UNIX systems relies primarily on login accounts, passwords, file
ownership and file permissions, proper administration and adequate education of users is
normally sufficient to provide adequate security for most sites. Lapses in security are usually
caused by human error, or improper use of system security features. Extra security actions such
as commercial security-related software are not worth considering if even basic features are not
used or are compromised via incompetence.

An admin can alter the way in which failed login attempts are dealt with by configuring the
/etc/default/login file. There are many possibilities and options - see the 'login' reference page for
details (man login). For example, an effective way to enhance security is to make repeated
guessing of account passwords an increasingly slow process by penalising further login attempts
with ever increasing delays between login failures. Note that GUI-based login systems may not
support features such as this, though one can always deactivate them via an appropriate
chkconfig command.

Most UNIX vendors offer the use of hardware-level PROM passwords to provide an extra level
of security, ie. a password is required from a users who attempts to gain access to any low-level
hardware PROM-based 'Command Monitor', giving greater control over who can carry out
admin-level actions. While PROM passwords cannot prevent physical theft (eg. someone
stealing a disk and accessing its data by installing it as an option drive on another system), they
do limit the ability of malicious users to boot a system using their own program or device (a
common flaw with Mac systems), or otherwise harm the system at its lowest level. If the PROM
password has been forgotten, the root user can reset it. If both are lost, then one will usually have
to resort to setting a special jumper on the system motherboard, or temporarily removing the
PROM chip altogether (the loss of power to the chip resets the password).

Shadow Passwords

If the /etc/passwd file can be read by users, then there is scope for users to take a copy away to
be brute-force tested with password-cracking software. The solution is to use a shadow password
file called /etc/shadow - this is a copy of the ordinary password file (/etc/passwd) which cannot
be accessed by non-root users. When in use, the password fields in /etc/passwd are replaced with
an 'x'. All the usual password-related programs work in the same way as before, though shadow
passwords are dealt with in a different way for systems using NIS (this is because NIS keeps all
password data for ordinary users in a different file called /etc/passwd.nis). Users won't notice any
difference when shadow passwords are in use, except that they won't be able to see the encrypted
form of their password anymore.

The use of shadow passwords is activated simply by running the 'pwconv' program (see the man
page for details). Shadow passwords are in effect as soon as this command has been executed.

Password Ageing.

An admin can force passwords to age automatically, ensuring that users must set a new password
at desired intervals, or no earlier than a certain interval, or even immediately. The passwd
command is used to control the various available options. Note that NIS does not support
password ageing.

Choosing Passwords.

Words from the dictionary should not be used, nor should obvious items such as film characters
and titles, names of relatives, car number plates, etc. Passwords should include obscure
characters, digits and punctuation marks. Consider using and mixing words from other
languages, eg. Finnish, Russian, etc.

An admin should not use the same root password for more than one system, unless there is good
reason.

When a new account is created, a password should be set there and then. If the user is not
immediately present, a default password such as 'password' might be used in the expectation that
the user will login in immediately and change it to something more suitable. An admin should
lockout the account if the password isn't changed after some duration: replace the password entry
for the user concerned in the /etc/passwd file with anything that contains at least one character
that is not used by the encryption schema, eg. '*'.

Modern UNIX systems often include a minimum password length and may insist on certain rules
about what a password can be, eg. at least one digit.

Network Security.

As with other areas of security, GUI tools may be available for controlling network-related
security issues, especially those concerning the Internet. Since GUI tools may vary between
different UNIX OSs, this discussion deals mainly with the command line tools and related files.

Reminder: there is little point in tightening network security if local security has not yet been
dealt with, or is lax.

Apart from the /etc/passwd file, the other important files which control network behaviour are:
/etc/hosts.equiv A list of trusted hosts.

.rhosts A list of hosts that are allowed


access to a specific user account.

Figure 62. Files relevant to network behaviour.

These three files determine whether a host will accept an access request from programs such as
rlogin, rcp, rsh, or rdist. Both hosts.equiv and .rhosts have reference pages (use 'man hosts.equiv'
and 'man rhosts').

Suppose a user on host A attempts to access a remote host B. As long as the hosts.equiv file on B
contains the host name of A, and B's /etc/passwd lists A's user ID as a valid account, then no
further checks occur and the access is granted (all successful logins are recorded in
/var/adm/SYSLOG). The hosts.equiv file used by the Ve24 Indys contains the following:

localhost
yoda.comp.uclan.ac.uk
akira.comp.uclan.ac.uk
ash.comp.uclan.ac.uk
cameron.comp.uclan.ac.uk
chan.comp.uclan.ac.uk
conan.comp.uclan.ac.uk
gibson.comp.uclan.ac.uk
indiana.comp.uclan.ac.uk
leon.comp.uclan.ac.uk
merlin.comp.uclan.ac.uk
nikita.comp.uclan.ac.uk
ridley.comp.uclan.ac.uk
sevrin.comp.uclan.ac.uk
solo.comp.uclan.ac.uk
spock.comp.uclan.ac.uk
stanley.comp.uclan.ac.uk
warlock.comp.uclan.ac.uk
wolfen.comp.uclan.ac.uk
woo.comp.uclan.ac.uk
milamber.comp.uclan.ac.uk

Figure 63. hosts.equiv files used by Ve24 Indys.

Thus, once logged into one of the Indys, a user can rlogin directly to any of the other Indys
without having to enter their password again, and can execute rsh commands, etc. A staff
member logged into Yoda can login into any of the Ve24 Indys too (students cannot do this).

The hosts.equiv files on Yoda and Milamber are completely different, containing only references
to each other as needed. Yoda's hosts.equiv file contains:

localhost
milamber.comp.uclan.ac.uk
Figure 64. hosts.equiv file for yoda.

Thus, Yoda trusts Milamber. However, Milamber's hosts.equiv only contains:

localhost

Figure 65. hosts.equiv file for milamber.

ie. Milamber doesn't trust Yoda, the rationale being that even if Yoda's root security is
compromised, logging in to Milamber as root is blocked. Hence, even if a hack attack damaged
the server and Ve24 clients, I would still have at least one fully functional secure machine with
which to tackle the problem upon its discovery.

Users can extend the functionality of hosts.equiv by using a .rhosts file in their home directory,
enabling or disabling access based on host names, group names and specific user account names.

The root login only uses the /.rhosts file if one is present - /etc/hosts.equiv is ignored.

NOTE: an entry for root in /.rhosts on a local system allows root users on a remote system to
gain local root access. Thus, including the root name in /.rhosts is unwise. Instead, file transfers
can be more securely dealt with using ftp via a guest account, or through an NFS-mounted
directory. An admin should be very selective as to the entries included in root's .rhosts file.

A user's .rhosts file must be owned by either the user or root. If it is owned by anyone else, or if
the file permissions are such that it is writeable by someone else, then the system ignores the
contents of the user's .rhosts file by default.

An admin may decide it's better to bar the use of .rhosts files completely, perhaps because an
external network of unknown security status is connected. The .rhosts files can be barred by
adding a -l option to the rshd line in /etc/inetd.conf (use 'man rshd' for further details).

Thus, the relationship between the 20 different machines which form the SGI network I run is as
follows:

 All the Indys in Ve24 trust each other, as well as Yoda and Milamber.
 Yoda only trusts Milamber.
 Milamber doesn't trust any system.

With respect to choosing root passwords, I decided to use the following configuration:

 All Ve24 systems have the same root password and the same PROM password.
 Yoda and Milamber have their own separate passwords, distinct from all others.

This design has two deliberate consequences:

 Ordinary users have flexible access between the Indys in Ve24,


 If the root account of any of the Ve24 Indys is compromised, the unauthorised user will not be
able to gain access to Yoda or Milamber as root. However, the use of NFS compromises such a
schema since, for example, a root user on a Ve24 Indy could easily alter any files in /home,
/var/mail, /usr/share and /mapleson.

With respect to the use of identical root and PROM passwords on the Ve24 machines: because Internet
access (via a proxy server) has recently been setup for users, I will probably change the schema in order
to hinder brute force attacks.

The /etc/passwd File and NIS.

The NIS service enables users to login to a client by including the following entry as the last line
in the client's /etc/passwd file:

+::0:0:::

Figure 66. Additional line in /etc/passwd enabling NIS.

For simplicity, a + on its own can be used. I prefer to use the longer version so that if I want to
make changes, the fields to change are immediately visible.

If a user logs on with an account ID which is not listed in the /etc/passwd file as a local account,
then such an entry at the end of the file instructs the system to try and get the account
information from the NIS server, ie. Yoda. Since Yoda and Milamber do not include this extra
line in /etc/passwd, students cannot login to them with their own ID anyway, no matter the
contents of .rhosts and hosts.equiv.

inetd and inetd.conf

inetd is the 'Internet Super-server'. inetd listens for requests for network services, executing the
appropriate program for each request.

inetd is started on bootup by the /etc/init.d/network script (called by the /etc/rc2.d/S30network


link via the init process). It reads its configuration information from /etc/inetd.conf.

By using a super-daemon in this way, a single daemon is able to invoke other daemons when
necessary, reducing system load and using resources such as memory more efficiently.

The /etc/inetd.conf file controls how various network services are configured, eg. logging
options, debugging modes, service restrictions, the use of the bootp protocol for remote OS
installation, etc. An admin can control services and logging behaviour by customising this file. A
reference page is available with complete information ('man inetd').
Services communicate using 'port' numbers, rather like separate channels on a CB radio.
Blocking the use of certain port numbers is a simple way of preventing a particular service from
being used. Network/Internet services and their associated port numbers are contained in the
/etc/services database. An admin can use the 'fuser' command to identify which processes are
currently using a particular port, eg. to see the current use of TCP port 25:

fuser 25/tcp

On Yoda, an output similar to the following would be given:

yoda # fuser 25/tcp


25/tcp: 855o
yoda # ps -ef | grep 855 | grep -v grep
root 855 1 0 Apr 27 ? 5:01 /usr/lib/sendmail -bd -q15m

Figure 67. Typical output from fuser.

Insert (a quick example of typical information hunting): an admin wants to do the same on the
ftp port, but can't remember the port number. Solution: use grep to find the port number from
/etc/services:

yoda 25# grep ftp /etc/services


ftp-data 20/tcp
ftp 21/tcp
tftp 69/udp
sftp 115/tcp
yoda 26# fuser 21/tcp
21/tcp: 255o
yoda 28# ps -ef | grep 255 | grep -v grep
root 255 1 0 Apr 27 ? 0:04 /usr/etc/inetd
senslm 857 255 0 Apr 27 ? 11:44 fam
root 11582 255 1 09:49:57 pts/1 0:01 rlogind

An important aspect of the inetd.conf file is the user name field which determines which user ID
each process runs under. Changing this field to a less privileged ID (eg. nobody) enables system
service processes to be given lower access permissions than root, which may be useful for further
enhancing security. Notice that services such as http (the WWW) are normally already set to run
as nobody. Proxy servers should also run as nobody, otherwise http requests may be able to
retrieve files such as /etc/passwd (however, some systems may have the nobody user defined so
that it cannot run programs, so another user may have to be used - an admin can make one up).

Another common modification made to inetd.conf in order to improve security is to restrict the
use of the finger command, eg. with -S to prevent login status, home directory and shell
information from being given out. Or more commonly the -f option is used which forces any
finger request to just return the contents of a file, eg. yoda's entry for the finger service looks like
this:

finger stream tcp nowait guest /usr/etc/fingerd fingerd -f


/etc/fingerd.message
Figure 68. Blocking the use of finger in the /etc/inetd.conf file.

Thus, any remote user who executes a finger request to yoda is given a brief message [3].

If changes are made to the inetd.conf file, then inetd must be notified of the changes, either by
rebooting the system or via the following command (which doesn't require a reboot afterwards):

killall -HUP inetd

Figure 69. Instructing inetd to restart itself (using killall).

In general, a local trusted network is less likely to require a highly restricted set of services, ie.
modifying inetd.conf becomes more important when connecting to external networks, especially
the Internet. Thus, an admin should be aware that creating a very secure inetd.conf file on an
isolated network or Intranet may be unduly harsh on ordinary users.

X11 Windows Network Access

The X Windows system is a window system available for a wide variety of different computer
platforms which use bitmap displays [8]. Its development is managed by the X Consortium, Inc.
On SGI IRIX systems, the X Windows server daemon is called 'Xsgi' and conforms to Release 6
of the X11 standard (X11R6).

The X server, Xsgi, manages the flow of user/application input and output requests to/from client
programs using a number of interprocess communication links. The xdm daemon acts as the
display manager. Usually, user programs are running on the same host as the X server, but X
Windows also supports the display of client programs which are actually running on remote
hosts, even systems using completely different OSs and hardware platforms, ie. X is network-
transparent.

The X man page says:

"X supports overlapping hierarchical subwindows and text and


graphics operations, on both monochrome and color displays."

One unique side effect of this is that access to application mouse menus is independent of
application focus, requiring only a single mouse click for such actions. For example, suppose
two application windows are visible on screen:

 a jot editor session containing an unsaved file (eg. /etc/passwd.nis),


 a shell window which is partially obscuring the jot window.

With the shell window selected, the admin is about to run /var/yp/ypmake to reparse the password
database file, but realises the file isn't saved. Moving the mouse over the partially hidden jot window,
the admin holds down the right mouse button: this brings up jot's right-button menu (which may or may
not be partly ontop of the shell window even though the jot window is at the back) from which the
admin clicks on 'Save'; the menu disappears, the file is saved, but the shell window is still on top of the
jot window, ie. their relative front/back positions haven't changed during the operation.

The ability of X to process screen events independently of which application window is currently
in focus is a surprisingly useful time-saving feature. Every time a user does an action like this, at
least one extraneous mouse click is prevented; this can be shown by comparing to MS Windows
interfaces:

 Under Win95 and Win98, trying to access an application's right-button menu when the
application's window is currently not in focus requires at least two extraneous mouse clicks: the
first click brings the application in focus (ie. to the front), the second brings up the menu, and a
third (perhaps more if the original application window is now completely hidden) brings the
original application window back to the front and in focus. Thus, X is at least 66% more efficient
for carrying out this action compared to Win95/Win98.
 Under WindowsNT, attempting the same action requires at least one extraneous mouse click:
the first click brings the application in focus and reveals the menu, and a second (perhaps more,
etc.) brings the original application window back to the front and in focus. Thus, X is at least 50%
more efficient for carrying out this action compared to NT.

The same effect can be seen when accessing middle-mouse menus or actions under X, eg. text can be
highlighted and pasted to an application with the middle-mouse button even when that application is
not in focus and not at the front. This is a classic example of how much more advanced X is over
Microsoft's GUI interface technologies, even though X is now quite old. X also works in a way which links
to graphics libraries such as OpenGL.

Note that most UNIX-based hardware platforms use video frame buffer configurations which
allow a large number of windows to be present without causing colour map swapping or other
side effects, ie. the ability to have multiple overlapping windows is a feature supported in
hardware, eg. Indigo2 [6].

X is a widely used system, with emulators available for systems which don't normally use X, eg.
Windows Exceed for PCs.

Under the X Window System, users can run programs transparently on remote hosts that are part
of the local network, and can even run applications on remote hosts across the Internet with the
windows displayed locally if all the various necessary access permissions have been correctly set
at both ends. An 'X Display Variable' is used to denote which host the application should attempt
to display its windows on. Thus, assuming a connection with a remote host to which one had
authorised telnet access (eg. haarlem.vuurwerk.nl), from a local host whose domain name is
properly visible on the Internet (eg. thunder.uclan.ac.uk), then the local display of applications
running on the remote host is enabled with a command such as:

haarlem% set DISPLAY = thunder.uclan.ac.uk:0.0


I've successfully used this method while at Heriot Watt to run an xedit editor on a remote system
in England but with the xedit window itself displayed on the monitor attached to the system I
was physically using in Scotland.

The kind of inter-system access made possible by X has nothing to do with login accouns,
passwords, etc. and is instead controlled via the X protocols. The 'X' man page has full details,
but note: the man page for X is quite large.

A user can utilise the xhost command to control access to their X display. eg. 'xhost -' bars access
from all users, while 'xhost +harry' gives X access to the user harry.

Note that system-level commands and files which relate to xhost and X in general are stored in
/var/X11/xdm.

Firewalls [4].

A firewall is a means by which a local network of trusted hosts can be connected to an external
untrusted network, such as the Internet, in a more secure manner than would otherwise be the
case. 'Firewall' is a conceptual idea which refers to a combination of hardware and software steps
taken to setup a desired level of security; although an admin can setup a firewall via basic steps
with as-supplied tools, all modern systems have commercial packages available to aid in the task
of setting up a firewall environment, eg. Gauntlet for IRIX systems.

As with other security measures, there is a tradeoff between ease of monitoring/administration,


the degree of security required, and the wishes/needs of users. A drawback of firewalls is when a
user has a legitimate need to access packets which are filtered out - an alternative is to have each
host on the local network configured according to a strict security regime.

The simplest form of a firewall is a host with more than one network interface, called a dual-
homed host [9]. Such hosts effectively exist on two networks at once. By configuring such a host
in an appropriate manner, it acts as a controllable obstruction between the local and external
network, eg. the Internet.
A firewall does not affect the communications between hosts on an internal network; only the
way in which the internal network interacts with the external connection is affected. Also, the
presence of a firewall should not be used as an excuse for having less restrictive security
measures on the internal network.

One might at first think that Yoda could be described as a firewall, but it is not, for a variety of
reasons. Ideally, a firewall host should be treated thus:

 no ordinary user accounts (root admin only, with a different password),


 as few services as possible (the more services are permitted, the greater is the chance of a
security hole; newer, less-tested software is more likely to be at risk) and definitely no NIS or
NFS,
 constantly monitored for access attempts and unusual changes in files, directories and software
(commands: w, ps, 'versions changed', etc.),
 log files regularly checked (and not stored on the firewall host!),
 no unnecessary applications,
 no anonymous ftp!

Yoda breaks several of these guidelines, so it cannot be regarded as a firewall, even though a range of
significant security measures are in place. Ideally, an extra host should be used, eg. an Indy (additional
Ethernet card required to provide the second Ethernet port), or a further server such as Challenge S. A
simple system like Indy is sufficient though, or other UNIX system such as an HP, Sun, Dec, etc. - a Linux
PC should not be used though since Linux has too many security holes in its present form. [1]

Services can be restricted by making changes to files such as /etc/inetd.conf, /etc/services, and
others. Monitoring can be aided via the use of free security-related packages such as COPS - this
package can also check for bad file permission settings, poorly chosen passwords, system setup
file integrity, root security settings, and many other things. COPS can be downloaded from:
ftp://ftp.cert.org/pub/tools/cops

Monitoring a firewall host is also a prime candidate for using scripts to automate the monitoring
process.

Other free tools include Tripwire, a file and directory integrity checker:

ftp://ftp.cert.org/pub/tools/tripwire

With Tripwire, files are monitored and compared to information stored in a database. If files
change when they're supposed to remain static according to the database, the differences are
logged and flagged for attention. If used regularly, eg. via cron, action can be taken immediately
if something happens such as a hacking attempt.

Firewall environments often include a router - a high speed packet filtering machine installed
either privately or by the ISP providing the external connection. Usually, a router is installed
inbetween a dual-homed host and the outside world [9]. This is how yoda is connected, via a
router whose address is 193.61.250.33, then through a second router at 193.61.250.65 before
finally reaching the JANET gateway at Manchester.

Routers are not very flexible (eg. no support for application-level access restriction systems such
as proxy servers), but their packet-filtering abilities do provide a degree of security, eg. the router
at 193.61.250.33 only accepts packets on the 193.61.250.* address space.

However, because routers can block packet types, ports, etc. it is possible to be overly restrictive
with their use, eg. yoda cannot receive USENET packets because they're blocked by the router.
In such a scenario, users must resort to using WWW-based news services (eg. DejaNews) which
are obviously less secure than running and managing a locally controlled USENET server, as
well as being more wasteful of network resources.

Accessing sites on the web poses similar security problems to downloading and using Internet-
sourced software, ie. the source is untrusted, unless vendor-verified with checksums, etc. When a
user accesses a site and attempts to retrieves data, what happens next cannot be predicted, eg. a
malicious executable program could be downloaded (this is unlikely to damage root-owned files,
but users could lose data if they're not careful). Users should be educated on these issues, eg.
turning off Java script features and disallowing cookies if necessary.

If web access is of particular concern with regard to security, one solution is to restrict web
access to just a limited number of internal hosts.

Anonymous ftp.

An anonymous FTP account allows a site to make information available to anyone, while still
maintaining control over access issues. Users can login to an anonymous FTP account as
'anonymous' or 'ftp'. The 'chroot' command is used to put the user in the home directory for
anonymous ftp access (~ftp), preventing access to other parts of the filesystem. A firewall host
should definitely not have an anonymous FTP account. A site should not provide such a service
unless absolutely necessary, but if it does then an understanding of how the anonymous FTP
access system works is essential to ensuring site security, eg. preventing outside agents from
using the site as a transfer point for pirated software. How an anon FTP account is used should
be regularly monitored.

Details of how to setup an anon FTP account can usually be found in a vendor's online
information; for IRIX, the relevant source is the section entitled, "Setting Up an Anonymous
FTP Account" in chapter three of the, "IRIX Admin: Networking and Mail" guide.

UNIX Fundamentals: Internet access: files and services. Email.

For most users, the Internet means the World Wide Web ('http' service), but this is just one
service out of many, and was in fact a very late addition to the Internet as a whole. Before the
advent of the web, Internet users were familiar with and used a wide range of services, including:

 telnet (interactive login sessions on remote hosts),



 ftp (file/data transfer using continuous connections),

 tftp (file/data transfer using temporary connections)

 NNTP (Internet newsgroups, ie. USENET)

 SMTP (email)

 gopher (remote host data searching and retrieval system)

 archie (another data-retrieval system)

 finger (probe remote site for user/account information)

 DNS (Domain Name Service)

Exactly which services users can use is a decision best made by consultation, though some users
may have a genuine need for particular services, eg. many public database systems on sites such
as NASA are accessed by telnet only.

Disallowing a service automatically improves security, but the main drawback will always be a
less flexible system from a user's point of view, ie. a balance must be struck between the need for
security and the needs of users. However, such discussions may be irrelevant if existing site
policies already state what is permitted, eg. UCLAN's campus network has no USENET service,
so users exploit suitable external services such as DejaNews [2].

For the majority of admins, the most important Internet service which should be appropriately
configured with respect to security is the web, especially considering today's prevalence of Java,
Java Script, and browser cookie files. It is all too easy for a modern web user to give out a
surprising amount of information about the system they're using without ever knowing it.
Features such as cookies and Java allow a browser to send a substantial amount of information to
a remote host about the user's environment (machine type, OS, browser type and version, etc.);
there are sites on the web which an admin can use to test how secure a user's browser
environment is - the site will display as much information as it can extract using all methods, so
if such sites can only report very little or nothing in return, then that is a sign of good security
with respect to user-side web issues.

There are many good web server software systems available, eg. Apache. Some even come free,
or are designed for local Intranet use on each host. However, for enhanced security, a site should
use a professional suite of web server software such as Netscape Enterprise Server; these
packages come with more advanced control mechanisms and security management features, the
configuration of which is controlled by GUI-based front-end servers, eg. Netscape
Administration Server. Similarly, lightweight proxy servers are available, but a site should a
professional solution, eg. Netscape Proxy Server. The GUI administration of web server software
makes it much easier for an admin to configure security issues such as access and service
restrictions, permitted data types, blocked sites, logging settings, etc.

Example: after the proxy server on the SGI network was installed, I noticed that users of the
campus-wide PC network were using Yoda as a proxy server, which would give them a faster
service than the University's proxy server. A proxy server which is accessible in this way is said
to be 'open'. Since all accesses from the campus PCs appear in the web logs as if they originate
from the Novix security system (ie. there is no indication of individual workstation or user), any
illegal activity would be untraceable. Thus, I decided to prevent campus PCs from using Yoda as
a proxy. The mechanism employed to achieve this was the ipfilterd program, which I had heard
of before but not used.

ipfilterd is a network packet-filtering daemon which screens all incoming IP packets based on
source/destination IP address, physical network interface, IP protocol number, source/destination
TCP/UDP port number, required service type (eg. ftp, telnet, etc.) or a combination of these. Up
to 1000 filters can be used. To improve efficiency, a configurable memory caching mechanism is
used to retain recently decided filter verdicts for a specified duration.

ipfilterd operates by using a searchable database of packet-filtering clauses stored in the


/etc/ipfilterd.conf file. Each incoming packet is compared with the filters in the file one at a time
until a match is found; if no match occurs, the packet is rejected by default. Since filtering is a
line-by-line database search process, the order in which filters are listed is important, eg. a reject
clause to exclude a particular source IP address from Ethernet port ec0 would have no effect if an
accept clause was earlier in the file that accepted all IP data from ec0, ie. in this case, the reject
should be listed before the accept. IP addresses may be specified in hex, dot format (eg.
193.61.255.4 - see the man page for 'inet'), host name or fully-qualified host name.

With IRIX 6.2, ipfilterd is not installed by default. After consulting with SGI to identify the
appropriate source CD, the software was installed, /etc/ipfilterd.conf defined, and the system
activated with:

chkconfig -f ipfilterd on
reboot

Since there was no ipfilterd on/off flag file in /etc/config by default, the -f forces the creation of
such a file with the given state.

Filters in the /etc/ipfilterd.conf file consist of a keyword and an expression denoting the type of
filter to be used; available keywords are:

 accept Accept all packets matching this filter



 reject Discard all packets matching this filter (silently)

 grab Grab all packets matching this filter

 define Define a new macro

ipfilterd supports macros, with no limit to the number of macros used.

Yoda's /etc/ipfilterd.conf file looks like this:


#
# ipfilterd.conf
# $Revision: 1.3 $
#
# Configuration file for ipfilterd(1M) IP layer packet filtering.
# Lines that begin with # are comments and are ignored.
# Lines begin with a keyword, followed either by a macro definition or
# by an optional interface filter, which may be followed by a protocol
filter.
# Both macros and filters use SGI's netsnoop(1M) filter syntax.
#
# The currently supported keywords are:
# accept : accept all packets matching this filter
# reject : silently discard packets matching this filter
# define : define a new macro to add to the standard netsnoop macros
#
# See the ipfilterd(1M) man page for examples of filters and macros.
#
# The network administrator may find the following macros useful:
#
define ip.netAsrc (src&0xff000000)=$1
define ip.netAdst (dst&0xff000000)=$1
define ip.netBsrc (src&0xffff0000)=$1
define ip.netBdst (dst&0xffff0000)=$1
define ip.netCsrc (src&0xffffff00)=$1
define ip.netCdst (dst&0xffffff00)=$1
define ip.notnetAsrc not((src&0xff000000)=$1)
define ip.notnetAdst not((dst&0xff000000)=$1)
define ip.notnetBsrc not((src&0xffff0000)=$1)
define ip.notnetBdst not((dst&0xffff0000)=$1)
define ip.notnetCsrc not((src&0xffffff00)=$1)
define ip.notnetCdst not((dst&0xffffff00)=$1)
#
# Additional macros:
#
# Filters follow:
#
accept -i ec0
reject -i ec3 ip.src 193.61.255.21 ip.dst 193.61.250.34
reject -i ec3 ip.src 193.61.255.22 ip.dst 193.61.250.34
accept -i ec3

Any packet coming from an SGI network machine is immediately accepted (traffic on the ec0
network interface). The web logs contained two different source IP addresses for accesses
coming from the campus PC network. These are rejected first if detected; a final accept clause is
then included so that all other types of packet are accepted.

The current contents of Yoda's ipfilterd.conf file does mean that campus PC users will not be
able to access Yoda as a web server either, ie. requests to www.comp.uclan.ac.uk by legitimate
users will be blocked too. Thus, the above contents of the file are experimental. Further
refinement is required so that accesses to Yoda's web pages are accepted, while requests which
try to use Yoda as a proxy to access non-UCLAN sites are rejected. This can be done by using
the ipfilterd-expression equivalent of the following if/then C-style statement:
if ((source IP is campus PC) and (destination IP is not Yoda)) then
reject packet;

Using ipfilterd has system resource implications. Filter verdicts stored in the ipfilterd cache by
the kernel take up memory; if the cache size is increased, more memory is used. A longer cache
and/or a larger number of filters means a greater processing overhead before each packet is dealt
with. Thus, for busy networks, a faster processor may be required to handle the extra load, and
perhaps more RAM if an admin increases the ipfilterd kernel cache size. In order to monitor such
issues and make decisions about resource implications as a result of using ipfilterd, the daemon
can be executed with the -d option which causes extra logging information about each filter to be
added to /var/adm/SYSLOG, ie. an /etc/config/ipfilterd.options file should be created, containing
'-d'.

As well as using programs like 'top' and 'ps' to monitor CPU loading and memory usage, log files
should be monitored to ensure they do not become too large, wasting disk space (the same
applies to any kind of log file). System logs are 'rotated' automatically to prevent this from
happening, but other logs created by 3rd-party software usually are not; such log files are not
normally stored in /var/adm either. For example, the proxy server logs are in this directory:

/var/netscape/suitespot/proxy-sysname-proxy/logs

If an admin wishes to retain the contents of older system logs such as /var/adm/oSYSLOG, then
the log file could be copied to a safe location at regular intervals, eg. once per night (the old log
file could then be emptied to save space).

A wise policy would be to create scripts which process the logs, summarising the data in a more
intuitive form. General shell script methods and programs such as grep can be used for this.

The above is just one example of the typical type of problem and its consequences that admins
come up against when managing a system:

 The first problem was how to give SGI network users Internet access, the solution to which was
a proxy server. Unfortunately, this allowed campus-PC users to exploit Yoda as an open proxy,
so ipfilterd was then employed to prevent such unauthorised use.

Thus, as stated in the introduction, managing system security is an ongoing, dynamic process.

Another example problem: in 1998, I noticed that some students were not using the SGIs (or not
asking if they could) because they thought the machines were turned off, ie. the monitor power-
saving feature would blank out the screen after some duration. I decided to alter the way the
Ve24 Indys behaved so that monitor power-saving would be deactivated during the day, but
would still happen overnight.

The solution I found was to modify the /var/X11/xdm/Xlogin file. This file contains a section
controlling monitor power-saving using the xset command, which normally looks like this:
#if [ -x /usr/bin/X11/xset ] ; then
# /usr/bin/X11/xset s 600 3600
#fi

If these lines are uncommented (the hash symbols removed), a system whose monitor supports
power-saving will tell the monitor to power down after ten minutes of unuse, after the last user
logs out. With the lines still commented out, modern SGI monitors use power-saving by default
anyway.

I created two new files in /var/X11/xdm:

-rwxr-xr-x 1 root sys 1358 Oct 28 1998 Xlogin.powersaveoff*


-rwxr-xr-x 1 root sys 1361 Oct 28 1998 Xlogin.powersaveon*

They are identical except for the the section concerning power-saving. Xlogin.powersaveoff
contains:

if [ -x /usr/bin/X11/xset ] ; then
/usr/bin/X11/xset s 0 0
fi

while Xlogin.powersaveon contains:

#if [ -x /usr/bin/X11/xset ] ; then


# /usr/bin/X11/xset s 0 0
#fi

The two '0' parameters supplied to xset in the Xlogin.powersaveoff file have a special effect (see
the xset man page for full details): the monitor is instructed to disable all power-saving features.

The cron system is used to switch between the two files when no one is present: every night at
9pm and every morning at 8am, followed by a reboot after the copy operation is complete. The
entries from the file /var/spool/cron/crontabs/cron on any of the Ve24 Indys are thus:

# Alternate monitor power-saving. Turn it on at 9pm. Turn it off at


8am.
0 21 * * * /bin/cp /var/X11/xdm/Xlogin.powersaveon
/var/X11/xdm/Xlogin && init 6&
#
0 8 * * * /bin/cp /var/X11/xdm/Xlogin.powersaveoff
/var/X11/xdm/Xlogin && init 6&

Hence, during the day, the SGI monitors are always on with the login logo/prompt visible -
students can see the Indys are active and available for use; during the night, the monitors turn
themselves off due to the new xset settings. The times at which the Xlogin changes are made
were chosen so as to occur when other cron jobs would not be running. Students use the Indys
each day without ever noticing the change, unless they happen to be around at the right time to
see the peculiar sight of 18 Indys all rebooting at once.
Static Routes.

A simple way to enable packets from clients to be forwarded through an external connection is
via the use of a 'static route'. A file called /etc/init.d/network.local is created with a simple script
that adds a routing definition to the current routing database, thus enabling packets to be
forwarded to their destination. To ensure the script is executed on bootup or shutdown, extra
links are added to the /etc/rc0.d and /etc/rc2.d directories (the following commands need only be
executed once as root):

ln -s /etc/init.d/network.local /etc/rc0.d/K39network
ln -s /etc/init.d/network.local /etc/rc2.d/S31network

Yoda once had a modem link to 'Demon Internet' for Internet access. A static route was used to
allow SGI network clients to access the Internet via the link. The contents of
/etc/init.d/network.local (supplied by SGI) was:

#!/sbin/sh
#Tag 0x00000f00
IS_ON=/sbin/chkconfig
case "$1" in
'start')
if $IS_ON network; then
/usr/etc/route add default 193.61.252.1 1
fi ;;

'stop')
/usr/etc/route delete default 193.61.252.1 ;;

*)
echo "usage: $0 {start|stop}"
;;
esac

Note the use of chkconfig to ensure that a static route is only installed on bootup if the network is
defined as active.

The other main files for controlling Internet access are /etc/services and /etc/inetd.conf. These
were discussed earlier.

Internet Access Policy.

Those sites which choose to allow Internet access will probably want to minimise the degree to
which someone outside the site can access internal services. For example, users may be able to
telnet to remote hosts from a company workstation, but should the user be able to successfully
telnet to that workstation from home in order to continue working? Such an ability would
obviously be very useful to users, and indeed administrators, but there are security implications
which may be prohibitive.
For example, students who have accounts on the SGI network cannot login to Yoda because the
/etc/passwd file contains /dev/null as their default shell, ie. they can't login because their account
'presence' on Yoda itself does not have a valid shell - another cunning use of /dev/null. The
/etc/passwd.nis file has the main user account database, so users can logon to the machines in
Ve24 as desired. Thus, with the use of /dev/null in the password file's shell field, students cannot
login to Yoda via telnet from outside UCLAN. Staff accounts on the SGI network do not have
/dev/null in the shell field, so staff can indeed login to Yoda via telnet from a remote host.

Ideally, I'd like students to be able to telnet to a Ve24 machine from a remote host, but this is not
yet possible for reasons explained in Appendix A (detailed notes for Day 2 Part 1).

There are a number of Internet sites which are useful sources of information on Internet issues,
some relating to specific areas such as newsgroups. In fact, USENET is an excellent source of
information and advice on dealing with system management, partly because of preprepared FAQ
files, but also because of the many experts who read and post to the newsgroups. Even if site
policy means users can't access USENET, an admin should exploit the service to obtain relevant
admin information.

A list of some useful reference sites are given in Appendix C.

Example Questions:

1. The positions of the 'accept ec0' and 'reject' lines in /etc/ipfilterd.conf could be swapped around
without affecting the filtering logic. So why is the ec0 line listed first? The 'netstat -i' command
(executed on Yoda) may be useful here.
2. What would an appropriate ipfilterd.conf filter (or filters) look like which blocked unauthorised
use of Yoda as a proxy to connect to an external site but still allowed access to Yoda's own web
pages via www.comp.uclan.ac.uk? Hint: the netsnoop command may be useful.

Course summary.

This course has focused on what an admin needs to know in order to run a UNIX system. SGI
systems running IRIX 6.2 have been used as an example UNIX platform, with occasional
mention of IRIX 6.5 as an example of how OSs evolve.

Admins are, of course, ordinary users too, though they often do not use the same set of
applications that other users do. Though an admin needs to know things an ordinary user does
not, occasionally users should be made aware of certain issues, eg. web browser cookie files,
choosing appropriate passwords etc.

Like any modern OS, UNIX has a vast range of features and services. This course has not by any
means covered them all (that would be impossible to do in just three days, or even thirty).
Instead, the basic things a typical admin needs to know have been introduced, especially the
techniques used to find information when needed, and how to exploit the useful features of
UNIX for daily administration.

Whatever flavour of UNIX an admin has to manage, a great many issues are always the same,
eg. security, Internet concepts, etc. Thus, an admin should consider purchasing relevant reference
books to aid in the learning process. When writing shell scripts, knowledge of the C
programming language is useful; since UNIX is the OS being used, a C programming book
(mentioned earlier) which any admin will find particularly useful is:

"C Programming in a UNIX Environment"

Judy Kay & Bob Kummerfeld, Addison Wesley Publishing, 1989.


ISBN: 0 201 12912 4

For further information on UNIX or related issues, read/post to relevant newsgroups using
DejaNews; example newsgroups are given in Appendix D.

Background Notes:

1. UNIX OSs like IRIX can be purchased in a form that passes the US Department of Defence's
Trusted-B1 security regulations (eg. 'Trusted IRIX'), whereas Linux doesn't come anywhere near
such rigorous security standards as yet. The only UNIX OS (and in fact the only OS of any kind)
which passes all of the US DoD's toughest security regulations is Unicos, made by Cray Research
(a subsidiary of SGI). Unicos and IRIX will be merged sometime in the future, creating the first
widely available commercial UNIX OS that is extremely secure - essential for fields such as
banking, local and national government, military, police (and other emergency/crime services),
health, research, telecoms, etc.

References:

2. DejaNews USENET Newsgroups, Reading/Posting service:

http://www.dejanews.com/

4. "Firewalls: Where there's smoke...", Network Week, Vol4, No. 12, 2nd December
1998, pp. 33 to 37.

5. Gauntlet 3.2 for IRIX Internet Firewall Software:

http://www.sgi.com/solutions/internet/products/gauntlet/

6. Framebuffer and Clipping Planes, Indigo2 Technical Report, SGI, 1994:


http://www.futuretech.vuurwerk.nl/i2sec4.html#4.3
http://www.futuretech.vuurwerk.nl/i2sec5.html#5.6.3

7. Useful security-related web sites:

UKERNA: http://www.ukerna.ac.uk/
JANET: http://www.ja.net/
CERT: http://www.cert.org/
RootShell: http://www.rootshell.com/
2600: http://www.2600.com/mindex.html

8. "About the X Window System", part of X11.org:

http://www.X11.org/wm/index.shtml

9. Images are from the online book, "IRIX Admin: Backup, Security, and Accounting.",
Chapter 5.

Appendix B:

3. Contents of /etc/fingerd.message:

Sorry, the finger service is not available from this host.

However, thankyou for your interest in the Department of


Computing at the University of Central Lancashire.

For more information, please see:

http://www.uclan.ac.uk/
http://www.uclan.ac.uk/facs/destech/compute/comphom.htm

Or contact Ian Mapleson at mapleson@gamers.org

Regards,

Ian.

Senior Technician,
Department of Computing,
University of Central Lancashire,
Preston,
England,
PR1 2HE.

mapleson@gamers.org
Tel: (+44 -0) 1772 893297
Fax: (+44 -0) 1772 892913

Doom Help Service (DHS): http://doomgate.gamers.org/dhs/


SGI/Future Technology/N64: http://sgi.webguide.nl/
BSc Dissertation (Doom): http://doomgate.gamers.org/dhs/diss/

Appendix C:

Example web sites useful to administrators:

AltaVista: http://altavista.digital.com/cgi-
bin/query?pg=aq
Webcrawler: http://webcrawler.com/
Lycos: http://www.lycos.com/
Yahoo: http://www.yahoo.com/
DejaNews: http://www.dejanews.com/
SGI Support: http://www.sgi.com/support/
SGI Tech/Advice Center: http://www.futuretech.vuurwerk.nl/sgi.html
X Windows: http://www.x11.org/
Linux Home Page: http://www.linux.org/
UNIXHelp for Users: http://unixhelp.ed.ac.uk/
Hacker Security Update: http://www.securityupdate.com/
UnixVsNT: http://www.unix-vs-nt.org/
RootShell: http://www.rootshell.com/
UNIX System Admin (SunOS): http://sunos-wks.acs.ohio-
state.edu/sysadm_course/html/sysadm-1.html

Appendix D:

Example newsgroups useful to administrators:

comp.security.unix
comp.unix.admin
comp.sys.sgi.admin
comp.unix.admin
comp.sys.sun.admin
comp.sys.next.sysadmin
comp.unix.aix
comp.unix.cray
comp.unix.misc
comp.unix.questions
comp.unix.shell
comp.unix.solaris
comp.unix.ultrix
comp.unix.wizards
comp.unix.xenix.misc
comp.sources.unix
comp.unix.bsd.misc
comp.unix.sco.misc
comp.unix.unixware.misc
comp.sys.hp.hpux
comp.unix.sys5.misc
comp.infosystems.www.misc
Detailed Notes for Day 3 (Part 5)
Project: Indy/Indy attack/defense (IRIX 5.3 vs. IRIX 6.5)

The aim of this practical session, which lasts two hours, is to give some experience of how an
admin typically uses a UNIX system to investigate a problem, locate information, construct and
finally implement a solution. The example problem used will likely require:

 the use of online information (man pages, online books, release notes, etc.),
 writing scripts and exploiting shell script methods as desired,
 the use of a wide variety of UNIX commands,
 identifying and exploiting important files/directories,

and so on. A time limit on the task is included to provide some pressure, which often happens in
real-world situations.

The problem situation is a simulated hacker attack/defense. Two SGI Indys are directly
connected together with an Ethernet cable; one Indy, referred to here as Indy X, is using an older
version of IRIX called IRIX 5.3 (1995), while the other (Indy Y) is using a much newer version,
namely IRIX 6.5 (1998).

Students will be split into two groups (A and B) of 3 or 4 persons each. For the first hour, group
A is placed with Indy X, while group B is with Indy Y. For the second hour, the situation is
reversed. Essentially, each group must try to hack the other group's system, locate and steal some
key information (described below), and finally cripple the enemy machine. However, since both
groups are doing this, each group must also defend against attack. Whether a group focuses on
attack or defense, or a mixture of both, is for the group's members to decide during the
preparatory stage.

The first hour is is dealt with as follows:

 For the first 35 minutes, each group uses the online information and any available notes
to form a plan of action. During this time, the Ethernet cable between the Indys X and Y
is not connected, and separate 'Research' Indys are used for this investigative stage in
order to prevent any kind of preparatory measures. Printers will be available if printouts
are desired.
 After a short break of 5 minutes to prepare/test the connection between the two Indys and
move the groups to Indys X and Y, the action begins. Each group must try to hack into
the other group's Indy, exploiting any suspected weaknesses, whilst also defending
against the other group's attack. In addition, the hidden data must be found, retrieved, and
the enemy copy erased. The end goal is to shutdown the enemy system after retrieving
the hidden data. How the shutdown is effected is entirely up to the group members.
At the end of the hour, the groups are reversed so that group B will now use an Indy running
IRIX 5.3, while group A will use an Indy running IRIX 6.5. The purpose of this second attempt
is to demonstrate how an OS evolves and changes over time with respect to security and OS
features, especially in terms of default settings, online help, etc.

Indy Specifications.

Both systems will have default installations of the respective OS version, with only minor
changes to files so that they are aware of each other's existence (/etc/hosts, and so on).

All systems will have identical hardware (133MHz R4600PC CPU, 64MB RAM, etc.) except for
disk space: Indys with IRIX 6.5 will use 2GB disks, while Indys with IRIX 5.3 will use 549MB
disks. Neither system will have any patches installed from any vendor CD updates.

The hidden data which must be located and stolen from the enemy machine by each group is the
Blender V1.57 animation and rendering archive file for IRIX 6.2:

blender1.57_SGI_6.2_iris.tar.gz

Size: 1228770 bytes.

For a particular Indy, the file will be placed in an appropriate directory in the file system, the
precise location of which will only be made known to the group using that Indy - how an
attacking group locates the file is up to the attackers to decide.

It is expected that groups will complete the task ahead of schedule; any spare time will be used
for a discussion of relevant issues:

 Reliability of relying on default settings for security, etc.


 How to detect hacking in progress, especially if an unauthorised person is carrying out
actions as root.
 Whose responsibility is it to ensure security? The admin or the user?
 If a hacker is 'caught', what kind of evidence would be required to secure a conviction?
How reliable is the evidence?

END OF COURSE.
Figure Index for Detailed Notes.

Day 1:

Figure 1. A typical root directory shown by 'ls'.


Figure 2. The root directory shown by 'ls -F /'.
Figure 3. Important directories visible in the root directory.
Figure 4. Key files for the novice administrator.
Figure 5. Output from 'man -f file'.
Figure 6. Hidden files shown with 'ls -a /'.
Figure 7. Manipulating an NFS-mounted file system with 'mount'.
Figure 8. The various available shells.
Figure 9. The commands used most often by any user.
Figure 10. Editor commands.
Figure 11. The next most commonly used commands.
Figure 12. File system manipulation commands.
Figure 13. System Information and Process Management Commands.
Figure 14. Software Management Commands.
Figure 15. Application Development Commands.
Figure 16. Online Information Commands (all available from the
'Toolchest')
Figure 17. Remote Access Commands.
Figure 18. Using chown to change both user ID and group ID.
Figure 19. Handing over file ownership using chown.

Day 2:

Figure 20. IP Address Classes: bit field and width allocations.


Figure 21. IP Address Classes: supported network types and sizes.
Figure 22. The contents of the /etc/hosts file used on the SGI network.
Figure 23. Yoda's /etc/named.boot file.
Figure 24. The example named.boot file in /var/named/Examples.
Figure 25. A typical find command.
Figure 26. Using cat to quickly create a simple shell script.
Figure 27. Using echo to create a simple one-line shell script.
Figure 28. An echo sequence without quote marks.
Figure 29. The command fails due to * being treated as a
Figure 30. Using a backslash to avoid confusing the shell.
Figure 31. Using find with the -exec option to execute rm.
Figure 32. Using find with the -exec option to execute ls.
Figure 33. Redirecting the output from find to a file.
Figure 34. A simple script with two lines.
Figure 35. The simple rebootlab script.
Figure 36. The simple remountmapleson script.
Figure 37. The daily tasks of an admin.
Figure 38. Using df without options.
Figure 39. The -k option with df to show data in K.
Figure 40. Using df to report usage for the file
Figure 41. Using du to report usage for several directories/files.
Figure 42. Restricting du to a single directory.
Figure 43. Forcing du to ignore symbolic links.
Figure 44. Typical output from the ps command.
Figure 45. Filtering ps output with grep.
Figure 46. top shows a continuously updated output.
Figure 47. The IRIX 6.5 version of top, giving extra information.
Figure 48. System information from osview.
Figure 49. CPU information from osview.
Figure 50. Memory information from osview.
Figure 51. Network information from osview.
Figure 51. Miscellaneous information from osview.
Figure 52. Results from ttcp between two hosts on a 10Mbit network.
Figure 53. The output from netstat.
Figure 54. Example use of the ping command.
Figure 55. The output from rup.
Figure 56. The output from uptime.
Figure 57. The output from w showing current user activity.
Figure 58. Obtaining full domain addresses from w with the -W option.
Figure 59. The output from rusers, showing who is logged on where.

Day 3:

Figure 60. Standard UNIX security features.


Figure 61. Aspects of a system relevant to security.
Figure 62. Files relevant to network behaviour.
Figure 63. hosts.equiv files used by Ve24 Indys.
Figure 64. hosts.equiv file for yoda.
Figure 65. hosts.equiv file for milamber.
Figure 66. Additional line in /etc/passwd enabling NIS.
Figure 67. Typical output from fuser.
Figure 68. Blocking the use of finger in the /etc/inetd.conf file.
Figure 69. Instructing inetd to restart itself (using killall).

Das könnte Ihnen auch gefallen