prev Draft Version 475 (Tue Sep 13 21:33:01 2005) next
Introduction Most modern tools have a graphical user interface (GUI) Because they're easier to use But command-line user interfaces (CLUIs) still have their place Easier (faster) to build new CLUI tools Easier to combine CLUI tools than GUI tools Higher action-to-keystroked ratio Once you're over the (steeper) learning curve Easier to see and understand what the computer is doing on your behalf Which is part of what this course is about Simplest way to introduce the file system, I/O redirection, and process control Don't worry if you don't understand these terms right now-you will [Ray & Ray 2003] is a good introduction for newcomers And despite its age, [Kernighan & Pike 1984] is still useful How to tell if you can skip this lecture Do you know what a shell is? Do you know the difference between an absolute path and a relative path? Do you know what a process is? Do you know what a pipe is? Do you know what $PATH is? Do you know what rwxr-xr-x means? Comment on this slide Operating Systems and Shells The operating system is the program that controls a computer's hardware Automatically loaded when the computer boots up Handles input and output (keyboard, mouse, disk, networks, .) Manages files and directories Keeps track of who you are, and what you're allowed to do A shell is a program that manages a user's interactions with the operating system Reads commands from the keyboard Figures out what other programs to run E.g., editor, compiler, browser, . If they produce plain text output, the shell displays it Though most modern applications send drawing commands to the operating system instead Figure 3.1: Operating System and Shell Many different shells have been written (remember, they're just programs) sh was the first for Unix, and is still the standard I.e., the lowest common denominator We'll use bash (the Bourne again shell) in this course Available just about everywhere Even on Windows (thanks to Cygwin) The operating system automatically runs the shell for you when you log in You can run multiple copies side by side in separate windows, just as you'd run multiple browsers Comment on this slide Lots of Little Tools Good tools are small, extensible frameworks Only a handful of commands are built into the shell The rest are implemented as programs in their own right Users can create new commands that are indistinguishable from those that shipped with the system The shell: Figures out what program you want to run Tells the operating system to run that program Gives that program access to the keyboard and screen Passes it any parameters you've specified Waits until it finishes Repeats Comment on this slide A Few Simple Commands cat: concatenate files to standard output I.e., show their contents $ cat todo.txt write lecture on how the shell works debug examples for version control lecture pick up cat food return Sally's DVDs date: print the time and date $ date Wed Jan 25 15:09:27 EST 2005 ls: list files and directories $ ls /home/gvwilson/swc README.txt admin data exer img lec license.txt publ soln src util tmpl todo.txt Important note: by default, ls does not show anything whose name begins with "." This is why . and .. (the current directory and its parent) don't show up See in a moment how to get these to show up mv: move (rename) files $ mv /home/gvwilson/swc/README.txt /home/gvwilson/swc/notes.txt $ ls admin data exer img lec license.txt notes.txt publ soln src util tmpl todo.txt Yeah, I know, they're not particularly memorable names. cd ("change directory) changes what directory you're currently in And ls without arguments lists the contents of the current directory, so: $ cd /home/gvwilson/swc $ ls admin data exer img lec license.txt notes.txt publ soln src util tmpl todo.txt wc: count words Actually counts the lines, words, and characters in a file $ wc license.txt 44 272 1293 license.txt Comment on this slide Flags Control the behavior of commands by giving them extra arguments (or flags) By convention, flags start with "-", as in "-c" or "-l" Examples Show directories with trailing slash (ls -F) $ ls -F LICENSE.txt admin/ data/ exer/ lec/ publ/ soln/ util/ Makefile cgi-bin/ etc/ img/ pdf/ scraps/ src/ web/ Show files by extension (ls -X) $ ls -X cat_todo.txt.out ls_x.txt.out wildcards_off.txt.out ls_a.txt pipe_3.txt wildcards_off.txt date.txt.out redirect_1.txt.out wildcards_on.txt.out ls_f.txt redirect_1.txt wildcards_on.txt echo.txt.out redirect_2.txt.out wordcount.txt.out ls_home.txt redirect_2.txt wordcount.txt ls_a.txt.out redirect_3.txt.out cat_todo.txt ls_x.txt redirect_3.txt ls_f.txt.out rename.txt.out date.txt pipe_1.txt rename.txt ls_home.txt.out wc_l.txt.out echo.txt pipe_2.txt wc_l.txt Show all files and directories, including those whose names begin with "." (ls -a) $ ls -a . .. .svn admin data exer img lec publ soln src util tmpl README.txt license.txt todo.txt Count lines only in all text files $ wc -l license.txt readme.txt todo.txt 44 license.txt 18 readme.txt 4 todo.txt man command displays the manual page for a command Lists all of the flags and their meanings More of a reference guide than a tutorial. Comment on this slide Wildcards Some characters (called wildcards) mean special things to the shell "*" matches zero or more characters "?" matches any single character "~" on its own means "my home directory "~bhargan" means "Bhargan's home directory The shell expands wildcards before running the command $ ls README.txt admin data exer img lec license.txt publ soln src util tmpl todo.txt $ ls *.txt README.txt license.txt todo.txt Note: it's the shell that expands wildcards, not the individual commands There's no way for rm to know whether it was invoked as rm *.txt or rm contract.txt thesis.txt payclaim.txt Comment on this slide Files and Directories Data is stored in files By convention, files have two part names, like notes.txt or home.html Most operating systems allow you to associate a filename extension with an application E.g., ".txt" is associated with an editor, and ".html" with a web browser But this is all just convention: you can call files (almost) anything you want Files are stored in directories (often called folders) Directories can contain other directories, too Results in the familiar directory tree Figure 3.2: A Directory Tree Everything in a directory must have a unique name Items can have the same name, but only if they're in different directories A collection of files and directories is called a file system On Unix, the file system has a unique root directory called / Every other directory is a child of it, or a child of a child, etc. On Windows, every drive has its own root directory So C:\home\gvwilson\notes.txt is different from J:\home\gvwilson\notes.txt When you're using Cygwin, you can also write C:\home\gvwilson as c:/home/gvwilson Or as /cygdrive/c/home/gvwilson Some Unix programs give ":" a special meaning, so Cygwin needed a way to write paths without it. A path is a description of how to find something in a file system An absolute path describes a location from the root directory down Equivalent to a street address Always starts with "/" E.g., /home/gvwilson is my home directory, and /courses/swc/lec/shell.swc is this file A relative path describes how to find something from some other location Equivalent to saying, "Four blocks north, and seven east E.g., from /courses/swc, the relative path to this file is lec/shell.swc Special symbols: "." means "the current directory ".." means "the directory immediately above this one Also called the parent directory In /courses/swc/data, .. is /courses/swc In /courses/swc/data/elements, .. is /courses/swc/data Figure 3.3: Parent Directories Comment on this slide Input and Output A running program is called a process A particular program With its own virtual memory Inherits the rights of whoever ran it By default, every process has three connections to the outside world: Standard input (stdin): connected to the keyboard Standard output (stdout): connected to the screen Standard error (stderr): also connected to the screen Used for error messages Figure 3.4: The Shell Running a Program You can tell the shell to connect stdin and stdout to files instead command < inputFile reads from inputFile command > outputFile writes to outputFile Only "normal output goes to the file, not error messages command < inputFile > outputFile does both Figure 3.5: Redirecting Standard Input and Output Example: save number of lines in all text files to words.len $ wc -w *.txt > words.len # nothing appears $ ls -t words.len data/ exer/ todo.txt img/ lec/ publ/ admin/ src/ util/ README.txt license.txt soln/ tmpl/ $ cat words.len 44 license.txt 18 readme.txt 4 todo.txt "-t" means "sort by modification time (most recent first) Example: sort < data.txt reads lines from data.txt and sorts them Redundant with many command-line tools, since they let you specify names of input files anyway Comment on this slide Pipes Suppose you want to use the output of one program as the input of another E.g., use wc to see how big files are, then sort -n to sort numerically Option 1: send output of first command to a temporary file, then read from that file $ wc *.txt > temp $ sort -n temp Option 2: use a pipe to connect the first program's output to the second one's input Written as "|" $ wc *.txt | sort -n More efficient and less error prone Can chain any number of commands together And combine with input and output redirection E.g., find the five shortest text files in a directory: $ wc *.txt | sort -n | head -5 > shortest.files Figure 3.6: Pipes Any program that reads from standard input, and writes to standard output, can use redirection and pipes Programs that do this are often called filters Comment on this slide Environment Variables Like any other program, the shell has variables Since they define a user's environment, they are usually called environment variables Usually all upper case Type set at the command prompt to get a listing: $ set ANT_HOME=C:/apache-ant-1.6.2 BASH=/usr/bin/bash COLUMNS=80 COMPUTERNAME=ISHAD HISTFILESIZE=500 Get a particular variable's value by putting a "$" in front of its name E.g., the shell replaces "$HOME" with the current user's home directory Often use the echo command to print this out $ echo $HOME /home/gvwilson Question: why must you type echo $HOME, and not just $HOME? To set or reset a variable's value temporarily, use this: $ export VARNAME=value Only affects the current shell (and programs run from it) To set a variable's value automatically when you log in, set it in ~/.bashrc Remember, "~" is a shortcut meaning "your home directory For me, right now, ~/.bashrc is /home/gvwilson/.bashrc Important environment variables Name Typical Value Notes HOME /home/gvwilson The current user's home directory HOMEDRIVE C: The current user's home drive (Windows only) HOSTNAME "ishad" This computer's name HOSTTYPE "i686" What kind of computer this is OS "Windows_NT" What operating system is running PATH "/home/gvwilson/bin:/usr/local/bin:/usr/bin:/bin:/Python24/" Where to look for programs PWD /home/gvwilson/swc/lec Present working directory (sometimes CWD, for current working directory) SHELL /bin/bash What shell is being run TEMP /tmp Where to store temporary files USER "gvwilson" The current user's ID Table 3.1: Important Environment Variables Comment on this slide How the Shell Finds Programs The most important of these variables is PATH The search path that tells the shell where to look for programs When you type a command like tabulate, the shell: Splits $PATH on colons to get a list of directories Looks for the program in each directory, in left-to-right order Runs the first one that it finds Example PATH is /home/gvwilson/bin:/usr/local/bin:/usr/bin:/bin:/Python24 Both /usr/local/bin/tabulate and /home/gvwilson/bin/tabulate exist /home/gvwilson/bin/tabulate will be run when you type tabulate at the command prompt Can run the other one by specifying the path, instead of just the command name Warning: it is common to include . in your path This allows you to run a program in the current directory just by typing whatever, instead of ./whatever But it also means you can never be quite sure what program a command will invoke Though you can use the command which program_name, which will tell you Common entries in PATH include: /bin, /usr/bin: core tools like ls Note: the word "bin comes from "binary, which is geekspeak for "a compiled program /usr/local/bin: optional (but common) tools, like the gcc $HOME/bin: tools you have built for yourself Remember, $HOME means "the user's home directory So this is equivalent to ~/bin Cygwin does things a little differently Uses the notation /cygdrive/c/somewhere instead of Windows' c:/somewhere The colon in c:/somewhere would clash with the colons in the PATH variable By default, Cygwin treats c:/cygwin as the root of its file system So /home/aturing is a synonym for c:/cygwin/home/aturing Yes, it can be confusing, but remember: we're trying to run one operating system's tools on top of another Comment on this slide Ownership and Permission: Unix On Unix, every user belongs to one or more groups The groups command will show you which ones you are in Every file is owned by a particular user and a particular group Owner can assign different read, write, and execute permissions to user, group, and others Read: can look at contents, but not modify them Write: can modify contents Execute: can run the file (e.g., it's a program) ls -l will show all of this information (Along with the file's size and a few other things) Permissions displayed as three rwx triples "Missing permissions shown by "-" Example: rw-rw-r-- means "user and group can read and write; everyone else can read; no one can execute Change permissions using chmod Example: chmod u+x something.exe gives the user execute permission to something.exe Example: chmod o-r notes.txt takes away the world's read permission for notes.txt Permissions mean something a little different for directories Execute permission means you can "go into a directory, but does not mean you can read its contents So if a directory called tools has permission rwx--x--x (i.e., owner can do anything, but everyone else only has execute permission), then: If someone other than the owner does ls tools, permission is denied But if there's a useful program called tools/findanswers, other users can still run it Comment on this slide Ownership and Permission: Windows Of course, it all works differently on Windows Not better or worse, just differently Windows XP uses access control lists (ACLs) Instead of describing users as "file owner, group member, or something else, ACLs let you specify exactly what any particular user, or set of users, can do to a file, directory, device, etc. Older versions of Windows (such as Windows 95 and Windows 2000) are fundamentally insecure, and shouldn't be used Cygwin does its best to make the Windows model look like Unix's If you trip over the differences, please consult a system administrator Comment on this slide Basic Tools man Documentation for commands. cat Concatenate and display text files. cd Change working directory. chmod Change file and directory permissions. clear Clear the screen. cp Copy files and directories. date Display the current date and time. diff Show differences between two text files. echo Print arguments. env Show environment variables. head Display the first few lines of a file. ls List files and directories. mkdir Make directories. more Page through a text file. mv Move (rename) files and directories. od Display the bytes in a file. passwd Change your password. pwd Print current working directory. rm Remove files. rmdir Remove directories. sort Sort lines. tail Display the last few lines of a file. uniq Remove duplicate lines. wc Count lines, words, and characters in a file. which locate a command Table 3.2: Basic Command-Line Tools Comment on this slide Examples Give several examples of how to do useful things with basic tools (#84) Comment on this slide More Advanced Tools du Print the disk space used by files and directories. find Find files that match a pattern. grep Print lines matching a pattern. gunzip Uncompress a file. gzip Compress a file. lpr Send a file to a printer. lprm Remove a print job from a printer's queue. lpq Check the status of a printer's queue. ps Display running processes. tar Archive files. which Find the path to a program. who See who is logged in. xargs Execute a command for each line of input. Table 3.3: Advanced Command-Line Tools Comment on this slide Exercises Exercise 3.1: Suppose you are in your home directory, and ls shows you this: Makefile biography.txt data enrolment.txt programs thesis What argument(s) do you have to give to ls to get it to put a trailing slash after the names of subdirectories, like this: Makefile biography.txt data/ enrolment.txt programs/ thesis/ If you run ls data, it shows: earth.txt jupiter.txt mars.txt mercury.txt saturn.txt venus.txt What command should you run to get the following output: data/earth.txt data/jupiter.txt data/mars.txt data/mercury.txt data/saturn.txt data/venus.txt What if you want this (note that an extra entry is being displayed): total 7 drwxr-xr-x 7 someone 0 May 6 08:27 .svn -rw-r--r-- 1 someone 2396 May 6 08:38 earth.txt -rw-r--r-- 1 someone 1263 May 6 08:38 jupiter.txt -rw-r--r-- 1 someone 1015 May 6 08:43 mars.txt -rw-r--r-- 1 someone 946 May 6 08:41 mercury.txt -rw-r--r-- 1 someone 1714 May 6 08:40 saturn.txt -rw-r--r-- 1 someone 881 May 6 08:40 venus.txt Note: the command will display your user ID, rather than someone. On some machines, the command will also display a group ID. Ignore these differences for the purpose of this question. Exercise 3.2: According to the listing of the data directory above, who can read the file mercury.txt? Who can write it (i.e., change its contents or delete it)? When was mercury.txt last changed? What command would you run to allow everyone to edit or delete the file? Exercise 3.3: Suppose you want to remove all files whose names (not including their extensions) are of length 3, start with the letter a, and have .txt as extension. What command would you use? For example, if the directory contains three files a.txt, abc.txt, and abcd.txt, the command should remove abc.txt , but not the other two files. Exercise 3.4: What does the command cd ~ do? What about cd ~gvwilson? Exercise 3.5: What's the difference between the commands cd HOME and cd $HOME? Exercise 3.6: Suppose you want to list the names of all the text files in the data directory that contain the word "carpentry". What command or commands could you use? Exercise 3.7: Suppose you have written a program called analyze. What command or commands could you use to display the first ten lines of its output? What would you use to display lines 50-100? To send lines 50-100 to a file called tmp.txt? Exercise 3.8: The command ls data > tmp.txt writes a listing of the data directory's contents into tmp.txt. Anything that was in the file before the command was run is overwritten. What command could you use to append the listing to tmp.txt instead? Exercise 3.9: What command(s) would you use to find out how many subdirectories there are in the lectures directory? Exercise 3.10: What does rm *.ch? What about rm *.[ch]? Exercise 3.11: What command(s) could you use to find out how many instances of a program are running on your computer at once? For example, if you are on Windows, what would you do to find out how many instances of svchost.exe are running? On Unix, what would you do to find out how many instances of bash are running? Exercise 3.12: What do the commands pushd, popd, and dirs do? Where do their names come from? Exercise 3.13: How would you send the file earth.txt to the default printer? How would you check it made it (other than wandering over to the printer and standing there)? Exercise 3.14: A colleague asks for your data files. How would you archive them to send as one file? How could you compress them? Exercise 3.15: The instructor wants you to use a hitherto unknown command for manipulating files. How would you get help on this command? Exercise 3.16: You're worried your data files can be read by your nemesis, Dr. Evil. How would you check whether or not he can, and if necessary change permissions so only you can read or write the files? Exercise 3.17: You have changed a text file on your home PC, and mailed it to the university terminal. What steps can you take to see what changes you may have made, compared with a master copy in your home directory? Exercise 3.18: How would you change your password? Exercise 3.19: grep is one of the more useful tools in the toolbox. It finds lines in files that match a pattern and prints them out. For example, assume I have files earth.txt and venus.txt containing lines like this: Name: Earth Period: 365.26 days Inclination: 0.00 Eccentricity: 0.02 If I type grep Period *.txt in that directory, I get: earth.txt:Period: 365.26 days venus.txt:Period: 224.70 days Search strings can use regular expressions, which will be discussed in a later lecture. grep takes many options as well; for example, grep -c /bin/bash /etc/passwd reports how many lines in /etc/passwd (the Unix password file) that contain the string /bin/bash, which in turn tells me how many users are using bash as their shell. Suppose all you wanted was a list of the files that contained lines matching a pattern, rather than the matches themselves-what flag or flags would you give to grep? What if you wanted the line numbers of matching lines? Exercise 3.20: diff finds and displays the differences between two files. It works best if both files are plain text (i.e., not images or Excel spreadsheets). By default, it shows the differences in groups, like this: 3c3,4 < Inclination: 0.00 --- > Inclination: 0.00 degrees > Satellites: 1 (The rather cryptic header "3c3,4" means that line 3 of the first file must be changed to get lines 3-4 of the second.) What flag(s) should you give diff to tell it to ignore changes that just insert or delete blank lines? What if you want to ignore changes in case (i.e., treat lowercase and uppercase letters as the same)? prev Copyright 2005, Python Software Foundation. See License for details. next