Sie sind auf Seite 1von 6

Searching Files on UNIX

Searching Files Using UNIX grep


The grep program is a standard UNIX utility that searches through a set of files for an arbitrary text pattern, specified through a regular expression. Also check the man pages as well for egrep and fgrep. The MPEequivalents are MPEX and Magnet, both third-party products. By default, grep is case-sensitive (use -i to ignore case). By default, grep ignores the context of a string (use -w to match words only). By default, grep shows the lines that match (use -v to show those that don't match).

Understanding Regular Expressions


Regular Expressions are a feature of UNIX. They describe a pattern to match, a sequence of characters, not words, within a line of text. Here is a quick summary of the special characters used in the grep tool and their meaning: Text version. ^ (Caret) $ (Question ) \ (Back Slash) [] (Brackets ) [^ ] . (Period) = match expression at the start of a line, as in ^A. = match expression at the end of a line, as in A$. = turn off the special meaning of the next character, as in \^. = match any one of the enclosed characters, as in [aeiou]. Use Hyphen "-" for a range, as in [0-9]. = match any one character except those enclosed in [ ], as in [^0-9]. = match a single character of any value, except end of line.

* (Asterisk) \{x,y\} \{x\} \{x,\}

= match zero or more of the preceding character or expression. = match x to y occurrences of the preceding. = match exactly x occurrences of the preceding. = match x or more occurrences of the preceding.

As an MPE user, you may find regular expressions difficult to use at first. Please persevere, because they are used in many UNIX tools, from more to perl. Unfortunately, some tools use simple regular expressions and others use extended regular expressions and some extended features have been merged into simple tools, so that it looks as if every tool has its own syntax. Not only that, regular expressions use the same characters as shellwildcarding, but they are not used in exactly the same way. What do you expect of an operating system built by graduate students? Since you usually type regular expressions within shell commands, it is good practice to enclose the regular expression in single quotes (') to stop the shell from expanding it before passing the argument to your search tool. Here are some examples using grep:

Text version. grep smug files grep files grep files grep files grep '^smug' 'smug$' '^smug$' '\^s' files {search files for lines with 'smug'} {'smug' at the start of a line} {'smug' at the end of a line} {lines containing only 'smug'} {lines starting with '^s', "\" escapes the ^} {search for 'Smug' or 'smug'} {search for BOB, Bob, BOb or BoB }

grep '[Ss]mug' files grep 'B[oO][bB]' files

grep '^$' files grep '[0-9][0-9]' file

{search for blank lines} {search for pairs of numeric digits}

Back Slash "\" is used to escape the next symbol, for example, turn off the special meaning that it has. To look for a Caret "^" at the start of a line, the expression is ^\^. Period "." matches any single character. So b.b will match "bob", "bib", "b-b", etc. Asterisk "*" does not mean the same thing in regular expressions as in wildcarding; it is a modifier that applies to the preceding single character, or expression such as [0-9]. An asterisk matches zero or more of what precedes it. Thus [AZ]* matches any number of upper-case letters, including none, while [A-Z][A-Z]* matches one or more upper-case letters. The vi editor uses \< \> to match characters at the beginning and/or end of a word boundary. A word boundary is either the edge of the line or any character except a letter, digit or underscore "_". To look for if, but skip stiff, the expression is \<if\>. For the same logic in grep, invoke it with the -w option. And remember that regular expressions are case-sensitive. If you don't care about the case, the expression to match "if" would be [Ii][Ff], where the characters in square brackets define a character set from which the pattern must match one character. Alternatively, you could also invoke grep with the -i option to ignore case. Here are a few more examples of grep to show you what can be done:

Text version. grep '^From: ' /usr/mail/ $USER {list your mail}

grep '[a-zA-Z]' grep '[^a-zA-Z0-9] grep '[0-9]\{3\}-[0-9]\ {4\}' grep '^.$' grep '"smug"' grep '"*smug"*' grep '^\.' grep '^\.[a-z][a-z]'

{any line with at least one letter} {anything not a letter or number} {999-9999, like phone numbers} {lines with exactly one character} {'smug' within double quotes} {'smug', with or without quotes} {any line that starts with a Period "."} {line start with "." and 2 lc letters}

Wildcards in Filenames
UNIX allows wildcards in almost all commands -- it is actually a feature of the shell. Caution: UNIX also uses the wildcard characters in pattern matching, but the meaning is only similar, not identical. MPE allows wildcards in the Listf, Store, Restore, and Purge (new feature) commands. UNIX Wildcards

Text version. ? * [ ] any single character, except a leading dot zero or more characters, except a leading dot defines a class of characters ( - for range, ! to exclude)

UNIX Examples: Text version. [abc]?? [1-9] [A-Z] [!AZ]?? *e[09]f 3 character filename beginning with "a", "b", or "c". 2 character filename starting with a number, and ending with an uppercase letter. 3 character filename that does not begin with an uppercase letter. any file ending with "e", a single number, and "f".

MPE Wildcards Remember that the MPE Name Space upshifts all filenames, although you can create files in the POSIX Name Space by preceding them with dot . or slash /. Text version. @ # ? [ ] anything, including nothing a single numeric digit a single alphanumeric character defines a class of characters ( - for range, no ! to exclude); :List file

MPE Examples: Text version. @fix@ any filename containing the string "FIX" (MPE name space implied).

./@fix @ [ABC]? ?

any filename containing lowercase "fix" (:Listfile only, POSIX name space). 3 character filename beginning with "A" "B", or "C" (:Listfile only).

Das könnte Ihnen auch gefallen