Previous Section  < Day Day Up >  Next Section

4.1. The grep Command

4.1.1 The Meaning of grep

The name grep can be traced back to the ex editor. If you invoked that editor and wanted to search for a string, you would type at the ex prompt:


: /pattern/p


The first line containing the string pattern would be printed as "p" by the print command. If you wanted all the lines that contained pattern to be printed, you would type:


: g/pattern/p


When g precedes pattern, it means "all lines in the file," or "perform a global substitution."

Because the search pattern is called a regular expression, we can substitute RE for pattern and the command reads


: g/RE/p


And there you have it: the meaning of grep and the origin of its name. It means "globally search for the regular expression (RE) and print out the line." The nice part of using grep is that you do not have to invoke an editor to perform a search, and you do not need to enclose the regular expression in forward slashes. It is much faster than using ex or vi.

4.1.2 How grep Works

The grep command searches for a pattern of characters in a file or multiple files. If the pattern contains whitespace, it must be quoted. The pattern is either a quoted string or a single word,[1] and all other words following it are treated as filenames. Grep sends its output to the screen and does not change or affect the input file in any way.

[1] A word is also called a token.

FORMAT


grep word filename filename


Example 4.1.

grep Tom /etc/passwd


EXPLANATION

Grep will search for the pattern Tom in a file called /etc/passwd. If successful, the line from the file will appear on the screen; if the pattern is not found, there will be no output at all; and if the file is not a legitimate file, an error will be sent to the screen. If the pattern is found, grep returns an exit status of 0, indicating success; if the pattern is not found, the exit status returned is 1; and if the file is not found, the exit status is 2.

The grep program can get its input from a standard input or a pipe, as well as from files. If you forget to name a file, grep will assume it is getting input from standard input, the keyboard, and will stop until you type something. If coming from a pipe, the output of a command will be piped as input to the grep command, and if a desired pattern is matched, grep will print the output to the screen.

Example 4.2.

ps -ef | grep root


EXPLANATION

The output of the ps command (ps –ef displays all processes running on this system) is sent to grep and all lines containing root are printed.

4.1.3 Metacharacters

A metacharacter is a character that represents something other than itself. ^ and $ are examples of metacharacters.

The grep command supports a number of regular expression metacharacters (see Table 4.1) to help further define the search pattern. It also provides a number of options (see Table 4.2) to modify the way it does its search or displays lines. For example, you can provide options to turn off case sensitivity, display line numbers, display errors only, and so on.

Example 4.3.

grep -n  '^jack:' /etc/passwd


Table 4.1. grep's Regular Expression Metacharacters

Metacharacter

Function

Example

What It Matches

^

Beginning-of-line anchor

'^love'

Matches all lines beginning with love.

$

End-of-line anchor

'love$'

Matches all lines ending with love.

.

Matches one character

'l..e'

Matches lines containing an l, followed by two characters, followed by an e.

*

Matches zero or more characters preceding the asterisk

' *love'

Matches lines with zero or more spaces, followed by the pattern love.

[ ]

Matches one character in the set

'[Ll]ove'

Matches lines containing love or Love.

[^]

Matches one character not in the set

'[^A–K]ove'

Matches lines not containing a character in the range A through K, followed by ove.

\<

Beginning-of-word anchor

'\<love'

Matches lines containing a word that begins with love.

\>

End-of-word anchor

'love\>'

Matches lines containing a word that ends with love.

\(..\)

Tags matched characters

'\(love\)ing'

Tags marked portion in a register to be remembered later as number 1. To reference later, use \1 to repeat the pattern. May use up to nine tags, starting with the first tag at the leftmost part of the pattern. For example, the pattern love is saved in register 1 to be referenced later as \1.


x\{m\}

x\{m,\}

x\{m,n\}[a]


Repetition of character x: m times, at least m times, or between m and n times


'o\{5\}'

'o\{5,\}'

'o\{5,10\}'


Matches if line has 5 occurences of o, at least 5 occurences of o, or between 5 and 10 occurrences of o.


[a] The \{ \} metacharacters are not supported on all versions of UNIX or all pattern-matching utilities; they usually work with vi and grep.

Table 4.2. grep's Options

Option

What It Does

–b

Precedes each line by the block number on which it was found. This is sometimes useful in locating disk block numbers by context.

–c

Displays a count of matching lines rather than displaying the lines that match.

–h

Does not display filenames.

–i

Ignores the case of letters in comparisons (i.e., upper- and lowercase are considered identical).

–l

Lists only the names of files with matching lines (once), separated by newline characters.

–n

Precedes each line by its relative line number in the file.

–s

Works silently, that is, displays nothing except error messages. This is useful for checking the exit status.

–v

Inverts the search to display only lines that do not match.

–w

Searches for the expression as a word, as if surrounded by \< and \>. This applies to grep only. (Not all versions of grep support this feature; e.g., SCO UNIX does not.)


EXPLANATION

Grep searches the /etc/passwd file for jack; if jack is at the beginning of a line, grep prints out the number of the line on which jack was found and where in the line jack was found.

4.1.4 grep and Exit Status

The grep command is very useful in shell scripts, because it always returns an exit status to indicate whether it was able to locate the pattern or the file you were looking for. If the pattern is found, grep returns an exit status of 0, indicating success; if grep cannot find the pattern, it returns 1 as its exit status; and if the file cannot be found, grep returns an exit status of 2. (Other UNIX utilities that search for patterns, such as sed and awk, do not use the exit status to indicate the success or failure of locating a pattern; they report failure only if there is a syntax error in a command.)

In the following example, john is not found in the /etc/passwd file.

Example 4.4.

1   % grep 'john' /etc/passwd          # john is not in the passwd file

2   % echo $status      (csh)

    1


or


2   $ echo $?           (sh, ksh)

     1


EXPLANATION

  1. Grep searches for john in the /etc/passwd file, and if successful, grep exits with a status of 0. If john is not found in the file, grep exits with 1. If the file is not found, an exit status of 2 is returned.

  2. The C shell variable, status, and the Bourne/Korn shell variable, ?, are assigned the exit status of the last command that was executed.

    Previous Section  < Day Day Up >  Next Section