Previous Section  < Day Day Up >  Next Section

4.2. grep Examples with Regular Expressions

The following datafile, used for the examples in this section, is repeated periodically for your convenience.

% cat datafile

northwest

NW

Charles Main

3.0

.98

3

34

western

WE

Sharon Gray

5.3

.97

5

23

southwest

SW

Lewis Dalsass

2.7

.8

2

18

southern

SO

Suan Chin

5.1

.95

4

15

southeast

SE

Patricia Hemenway

4.0

.7

4

17

eastern

EA

TB Savage

4.4

.84

5

20

northeast

NE

AM Main Jr.

5.1

.94

3

13

north

NO

Margot Weber

4.5

.89

5

9

central

CT

Ann Stephens

5.7

.94

5

13


Example 4.5.

% grep NW datafile

northwest             NW        Charles Main           3.0   .98   3       34


EXPLANATION

Prints all lines containing the regular expression NW in a file called datafile.

Example 4.6.

% grep NW d*

datafile: northwest   NW       Charles Main            3.0   .98    3     34

db: northwest         NW        Joel Craig             30     40    5     123


EXPLANATION

Prints all lines containing the regular expression NW in all files starting with a d. The shell expands d* to all files that begin with a d. In this case, the filenames retrieved are db and datafile.

Example 4.7.

% grep '^n' datafile

northwest            NW       Charles Main          3.0   .98    3    34

northeast            NE       AM Main Jr.           5.1   .94    3    13

north                NO       Margot Weber          4.5   .89    5     9


EXPLANATION

Prints all lines beginning with an n. The caret (^) is the beginning-of-line anchor.

Example 4.8.

% grep '4$' datafile

northwest             NW       Charles Main            3.0   .98    3     34


EXPLANATION

Prints all lines ending with a 4. The dollar sign ($) is the end-of-line anchor.

Example 4.9.

% grep TB Savage datafile

grep: Savage: No such file or directory

datafile: eastern     EA       TB Savage               4.4   .84     5     20


EXPLANATION

Because the first argument is the pattern and all of the remaining arguments are filenames, grep will search for TB in a file called Savage and a file called datafile. To search for TB Savage, see the next example.

Example 4.10.

% grep 'TB Savage' datafile

eastern               EA        TB Savage                 4.4   .84    5   20


EXPLANATION

Prints all lines containing the pattern TB Savage. Without quotes (in this example, either single or double quotes will do), the whitespace between TB and Savage would cause grep to search for TB in a file called Savage and a file called datafile, as in the previous example.

Example 4.11.

% grep '5\..' datafile

western               WE        Sharon Gray           5.3   .97   5    23

southern              SO        Suan Chin             5.1   .95   4    15

northeast             NE        AM Main Jr.           5.1   .94   3    13

central               CT        Ann Stephens          5.7   .94   5    13


EXPLANATION

Prints a line containing the number 5, followed by a literal period and any single character. The "dot" metacharacter represents a single character, unless it is escaped with a backslash. When escaped, the character is no longer a special metacharacter, but represents itself, a literal period.

% cat datafile

northwest

NW

Charles Main

3.0

.98

3

34

western

WE

Sharon Gray

5.3

.97

5

23

southwest

SW

Lewis Dalsass

2.7

.8

2

18

southern

SO

Suan Chin

5.1

.95

4

15

southeast

SE

Patricia Hemenway

4.0

.7

4

17

eastern

EA

TB Savage

4.4

.84

5

20

northeast

NE

AM Main Jr.

5.1

.94

3

13

north

NO

Margot Weber

4.5

.89

5

9

central

CT

Ann Stephens

5.7

.94

5

13


Example 4.12.

% grep '\.5' datafile

north                 NO        Margot Weber           4.5   .89    5   9


EXPLANATION

Prints any line containing the expression .5.

Example 4.13.

% grep '^[we]' datafile

western                WE        Sharon Gray             5.3   .97    5   23

eastern                EA        TB Savage               4.4   .84    5   20


EXPLANATION

Prints lines beginning with either a w or an e. The caret (^) is the beginning-of-line anchor, and either one of the characters in the brackets will be matched.

Example 4.14.

% grep '[^0-9]' datafile

northwest              NW      Charles Main          3.0    .98    3      34

western                WE      Sharon Gray           5.3    .97    5      23

southwest              SW      Lewis Dalsass         2.7    .8     2      18

southern               SO      Suan Chin             5.1    .95    4      15

southeast              SE      Patricia Hemenway     4.0    .7     4      17

eastern                EA      TB Savage             4.4    .84    5      20

northeast              NE      AM Main Jr.           5.1    .94    3      13

north                  NO      Margot Weber          4.5    .89    5      9

central                CT      Ann Stephens          5.7    .94    5      13


EXPLANATION

Prints all lines containing one nondigit. Because all lines have at least one nondigit, all lines are printed. (See the –v option in Table 4.2 on page 84.)

Example 4.15.

% grep '[A-Z][A-Z] [A-Z]' datafile

eastern               EA        TB Savage               4.4   .84     5    20

northeast             NE        AM Main Jr.             5.1   .94     3    13


EXPLANATION

Prints all lines containing two capital letters followed by a space and a capital letter; e.g., TB Savage and AM Main.

Example 4.16.

% grep 'ss* ' datafile

northwest             NW       Charles Main            3.0   .98     3     34

southwest             SW       Lewis Dalsass           2.7   .8      2     18


EXPLANATION

Prints all lines containing an s followed by zero or more consecutive occurrences of the letter s and a space. Finds Charles and Dalsass.

Example 4.17.

% grep '[a-z]\{9\}' datafile

northwest             NW       Charles Main              3.0   .98    3     34

southwest             SW       Lewis Dalsass             2.7   .8     2     18

southeast             SE       Patricia Hemenway         4.0   .7     4     17

northeast             NE       AM Main Jr.               5.1   .94    3     13


EXPLANATION

Prints all lines where there are at least nine consecutive lowercase letters, for example, northwest, southwest, southeast, and northeast.

Example 4.18.

% grep '\(3\)\.[0-9].*\1    *\1' datafile

northwest             NW       Charles Main            3.0   .98     3     34


EXPLANATION

Prints the line if it contains a 3 followed by a period and another number, followed by any number of characters (.* ), another 3 (originally tagged), any number of tabs, and another 3. Because the 3 was enclosed in parentheses, \(3\), it can be later referenced with \1. \1 means that this was the first expression to be tagged with the \( \) pair.

% cat datafile

northwest

NW

Charles Main

3.0

.98

3

34

western

WE

Sharon Gray

5.3

.97

5

23

southwest

SW

Lewis Dalsass

2.7

.8

2

18

southern

SO

Suan Chin

5.1

.95

4

15

southeast

SE

Patricia Hemenway

4.0

.7

4

17

eastern

EA

TB Savage

4.4

.84

5

20

northeast

NE

AM Main Jr.

5.1

.94

3

13

north

NO

Margot Weber

4.5

.89

5

9

central

CT

Ann Stephens

5.7

.94

5

13


Example 4.19.

% grep '\<north' datafile

northwest           NW       Charles Main            3.0   .98     3    34

northeast           NE       AM Main Jr.             5.1   .94     3    13

north               NO       Margot Weber            4.5   .89     5     9


EXPLANATION

Prints all lines containing a word starting with north. The \< is the beginning-of-word anchor.

Example 4.20.

% grep '\<north\>' datafile

north                NO        Margot Weber           4.5   .89     5     9


EXPLANATION

Prints the line if it contains the word north. The \< is the beginning-of-word anchor, and the \> is the end-of-word anchor.

Example 4.21.

% grep '\<[a-z].*n\>' datafile

northwest             NW        Charles Main           3.0   .98     3    34

western               WE        Sharon Gray            5.3   .97     5    23

southern              SO        Suan Chin              5.1   .95     4    15

eastern               EA        TB Savage              4.4   .84     5    20

northeast             NE        AM Main Jr.            5.1   .94     3    13

central               CT        Ann Stephens           5.7   .94     5    13


EXPLANATION

Prints all lines containing a word starting with a lowercase letter, followed by any number of characters, and a word ending in n. Watch the .* symbol. It means any character, including whitespace.

    Previous Section  < Day Day Up >  Next Section