Previous Section  < Day Day Up >  Next Section

6.10. Review

The examples in this section use the following sample database, called datafile, repeated periodically for your convenience. In the database, the input field separator, FS, is whitespace, the default. The number of fields, NF , is 8. The number may vary from line to line, but in this file, the number of fields is fixed. The record separator, RS, is the newline, which separates each line of the file. Awk keeps track of the number of each record in the NR variable. The output field separator, OFS, is a space. If a comma is used to separate fields, when the line is printed, each field printed will be separated by a space.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


6.10.1 Simple Pattern Matching

Example 6.30.

nawk '/west/'  datafile

northwest             NW        Joel Craig            3.0   .98   3     4

western               WE        Sharon Kelly          5.3   .97   5    23

southwest             SW        Chris Foster          2.7   .8    2    18


EXPLANATION

All lines containing the pattern west are printed.

Example 6.31.

nawk '/^north/' datafile

northwest              NW      Joel Craig              3.0   .98   3     4

northeast              NE      TJ Nichols              5.1   .94   3     13

north                  NO      Val Shultz              4.5   .89   5     9


EXPLANATION

All lines beginning with the pattern north are printed.

Example 6.32.

nawk '/^(no|so)/' datafile

northwest             NW      Joel Craig            3.0    .98    3     4

southwest             SW      Chris Foster          2.7    .8     2     18

southern              SO      May Chin              5.1    .95    4     15

southeast             SE      Derek Johnson         4.0    .7     4     17

northeast             NE      TJ Nichols            5.1    .94    3     13

north                 NO      Val Shultz            4.5    .89    5     9


EXPLANATION

All lines beginning with the pattern no or so are printed.

6.10.2 Simple Actions

Example 6.33.

nawk '{print $3, $2}' datafile

Joel NW

Sharon WE

Chris SW

May SO

Derek SE

Susan EA

TJ NE

Val NO

Sheri CT


EXPLANATION

The output field separator, OFS, is a space by default. The comma between $3 and $2 is translated to the value of the OFS. The third field is printed, followed by a space and the second field.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


Example 6.34.

nawk '{print $3 $2}' datafile

JoelNW

SharonWE

ChrisSW

MaySO

DerekSE

SusanEA

TJNE

ValNO

SheriCT


EXPLANATION

The third field is followed by the second field. Because the comma does not separate fields $3 and $2, the output is displayed without spaces between the fields.

Example 6.35.

nawk 'print $1' datafile

nawk: syntax error at source line 1

 context is

         >>> print <<<  $1

nawk: bailing out at source line 1


EXPLANATION

This is the nawk (new awk) error message. Nawk error messages are much more verbose than those of the old awk. In this program, the curly braces are missing in the action statement.

Example 6.36.

awk 'print $1' datafile

awk: syntax error near line 1

awk: bailing out near line 1


EXPLANATION

This is the awk (old awk) error message. Old awk programs were difficult to debug because almost all errors produced this same message. The curly braces are missing in the action statement.

Example 6.37.

nawk '{print $0}' datafile

northwest              NW       Joel Craig             3.0   .98   3    4

western                WE       Sharon Kelly           5.3   .97   5    23

southwest              SW       Chris Foster           2.7   .8    2    18

southern               SO       May Chin               5.1   .95   4    15

southeast              SE       Derek Johnson          4.0   .7    4    17

eastern                EA       Susan Beal             4.4   .84   5    20

northeast              NE       TJ Nichols             5.1   .94   3    13

north                  NO       Val Shultz             4.5   .89   5    9

central                CT       Sheri Watson           5.7   .94   5    13


EXPLANATION

Each record is printed. $0 holds the current record.

Example 6.38.

nawk '{print "Number of fields: "NF}' datafile

Number of fields: 8

Number of fields: 8

Number of fields: 8

Number of fields: 8

Number of fields: 8

Number of fields: 8

Number of fields: 8

Number of fields: 8

Number of fields: 8


EXPLANATION

There are 8 fields in each record. The built-in awk variable NF holds the number of fields and is reset for each record.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


6.10.3 Regular Expressions in Pattern and Action Combinations

Example 6.39.

nawk '/northeast/{print $3, $2}' datafile

TJ NE


EXPLANATION

If the record contains (or matches) the pattern northeast, the third field, followed by the second field, is printed.

Example 6.40.

nawk '/E/' datafile

western              WE       Sharon Kelly           5.3   .97   5    23

southeast            SE       Derek Johnson          4.0   .7    4    17

eastern              EA       Susan Beal             4.4   .84   5    20

northeast            NE       TJ Nichols             5.1   .94   3    13


EXPLANATION

If the record contains an E, the entire record is printed.

Example 6.41.

nawk '/^[ns]/{print $1}' datafile

northwest

southwest

southern

southeast

northeast

north


EXPLANATION

If the record begins with an n or s, the first field is printed.

Example 6.42.

nawk '$5 ~ /\.[7-9]+/' datafile

southwest             SW        Chris Foster          2.7   .8    2    18

central               CT        Sheri Watson          5.7   .94   5    13


EXPLANATION

If the fifth field ($5) contains a literal period, followed by one or more numbers between 7 and 9, the record is printed.

Example 6.43.

nawk '$2 !~ /E/{print $1, $2}' datafile

northwest NW

southwest SW

southern SO

north NO

central CT


EXPLANATION

If the second field does not contain the pattern E, the first field followed by the second field ($1, $2) is printed.

Example 6.44.

nawk '$3 ~ /^Joel/{print $3 " is a nice guy."}' datafile

Joel is a nice guy.


EXPLANATION

If the third field ($3) begins with the pattern Joel, the third field followed by the string is a nice guy. is printed. Note that a space is included in the string if it is to be printed.

Example 6.45.

nawk '$8 ~ /[0-9][0-9]$/{print $8}' datafile

23

18

15

17

20

13

13


EXPLANATION

If the eighth field ($8) ends in two digits, it is printed.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


Example 6.46.

nawk '$4 ~ /Chin$/{print "The price is $" $8 "."}' datafile

The price is $15.


EXPLANATION

If the fourth field ($4) ends with Chin, the string enclosed in double quotes ("The price is $"), the eighth field ($8), and the string containing a period are printed.

Example 6.47.

nawk '/TJ/{print $0}' datafile

northeast             NE         TJ Nichols            5.1    .94     3    13


EXPLANATION

If the record contains the pattern TJ, $0 (the record) is printed.

6.10.4 Input Field Separators

Use the following datafile2 for Examples 6.48 through 6.52.

% cat datafile2

Joel Craig:northwest:NW:3.0:.98:3:4

Sharon Kelly:western:WE:5.3:.97:5:23

Chris Foster:southwest:SW:2.7:.8:2:18

May Chin:southern:SO:5.1:.95:4:15

Derek Johnson:southeast:SE:4.0:.7:4:17

Susan Beal:eastern:EA:4.4:.84:5:20

TJ Nichols:northeast:NE:5.1:.94:3:13

Val Shultz:north:NO:4.5:.89:5:9

Sheri Watson:central:CT:5.7:.94:5:13


Example 6.48.

nawk '{print $1}' datafile2

Joel

Sharon

Chris

May

Derek

Susan

TJ

Val

Sheri


EXPLANATION

The default input field separator is whitespace. The first field ($1) is printed.

Example 6.49.

nawk -F: '{print $1}' datafile2

Joel Craig

Sharon Kelly

Chris Foster

    <more output here>

Val Shultz

Sheri Watson


EXPLANATION

The –F option specifies the colon as the input field separator. The first field ($1) is printed.

Example 6.50.

nawk '{print "Number of fields: "NF}' datafile2

Number of fields: 2

Number of fields: 2

Number of fields: 2

    <more of the same output here>

Number of fields: 2

Number of fields: 2


EXPLANATION

Because the field separator is the default (whitespace), the number of fields for each record is 2. The only space is between the first and last name.

Example 6.51.

nawk -F: '{print "Number of fields: "NF}' datafile2

Number of fields: 7

Number of fields: 7

Number of fields: 7

    <more of the same output here>

Number of fields: 7

Number of fields: 7


EXPLANATION

Because the field separator is a colon, the number of fields in each record is 7.

Example 6.52.

nawk -F"[ :]" '{print $1, $2}' datafile2

Joel Craig northwest

Sharon Kelly western

Chris Foster southwest

May Chin southern

Derek Johnson southeast

Susan Beal eastern

TJ Nichols northeast

Val Shultz north

Sheri Watson central


EXPLANATION

Multiple field separators can be specified with nawk as a regular expression. Either a space or a colon will be designated as a field separator. The first and second fields ($1, $2) are printed. (The square brackets must be quoted to prevent the shell from trying to interpret them as shell metacharacters.)

6.10.5 awk Scripting

The following datafile is used for the next example.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


Example 6.53.

cat nawk.sc1

# This is a comment

# This is my first nawk script

1    /^north/{print $1, $2, $3}

2    /^south/{print "The " $1 " district."}



3    nawk -f nawk.sc1 datafile

     northwest NW Joel

     The southwest district.

     The southern district.

     The southeast district.

     northeast NE TJ

     north NO Val


EXPLANATION

  1. If the record begins with the pattern north, the first, second, and third fields ($1, $2, $3) are printed.

  2. If the record begins with the pattern south, the string The, followed by the value of the first field ($1), and the string district. are printed.

  3. The –f option precedes the name of the nawk script file, followed by the input file that is to be processed.

    Previous Section  < Day Day Up >  Next Section