Previous Section  < Day Day Up >  Next Section

6.16. Review

The examples in this section, unless noted otherwise, use the following datafile, repeated periodically for your convenience.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


6.16.1 Increment and Decrement Operators

Example 6.105.

% nawk '/^north/{count += 1; print count}' datafile

1

2

3


EXPLANATION

If the record begins with the regular expression north, a user-defined variable, count, is created; count is incremented by 1 and its value is printed.

Example 6.106.

% nawk '/^north/{count++; print count}' datafile

1

2

3


EXPLANATION

The auto-increment operator increments the user-defined variable count by 1. The value of count is printed.

Example 6.107.

% nawk '{x = $7--; print "x = "x ", $7 = "$7}' datafile

x = 3, $7 = 2

x = 5, $7 = 4

x = 2, $7 = 1

x = 4, $7 = 3

x = 4, $7 = 3

x = 5, $7 = 4

x = 3, $7 = 2

x = 5, $7 = 4

x = 5, $7 = 4


EXPLANATION

After the value of the seventh field ($7) is assigned to the user-defined variable x, the auto-decrement operator decrements the seventh field by 1. The value of x and the seventh field are printed.

6.16.2 Built-In Variables

Example 6.108.

% nawk '/^north/{print "The record number is " NR}' datafile

The record number is 1

The record number is 7

The record number is 8


EXPLANATION

If the record begins with the regular expression north, the string The record number is and the value of NR (record number) are printed.

Example 6.109.

% nawk '{print NR, $0}' datafile

1 northwest           NW        Joel Craig           3.0    .98    3    4

2 western             WE        Sharon Kelly         5.3    .97    5    23

3 southwest           SW        Chris Foster         2.7    .8     2    18

4 southern            SO        May Chin             5.1    .95    4    15

5 southeast           SE        Derek Johnson        4.0    .7     4    17

6 eastern             EA        Susan Beal           4.4    .84    5    20

7 northeast           NE        TJ Nichols           5.1    .94    3    13

8 north               NO        Val Shultz           4.5    .89    5    9

9 central             CT        Sheri Watson         5.7    .94    5    13


EXPLANATION

The value of NR, the number of the current record, and the value of $0, the entire record, are printed.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


Example 6.110.

% nawk 'NR==2,NR==5{print NR, $0}' datafile

2 western             WE        Sharon Kelly           5.3    .97    5    23

3 southwest           SW        Chris Foster           2.7     .8    2    18

4 southern            SO        May Chin               5.1    .95    4    15

5 southeast           SE        Derek Johnson          4.0     .7    4    17


EXPLANATION

If the value of NR is in the range between 2 and 5 (record numbers 2–5), the number of the record (NR) and the record ($0) are printed.

Example 6.111.

% nawk '/^north/{print NR, $1, $2, $NF, RS}' datafile

1 northwest NW 4



7 northeast NE 13



8 north NO 9


EXPLANATION

If the record begins with the regular expression north, the number of the record (NR), followed by the first field, the second field, the value of the last field (NF preceded by a dollar sign), and the value of RS (a newline) are printed. Because the print function generates a newline by default, RS will generate another newline, resulting in double spacing between records.

Use the following datafile2 for Examples 6.112 and 6.113.

% cat datafile2

Joel Craig:northwest:NW:3.0:.98:3:4

Sharon Kelly:western:WE:5.3:.97:5:23

Chris Foster:southwest:SW:2.7:.8:2:18

May Chin:southern:SO:5.1:.95:4:15

Derek Johnson:southeast:SE:4.0:.7:4:17

Susan Beal:eastern:EA:4.4:.84:5:20

TJ Nichols:northeast:NE:5.1:.94:3:13

Val Shultz:north:NO:4.5:.89:5:9

Sheri Watson:central:CT:5.7:.94:5:13


Example 6.112.

% nawk -F: 'NR == 5{print NF}' datafile2

7


EXPLANATION

The field separator is set to a colon at the command line with the –F option. If the number of the record (NR) is 5, the number of fields (NF) is printed.

Example 6.113.

% nawk 'BEGIN{OFMT="%.2f";print 1.2456789,12E-2}' datafile2

1.25 0.12


EXPLANATION

OFMT, the output format variable for the print function, is set so that floating-point numbers will be printed with a decimal-point precision of two digits. The numbers 1.23456789 and 12E–2 are printed in the new format.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


Example 6.114.

% nawk '{$9 = $6 * $7; print $9}' datafile

2.94

4.85

1.6

3.8

2.8

4.2

2.82

4.45

4.7


EXPLANATION

The result of multiplying the sixth field ($6) and the seventh field ($7) is stored in a new field, $9, and printed. There were eight fields; now there are nine.

Example 6.115.

% nawk '{$10 = 100; print NF, $9, $0}' datafile

10  northwest        NW        Joel Craig            3.0    .98   3    4     100

10  western          WE        Sharon Kelly          5.3    .97   5    23    100

10  southwest        SW        Chris Foster          2.7    .8    2    18    100

10  southern         SO        May Chin              5.1    .95   4    15    100

10  southeast        SE        Derek Johnson         4.0    .7    4    17    100

10  eastern          EA        Susan Beal            4.4    .84   5    20    100

10  northeast        NE        TJ Nichols            5.1    .94   3    13    100

10  north            NO        Val Shultz            4.5    .89   5    9     100

10  central          CT        Sheri Watson          5.7    .94   5    13    100


EXPLANATION

A tenth field ($10) is assigned 100 for each record. This is a new field. The ninth field ($9) does not exist, so it will be considered a null field. The number of fields is printed (NF), followed by the value of $9, the null field, and the entire record ($0). The value of the tenth field is 100.

6.16.3 BEGIN Patterns

Example 6.116.

% nawk 'BEGIN{print "---------EMPLOYEES---------"}'

---------EMPLOYEES---------


EXPLANATION

The BEGIN pattern is followed by an action block. The action is to print out the string – – – – – – – – –EMPLOYEES– – – – – – – – – before opening the input file. Note that an input file has not been provided and awk does not complain because any action preceded by BEGIN occurs first, even before awk looks for an input file.

Example 6.117.

% nawk 'BEGIN{print "\t\t---------EMPLOYEES-------\n"}\

  {print $0}' datafile

                               ---------EMPLOYEES-------

northwest             NW       Joel Craig             3.0    .98   3    4

western               WE       Sharon Kelly           5.3    .97   5    23

southwest             SW       Chris Foster           2.7    .8    2    18

southern              SO       May Chin               5.1    .95   4    15

southeast             SE       Derek Johnson          4.0    .7    4    17

eastern               EA       Susan Beal             4.4    .84   5    20

northeast             NE       TJ Nichols             5.1    .94   3    13

north                 NO       Val Shultz             4.5    .89   5    9

central               CT       Sheri Watson           5.7    .94   5    13


EXPLANATION

The BEGIN action block is executed first. The title – – – – – – – – –EMPLOYEES– – – – – – – is printed. The second action block prints each record in the input file. When breaking lines, the backslash is used to suppress the carriage return. Lines can be broken at a semicolon or a curly brace.

The following datafile2 is used for Example 6.118.

% cat datafile2

Joel Craig:northwest:NW:3.0:.98:3:4

Sharon Kelly:western:WE:5.3:.97:5:23

Chris Foster:southwest:SW:2.7:.8:2:18

May Chin:southern:SO:5.1:.95:4:15

Derek Johnson:southeast:SE:4.0:.7:4:17

Susan Beal:eastern:EA:4.4:.84:5:20

TJ Nichols:northeast:NE:5.1:.94:3:13

Val Shultz:north:NO:4.5:.89:5:9

Sheri Watson:central:CT:5.7:.94:5:13


Example 6.118.

% nawk 'BEGIN{ FS=":";OFS="\t"};/^Sharon/{print $1, $2, $8 }' datafile2

Sharon Kelly       western     28


EXPLANATION

The BEGIN action block is used to initialize variables. The FS variable (field separator) is assigned a colon. The OFS variable (output field separator) is assigned a tab (\t). After processing the BEGIN action block, awk opens datafile2 and starts reading input from the file. If a record begins with the regular expression Sharon, the first, second, and eighth fields ($1, $2, $8) are printed. Each field in the output is separated by a tab.

6.16.4 END Patterns

The following datafile is used for Examples 6.119 and 6.120.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


Example 6.119.

% nawk 'END{print "The total number of records is " NR}' datafile

The total number of records is 9


EXPLANATION

After awk has finished processing the input file, the statements in the END block are executed. The string The total number of records is is printed, followed by the value of NR, the number of the last record.

Example 6.120.

% nawk '/^north/{count++}END{print count}' datafile

3


EXPLANATION

If the record begins with the regular expression north, the user-defined variable count is incremented by one. When awk has finished processing the input file, the value stored in the variable count is printed.

6.16.5 awk Script with BEGIN and END

The following datafile2 is used for Example 6.121.

% cat datafile2

Joel Craig:northwest:NW:3.0:.98:3:4

Sharon Kelly:western:WE:5.3:.97:5:23

Chris Foster:southwest:SW:2.7:.8:2:18

May Chin:southern:SO:5.1:.95:4:15

Derek Johnson:southeast:SE:4.0:.7:4:17

Susan Beal:eastern:EA:4.4:.84:5:20

TJ Nichols:northeast:NE:5.1:.94:3:13

Val Shultz:north:NO:4.5:.89:5:9

Sheri Watson:central:CT:5.7:.94:5:13


Example 6.121.

    # Second awk script-- awk.sc2

1   BEGIN{ FS=":"

       print "  NAME\t\tDISTRICT\tQUANTITY"

       print "___________________________________________\n"

    }



2      {print $1"\t  " $3"\t\t" $7}

       {total+=$7}

       /north/{count++}



3   END{

       print "---------------------------------------------"

       print "The total quantity is " total

       print "The number of northern salespersons is " count "."

    }



(The Output)

4  % nawk -f awk.sc2 datafile2

     NAME DISTRICT  QUANTITY

   ___________________________________________

   Joel Craig       NW       4

   Sharon Kelly     WE       23

   Chris Foster     SW       18

   May Chin         SO       15

   Derek Johnson    SE       17

   Susan Beal       EA       20

   TJ Nichols       NE       13

   Val Shultz       NO       9

   Sheri Watson     CT       13

   ---------------------------------------------

   The total quantity is 132

   The number of northern salespersons is 3.


EXPLANATION

  1. The BEGIN block is executed first. The field separator (FS) is set. Header output is printed.

  2. The body of the awk script contains statements that are executed for each line of input coming from datafile2.

  3. Statements in the END block are executed after the input file has been closed, i.e., before awk exits.

  4. At the command line, the nawk program is executed. The –f option is followed by the script name, awk.sc2, and then by the input file, datafile2.

The remaining examples in this section use the following datafile, repeated periodically for your convenience.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


6.16.6 The printf Function

Example 6.122.

% nawk '{printf "$%6.2f\n",$6 * 100}' datafile

$ 98.00

$ 97.00

$ 80.00

$ 95.00

$ 70.00

$ 84.00

$ 94.00

$ 89.00

$ 94.00


EXPLANATION

The printf function formats a floating-point number to be right-justified (the default) with a total of 6 digits, one for the decimal point, and two for the decimal numbers to the right of the period. The number will be rounded up and printed.

Example 6.123.

% nawk '{printf "|%-15s|\n",$4}' datafile

|Craig|

|Kelly  |

|Foster |

|Chin   |

|Johnson|

|Beal |

|Nichols|

|Shultz|

|Watson|


EXPLANATION

A left-justified, 15-space string is printed. The fourth field ($4) is printed enclosed in vertical bars to illustrate the spacing.

6.16.7 Redirection and Pipes

Example 6.124.

% nawk '/north/{print $1, $3, $4 > "districts"}' datafile

% cat districts

northwest Joel Craig

northeast TJ Nichols

north Val Shultz


EXPLANATION

If the record contains the regular expression north, the first, third, and fourth fields ($1, $3, $4) are printed to an output file called districts. Once the file is opened, it remains open until closed or the program terminates. The filename "districts" must be enclosed in double quotes.

Example 6.125.

% nawk '/south/{print $1, $2, $3 >> "districts"}' datafile

% cat districts

southwest SW Chris

southern SO May

southeast SE Derek


EXPLANATION

If the record contains the pattern south, the first, second, and third fields ($1, $2, $3) are appended to the output file districts.

% cat datafile

northwest

NW

Joel Craig

3.0

.98

3

4

western

WE

Sharon Kelly

5.3

.97

5

23

southwest

SW

Chris Foster

2.7

.8

2

18

southern

SO

May Chin

5.1

.95

4

15

southeast

SE

Derek Johnson

4.0

.7

4

17

eastern

EA

Susan Beal

4.4

.84

5

20

northeast

NE

TJ Nichols

5.1

.94

3

13

north

NO

Val Shultz

4.5

.89

5

9

central

CT

Sheri Watson

5.7

.94

5

13


6.16.8 Opening and Closing a Pipe

Example 6.126.

# awk script using pipes -- awk.sc3

1   BEGIN{

2       printf " %-22s%s\n", "NAME", "DISTRICT"

        print "--------------------------------------"

3   }

4   /west/{count++}

5   {printf "%s %s\t\t%-15s\n", $3, $4, $1| "sort +1" }

6   END{

7       close "sort +1"

        printf "The number of sales persons in the western "

        printf "region is " count "."    }



(The Output)

    % nawk -f awk.sc3 datafile

1   NAME                    DISTRICT

2   --------------------------------------------------

3   Susan Beal              eastern

    May Chin                southern

    Joel Craig              northwest

    Chris Foster            southwest

    Derek Johnson           southeast

    Sharon Kelly            western

    TJ Nichols              northeast

    Val Shultz              north

    Sheri Watson            central

    The number of sales persons in the western region is 3.


EXPLANATION

  1. The special BEGIN pattern is followed by an action block. The statements in this block are executed first, before awk processes the input file.

  2. The printf function displays the string NAME as a 22-character, left-justified string, followed by the string DISTRICT, which is right-justified.

  3. The BEGIN block ends.

  4. Now awk will process the input file, one line at a time. If the pattern west is found, the action block is executed, i.e., the user-defined variable count is incremented by one. The first time awk encounters the count variable, it will be created and given an initial value of 0.

  5. The printf function formats and sends its output to a pipe. After all of the output has been collected, it will be sent to the sort command.

  6. The END block is started.

  7. The pipe (sort +1) must be closed with exactly the same command that opened it; in this example, sort +1. Otherwise, the END statements will be sorted with the rest of the output.

    Previous Section  < Day Day Up >  Next Section