Previous Section  < Day Day Up >  Next Section

4.4. Command Substitution

From the discussion so far, we've seen two ways of getting values into variables: by assignment statements and by the user supplying them as command-line arguments (positional parameters). There is another way: command substitution, which allows you to use the standard output of a command as if it were the value of a variable. You will soon see how powerful this feature is.

The syntax of command substitution is:[9]

[9] Bourne and C shell users should note that the command substitution syntax of those shells, `UNIX command` (with backward quotes, or grave accents), is also supported by bash for backward compatibility reasons. However, it is harder to read and less conducive to nesting.

$(UNIX command)

The command inside the parentheses is run, and anything the command writes to standard output is returned as the value of the expression. These constructs can be nested, i.e., the UNIX command can contain command substitutions.

Here are some simple examples:

  • The value of $(pwd) is the current directory (same as the environment variable $PWD).

  • The value of $(ls $HOME) is the names of all files in your home directory.

  • The value of $(ls $(pwd)) is the names of all files in the current directory.

  • The value of $(< alice) is the contents of the file alice with any trailing newlines removed.[10]

    [10] Not available in versions of bash prior to 2.02.

  • To find out detailed information about a command if you don't know where its file resides, type ls -l $(type -path -all command-name). The -all option forces type to do a pathname look-up and -path causes it to ignore keywords, built-ins, etc.

  • If you want to edit (with vi) every chapter of your book on bash that has the phrase "command substitution," assuming that your chapter files all begin with ch, you could type:

    vi $(grep -l 'command substitution' ch*)

  • The -l option to grep prints only the names of files that contain matches.

Command substitution, like variable and tilde expansion, is done within double quotes. Therefore, our rule in Chapter 1 and Chapter 3 about using single quotes for strings unless they contain variables will now be extended: "When in doubt, use single quotes, unless the string contains variables or command substitutions, in which case use double quotes."

Command substitution helps us with the solution to the next programming task, which relates to the album database in Task 4-1.

Task 4-5

The file used in Task 4-1 is actually a report derived from a bigger table of data about albums. This table consists of several columns, or fields, to which a user refers by names like "artist," "title," "year," etc. The columns are separated by vertical bars (|, the same as the UNIX pipe character). To deal with individual columns in the table, field names need to be converted to field numbers.

Suppose there is a shell function called getfield that takes the field name as argument and writes the corresponding field (or column) number on the standard output. Use this routine to help extract a column from the data table.


The cut utility is a natural for this task. cut is a data filter: it extracts columns from tabular data. If you supply the numbers of columns you want to extract from the input, cut will print only those columns on the standard output. Columns can be character positions or—relevant in this example—fields that are separated by TAB characters or other delimiters.[11] Assume that the data table in our task is a file called albums and that it looks like this:

[11] Some older BSD-derived systems don't have cut, but you can use awk instead. Whenever you see a command of the form: cut -fN -dC filename, use this instead: awk -FC '{print $N}' filename.

Depeche Mode|Speak and Spell|Mute Records|1981

Depeche Mode|Some Great Reward|Mute Records|1984

Depeche Mode|101|Mute Records|1989

Depeche Mode|Violator|Mute Records|1990

Depeche Mode|Songs of Faith and Devotion|Mute Records|1993

...

Here is how we would use cut to extract the fourth (year) column:

cut -f4 -d\| albums

The -d argument is used to specify the character used as field delimiter (TAB is the default). The vertical bar must be backslash-escaped so that the shell doesn't try to interpret it as a pipe.

From this line of code and the getfield routine, we can easily derive the solution to the task. Assume that the first argument to getfield is the name of the field the user wants to extract. Then the solution is:

fieldname=$1

cut -f$(getfield $fieldname) -d\| albums

If we called this script with the argument year, the output would be:

1981

1984

1989

1990

1993

...

Task 4-6 shows another small task that makes use of cut.

Task 4-6

Send a mail message to everyone who is currently logged in.


The command who tells you who is logged in (as well as which terminal they're on and when they logged in). Its output looks like this:

root     tty1         Oct 13 12:05

michael  tty5         Oct 13 12:58

cam      tty23        Oct 13 11:51

kilrath  tty25        Oct 13 11:58

The fields are separated by spaces, not TABs. Since we need the first field, we can get away with using a space as the field separator in the cut command. (Otherwise we'd have to use the option to cut that uses character columns instead of fields.) To provide a space character as an argument on a command line, you can surround it by quotes:

$ who | cut -d' ' -f1

With the above who output, this command's output would look like this:

root

michael

cam

kilrath

This leads directly to a solution to the task. Just type:

$ mail $(who | cut -d' ' -f1)

The command mail root michael cam kilrath will run and then you can type your message.

Task 4-7 is another task that shows how useful command pipelines can be in command substitution.

Task 4-7

The ls command gives you pattern-matching capability with wildcards, but it doesn't allow you to select files by modification date. Devise a mechanism that lets you do this.


Here is a function that allows you to list all files that were last modified on the date you give as argument. Once again, we choose a function for speed reasons. No pun is intended by the function's name:

function lsd

{

    date=$1

    ls -l | grep -i "^.\{42\}$date" | cut -c55-

}

This function depends on the column layout of the ls -l command. In particular, it depends on dates starting in column 42 and filenames starting in column 55. If this isn't the case in your version of UNIX, you will need to adjust the column numbers.[12]

[12] For example, ls -l on SunOS 4.1.x has dates starting in column 33 and filenames starting in column 46.

We use the grep search utility to match the date given as argument (in the form Mon DD, e.g., Jan 15 or Oct 6, the latter having two spaces) to the output of ls -l. This gives us a long listing of only those files whose dates match the argument. The -i option to grep allows you to use all lowercase letters in the month name, while the rather fancy argument means, "Match any line that contains 41 characters followed by the function argument." For example, typing lsd `jan 15' causes grep to search for lines that match any 41 characters followed by jan 15 (or Jan 15).[13]

[13] Some older BSD-derived versions of UNIX (without System V extensions) do not support the \{N\} option. For this example, use 42 periods in a row instead of .\{42\}.

The output of grep is piped through our ubiquitous friend cut to retrieve the filenames only. The argument to cut tells it to extract characters in column 55 through the end of the line.

With command substitution, you can use this function with any command that accepts filename arguments. For example, if you want to print all files in your current directory that were last modified today, and today is January 15th, you could type:

$ lp $(lsd 'jan 15')

The output of lsd is on multiple lines (one for each filename), but LINEFEEDs are legal field separators for the lp command, because the environment variable IFS (see earlier in this chapter) contains LINEFEED by default.

    Previous Section  < Day Day Up >  Next Section