3.4. Shell Variables

There are several characteristics of your environment that you may want to customize but that cannot be expressed as an on/off choice. Characteristics of this type are specified in shell variables. Shell variables can specify everything from your prompt string to how often the shell checks for new mail.

Like an alias, a shell variable is a name that has a value associated with it. bash keeps track of several built-in shell variables; shell programmers can add their own. By convention, built-in variables should have names in all capital letters. bash does, however, have two exceptions.^[7] The syntax for defining variables is somewhat similar to the syntax for aliases:

^[7] Versions prior to 2.0 have many more lowercase built-in variables. Most of these are now obsolete, the functionality having been moved to the shopt command.

varname=value

There must be no space on either side of the equal sign, and if the value is more than one word, it must be surrounded by quotes. To use the value of a variable in a command, precede its name by a dollar sign ($).

You can delete a variable with the command unset varname. Normally this isn't useful, since all variables that don't exist are assumed to be null, i.e., equal to the empty string "". But if you use the set option nounset, which causes the shell to indicate an error when it encounters an undefined variable, then you may be interested in unset.

The easiest way to check a variable's value is to use the echo built-in command. All echo does is print its arguments, but not until the shell has evaluated them. This includes—among other things that will be discussed later—taking the values of variables and expanding filename wildcards. So, if the variable wonderland has the value alice, typing:

$ echo "$wonderland"

will cause the shell to simply print alice. If the variable is undefined, the shell will print a blank line. A more verbose way to do this is:

$ echo "The value of \$ varname  is \"$ varname \"."

The first dollar sign and the inner double quotes are backslash-escaped (i.e., preceded with \ so the shell doesn't try to interpret them—see Chapter 1) so they appear literally in the output, which for the above example would be:

The value of $wonderland is "alice".

3.4.1. Variables and Quoting

Notice that we used double quotes around variables (and strings containing them) in these echo examples. In Chapter 1, we said that some special characters inside double quotes are still interpreted, while none are interpreted inside single quotes.

A special character that "survives" double quotes is the dollar sign—meaning that variables are evaluated. It's possible to do without the double quotes in some cases; for example, we could have written the above echo command this way:

$ echo The value of \$ varname  is \"$ varname \".

But double quotes are more generally correct. Here's why. Suppose we did this:

$ fred='Four spaces between these words.'

Then if we entered the command echo $fred, the result would be:

Four spaces between these words.

What happened to the extra spaces? Without the double quotes, the shell splits the string into words after substituting the variable's value, as it normally does when it processes command lines. The double quotes circumvent this part of the process (by making the shell think that the whole quoted string is a single word).

Therefore the command echo "$fred" prints this:

Four spaces between these    words.

The distinction between single and double quotes becomes particularly important when we start dealing with variables that contain user or file input later on.

Double quotes also allow other special characters to work, as we'll see in Chapter 4, Chapter 6, and Chapter 7. But for now, we'll revise the "When in doubt, use single quotes" rule in Chapter 1 by adding, "...unless a string contains a variable, in which case you should use double quotes."

3.4.2. Built-In Variables

As with options, some built-in shell variables are meaningful to general UNIX users, while others are arcana for hackers. We'll look at the more generally useful ones here, and we'll save some of the more obscure ones for later chapters. Again, Appendix B contains a complete list.

3.4.2.1 Editing mode variables

Several shell variables relate to the command-line editing modes that we saw in the previous chapter. These are listed in Table 3-4.

Table 3-4. Editing mode variables

Variable

Meaning

HISTCMD

The history number of the current command.

HISTCONTROL

A list of patterns, separated by colons (:), which can have the following values. ignorespace: lines beginning with a space are not entered into the history list. ignoredups: lines matching the last history line are not entered. erasedups: all previous lines matching the current line are removed from the history list before the line is saved. ignoreboth: enables both ignorespace and ignoredups.^[8]

HISTIGNORE

A list of patterns, separated by colons (:), used to decide which command lines to save in the history list. Patterns are considered to start at the beginning of the command line and must fully specify the line, i.e., no wildcard (*) is implicitly appended. The patterns are checked against the line after HISTCONTROL is applied. An ampersand (&) matches the previous line. An explicit & may be generated by escaping it with a backslash.^[9]

HISTFILE

Name of history file in which the command history is saved. The default is ~/.bash_history.

HISTFILESIZE

The maximum number of lines to store in the history file. The default is 500. When this variable is assigned a value, the history file is truncated, if necessary, to the given number of lines.

HISTSIZE

The maximum number of commands to remember in the command history. The default is 500.

HISTTIMEFORMAT

If it is set and not null, its value is used as a format string for strftime(3) to print the time stamp associated with each history entry displayed by the history command. Time stamps are written to the history file so they may be preserved across shell sessions.^[10]

FCEDIT

Pathname of the editor to use with the fc command.

^[8] history_control is synonymous with HISTCONTROL in versions of bash prior to 2.0. Versions prior to 1.14 only define history_control. ignoreboth is not available in bash versions prior to 1.14. HISTCONTROL is a colon-separated list, and erasedups has been added in bash 3.0 and later.

^[9] This variable is not available in versions of bash prior to 2.0.

^[10] This variable is not available in versions of bash prior to 3.0.

In the previous chapter, we saw how bash numbers commands. To find out the current command number in an interactive shell, you can use the HISTCMD. Note that if you unset HISTCMD, it will lose its special meaning, even if you subsequently set it again.

We also saw in the last chapter how bash keeps the history list in memory and saves it to a file when you exit a shell session. The variables HISTFILESIZE and HISTSIZE allow you to set the maximum number of lines that the shell saves in the history file, and the maximum number of lines to "remember" in the history list, i.e., the lines that it displays with the history command.

Suppose you wanted to maintain a small history file in your home directory. By setting HISTFILESIZE to 100, you immediately cause the history file to allow a maximum of 100 lines. If it is already larger than the size you specify, it will be truncated.

HISTSIZE works in the same way, but only on the history that the current shell has in memory. When you exit an interactive shell, HISTSIZE will be the maximum number of lines saved in your history file. If you have already set HISTFILESIZE to be less than HISTSIZE, the saved list will be truncated.

You can also cut down on the size of your history file and history list by use of the HISTCONTROL variable. This is a colon-separated list of values. If it includes ignorespace, any commands that you type that start with a space won't appear in the history. Even more useful is the ignoredups option. This discards consecutive entries from the history list that are duplicated. Suppose you want to monitor the size of a file with ls as it is being created. Normally, every time you type ls it will appear in your history. By setting HISTCONTROL to ignoredups, only the first ls will appear in the history.

The variable HISTIGNORE allows you to specify a list of patterns which the command line is checked against. If the command line matches one of the patterns, it is not entered into the history list. You can also request that it ignore duplicates by using the pattern &.

For example, suppose you didn't want any command starting with l, nor any duplicates, to appear in the history. Setting HISTIGNORE to l*:& will do just that. Just as with other pattern matching we have seen, the wildcard after the l will match any command line starting with that letter.

Another useful variable is HISTTIMEFORMAT, which prepends a time stamp to each history entry showing when the command was executed. If it is unset or the value is null then no time stamp is written. If a format is given then time stamps are inserted using the specified format as part of the history and are shown with the history command.

The time stamp formats are shown in Table 3-5. Some of the results will be displayed using the particular format for the underlying locale, e.g., weekday names will be translated into the language being used on the system.

Table 3-5. Time stamp formats

Format

Replaced by

%a

The locale's abbreviated weekday name

%A

The locale's full weekday name

%b

The locale's abbreviated month name

%B

The locale's full month name

%c

The locale's appropriate date and time representation

%C

The century number (the year divided by 100 and truncated to an integer) as a decimal number [00-99]

%d

The day of the month as a decimal number [01-31]

%D

The date in American format; the same value as %m/%d/%y.

%e

The day of the month as a decimal number [1-31]; a single digit is preceded by a space

%h

The same as %b

%H

The hour (24-hour clock) as a decimal number [00-23]

%I

The hour (12-hour clock) as a decimal number [01-12]

%j

The day of the year as a decimal number [001-366]

%m

The month as a decimal number [01-12]

%M

The minute as a decimal number [00-59]

%n

A newline character

%p

The locale's equivalent of either a.m. or p.m

%r

The time in a.m. and p.m. notation; in the POSIX locale this is equivalent to %I:%M:%S %p

%R

The time in 24-hour notation (%H:%M)

%S

The second as a decimal number [00-61]

%t

A tab character

%T

The time (%H:%M:%S)

%u

The weekday as a decimal number [1-7], with 1 representing Monday

%U

The week number of the year (Sunday as the first day of the week) as a decimal number [00-53]

%V

The week number of the year (Monday as the first day of the week) as a decimal number [01-53]; if the week containing 1 January has four or more days in the new year, then it is considered week 1—otherwise, it is the last week of the previous year, and the next week is week 1

%w

The weekday as a decimal number [0-6], with 0 representing Sunday

%W

The week number of the year (Monday as the first day of the week) as a decimal number [00-53]; all days in a new year preceding the first Monday are considered to be in week 0

%x

The locale's appropriate date representation

%X

The locale's appropriate time representation

%y

The year without century as a decimal number [00-99]

%Y

The year with century as a decimal number

%Z

The timezone name or abbreviation, or by nothing if no timezone information exists

%%

%

If you wanted to have the date and time with each history entry, you could put:

HISTTIMEFORMAT="%y/%m/%d %T "

then the output of the history command would look something like:

...

78 04/11/26 17:14:05 HISTTIMEFORMAT="%y/%m/%d %T "

79 04/11/26 17:14:08 ls -l

80 04/11/26 17:14:09 history

If the history has never had a date format set before then all of the entries prior to setting the variable will get the time stamp of the time the variable was set. If you set HISTTIMEFORMAT to null and then set it to a format, the previous time stamps are retained and displayed in the new format.

3.4.2.2 Mail variables

Since the mail program is not running all the time, there is no way for it to inform you when you get new mail; therefore the shell does this instead.^[11] The shell can't actually check for incoming mail, but it can look at your mail file periodically and determine whether the file has been modified since the last check. The variables listed in Table 3-6 let you control how this works.

^[11] BSD UNIX users should note that the biff command on those systems does a better job of informing you about new mail; while bash only prints "you have new mail" messages right before it prints command prompts, biff can do so at any time.

Table 3-6. Mail variables

Variable

Meaning

MAIL

Name of file to check for incoming mail

MAILCHECK

How often, in seconds, to check for new mail (default 60 seconds)

MAILPATH

List of filenames, separated by colons (:), to check for incoming mail

Under the simplest scenario, you use the standard UNIX mail program, and your mail file is /usr/mail/yourname or something similar. In this case, you would just set the variable MAIL to this filename if you want your mail checked:

MAIL=/usr/mail/yourname

If your system administrator hasn't already done it for you, put a line like this in your .bash_profile.

However, some people use nonstandard mailers that use multiple mail files; MAILPATH was designed to accommodate this. bash will use the value of MAIL as the name of the file to check, unless MAILPATH is set; in which case, the shell will check each file in the MAILPATH list for new mail. You can use this mechanism to have the shell print a different message for each mail file: for each mail filename in MAILPATH, append a question mark followed by the message you want printed.

For example, let's say you have a mail system that automatically sorts your mail into files according to the username of the sender. You have mail files called /usr/mail/you/martin, /usr/mail/you/geoffm, /usr/mail/you/paulr, etc. You define your MAILPATH as follows:

MAILPATH=/usr/mail/you/martin:/usr/mail/you/geoffm:\

/usr/mail/you/paulr

If you get mail from Martin Lee, the file /usr/mail/you/martin will change. bash will notice the change within one minute and print the message:

You have new mail in /usr/mail/you/martin

If you are in the middle of running a command, the shell will wait until the command finishes (or is suspended) to print the message. To customize this further, you could define MAILPATH to be:

MAILPATH="\

/usr/mail/you/martin?You have mail from Martin.:\

/usr/mail/you/geoffm?Mail from Geoff has arrived.:\

/usr/mail/you/paulr?There is new mail from Paul."

The backslashes at the end of each line allow you to continue your command on the next line. But be careful: you can't indent subsequent lines. Now, if you get mail from Martin, the shell will print:

You have mail from Martin.

You can also use the variable $_ in the message to print the name of the current mail file. For example:

MAILPATH='/usr/mail/you?You have some new mail in $_'

When new mail arrives, this will print the line:

You have some new mail in /usr/mail/you

The ability to receive notification of mail can be switched on and off by using the mailwarn option to the shopt command.

3.4.2.3 Prompting variables

If you have seen enough experienced UNIX users at work, you may already have realized that the shell's prompt is not engraved in stone. Many of these users have all kinds of things encoded in their prompts. It is possible to put useful information into the prompt, including the date and the current directory. We'll give you some of the information you need to modify your own here; the rest will come in the next chapter.

Actually , bash uses four prompt strings. They are stored in the variables PS1, PS2, PS3, and PS4. The first of these is called the primary prompt string; it is your usual shell prompt, and its default value is "\s-\v\$ ".^[12] Many people like to set their primary prompt string to something containing their login name. Here is one way to do this:

^[12] In versions of bash prior to 2.0, the default is "bash\$ ".

PS1="\u--> "

The \u tells bash to insert the name of the current user into the prompt string. If your user name is alice, your prompt string will be "alice—>". If you are a C shell user and, like many such people, are used to having a history number in your prompt string, bash can do this similarly to the C shell: if the sequence \! is used in the prompt string, it will substitute the history number. Thus, if you define your prompt string to be:

PS1="\u \!--> "

then your prompts will be like alice 1—>, alice 2—>, and so on.

But perhaps the most useful way to set up your prompt string is so that it always contains your current directory. This way, you needn't type pwd to remember where you are. Here's how:

PS1="\w--> "

Table 3-7 lists the prompt customizations that are available.^[13]

^[13] \a, \e, \H, \T, \@, \v, and \V are not available in versions prior to 2.0. \D was introduced in bash 2.05b.

Table 3-7. Prompt string customizations

Command

Meaning

\a

The ASCII bell character (007)

\A

The current time in 24-hour HH:MM format

\d

The date in "Weekday Month Day" format

\D {format}

The format is passed to strftime(3) and the result is inserted into the prompt string; an empty format results in a locale-specific time representation; the braces are required

\e

The ASCII escape character (033)

\H

The hostname

\h

The hostname up to the first "."

\j

The number of jobs currently managed by the shell

\l

The basename of the shell's terminal device name

\n

A carriage return and line feed

\r

A carriage return

\s

The name of the shell

\T

The current time in 12-hour HH:MM:SS format

\t

The current time in HH:MM:SS format

\@

The current time in 12-hour a.m./p.m. format

\u

The username of the current user

\v

The version of bash (e.g., 2.00)

\V

The release of bash; the version and patchlevel (e.g., 2.00.0)

\w

The current working directory

\W

The basename of the current working directory

\#

The command number of the current command

\!

The history number of the current command

\$

If the effective UID is 0, print a #, otherwise print a $

\nnn

Character code in octal

\\

Print a backslash

\[

Begin a sequence of non-printing characters, such as terminal control sequences

\]

End a sequence of non-printing characters

PS2 is called the secondary prompt string; its default value is >. It is used when you type an incomplete line and hit RETURN, as an indication that you must finish your command. For example, assume that you start a quoted string but don't close the quote. Then if you hit RETURN, the shell will print > and wait for you to finish the string:

$ echo "This is a long line,  # PS1 for the command 

> which is terminated down here"  # PS2 for the continuation 

$                                    # PS1 for the next command

PS3 and PS4 relate to shell programming and debugging. They will be explained in Chapter 5, and Chapter 9.

3.4.2.4 Command search path

Another important variable is PATH, which helps the shell find the commands you enter.

As you probably know, every command you use is actually a file that contains code for your machine to run.^[14] These files are called executable files or just executables for short. They are stored in various directories. Some directories, like /bin or /usr/bin, are standard on all UNIX systems; some depend on the particular version of UNIX you are using; some are unique to your machine; if you are a programmer, some may even be your own. In any case, there is no reason why you should have to know where a command's executable file is in order to run it.

^[14] Unless it's a built-in command (one of those shown in boldface, like cd and echo), in which case the code is simply part of the executable file for the entire shell.

That is where PATH comes in. Its value is a list of directories that the shell searches every time you enter a command;^[15] the directory names are separated by colons (:), just like the files in MAILPATH.

^[15] Unless the command name contains a slash (/), in which case the search does not take place.

For example, if you type echo $PATH, you will see something like this:

/bin:/usr/bin:/usr/local/bin:/usr/X386/bin

Why should you care about your path? There are two main reasons. First, once you have read the later chapters of this book and you try writing your own shell programs, you will want to test them and eventually set aside a directory for them. Second, your system may be set up so that certain restricted commands' executable files are kept in directories that are not listed in PATH. For example, there may be a directory /usr/games in which there are executables that are verboten during regular working hours.

Therefore you may want to add directories to your PATH. Let's say you have created a bin directory under your login directory, which is /home/you, for your own shell scripts and programs. To add this directory to your PATH so that it is there every time you log in, put this line in your .bash_profile:

PATH=$PATH":/home/you/bin"

This line sets PATH to whatever it was before, followed immediately by a colon and /home/you/bin.

This is the safe way of doing it. When you enter a command, the shell searches directories in the order they appear in PATH until it finds an executable file. Therefore, if you have a shell script or program whose name is the same as an existing command, the shell will use the existing command—unless you type in the command's full pathname to make it clear. For example, if you have created your own version of the more command in the above directory and your PATH is set up as in the last example, you will need to type /home/you/bin/more (or just ~/bin/more) to get your version.

The more reckless way of resetting your path is to put your own directory before the other directories:

PATH="/home/you/bin:"$PATH

This is unsafe because you are trusting that your own version of the more command works properly. But it is also risky for a more important reason: system security. If your PATH is set up in this way, you leave open a "hole" that is well known to computer crackers and mischief makers: they can install "Trojan horses" and do other things to steal files or do damage. (See Chapter 10 for more details.) Therefore, unless you have complete control of (and confidence in) everyone who uses your system, use the first of the two methods of adding your own command directory.

If you need to know which directory a command comes from, you need not look at directories in your PATH until you find it. The shell built-in command type prints the full pathname of the command you give it as argument, or just the command's name and its type if it's a built-in command itself (like cd), an alias, or a function (as we'll see in Chapter 4).

3.4.2.5 Command hashing

You may be thinking that having to go and find a command in a large list of possible places would take a long time, and you'd be right. To speed things up, bash uses what is known as a hash table.

Every time the shell goes and finds a command in the search path, it enters it in the hash table. If you then use the command again, bash first checks the hash table to see if the command is listed. If it is, it uses the path given in the table and executes the command; otherwise, it just has to go and look for the command in the search path.

You can see what is currently in the hash table with the command hash:

$ hash

hits    command

   2    /bin/cat

   1    /usr/bin/stat

   2    /usr/bin/less

   1    /usr/bin/man

   2    /usr/bin/apropos

   2    /bin/more

   1    /bin/ln

   3    /bin/ls

   1    /bin/ps

   2    /bin/vi

This not only shows the hashed commands, but how many times they have been executed (the hits) during the current login session.

Supplying a command name to hash forces the shell to look up the command in the search path and enter it in the hash table. You can also make bash "forget" what is in the hash table by using hash -r to remove everything in the table or hash -d name to remove the specified name.^[16] Another option, -p, allows you to enter a command into the hash table, even if the command doesn't exist.^[17]

^[16] The -d option is not available in versions of bash prior to 2.05b.

^[17] The -p option is not available in versions of bash prior to 2.0.

Command hashing can be turned on and off with the hashall option to set. In general use, there shouldn't be any need to turn it off.

Don't be too concerned about the details of hashing. The command hashing and lookup is all done by bash without you knowing it's taking place.

3.4.2.6 Directory search path and variables

CDPATH is a variable whose value, like that of PATH, is a list of directories separated by colons. Its purpose is to augment the functionality of the cd built-in command.

By default, CDPATH isn't set (meaning that it is null), and when you type cd dirname, the shell will look in the current directory for a subdirectory that is called dirname.^[18] If you set CDPATH, you give the shell a list of places to look for dirname; the list may or may not include the current directory.

^[18] This search is disabled when dirname starts with a slash. It is also disabled when dirname starts with ./ or ../.

Here is an example. Consider the alias for the long cd command from earlier in this chapter:

alias cdvoy='cd sipp/demo/animation/voyager'

Now suppose there were a few directories under this directory to which you need to go often; they are called src, bin, and doc. You define your CDPATH like this:

CDPATH=:~/sipp/demo/animation/voyager

In other words, you define your CDPATH to be the empty string (meaning the current directory) followed by ~/sipp/demo/animation/voyager.

With this setup, if you type cd doc, then the shell will look in the current directory for a (sub)directory called doc. Assuming that it doesn't find one, it looks in the directory ~/sipp/demo/animation/voyager. The shell finds the doc directory there, so you go directly there.

If you often find yourself going to a specific group of directories as you work on a particular project, you can use CDPATH to get there quickly. Note that this feature will only be useful if you update it whenever your work habits change.

bash provides another shorthand mechanism for referring to directories; if you set the shell option cdable_vars using shopt,^[19] any argument supplied to the cd command that is not a directory is assumed to be a variable.

^[19] In versions of bash prior to 2.0, cdable_vars is a shell variable that you can set and unset.

We might define the variable anim to be ~/sipp/demo/animation/voyager. If we set cdable_vars and then type:

cd anim

the current directory will become ~/sipp/demo/animation/voyager.

3.4.2.7 Miscellaneous variables

We have covered the shell variables that are important from the standpoint of customization. There are also several that serve as status indicators and for various other miscellaneous purposes. Their meanings are relatively straightforward; the more basic ones are summarized in Table 3-8.

Table 3-8. Status variables

Variable

Meaning

HOME

Name of your home (login) directory

SECONDS

Number of seconds since the shell was invoked

BASH

Pathname of this instance of the shell you are running

BASH_VERSION

The version number of the shell you are running

BASH_VERSINFO

An array of version information for the shell you are running

PWD

Current directory

OLDPWD

Previous directory before the last cd command

The shell sets the values of these variables, except HOME (which is set by the login process: login, rshd, etc.). The first five are set at login time, the last two whenever you change directories. Although you can also set their values, just like any other variables, it is difficult to imagine any situation where you would want to. In the case of SECONDS, if you set it to a new value it will start counting from the value you give it, but if you unset SECONDS it will lose its special meaning, even if you subsequently set it again.

< Day Day Up >