< Day Day Up > |
7.1. I/O RedirectorsIn Chapter 1, you learned about the shell's basic I/O redirectors: >, <, and |. Although these are enough to get you through 95% of your UNIX life, you should know that bash supports many other redirectors. Table 7-1 lists them, including the three we've already seen. Although some of the rest are broadly useful, others are mainly for systems programmers. Notice that some of the redirectors in Table 7-1 contain a digit n, and that their descriptions contain the term file descriptor; we'll cover that in a little while. The first two new redirectors, >> and >|, are simple variations on the standard output redirector >. The >> appends to the output file (instead of overwriting it) if it already exists; otherwise it acts exactly like >. A common use of >> is for adding a line to an initialization file (such as .bashrc or .mailrc) when you don't want to bother with a text editor. For example: $ cat >> .bashrc alias cdmnt='mount -t iso9660 /dev/sbpcd /cdrom' ^D As we saw in Chapter 1, cat without an argument uses standard input as its input. This allows you to type the input and end it with CTRL-D on its own line. The alias line will be appended to the file .bashrc if it already exists; if it doesn't, the file is created with that one line. Recall from Chapter 3, that you can prevent the shell from overwriting a file with > file by typing set -o noclobber. >| overrides noclobber—it's the "Do it anyway, dammit!" redirector. The redirector <> is mainly meant for use with device files (in the /dev directory), i.e., files that correspond to hardware devices such as terminals and communication lines. Low-level systems programmers can use it to test device drivers; otherwise, it's not very useful. The rest of the redirectors will only be useful in special situations and you are unlikely to need them most of the time. 7.1.1. Here-documentsThe << label redirector essentially forces the input to a command to be the shell's standard input, which is read until there is a line that contains only label. The input in between is called a here-document. Here-documents aren't very interesting when used from the command prompt. In fact, it's the same as the normal use of standard input except for the label. We could use a here-document to simulate the mail facility. When you send a message to someone with the mail utility, you end the message with a dot (.). The body of the message is saved in a file, msgfile: $ cat >> msgfile << . > this is the text of > our message. > . Here-documents are meant to be used from within shell scripts; they let you specify "batch" input to programs. A common use of here-documents is with simple text editors like ed. Task 7-1 is a programming task that uses a here-document in this way.
We can use ed to delete the header lines. To do this, we need to know something about the syntax of mail messages; specifically, that there is always a blank line between the header lines and the message text. The ed command 1,/^[]*$/d does the trick: it means, "Delete from line 1 until the first blank line." We also need the ed commands w (write the changed file) and q (quit). Here is the code that solves the task: ed $1 << EOF 1,/^[ ]*$/d w q EOF The shell does parameter (variable) substitution and command substitution on text in a here-document, meaning that you can use shell variables and commands to customize the text. A good example of this is the bashbug script, which sends a bug report to the bash maintainer (see Chapter 11). Here is a stripped-down version: MACHINE="i586" OS="linux-gnu" CC="gcc" CFLAGS=" -DPROGRAM='bash' -DHOSTTYPE='i586' -DOSTYPE='linux-gnu' \ -DMACHTYPE='i586-pc-linux-gnu' -DSHELL -DHAVE_CONFIG_H -I. \ -I. -I./lib -g -O2" RELEASE="2.01" PATCHLEVEL="0" RELSTATUS="release" MACHTYPE="i586-pc-linux-gnu" TEMP=/tmp/bbug.$$ case "$RELSTATUS" in alpha*|beta*) BUGBASH=chet@po.cwru.edu ;; *) BUGBASH=bug-bash@prep.ai.mit.edu ;; esac BUGADDR="${1-$BUGBASH}" UN= if (uname) >/dev/null 2>&1; then UN=`uname -a` fi cat > $TEMP <<EOF From: ${USER} To: ${BUGADDR} Subject: [50 character or so descriptive subject here (for reference)] Configuration Information [Automatically generated, do not change]: Machine: $MACHINE OS: $OS Compiler: $CC Compilation CFLAGS: $CFLAGS uname output: $UN Machine Type: $MACHTYPE bash Version: $RELEASE Patch Level: $PATCHLEVEL Release Status: $RELSTATUS Description: [Detailed description of the problem, suggestion, or complaint.] Repeat-By: [Describe the sequence of events that causes the problem to occur.] Fix: [Description of how to fix the problem. If you don't know a fix for the problem, don't include this section.] EOF vi $TEMP mail $BUGADDR < $TEMP The first eight lines are generated when bashbug is installed. The shell will then substitute the appropriate values for the variables in the text whenever the script is run. The redirector << has two variations. First, you can prevent the shell from doing parameter and command substitution by surrounding the label in single or double quotes. In the above example, if you used the line cat > $TEMP <<`EOF', then text like $USER and $MACHINE would remain untouched (defeating the purpose of this particular script). The second variation is <<-, which deletes leading TABs (but not blanks) from the here-document and the label line. This allows you to indent the here-document's text, making the shell script more readable: cat > $TEMP <<-EOF From: ${USER} To: ${BUGADDR} Subject: [50 character or so descriptive subject here] Configuration Information [Automatically generated, do not change]: Machine: $MACHINE OS: $OS Compiler: $CC Compilation CFLAGS: $CFLAGS ... EOF Make sure you are careful when choosing your label so that it doesn't appear as an actual input line. A slight variation on this is provided by the here string. It takes the form <<<word; the word is expanded and supplied on the standard input. 7.1.2. File DescriptorsThe next few redirectors in Table 7-1 depend on the notion of a file descriptor. Like the device files used with <>, this is a low-level UNIX I/O concept that is of interest only to systems programmers—and then only occasionally. You can get by with a few basic facts about them; for the whole story, look at the entries for read( ), write( ), fcntl( ), and others in Section 2 of the UNIX manual. You might wish to refer to UNIX Power Tools by Shelley Powers, Jerry Peek, Tim O'Reilly, and Mike Loukides (O'Reilly). File descriptors are integers starting at 0 that refer to particular streams of data associated with a process. When a process starts, it usually has three file descriptors open. These correspond to the three standards: standard input (file descriptor 0), standard output (1), and standard error (2). If a process opens additional files for input or output, they are assigned to the next available file descriptors, starting with 3. By far the most common use of file descriptors with bash is in saving standard error in a file. For example, if you want to save the error messages from a long job in a file so that they don't scroll off the screen, append 2> file to your command. If you also want to save standard output, append > file1 2> file2. This leads to another programming task.
We'll call this script start. The code is very terse: "$@" > logfile 2>&1 & This line executes whatever command and parameters follow start. (The command cannot contain pipes or output redirectors.) It sends the command's standard output to logfile. Then, the redirector 2>&1 says, "send standard error (file descriptor 2) to the same place as standard output (file descriptor 1)." Since standard output is redirected to logfile, standard error will go there too. The final & puts the job in the background so that you get your shell prompt back. As a small variation on this theme, we can send both standard output and standard error into a pipe instead of a file: command 2>&1 | ... does this. (Make sure you understand why.) Here is a script that sends both standard output and standard error to the logfile (as above) and to the terminal: "$@" 2>&1 | tee logfile & The command tee takes its standard input and copies it to standard output and the file given as argument. These scripts have one shortcoming: you must remain logged in until the job completes. Although you can always type jobs (see Chapter 1) to check on progress, you can't leave your terminal until the job finishes, unless you want to risk a breach of security.[1] We'll see how to solve this problem in the next chapter.
The other file-descriptor-oriented redirectors (e.g., <&n) are usually used for reading input from (or writing output to) more than one file at the same time. We'll see an example later in this chapter. Otherwise, they're mainly meant for systems programmers, as are <&- (force standard input to close) and >&- (force standard output to close). Before we leave this topic, we should just note that 1> is the same as >, and 0< is the same as <. If you understand this, then you probably know all you need to know about file descriptors. |
< Day Day Up > |