Previous Section  < Day Day Up >  Next Section

5.4. Regular Expressions

Sed, like grep, searches for patterns in files using regular expressions (REs) and a variety of metacharacters shown in Table 5.3 on page 132. Regular expressions are patterns of characters enclosed in forward slashes for searches and substitutions.


sed -n '/RE/p' filename

sed -n '/love/p' filename



sed -n 's/RE/replacement string/' filename

sed -n 's/love/like/' filename


Table 5.3. sed's Regular Expression Metacharacters

Metacharacter

Function

Example

What It Matches

^

Beginning-of-line anchor

/^love/

Matches all lines beginning with love.

$

End-of-line anchor

/love$/

Matches all lines ending with love.

.

Matches one character, but not the newline character

/l..e/

Matches lines containing an l, followed by two characters, followed by an e.

*

Matches zero or more characters

/ *love/

Matches lines with zero or more spaces, followed by the pattern love.

[ ]

Matches one character in the set

/[Ll]ove/

Matches lines containing love or Love.

[^ ]

Matches one character not in the set

/[^A–KM–Z]ove/

Matches lines not containing A through K or M through Z followed by ove.

\(..\)

Saves matched characters

s/\(love\)able/\1er/

Tags marked portion and saves it as tag number 1. To reference later, use \1 to reference the pattern. May use up to nine tags, starting with the first tag at the leftmost part of the pattern. For example, love is saved in register 1 and remembered in the replacement string. lovable is replaced with lover.

&

Saves search string so it can be remembered in the replacement string

s/love/**&**/

The ampersand represents the search string. The string love will be replaced with itself surrounded by asterisks; i.e., love will become **love**.

\<

Beginning-of-word anchor

/\<love/

Matches lines containing a word that begins with love.

\>

End-of-word anchor

/love\>/

Matches lines containing a word that ends with love.


x\{m\}

x\{m,\}

x\{m,n\} [a]


Repetition of character x:

m times,

at least m times, or

between m and n times


/o\{5\}/

/o\{5,\}/

/o\{5,10\}/


Matches if line has:

5 occurrences of o,

at least 5 occurrences of o, or between 5 and 10 occurrences of o.


[a] Not dependable on all versions of UNIX or all pattern-matching utilities; usually works with vi and grep.

To change the regular expression delimeter, some character, say c, is preceded by a backslash, followed by the regular expression, and that character; for example,


sed -n '/love/p' filename


prints all lines containing love.

To change the delimiter:


sed -n '\cREcp' filename


where c represents the character to delimit the regular expression (RE) in place of forward slashes.

Example 5.2.

1   % sed -n '/12\/10\/04/p' datafile

2   % sed -n '\x12/10/04xp' datafile      # sed lets you change the delimiter


EXPLANATION

  1. When forward slashes are part of the regular expression, they must be backslashed so they won't be confused with the forward slashes that delimit it.

  2. The forward slashes are replaced by the letter x. This makes it easier when the regular expression contains forward slashes.

If you recall, the grep command returns a zero exit status if a pattern is found in a file, and 1 if it is not found. The exit status of the sed command, however, will be zero, whether or not the pattern being searched for is found. The only time the exit status will be nonzero is when the command contains a syntax error. (See "Error Messages and Exit Status" on page 131.) In the following example, both grep and sed search for the regular expression, John, in a file.

Regular expressions can also be made part of an address as shown in "Addressing" on page 128.

Example 5.3.

1   % grep 'John' datafile         # grep searches for John

2   % echo $status

    1



3   % sed -n '/John/p' datafile    # sed searches for John

4   % echo $status

    0


EXPLANATION

  1. With grep, the regular expression John is not enclosed in a delimiter.

  2. The exit status of the grep command is zero if the pattern John was found, and nonzero if not.

  3. The sed command will print all lines containing the RE pattern, John.

  4. Even though the pattern John is not found in the file, the exit status is zero because the syntax was okay.

    Previous Section  < Day Day Up >  Next Section