5.4. Regular Expressions
Sed, like grep, searches for patterns in files using regular expressions (REs) and a variety of metacharacters shown in Table 5.3 on page 132. Regular expressions are patterns of characters enclosed in forward slashes for searches and substitutions.
sed -n '/RE/p' filename
sed -n '/love/p' filename
sed -n 's/RE/replacement string/' filename
sed -n 's/love/like/' filename
Table 5.3. sed's Regular Expression MetacharactersMetacharacter | Function | Example | What It Matches |
---|
^ | Beginning-of-line anchor | /^love/ | Matches all lines beginning with love. | $ | End-of-line anchor | /love$/ | Matches all lines ending with love. | . | Matches one character, but not the newline character | /l..e/ | Matches lines containing an l, followed by two characters, followed by an e. | * | Matches zero or more characters | / *love/ | Matches lines with zero or more spaces, followed by the pattern love. | [ ] | Matches one character in the set | /[Ll]ove/ | Matches lines containing love or Love. | [^ ] | Matches one character not in the set | /[^A–KM–Z]ove/ | Matches lines not containing A through K or M through Z followed by ove. | \(..\) | Saves matched characters | s/\(love\)able/\1er/ | Tags marked portion and saves it as tag number 1. To reference later, use \1 to reference the pattern. May use up to nine tags, starting with the first tag at the leftmost part of the pattern. For example, love is saved in register 1 and remembered in the replacement string. lovable is replaced with lover. | & | Saves search string so it can be remembered in the replacement string | s/love/**&**/ | The ampersand represents the search string. The string love will be replaced with itself surrounded by asterisks; i.e., love will become **love**. | \< | Beginning-of-word anchor | /\<love/ | Matches lines containing a word that begins with love. | \> | End-of-word anchor | /love\>/ | Matches lines containing a word that ends with love. |
x\{m\}
x\{m,\}
x\{m,n\}
| Repetition of character x:
m times,
at least m times, or
between m and n times |
/o\{5\}/
/o\{5,\}/
/o\{5,10\}/
| Matches if line has:
5 occurrences of o,
at least 5 occurrences of o, or between 5 and 10 occurrences of o. |
To change the regular expression delimeter, some character, say c, is preceded by a backslash, followed by the regular expression, and that character; for example,
sed -n '/love/p' filename
prints all lines containing love.
To change the delimiter:
sed -n '\cREcp' filename
where c represents the character to delimit the regular expression (RE) in place of forward slashes.
Example 5.2.
1 % sed -n '/12\/10\/04/p' datafile
2 % sed -n '\x12/10/04xp' datafile # sed lets you change the delimiter
EXPLANATION
When forward slashes are part of the regular expression, they must be backslashed so they won't be confused with the forward slashes that delimit it. The forward slashes are replaced by the letter x. This makes it easier when the regular expression contains forward slashes.
If you recall, the grep command returns a zero exit status if a pattern is found in a file, and 1 if it is not found. The exit status of the sed command, however, will be zero, whether or not the pattern being searched for is found. The only time the exit status will be nonzero is when the command contains a syntax error. (See "Error Messages and Exit Status" on page 131.) In the following example, both grep and sed search for the regular expression, John, in a file.
Regular expressions can also be made part of an address as shown in "Addressing" on page 128.
Example 5.3.
1 % grep 'John' datafile # grep searches for John
2 % echo $status
1
3 % sed -n '/John/p' datafile # sed searches for John
4 % echo $status
0
EXPLANATION
With grep, the regular expression John is not enclosed in a delimiter. The exit status of the grep command is zero if the pattern John was found, and nonzero if not. The sed command will print all lines containing the RE pattern, John. Even though the pattern John is not found in the file, the exit status is zero because the syntax was okay.
|