Previous Section  < Day Day Up >  Next Section

3.2. Combining Regular Expression Metacharacters

Now that basic regular expression metacharacters have been explained, they can be combined into more complex expressions. Each of the regular expression examples enclosed in forward slashes is the search string and is matched against each line in the text file.

Example 3.9.

Note: The line numbers are NOT part of the text file. The vertical bars mark the left and

 right margins.

   ---------------------------------------------------------------

1  |Christian Scott lives here and will put on a Christmas party.|

2  |There are around 30 to 35 people invited.                    |

3  |They are:                                                    |

4  |                                                          Tom|

5  |Dan                                                          |

6  |   Rhonda Savage                                             |

7  |Nicky and Kimberly.                                          |

8  |Steve, Suzanne, Ginger and Larry.                            |

   ---------------------------------------------------------------


EXPLANATION

  1. /^[A–Z]..$/ Will find all lines beginning with a capital letter, followed by two of any character, followed by a newline. Will find Dan on line 5.

  2. /^[A–Z][a–z ]*3[0–5]/ Will find all lines beginning with an uppercase letter, followed by zero or more lowercase letters or spaces, followed by the number 3 and another number between 0 and 5. Will find line 2.

  3. /[a–z]*\ ./ Will find lines containing zero or more lowercase letters, followed by a literal period. Will find lines 1, 2, 7, and 8.

  4. /^ *[A–Z][a–z][a–z]$/ Will find a line that begins with zero or more spaces (tabs do not count as spaces), followed by an uppercase letter, two lowercase letters, and a newline. Will find Tom on line 4 and Dan on line 5.

  5. /^[A–Za–z]*[^,][A–Za–z]*$/ Will find a line that begins with zero or more uppercase and/or lowercase letters, followed by a noncomma, followed by zero or more upper- or lowercase letters and a newline. Will find line 5.

3.2.1 More Regular Expression Metacharacters

The following metacharacters are not necessarily portable across all utilities using regular expressions, but can be used in the vi editor and some versions of sed and grep. There is an extended set of metacharacters available with egrep and awk, which will be discussed in later sections.

Example 3.10.

    (Beginning-of-word (\<) and end-of-word (\>) anchors)

    % vi textfile

    -------------------------------------------------------------

    Unusual occurrences happened at the fair.

--> Patty won fourth place in the 50 yard dash square and fair.

    Occurrences like this are rare.

    The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

    Guy fell down while running around the south bend in his last event.

    ~

    ~

    ~

    /\<fourth\>/

    -------------------------------------------------------------


EXPLANATION

Will find the word fourth on each line. The \< is the beginning-of-word anchor and the \ > is the end-of-word anchor. A word can be separated by spaces, end in punctuation, start at the beginning of a line, end at the end of a line, and so forth.

Example 3.11.

    % vi textfile

    -------------------------------------------------------------

    Unusual occurrences happened at the fair.

--> Patty won fourth place in the 50 yard dash square and fair.

    Occurrences like this are rare.

    The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

--> Guy fell down while running around the south bend in his last event.

    ~

    ~

    ~

    /\<f.*th\>/

    -------------------------------------------------------------


EXPLANATION

Will find any word (or group of words) beginning with an f, followed by zero or more of any character (.*), and a string ending with th.

Example 3.12.

(Remembered patterns \( and \))

    % vi textfile (Before substitution)

    -------------------------------------------------------------

    Unusual occurences happened at the fair.

    Patty won fourth place in the 50 yard dash square and fair.

    Occurences like this are rare.

    The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

    Guy fell down while running around the south bend in his last event.

    ~

    ~

    ~

1   :1,$s/\([0o]ccur\)ence/\1rence/

    -------------------------------------------------------------





    % vi textfile (After substitution)

    -------------------------------------------------

--> Unusual occurrences happened at the fair.

    Patty won fourth place in the 50 yard dash square and fair.

--> Occurrences like this are rare.

    The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

    Guy fell down while running around the south bend in his last event.

    ~

    ~

    ~

    -------------------------------------------------------------


EXPLANATION

  1. The editor searches for the entire string occurence (intentionally misspelled) or Occurrence and if found, the pattern portion enclosed in parentheses is tagged (i.e., either occur or Occur is tagged). Because this is the first pattern tagged, it is called tag 1. The pattern is stored in a memory register called register 1. On the replacement side, the contents of the register are replaced for \1 and the rest of the word, rence, is appended to it. We started with occurence and ended up with occurrence. See Figure 3.1.

    Figure 3.1. Remembered patterns and tags.


Example 3.13.

  % vi textfile (Before substitution)

    -------------------------------------------------------------

    Unusual occurrences happened at the fair.

    Patty won fourth place in the 50 yard dash square and fair.

    Occurrences like this are rare.

    The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

    Guy fell down while running around the south bend in his last event.

    ~

    ~

    ~



    Occurrences like this are rare.

    The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

    Guy fell down while running around the south bend in his last event.

    ~

    ~

    ~

    -------------------------------------------------------------


EXPLANATION

  1. The editor searches for the regular expression square and fair, and tags square as 1 and fair as 2. On the replacement side, the contents of register 2 are substituted for \2 and the contents of register 1 are substituted for \1. See Figure 3.2.

    Figure 3.2. Using more than one tag.


Example 3.14.

(Repetition of patterns ( \{n\} ))

    % vi textfile

    -------------------------------------------

    Unusual occurrences happened at the fair.

    Patty won fourth place in the 50 yard dash square and fair.

    Occurrences like this are rare.

--> The winning ticket is 55222.

    The ticket I got is 54333 and Dee got 55544.

    Guy fell down while running around the south bend in his last

    event.

    ~

    ~

    ~

    ~

1   /5\{2\}2\{3\}\./

    -------------------------------------------------------------


EXPLANATION

  1. Searches for lines containing two occurrences of the number 5, followed by three occurrences of the number 2, followed by a literal period.

    Previous Section  < Day Day Up >  Next Section