Team LiB
Previous Section Next Section

Searching a String

This example searches a given string for the existence of a pattern and returns all of the matching strings. This is very easy code, but it's such a useful little program that it's worthwhile to demonstrate it.

First, I need to decide exactly what I mean by "return." Return what? In this case, I decide to return an ArrayList of matching Strings, because I want the Strings to be in the order in which they were found, and an ArrayList maintains the order in which elements were inserted. Also, I like the idea of returning a well-defined data structure, in case the client wants to, say, step through that structure and examine the data further.

I also decide that I want the client to be able to pass in Pattern.compile flags such as Pattern.MULTILINE and Pattern.DOTALL. It doesn't really cost me anything in the way of additional complexity, and it's a nice feature for the client. At this point, it's worthwhile to get a preliminary method signature written down. I come up with this:

    public static ArrayList searchString(
       String content, String searchPattern, int flags
   ) throws IOException

Now I'm ready to start writing my method. My first pass looks like Listing 5-5.

Listing 5-5: First Pass at the searchString Method
Start example
01     public static ArrayList searchString(
02         String content,
03         String searchPattern,
04         int flags
05     )
06     throws IOException
07     {
08         ArrayList retval = new ArrayList();
09         Pattern pattern = null;

10        //compile the pattern
11        if (flags > -1)
12        {
13            pattern = Pattern.compile(searchPattern, flags);
14        }
15        else
16        {
17            pattern = Pattern.compile(searchPattern);
18        }

19        //extract the matcher for the pattern
20        Matcher matcher = pattern.matcher(content);

21        //iterate through all of the matches, and add
22        //all relevant ones to the arrayList
23        while (matcher.find())
24        {
25            //extract the match and its position
26            String tmp = matcher.group();
27            //insert the matching string
28            //into the map.
29            retval.add(+ tmp);
30        }

32        return retval;
32     }

End example

Listing 5-5 isn't terrible. It finds all the relevant matching substrings and returns them in order. I run a few sample tests and find that it works as expected. But it does leave something to be desired. It doesn't really tell me where the string was found, and it might be nice if it were overloaded, so the client isn't forced to pass in a flag if they don't need one.

I decide that for this generation, the client can live without the overloading. However, I do think the client has a right to ask for the position at which the matching strings were found. Thus, I modify the code so that it returns a Map. The Map will contain a key/value pair, which will use the byte position of each find (stored as a String or an Integer-I haven't decided which yet) and the matching substring as a value. Modifying the code, I come up with Listing 5-6. The only significant changes are on lines 7, 25, and 29. By the way, I decided to use a LinkedHashMap on line 8, because I wanted to preserve the order in which the matching Strings were found. A LinkedHashMap is a J2SE 1.4 addition to the Map family that preserves the insertion order of elements.

Listing 5-6: Modified searchString Method Belonging in the RegexUtil Class
Start example
01  public static Map searchString(
02      String content,
03      String searchPattern,
04      int flags
05  )

06  {
07      Map retval = new LinkedHashMap();
08      Pattern pattern = null;

09      //compile the pattern
10      if (flags > -1)
11      {
12          pattern = Pattern.compile(searchPattern, flags);
13      }
14      else
15      {
16          pattern = Pattern.compile(searchPattern);
17      }

18      //extract the matcher for the pattern
19      Matcher matcher = pattern.matcher(content);
20     //iterate through all of the matches, and add
21     //all relevant ones to the arrayList
22     while (matcher.find())
23     {
24         //extract the match and its position
25         int position = matcher.start();
26         String tmp = matcher.group();
27         //insert the matching string and position
28         //into the map.
29         retval.put(position+"",tmp);
30     }

31     return retval;
32 }
End example

I decide to make the position a String, to make dealing with the output easier. I don't want to require the client to handle the keys too carefully, so Strings will do for now.


Team LiB
Previous Section Next Section