"Everything, you see, makes sense, if you take the trouble to work out the rational."
— Piers Anthony
Regular expressions, or regex for short, describe text. They are a mechanism by which you can tell the Java Virtual Machine (JVM) how to find and potentially manipulate text for you. In this chapter, I'll examine and contrast the traditional approach of describing text with the regex approach.
For example, imagine you need to validate e-mail addresses. The verbal directions for doing so might be something along the lines of "Make sure the e-mail address contains an at (@) symbol." You could probably handle this task with a single line of Java code:
If (email.indexOf("@") > 0) { return true; }
So far, so good. Suppose additional requirements creep in, though, as they invariably do. Now you also need to make sure that all e-mail addresses end with the .org extension. So you amend your code as follows:
If ((email.indexOf("@") > 0) && (email.endsWith(".org"))){ return true; }
But the requirements continue to creep. You now need all e-mail addresses to be of the form firstname_lastname, so you use the StringTokenizer to tokenize the e-mail address, extract the part before the @, look for the underscore (_) character, tokenize the strings around that, and so on. Pretty soon, you have some convoluted code for what should be a fairly straightforward operation.
The use of regular expressions can greatly simplify and condense this process. With regular expressions, you could write the following:
In English, this means "Look for one or more letters, followed by an _, followed by one or more letters, followed by an @, followed by one or more letters, followed by .org." Notice that a period precedes the o in "org".
Don't be concerned if the syntax isn't completely clear to you right now— making it clear is the aim of this book. This chapter explores the underlying concepts of Java regex, with an emphasis on actually forming and using the regex syntax. It's a complete introduction to regular expressions, and it also serves as a preamble to the next chapter. Chapter 2, in turn, is a complete and exhaustive documentation of the J2SE regex object model.