Previous Section  < Free Open Study >  Next Section

Chapter 7. Perl

Perl has been featured prominently in this book, and with good reason. It is popular, extremely rich with regular expressions, freely and readily obtainable, easily approachable by the beginner, and available for a remarkably wide variety of platfor ms, including pretty much all flavors of Windows, Unix, and the Mac.

Some of Perl's programming constructs superficially resemble those of C or other traditional programming languages, but the resemblance stops there. The way you wield Perl to solve a problem — The Perl Way — is different from traditional languages. The overall layout of a Perl program often uses traditional structured and object-oriented concepts, but data processing often relies heavily on regular expressions. In fact, I believe it is safe to say that regular expressions play a key role in virtually all Perl programs. This includes everything from huge 100,000-line systems, right down to simple one-liners, like

     % perl -pi -e 's{([-+]?\d+(\.\d*)?)F\b}{sprintf "%.0fC",($1-32)*5/9}eg' *.txt

which goes through *.txt files and replaces Fahrenheit values with Celsius ones (reminiscent of the first example from Chapter 2).

In This Chapter

This chapter looks at everything regex about Perl,[1] including details of its regex flavor and the operators that put them to use. This chapter presents the regex-relevant details from the ground up, but I assume that you have at least a basic familiarity with Perl. (If you've read Chapter 2, you're already familiar enough to at least start using this chapter.) I'll often use, in passing, concepts that have not yet been examined in detail, and I won't dwell much on non-regex aspects of the language. It might be a good idea to keep the Perl documentation handy, or perhaps O'Reilly's Programming Perl.

[1] This book covers features of Perl as of Version 5.8.

Perhaps more important than your current knowledge of Perl is your desire to understand more. This chapter is not light reading by any measure. Because it's not my aim to teach Perl from scratch, I am afforded a luxury that general books about Perl do not have: I don't have to omit important details in favor of weaving one coherent story that progresses unbroken through the whole chapter. Some of the issues are complex, and the details thick; don't be worried if you can't take it all in at once. I recommend first reading the chapter through to get the overall pictur e, and returning in the future to use it as a reference as needed.

To help guide your way, here's a quick rundown of how this chapter is organized:

  • "Perl's Regex Flavor" (see Section 7.2) looks at the rich set of metacharacters supported by Perl regular expressions, along with additional features afforded to raw regex literals.

  • "Regex Related Perlisms" (see Section 7.3) looks at some aspects of Perl that are of particular interest when using regular expressions. Dynamic scoping and expression context are covered in detail, with a strong bent toward explaining their relationship with regular expressions.

  • Regular expressions are not useful without a way to apply them, so the following sections provide all the details to Perl's sometimes magical regex controls:

    "The qr/···/ Operator and Regex Objects" (see Section 7.4)

    "The Match Operator" (see Section 7.5)

    "The Substitution Operator" (see Section 7.6)

    "The Split Operator" (see Section 7.7)

  • "Fun with Perl Enhancements" (see Section 7.8) goes over a few Perl-only enhancements to Perl's regular-expression repertoire, including the ability to execute arbitrary Perl code during the application of a regular expression.

  • "Perl Efficiency Issues" (see Section 7.9) delves into an area close to every Perl programmer's heart. Perl uses a Traditional NFA match engine, so you can feel free to start using all the techniques from Chapter 6 right away. There are, of course, Perl-specific issues that can greatly affect in what way, and how quickly, Perl applies your regexes. We'll look at them here.

Perl in Earlier Chapters

Perl is touched on throughout most of this book:

  • Chapter 2 contains an introduction to Perl, with many regex examples.

  • Chapter 3 contains a section on Perl history (see Section 3.1.1.7), and touches on numerous regex-related issues that apply to Perl, such as character-encoding issues (including Unicode Section 3.3.2), match modes (see Section 3.3.3), and a long overview of metacharacters (see Section 3.4).

  • Chapter 4 is a key chapter that demystifies the Traditional NFA match engine found in Perl. Chapter 4 is extremely important to Perl users.

  • Chapter 5 contains many examples, discussed in the light of Chapter 4. Many of the examples are in Perl, but even those not presented in Perl apply to Perl.

  • Chapter 6 is an important chapter to the user of Perl interested in efficiency.

In the interest of clarity for those not familiar with Perl, I often simplified Perl examples in these earlier chapters, writing in as much of a self-documenting pseudo-code style as possible. In this chapter, I'll try to present examples in a more Perlish style of Perl.

    Previous Section  < Free Open Study >  Next Section