< Free Open Study > |
4.7 SummaryIf you understood everything in this chapter the first time you read it, you probably didn't need to read it in the first place. It's heady stuff, to say the least. It took me quite a while to understand it, and then longer still to understand it. I hope this one concise presentation makes it easier for you. I've tried to keep the explanation simple without falling into the trap of oversimplification (an unfortunately all-too-common occurrence which hinders true understanding). This chapter has a lot in it, so I've included a lot of section references in the following summary, for when you'd like to quickly check back on something. There are two underlying technologies commonly used to implement a regex match engine, "regex-directed NFA" (see Section 4.3.1) and "text-directed DFA" (see Section 4.3.2). The abbreviations are spelled out in Section 4.3.3. Combine the two technologies with the POSIX standard (see Section 4.6.2), and for practical purposes, there are three types of engines:
To get the most out of a utility, you need to understand which type of engine it uses, and craft your regular expressions appropriately. The most common type is the Traditional NFA, followed by the DFA. Table 4-1 lists a few common tools and their engine types, and the section "Testing the Engine Type" (see Section 4.1.4) shows how you can test the type yourself. One overriding rule regardless of engine type: matches starting sooner take precedence over matches starting later. This is due to how the engine's "transmission" tests the regex at each point in the string (see Section 4.2.2). For the match attempt starting at any given spot: DFA Text-Directed Engines
NFA Regex-Directed Engines
Understanding the concepts and practices covered in this chapter is the foundation for writing correct and efficient regular expressions, which just happens to be the subject of the next two chapters. |
< Free Open Study > |