< Free Open Study > |
4.8 Quiz Answers4.7.1 Quiz AnswerAnswer to the question in Section 4.2.2 . Remember, the regex is tried completely each time, so fat|cat|belly|your matches 'The dragging belly indicates your cat is too fat' rather than fat, even though fat is listed first among the alternatives. Sure, the regex could conceivably match fat and the other alternatives, but since they are not the earliest possible match (the match starting furthest to the left), they are not the one chosen. The entire regex is attempted completely from one spot before moving along the string to try again from the next spot, and in this case that means trying each alternative fat , cat , belly , and your at each position before moving on. 4.7.2 Quiz AnswerAnswer to the question in Section 4.2.4.3 . When ^.*([0-9]+) is applied to 'Copyright 2003.', what is captured by the parentheses? The desire is to get the last whole number, but it doesn't work. As before, .* is forced to relinquish some of what it had matched because the subsequent [0-9]+ requires a match to be successful. In this example, that means unmatching the final period and '3', which then allows [0-9] to match. That's governed by + , so matching just once fulfills its minimum, and now facing '.' in the string, it finds nothing else to match. Unlike before, though, there's then nothing further that must match, so .* is not forced to give up the 0 or any other digits it might have matched. Were .* to do so, the [0-9]+ would certainly be a grateful and greedy recipient, but nope, first come first served. Greedy constructs give up something they've matched only when forced. In the end, $1 gets only '3'. If this feels counter-intuitive, realize that [0-9]+ is at most one match away from [0-9]* , which is in the same league as .* . Substituting that into ^.*([0-9]+) , we get ^.*(.*) as our regex, which looks suspiciously like the ^Subject:•(.*).* example from Section 4.2.4.2, where the second .* was guaranteed to match nothing. 4.7.3 Quiz AnswerAnswer to the question in Section 4.4.4.1 . When matching [0-9]* against 'a•1234•num', would 'a•1234•num' be part of a saved state? The answer is "no." I posed this question because the mistake is commonly made. Remember, a component that has star applied can always match. If that's the entire regex, it can always match anywhere. This certainly includes the attempt when the transmission applies the engine the first time, at the start of the string. In this case, the regex matches at ' a•1234•num' and that's the end of it—it never even gets as far the digits. In case you missed this, there's still a chance for partial credit. Had there been something in the regex after the [0-9]*] that kept an overall match from happening before the engine got to:
then indeed, the attempt of the '1' also creates the state:
4.7.4 Quiz AnswerAnswer to the question in Section 4.5.6.1.1 . What does (?>.*?)···. match? It can never match, anything. At best, it's a fairly complex way to accomplish nothing! *? is the lazy * , and governs a dot, so the first path it attempts is the skip-the-dot path, saving the try-the-dot state for later, if required. But the moment that state has been saved, it's thrown away because matching exits the atomic grouping, so the skip-the-dot path is the only one ever taken. If something is always skipped, it's as if it's not there at all. |
< Free Open Study > |