FAQ

Java

JSP

Servlet


Advertisement



What are differences among greedy, reluctant, and possessive quantifiers in Java patterns?

The greedy, reluctant, and possessive quantifiers are for matching a specified expression x number of times. Quantifiers allow you to specify x number of occurrences to match against. The greedy quantifier is used to match with the longest possible string that matches the pattern while the reluctant quantifier is used to match with the shortest possible string that matches the pattern. The possessive quantifier is used to match the regular expression to the entire string and only matches when the whole string satisfies the criteria.

Quantifiers Meaning
Greedy Reluctant Possessive
 X?  X??  X?+  X, once or not at all
 X*  X*?  X*+  X, zero or more times
 X+  X+?  X++  X, one or more times
 X{n}  X{n}?  X{n}+  X, exactly n times
 X{n,}  X{n,}?  X{n,}+  X, at least n times
 X{n,m}  X{n,m}?  X{n,m}+  X, at least n but not more than m times

For example we have string: xxfoooooooooofoo, let's see what the differences among greedy, reluctant, and possessive qualifiers are in the matching strategy:

Given the regex (\w)*(.foo), the first part (\w)* is greedy and the entired string is consumed by it. At this point, the second part (.foo) cannot be matched (no portion of string left) and the overall expression cannot succeed. So the matcher slowly backs off one letter at a time until the rightmost occurrence of ofoo (which matches the second part .foo) has been regurgitated, at which point the match succeeds and the search ends. So the matched string is xxfoooooooooofoo.

Given the regex (\w)*?(.foo), the fist part (\w)*? is reluctant, so it starts by first consuming "nothing". Because the first four characters do not match .foo, the matcher is forced to swallow the first letter (an "x"), which triggers the first match xxfoo string. The matcher will continues the process until the entire string is exhausted. The matcher will find another match string oooooooofoo.

Given the regex (\w)*+(.foo), the first part (\w)*+ is possessive and the entired string is consumed by it, there is nothing left for the second part .foo and the overall pattern fails. Unlike greedy match, there are no steps to backtrack to for possessive match. The match attempt fails immediately when the second .foo matching fails. If we modify the given string to xxfoooooooooo=foo, the first part grabs xxfoooooooooo and the second part .foo will match the left string =foo. So the match result will be xxfoooooooooo=foo. Use a possessive quantifier for situations where you want to seize all of something without ever backing off; it will outperform the equivalent greedy quantifier in cases where the match is not immediately found.


Printer-friendly version Printer-friendly version | Send this 
article to a friend Mail this to a friend

Previous Next vertical dots separating previous/next from contents/index/pdf Contents

  |   |