How to use backReferences in Regular Expression?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (abc) creates a single capturing group. The part of input string matching the capturing group is stored in a "backreference" for late recall.  So that the first three characters abc of an input string abcdefg creates a "backreference".

Backreferences enable the programmer to refer back to the saved matching strings. A backreference is specified in the regular expression as a backslash (\) followed by a digit indicating the number of the group to be recalled. capturing groups are numbered by scaning the regular expression from left to right and counting the opening round brackets. The first bracket starts backreference number one, the second number two, etc. Non-capturing parentheses, group begining with (?, are not counted. For example (a(?:bc)(de))uvw\2xyz\1, this regex contains two such groups:

  1. (a(?:bc)(de))
  2. (de)

The (?bc) is non-capturing parenthese and is not counted. For easy to understand, you can think about the (a(?:bc)(de))uvw\2xyz\1 regular expression is equivalent to  the (a(?:bc)(de))uvw(de)xyz(a(?:bc)(de)) regular expression. The (a(?:bc)(de))uvw\2xyz\1 regex will match abcdeuvwdexyzabcde.

Printer-friendly version Printer-friendly version | Send this 
article to a friend Mail this to a friend

Previous Next vertical dots separating previous/next from contents/index/pdf Contents

  |   |