Tips n Tracks

  • Increase font size
  • Default font size
  • Decrease font size
  • default color
  • black color

Reference

Sample image

Microsoft .NET Framework Get Details.

Sample image

Microsoft .NET Framework Get Details.

Reference

Sample image Microsoft .NET Framework Get Details.
Sample image

Microsoft .NET Framework Get Details.


Using Regular Expressions Metacharacters with .net – Lookahead and Lookbehind Zero-Width Assertions


Using Regular Expressions Metacharacters with .net – Lookahead and Lookbehind Zero-Width Assertions

Perl 5 introduced two very powerful constructs: "lookahead" and "lookbehind". Collectively, these are called "lookaround". They are zero-width just like other "zero-width assertions" don’t match actual text, but rather positions in the text. The difference is that lookarounds will actually match characters, but then give up the match and only return the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not. Lookarounds allow you to create regular expressions that are impossible to create without them, or that would get very longwinded without them.

Positive Lookaround

Positive Lookaround are so-called lookahead or lookbehind assertions. They look for things that go before or after the current match without including them in the match. It is important to understand that these expressions match a position like "^" or "\b" and never match any text. For this reason, they are known as "zero-width assertions". They are best illustrated by example:

"(?=exp)" is the "zero-width positive lookahead assertion". It matches a position in the text that precedes a given suffix, but doesn’t include the suffix in the match:

\b\w+(?=ing\b) The beginning of words ending with "ing"

"(?<=exp)" is the "zero-width positive lookbehind assertion". It matches the position following a prefix, but doesn’t include the prefix in the match:

(?<=\bre)\w+\b The end of words starting with "re"

Negative Lookaround

Earlier, we see how to search for a character that is not a specific character or the member of a character class. What if we simply want to verify that a character is not present, but don’t want to match anything? For example, what if we are searching for words in which the letter "q" is not followed by the letter "u"? We could try:

\b\w*q[^u]\w*\b Words with "q" followed by NOT "u"

Run the example and you will see that it fails when "q" is the last letter of a word, as in "Iraq". This is because "[^q]" always matches a character. If "q" is the last character of the word, it will match the whitespace character that follows, so in the example the expression ends up matching two whole words. Negative lookaround solves this problem because it matches a position and does not consume any text. As with positive lookaround, it can also be used to match the position of an arbitrarily complex subexpression, rather than just a single character. We can now do a better job:

\b\w*q(?!u)\w*\b Search for words with "q" not followed by "u"

We used the "zero-width negative lookahead assertion", "(?!exp)", which succeeds only if the suffix "exp" is not present.

Table 1: Regular Expressions Four Types of Lookaround
Type Regex RegexDefinition ( or Use)
Positive Lookahead (?=exp) Match any position preceding a suffix exp. successful if can match to the right
Positive Lookbehind (?<=exp) Match any position following a prefix exp. successful if can match to the left
Negative Lookahead (?!exp) Match any position after which the suffix exp is not found. successful if can not match to the right
Negative Lookbehind (?<!exp) Match any position before which the prefix exp is not found. successful if can not match to the left

Chetan love blogging. He regularly blogs at http://www.tipsntracks.com. You can connect with Chetan on Twitter, Facebook and Google Plus...

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Comments

[...] Using Regular Expressions Metacharacters with .net – Lookahead and Lookbehind Zero-Width Asser… [...]

Pingback by Text and Data Manipulation with Regular Expressions in .NET Development | Tips n Tracks on November 21, 2009 @ 11:23 am

Leave a comment

(required)

(required)

*
To prove that you're not a bot, enter this code
Anti-Spam Image