Welcome Guest!
Create Account | Login
Locator+ Code:

Search:
FTPOnline Channels Conferences Resources Hot Topics Partner Sites Magazines About FTP RSS 2.0 Feed

Free Trial Issue of Visual Studio Magazine


Leverage Regular Expressions (Continued)

After you've created the Regex object, you can enumerate all occurrences of the searched pattern in a source string by iterating over the collection of results the Matches method returns:

Dim text As String = "Anne Bob Eric"
Dim m As Match
For Each m In re.Matches(text)
   Console.WriteLine("{0} at index" _
      & "{1}", m.Value, m.Index)
Next

Figure 1. The Regex Class Hierarchy is Small.

The previous code displays these lines in the console window:

Anne at index 0
Eric at index 9

The System.Text.RegularExpressions namespace doesn't expose a large number of classes and methods (see Figure 1). The challenge in putting regular expressions to good use is learning how to build the search pattern. Regular expressions in .NET are a superset of regular expressions that VBScript 5.0 and later support, so you can leverage your prior knowledge on this topic. I've built an application that lets you test a regular expression against a text file (see Figure 2, and download the code here).

Figure 2. Test Your Regular Expressions.

Become Familiar With Regex Hierarchy
The most common and useful constructs you can use in the search pattern fall into a small number of categories (see Table 1). Character escapes provide a means to insert nonprintable characters in the search string; for example, you can use this sequence to search for the string "Visual Basic .NET" preceded by a tab character and followed by a carriage-return/line-feed pair:

\tVisual Basic \.NET\r\n

Notice that you need to escape the dot character, because unescaped dots have a special meaning inside regular expressions and match any character. Constructs in the character-class category are also simple to grasp. For example, this sequence matches a sequence of a nonalphabetical character, followed by a letter, two more alphanumerical characters, a digit, and a white space, such as the ",ndx2 " or " A123 " sequences:

\W[A-Za-z]\w\w\d\s

Notice that the leading nonalphanumerical character and the trailing white space are part of the match—a detail that becomes important if you want to replace the found substring. (I'll cover replace functions later.)

Back to top

Printer-Friendly Version













Java Pro | Visual Studio Magazine | Windows Server System Magazine
.NET Magazine | Enterprise Architect | XML & Web Services Magazine
VSLive! | Thunder Lizard Events | Discussions | Newsletters | FTP Home