前两天面试一个程序员,自己说工作中用到过正则表达式,也比较熟悉,问他要使用正则表达式需要引用那个命名空间,使用哪些类,居然吱吱唔唔答不上来,让他写一个验证电话号码的正则表达式也写不出来,实在是很奇怪这种程序员是怎么工作了两三年的。
言归正传,下面介绍下.net中正则表达式中的使用。
要在.net中使用正则表达式,需要引用System.Text.RegularExpressions 命名空间。新建一个正则表达式类:
Regex regex = new Regex(pattern);
使用正则表达式匹配字符串
Match match = regex.Match(input);
MatchCollection matches = regex.Matches(input);
bool isMatch = regex.IsMatch(input);
Match方法返回单个的精确匹配结果,Matches返回所有的匹配结果的一个Match类的集合,IsMatch方法返回是否能够匹配输入字符串的一个bool结果。
Match类是一个保持匹配结果的类,它有一个成员Groups,是一个保存Group class的集合类。
Group 表示单个捕获组的结果。由于存在数量词,一个捕获组可以在单个匹配中捕获零个、一个或更多的字符串,因此 Group 提供 Capture 对象的集合。
Capture 表示单个成功捕获中的一个子字符串。
Group从Capture继承,表示单个捕获组的最后一个字符串。
即对于一个Group 类的实例对象group:
int captureCount = group.Captures.Count;
则group.Value与group.Captures[captureCount - 1].Value是相等的。
以下是几个正则表达式的使用样例:
使用正则表达式检查字符串是否具有表示货币值的正确格式。
using System.Text.RegularExpressions;
public class Test
{
public static void Main ()
{
// Define a regular expression for currency values.
Regex rx = new Regex(@"^-?\d+(\.\d{2})?$");
// Define some test strings.
string[] tests = {"-42", "19.99", "0.001", "100 USD",
".34", "0.34", "1,052.21"};
// Check each test string against the regular expression.
foreach (string test in tests)
{
if (rx.IsMatch(test))
{
Console.WriteLine("{0} is a currency value.", test);
}
else
{
Console.WriteLine("{0} is not a currency value.", test);
}
}
}
}
// The example displays the following output to the console:
// -42 is a currency value.
// 19.99 is a currency value.
// 0.001 is not a currency value.
// 100 USD is not a currency value.
// .34 is not a currency value.
// 0.34 is a currency value.
// 1,052.21 is not a currency value.
使用正则表达式检查字符串中重复出现的词。
using System.Text.RegularExpressions;
public class Test
{
public static void Main ()
{
// Define a regular expression for repeated words.
Regex rx = new Regex(@"\b(?<word>\w+)\s+(\k<word>)\b",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
// Define a test string.
string text = "The the quick brown fox fox jumped over the lazy dog
// Find matches.
MatchCollection matches = rx.Matches(text);
// Report the number of matches found.
Console.WriteLine("{0} matches found in:\n {1}",
matches.Count,
text);
// Report on each match.
foreach (Match match in matches)
{
GroupCollection groups = match.Groups;
Console.WriteLine("'{0}' repeated at positions {1} and {2}",
groups["word"].Value,
groups[0].Index,
groups[1].Index);
}
}
}
// The example produces the following output to the console:
// 3 matches found in:
// The the quick brown fox fox jumped over the lazy dog dog.
// 'The' repeated at positions 0 and 4
// 'fox' repeated at positions 20 and 25
// 'dog' repeated at positions 50 and 54
使用 Capture 对象在控制台中显示每个正则表达式匹配项组的成员。
string pat = @"(?<1>\w+)\s+(?<2>fish)\s*";
// Compile the regular expression.
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
// Match the regular expression pattern against a text string.
Match m = r.Match(text);
while (m.Success)
{
// Display the first match and its capture set.
System.Console.WriteLine("Match=[" + m + "]");
CaptureCollection cc = m.Captures;
foreach (Capture c in cc)
{
System.Console.WriteLine("Capture=[" + c + "]");
}
// Display Group1 and its capture set.
Group g1 = m.Groups[1];
System.Console.WriteLine("Group1=[" + g1 + "]");
foreach (Capture c1 in g1.Captures)
{
System.Console.WriteLine("Capture1=[" + c1 + "]");
}
// Display Group2 and its capture set.
Group g2 = m.Groups[2];
System.Console.WriteLine("Group2=["+ g2 + "]");
foreach (Capture c2 in g2.Captures)
{
System.Console.WriteLine("Capture2=[" + c2 + "]");
}
// Advance to the next match.
m = m.NextMatch();
}
// The example displays the following output:
// Match=[One fish ]
// Capture=[One fish ]
// Group1=[One]
// Capture1=[One]
// Group2=[fish]
// Capture2=[fish]
// Match=[two fish ]
// Capture=[two fish ]
// Group1=[two]
// Capture1=[two]
// Group2=[fish]
// Capture2=[fish]
// Match=[red fish ]
// Capture=[red fish ]
// Group1=[red]
// Capture1=[red]
// Group2=[fish]
// Capture2=[fish]
// Match=[blue fish]
// Capture=[blue fish]
// Group1=[blue]
// Capture1=[blue]
// Group2=[fish]
// Capture2=[fish]