Вы находитесь на странице: 1из 18

C bn v regular expression trong c#

1. Gii thiu: RE l mt ngn ng cc mnh dng m t vn bn cng nh thao tc trn vn bn. Mt RE thng c ng dng ln mt chui, ngha l ln mt nhm k t. Chng hn, ta c chui sau: Mot, Hai, Ba, Bon, NEVERLAND. Bn c th tr v bt c hoc tt c cc chui con ca n (Hai hoc Bon chng hn) hoc thay i phin bn ca nhng chui con ca n (Mot hoc BoN chng hn). Mt RE l mt kiu mu vn bn gm 2 phn: literal (trc kin) v metacharacters (k t siu).

Mt literal n thun ch l mt k t (a-z) m bn mun em so khp vi chui ch. Metacharacters l mt k t c bit hot ng nh l 1 mnh lnh i vi b phn phn tch ng ngha (parser) ca RE

By gi chng ta th to mt regular expression nh sau: Code:


^(From|To|Subject|Date):

RE ny s khp vi bt c chui con no min l chui ny l mt hng mi bt u vi From hoc To hoc cc ch Subject hoc Date (Du ^ ngha l bt u 1 hng mi) chui ny kt thc bi du hai chm (:). Du ^ cho b parser ca RE bit chui m ban ang truy tm phi bt u bi mt hng mi. Cc ch From, To,.. l nhng literal, v nhng metacharacter (, ), v | c dng to nhm literal v cho bit bt c nhng la chn no cng phi khp. Du ^ cng l metacharacter, n cho bit l khi u 1 hng. Do , bn c hng sau y: Code:
^(From|To|Subject|Date):

Nh sau: cho khp bt c chui con no bt u bi mt hng mi theo sau bi bt c 4 chui literal: From,To,Subject v Date ri theo sau bi du hai chm 2. Cc k t siu thng dng (v cng quan trng cn phi nm) . : i din cho 1 k t bt k tr k t xung dng \n. \d : k t ch s tng ng [0-9] \D : k t ko phi ch s \s : k t khong trng tng ng [ \f\n\r\t\v]

\S : k t khng phi khong trng tng ng [ ^\f\n\r\t\v] \w : k t word (gm ch ci v ch s, du gch di _ ) tng ng [a-zA-Z_0-9] \W : k t khng phi k t word tng ng [^a-zA-Z_0-9] ^ : bt u 1 chui hay 1 dng $ : kt thc 1 chui hay 1 dng \A : bt u 1 chui \z : kt thc 1 chui | : k t ngn cch so trng tng ng vi php or (lu ci ny nu mun kt hp nhiu iu kin) [abc] : khp vi 1 k t nm trong nhm l a hay b hay c. [a-z] so trng vi 1 k t nm trong phm vi a-z, dng du - lm du ngn cch. [^abc] s khng so trng vi 1 k t nm trong nhm, v d khng so trng vi a hay b hay c. () : Xc nh 1 group (biu thc con) xem nh n l mt yu t n l trong pattern .v d ((a(b))c) s khp vi b, ab, abc. ? : khp vi ng trc t 0 hay 1 ln. V d A?B s khp vi B hay AB. * : khp vi ng trc t 0 ln tr ln . A*B khp vi B, AB, AAB + : khp vi ng trc t 1 ln tr ln. A+B khp vi AB, AAB. {n} : n l con s, Khp ng vi n k t ng trc n . V d A{2}) khp ng vi 2 ch A. {n, } : khp ng vi n k t tr ln ng trc n , A{2,} khp vi AA, AAA ... {m,n} : khp ng vi t m->n k t ng trc n, A{2,4} khp vi AA,AAA,AAAA. 3. Cc lp thao tc vi Regular Expression trong .NET: .NET cung cp mt cch tip cn hng i tng v vic so khp chui v thay th theo RE. System.Text.RegularExpression l namespace trn th vin cc lp chun ca .NET lin quan n tt c cc i tng gn lin vi RE. Sau y mnh xin gii thiu s lc v cc lp ny: 1.Regex: Lp Regex tng trng cho 1 regular expression bt di bt dch (read-only). N cng cha mt phng thc tnh (static) cho php chng ta s dng nhng lp rex khc m khi khi to 1 i tng khc. V d:

Code:
string pattern = @"\s2000"; Regex myRegex = new Regex(pattern);

Sau y, mnh s k ra vi thnh phn ca lp Regex ny: -Thuc tnh:

+Options: tr v nhng mc chn c trao qua cho constructor Regex. +RightToLeft: nhn 1 tr cho bit liu xem regular expression d tm t phi qua tri hay khng -Phng thc: +GetGroupNames: tr v mng gm ton tn nhm thu lm i vi RE. +GetGroupNumbers: tr v mng gm ton s nhm thu lm tng ng vi tn nhm trn 1 mng. +GroupNameFromNumber: i ly tn nhm tng ng vi s nhm c khai bo. +IsMatch: tr v tr bool cho bit liu xem RE c tm thy mt so khp hay khng trn pattern. +Match: d tm trn pattern xem c xut hin mt RE hay khng ri tr v kt qu chnh xc nh l mt i tng Match duy nht. +Matches: d tm trn pattern xem tt c cc xut hin ca mt RE c hay khng ri tr v tt c nhng so khp thnh cng xem nh Match c gi nhiu ln. +Replace: cho thay th nhng xut hin ca mt pattern c nh ngha bi mt RE bi mt chui k t thay th c ch nh. +Split: ch mt pattern thnh mt mng gm nhng chui con nhng v tr c ch nh bi mt so khp trn RE +Unescape: cho unescape bt c nhng k t no c escape trn pattern. Sau y l v d s dng lp Regex tch chui qua vic dng phng thc Split ca n:

Code:
string chuoi = "Mot, Hai, Ba, Bon, NEVERLAND."; //to pattern //lut: xem chui no c cha khong trng hay du phy string pattern = " |, "; Regex myRegex = new Regex(pattern); string[] sKetQua = myRegex.Split(chuoi); foreach (string subString in sKetQua) { Console.WriteLine(subString); }

V y l kt qu ca n: Code:

1. Mot

2. Hai

3. Ba

4. Bon

5. NEVERLAND.

Nh bn thy , phng thc khi to ca class Regex s nhn 1 chui pattern lm i s. Phng thc Regex.Split() hot ng cng ging nh String.Split(), tr v 1 mng chui nh l kt qu vic so khp pattern ca RE trong lng myRegex. 2.Lp Match: Lp ny tng trng cho nhng kt qu duy nht ca mt tc v so khp (match) RE. Sau y mnh c 1 v d nh s dng phng thc Match ca lp Regex tr v 1 i tng kiu Match c th tm ra so khp u tin trn chui nhp. S dng thuc tnh Match.Access ca lp Match bo cho bit liu xem tm ra 1 so khp hay cha.

Code:
string chuoi = "123abcd456bdabc"; string pattern = "abc"; Regex myRegex = new Regex(pattern); Match m = myRegex.Match(chuoi); if (m.Success) { Console.WriteLine("Tim thay chuoi con {0} o vi tri thu {1} trong chuoi", m.Value, m.Index);

} else Console.WriteLine("Khong tim thay chi ca");

Kt qu nh sau: Code:
Tim thay chuoi con abc o vi tri thu 3 trong chuoi

3.Lp MatchCollection Lp ny tng trng cho 1 lot nhng so khp thnh cng chng ln nhau to thnh mt tp hp bt di bt dch v lp ny khng c phng thc khi to. Nhng i tng MatchCollection s do thuc tnh Regex.Matches ca lp Regex tr v. Hiu nm na MatchCollection l mng cc i tng Match l c. V d: Code:
static void Main(string[] args) { //tp hp cha nhng so khp MatchCollection mc; //1 chui th nghim string chuoi = "I like money, like woman and like C#"; //to pattern string pattern = "like"; //khi to 1 i tng ca Regex //truyn chui pattern vo constructor Regex myRegex = new Regex(pattern); //dng phng thc Matches ca myRegex // tm ra matches v ch mc ca tng match mc = myRegex.Matches(chuoi); foreach (Match m in mc) { Console.WriteLine("Chuoi con '{0}' xuat hien o chi muc {1}", m.Value, m.Index); } }

Ta c kt qu sau:

Code:
Chuoi con 'like' xuat hien o chi muc 2 Chuoi con 'like' xuat hien o chi muc 14 Chuoi con 'like' xuat hien o chi muc 29

S dng Regex Match Collections: Hai thuc tnh ca i tng Match l chiu di v v tr ca n, m ta c th c nh v d sau: Code:
static void Main(string[] args) { //tp hp cha nhng so khp MatchCollection mc; //1 chui th nghim string chuoi = "This is a example string."; //to pattern //lut:cho tm ra bt c nhng k t khng phi k t khong trng //ri theo sau n l k t khong trng string pattern = @"\S+\s"; //khi to 1 i tng ca Regex //truyn chui pattern vo constructor Regex myRegex = new Regex(pattern); //dng phng thc Matches ca myRegex // tm ra matches v ch mc ca tng match mc = myRegex.Matches(chuoi); for (int i = 0; i < mc.Count; i++) { Console.WriteLine("The match[{0}]: '{1}' co chieu dai la {2}", i,mc[i].Value, mc[i].Length); }

Chui \S i tm nhng k t khng phi khong trng, v du + cho bit mt hoc nhiu k t ng sau. Cn \s (s thng nh) cho bit l khong trng. Do , gp li ta c mnh hy i tm bt c k t non-whitespace theo sau bi whitespace. Kt qu ca v d trn l: Code:

The The The The

match[0]: match[1]: match[2]: match[3]:

'This ' co chieu dai la 5 'is ' co chieu dai la 3 'a ' co chieu dai la 2 'example ' co chieu dai la 8

L do t cht string. khng c tm thy l v n c kt thc l du chm (khng phi khong trng). 4.Lp Group i khi ngi ta cho l rt tin khi cho gp li nhng biu thc con so khp vi nhau nh vy bn c th phn tch ng ngha nhng on ca chui khp. V d, c th bn mun so khp da trn a ch IP v cho gp li tt c cc IP tm thy c bt c ni no trn on chui. Lp Group cho php bn to nhng nhm so khp da trn c php RE, v tng trng cho kt qu t 1 biu thc gp nhm duy nht. Mt biu thc gp nhm t tn cho mt nhm v2 cung cp 1 RE; bt c chui con no khp vi RE s c a vo nhm. V d, mun to 1 nhm IP, bn c th vit mt RE cho bit mt hoc nhiu digit hay dot theo sau bi space nh sau: Code:
@(?<ip>(\d|\.)+)\s

Lp Match c dn xut t Group, v c mt tp hp mang tn Groups cha tt c cc nhm m Match tm thy. Lp Group tng trng cho nhng kt qu thu hoch c t 1 thu lm nhm duy nht. V Group c th thu lm 0, 1 hoc nhiu chui ch trong mt ln so khp duy nht, n cha mt tp hp gm nhng i tng ca Capture. V Group k tha t Capture, substring b thu lm c th c truy xut trc tip. Cc th hin ca Group s c tr v bi thuc tnh Match.Groups(s group) hoc Match.Groups(tn group) nu cu trc gp nhm (?<groupname>) c dng n. V d sau y s dng kin trc gp nhm lng nhau thu lm nhng chui con gp thnh nhm: Code:
static void Main(string[] args) { string pattern = "(a(b))c";

string chuoi = "abdabc"; //nh ngha nhng substring abc,ab,b Regex myRegex = new Regex(pattern); Match m = myRegex.Match(chuoi); for (int i = 0; m.Groups[i].Value != ""; i++) { Console.WriteLine("{0} co chieu dai {1}", m.Groups[i].Value, m.Groups[i].Length); } }

Kt qu: Code:
abc co chieu dai 3 ab co chieu dai 2 b co chieu dai 1

on m sau y s dng kin trc gp nhm c mang tn (name v value) thu lm nhng substrings t mt chui cha d liu trn 1 dng thc DATANAME:VALUE m RE b ch du du hai chm (:) Code:
static void Main(string[] args) { string pattern = @"^(?<name>\w+):(?<value>\w+)"; Regex myRegex = new Regex(pattern); Match m = myRegex.Match("Section:119900"); for (int i = 0; m.Groups[i].Value != ""; i++) { Console.WriteLine("{0} co chieu dai {1}", m.Groups[i].Value, m.Groups[i].Length); } }

Kt qu: Code:

Section:119900 co chieu dai 14 Section co chieu dai 7 119900 co chieu dai 6

RE s tr v kt xut sau y: Code:


m.Groups[name].Value = Section1 m.Groups[value].Value = 119900

S dng c th lp Group: Code:


static void Main(string[] args) { //mt chui v d string chuoi = "04:03:27 127.0.0.0 sinhvienit.net"; //group time = mt hoc nhiu digit hoc du hai chm //theo sau bi khong trng string timePattern = @"(?<time>(\d|\:)+)\s"; string ipPattren = @"(?<ip>(\d|\.)+)\s"; string sitePattern = @"(?<site>\S+)"; string pattern = timePattern + ipPattren + sitePattern; Regex myRegex = new Regex(pattern); //i ly tp hp nhng so khp MatchCollection matches = myRegex.Matches(chuoi); foreach (Match match in matches) { if (match.Length != 0) { Console.WriteLine("\nMatch: {0}", match.ToString()); Console.WriteLine("\nTime: {0}", match.Groups["time"]); Console.WriteLine("\nIP: {0}", match.Groups["ip"]); Console.WriteLine("\nSite: {0}", match.Groups["site"]); } } }

Kt qu: Code:
Match: 04:03:27 127.0.0.0 sinhvienit.net Time: 04:03:27 IP: 127.0.0.0 Site: sinhvienit.net

Theo v d trn, u tin ta to mt chui tin hnh d khp: Code:


string chuoi = "04:03:27 127.0.0.0 sinhvienit.net";

Chui ny c th l 1 trong nhiu chui c ghi nhn trn mt tp tin log ca web server nh l kt qu d tm ca CSDL. Trong v d n gin ny c 3 ct: time IP Site, mi ct cch nhau bi mt khong trng. Bn mun to mt i tng Regex duy nht d tm nhng chui kiu ny, v cht chng thnh 3 nhm: time,ip v site : Code:
string timePattern = @"(?<time>(\d|\:)+)\s"; string ipPattren = @"(?<ip>(\d|\.)+)\s"; string sitePattern = @"(?<site>\S+)"; string pattern = timePattern + ipPattren + Regex myRegex = new Regex(pattern);

sitePattern;

Ta tp trung xem cc k t hnh thnh nhm: Cc du ngoc () s to nn mt nhm. Nhng g nm gia du ngoc m (ngay trc du ?) v du ngoc ng (sau du + trong trng hp ny) l 1 nhm n c cha mang tn. Code:
@"(?<time>(\d|\:)+)\s"

Chui ?<time> t tn nhm l time v nhm gn lin vi on vn bn so khp, l regular expression (\d|\:)+)\s. RE ny c suy din nh sau: mt hoc nhiu digit hoc du hai chm theo sau bi khong trng. Chui ?<ip> t tn cho nhm ip, v ?<site> t tn cho nhm site.

Nh cc v d trc, v d trn cng i hi mt tp hp ca tt c cc on khp: Code:


MatchCollection matches = myRegex.Matches(chuoi);

Tip theo, cho i xuyn qua tp hp matches li ra tng phn t match ca n: Code:
foreach (Match match in matches)

Nu chiu di Length ca match ln hn 0 c ngha l tm thy mt so khp. Sau , th cho in ra ton b nhng mc so khp: Code:
Console.WriteLine("\nMatch: {0}", match.ToString());

Tip theo, l i ly nhm time t tp hp Groups ca match ri cho in ra ni dung: Code:


Console.WriteLine("\nTime: {0}", match.Groups["time"]);

Kt qu Code:

Time: 04:03:27

Tng t nh th vi cc nhm site v ip ta c kt qu Code:


IP: 127.0.0.0 Site: sinhvienit.net

5.Lp GroupCollection: L lp tng trng cho 1 tp hp gm ton nhng nhm c thu lm v tr v mt l nhng nhm c thu lm trong mt ln so khp duy nht. Collection ny thuc loi read-only v khng c phng thc khi to. Cc th hin ca lp GroupCollection c tr v trong tp hp m thuc tnh Match.Groups tr v. V d: d tm v in ra s nhng nhm c thu lm bi mt RE. Lm th no trch tng thu lm ring r trn mi thnh vin ca mt group collection. Code:
using using using using System; System.Collections.Generic; System.Text; System.Text.RegularExpressions;

namespace ConsoleApplication1 { class Program { static void Main(string[] args) { Regex myRegex = new Regex("(a(b))c"); Match m = myRegex.Match("abdabc"); Console.WriteLine("So nhom duoc tim thay la: {0}",m.Groups.Count); } } }

Kt qu: Code:
So nhom duoc tim thay la 3

6.Lp Cature: Lp ny cha nhng kt qu t mt thu lm duy nht da trn mt expression-con (subexpression) 7.Lp CatureCollection: Mi ln mt i tng Regex khp vi mt subexpression, mt th hin Capture s c to ra, v c thm vo tp hp CaptureCollection. Mi i tng Capture tng trng cho mt thu lm (capture) n l. Mi nhm s c ring cho mnh mt capture collection nhng mc khp vi subexpression c gn lin vi nhm. Lp CaptureCollection tng trng cho mt lot nhng chui con c thu lm v tr v mt l nhng thu lm c thc hin ch qua mt nhm thu lm duy nht. Thuc tnh Captures, mt i tng ca lp CaptureCollection, c cung cp nh l mt thnh vin ca cc lp Match v Group gip truy xut d dng l cc chui con c thu lm. V d: nu bn s dng regular expression ((a(b)c)+ (du + cho bit l mt hoc nhiu chui so khp) thu lm nhng so khp t chui ch abcabcabc. CaptureCollection i vi mi matching Group ca nhng substring s cha 3 thnh vin. V d sau y mnh dng n regular expression (Abc)+ tm ra mt hoc nhiu so khp trn chui XYZAbcAbcAbcXYZAbcAb. V d minh ha vic s dng thuc tnh Captures tr v nhiu nhm cc chui con b thu lm: Code:
using using using using System; System.Collections.Generic; System.Text; System.Text.RegularExpressions;

namespace ConsoleApplication1 { class Program { static void Main(string[] args) { string chuoi = "XYZAbcAbcAbcXYZAbcAb"; string pattern = "(Abc)+"; Regex myRegex = new Regex(pattern); Match m = myRegex.Match(chuoi); GroupCollection gc = m.Groups; CaptureCollection cc; Console.WriteLine("So nhom thu luom duoc = {0}",gc.Count.ToString()); Console.WriteLine(); for(int i=0;i<gc.Count;i++) {

cc = gc[i].Captures; Console.WriteLine("So capture = " + cc.Count.ToString()); for(int j=0;j< cc.Count;j++) { Console.WriteLine(cc[j] + " bat dau tu ky tu " + cc[j].Index); } Console.WriteLine(); } } } }

Kt qu Code:
So nhom thu luom duoc = 2 So capture = 1 AbcAbcAbc bat dau tu ky tu 3 So capture = 3 Abc bat dau tu ky tu 3 Abc bat dau tu ky tu 6 Abc bat dau tu ky tu 9

S dng lp CaptureCollection: Thuc tnh ch cht ca i tng Capture l Length, cho bit chiu di ca chui con b thu lm. Khi bn yu cu Match cho bit chiu di, th chnh Capture.Length bn tm thy, v Match c tha k t Group, v Group li c dn xut t Capture. in hnh, bn s ch tm thy mt Capture n c trong mg CaptureCollection; nhng iu ny khng buc phi nh th. iu g s xy ra nu bn phn tch ng ngha mt chui trong y tn cng ty c th xut hin hoc hai ni. Mun gp cc tn ny vo chung thnh mt match n l, bn to nhm ?<company> 2 ni trong pattern ca regular expression. Code:
using using using using System; System.Collections.Generic; System.Text; System.Text.RegularExpressions;

namespace ConsoleApplication1 { class Program { static void Main(string[] args) { string chuoi = "05:04:27 NEVERLAND 192.168.10.1 TNHH"; string pattern = @"(?<time>(\d|\:)+)\s" + @"(?<company>\S+)\s" + @"(?<ip>(\d|\.)+)\s" + @"(?<company>\S+)"; Regex myRegex = new Regex(pattern); MatchCollection mc = myRegex.Matches(chuoi); foreach(Match match in mc) { if (match.Length!=0) { Console.WriteLine("Match: {0}",match.ToString()); Console.WriteLine("Time: {0}",match.Groups["time"]); Console.WriteLine("IP: {0}",match.Groups["ip"]); Console.WriteLine("Company: {0}",match.Groups["company"]); Console.WriteLine(); foreach(Capture cap in match.Groups["company"].Captures) { Console.WriteLine("cap: {0}",cap.ToString()); } } } } } }

on m sau cho ro qua tp hp Capture i vi nhm company: Code:


foreach(Capture cap in match.Groups["company"].Captures)

Compiler bt u bng cch tm ra tp hp m ta ro qua trn y. match l mt i tng c mt tp hp mang tn Groups. Tp hp Groups c b ch mc (indexer) cho php trch mt chui v tr v mt i tng Group n l. Do , lnh sau y tr v mt i tng

Group n l: Code:
match.Groups["company"].Captures

n phin, vng lp foreach ro qua tp hp Captures, trch mi phn t trong tp hp v gn cho bin ton cc cap, thuc kiu d liu Capture. Bn c th l trn kt xut c 2 phn t capture: NEVERLAND v TNHH. Phn t th hai chng ln phn t u trn nhm, do ch in ra TNHH, nhng khi quan st tp hp Captures th bn thy c 2 tr b thu lm. Kt qu ca v d trn: Code:
Match: 05:04:27 NEVERLAND 192.168.10.1 TNHH Time: 05:04:27 IP: 192.168.10.1 Company: TNHH cap: NEVERLAND cap: TNHH

Ultrapico Expresso v3.0.3276 - Cng c hu ch khi lp trnh regular expression


Product Description: Build complex regular expressions by selecting components from a palette Test expressions against real or sample input data Display all matches in a tree structure, showing captured groups, and all captures within a group Build replacement strings and test the match and replace functionality Highlight matched text in the input data Test automatically for syntax errors Generate Visual Basic or C# code Save and restore data in a project file Maintain and expand a library of frequently used regular expressions

Operating System Requirements: This product is designed to run on the following operating systems:

Windows Windows Windows Windows

XP 2000 NT 98

Additional Requirements: Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows Windows NT 4 SP 6 2003 SP 1 XP AMD 64-bit XP 64-bit SP 1 NT 4 SP 2 2000 SP 1 2003 64-bit 2003 AMD 64-bit XP 64-bit SP 2 NT 4 SP 3 2000 SP 2 Server 2003 x64 R2 2000 2003 64-bit SP 1 Vista AMD 64-bit XP Itanium 64-bit NT 4 SP 4 2000 SP 3 NT 4 XP 32-bit XP SP 1 Server 2003 x86 R2 ME 2003 Itanium 64-bit NT 4 SP 5 2000 SP 4 Vista 32-bit XP 64-bit NT 4 SP 1 Server 2008 x64 NT 3 Server 2008 x86 XP Server 2008 2003 Vista Itanium 64-bit XP Itanium 64-bit SP 1 2003 32-bit XP Itanium 64-bit SP 2 XP SP 2 95 98 Vista NT

Windows 2003 Itanium 64-bit SP 1 Windows XP Pro

:84841220674596:

Home: http://www.ultrapico.com Download: http://www.mediafire.com/file/5trodz....Keygen-Lz0-S/

Вам также может понравиться