Regular Expressions
Regular
Expressions
Cookbook
-
2010
.
.
.
.
.
.
.
.
., .
. . . . .: , 2010. 608 ., .
ISBN 978-5-93286-181-3
100 , .
, : C#, Java,
JavaScript, Perl, PHP, Python, Ruby VB.NET. : URL , ,
,
HTML, XML, CSV .
, , , , . , ,
, .
ISBN 978-5-93286-181-3
ISBN 978-0-596-52068-7 ()
-, 2010
Authorized translation of the English edition 2009 OReilly. This translation is published and sold by permission of OReilly, Inc., the owner of all rights to publish and
sell the same.
, . , , .
-. 199034, -, 16 , 7,
. (812) 380-5007, www.symbol.ru. N 000054 25.12.98.
005-93, 2; 953000 .
30.10.2009. 70100 1/16. .
38 . . 2000 .
199034, -, 9 , 12.
............................................................................... 11
1. .......................................... 19
.......................................... 19
..................... 25
................. 27
2. ......... 48
2.1. ....................................... 49
2.2. ..................................... 52
2.3. .................... 54
2.4. ........................................ 59
2.5. / ....................... 62
2.6. ........................................ 68
2.7. , , ......... 71
2.8. ................. 85
2.9. ................................ 87
2.10. .....91
2.11. ........................ 93
2.12.
......................................................97
2.13.
................................ 100
2.14. .................................... 104
2.15. ..................... 108
2.16.
.............................. 111
2.17. ............. 119
2.18. ............ 122
2.19. ............... 124
2.20.
........................................................ 128
2.21.
........................................................ 129
2.22. ............. 133
3.
......................................................... 135
..... 135
3.1. ......... 142
3.2. .......... 149
3.3. ......................... 151
3.4. ................... 159
3.5.
.......................................... 168
3.6. ............... 175
3.7. .......................................... 181
3.8. .......................... 188
3.9. .................................. 194
3.10. ................................... 202
3.11. .......................................... 208
3.12. ........ 215
3.13. ..................... 219
3.14. .................................................... 224
3.15.
........................................................... 232
3.16. ,
.............................. 238
3.17.
.................................... 245
3.18.
.................................... 247
3.19. ............................................................ 253
3.20. ,
............................................... 264
3.21. ........................................................... 269
4. ................................................. 274
4.1.
4.2.
4.3.
4.4.
4.5.
4.6.
4.7.
4.8.
................................. 274
............... 282
................. 288
............................. 290
.................. 295
...................... 300
ISO 8601 ...................... 303
- ..................................... 308
4.9. ............................................... 312
4.10. ...................................... 317
5. , ................................... 361
5.1.
5.2.
5.3.
5.4.
5.5.
............................................... 361
...................................... 364
........................................................ 367
, ................... 371
,
............................... 373
5.6. ,
.................... 375
5.7. .................................... 379
5.8. .............................................. 387
5.9. ........................................ 389
5.10. ,
..................................... 395
5.11. ,
................................. 397
5.12.
...................................................... 398
5.13.
................................................... 402
5.14. ......... 403
6. ..................................................................................... 409
6.1.
6.2.
6.3.
6.4.
6.5.
6.6.
6.7.
................................................................... 409
............................................... 413
.............................................................. 416
.................................................. 418
....................................... 419
.......... 427
........................................................ 429
8. ................................................... 513
8.1. XML............................................................. 521
8.2. <b> <strong>....................................... 541
8.3. XML- ,
<em> <strong> ...................................... 545
8.4. XML .......................................... 549
8.5. HTML
<p> <br> ........................................... 557
8.6.
XML- ..................................................... 560
8.7. cellspacing <table>,
.............................................. 566
.............................................................. 593
.
, . , . .
, . ,
. , , ,
, ,
-. , , .
, , ,
, , , ,
.
, ,
. ,
,
.
12
,
, ,
,
. , ,
.
. -
- . , .
, ,
Perl, , . Perl ,
Perl .
,
, . , , , .
,
,.
, , , , .
. ,
.
- ,
.
13
,
, . ,
, ,
, . ,
. ,
.
, . , - , regex ( ),
, . , , , .
3 ,
. ,
,
.
.NET, Java, JavaScript, PCRE, Perl, Python Ruby . , .
. ,
, , .
, ( 3) ,
C#, Java, JavaScript, PHP, Perl,
Python, Ruby VB.NET. . , , , ,
, ,
.
14
, . , - .
1 ,
, .
2 ,
.
3
,
, .
4
, , , , .
5 , ,
.
6 ,
.
7 URL, , ,
Windows .
8 HTML, XML, CSV (comma-separated
values , ) INI.
, URL, , .
;
, ;
15
, , , . , .
, , ,
, .
, .
, , .
, ,
. .
, .
...
, , . , .
CR , LF CRLF
16
, .
.
.
. , . ,
, . OReilly. , . .
,
.
, ISBN. : Regular Expressions Cookbook by Jan Goyvaerts and Steven Levithan. Copyright 2009
Jan Goyvaerts and Steven Levithan, 978-0-596-2068-7.
permissions@oreilly.com.
17
, , :
OReilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-998-9938 ( )
707-829-0515 ( )
707-829-0104 ()
, :
http://www.regexcookbook.com
http://oreilly.com/catalog/9780596520687
:
bookquestions@oreilly.com
, ,
OReilly :
http://www.oreilly.com
(Andy Oram), OReilly Media, Inc., , . (Jeffrey Friedl), (Zak Greant), (Nikolaj Lindberg) c (Ian Morse), ,
.
, - ,
, .
, :
4 8.
. , ( , ), , . , , ,
, , ,
.
, , , , .
, .
, ,
;
, ; , , -
20
1.
; .
, , .
, , . (). , (backtracking).
,
grep, . , , Perl .
(). . -
: , ,
.
, , . , , , , . , 4.1, , ,
.
,
, , 1:
1
: http://regex.info/blog/2006-09-15/247.
21
-, , : -, ,
, .
, , , ,
, . . . , .
, .
. . ,
,
, . , ,
, , ,
, . .
, . , , ?
, , , Perl.
Perl. , .
. regex regexp, regexes .
-
. , .
, . ,
.
.
3; , . ,
22
1.
. , , , .
,
, . Perl. . , , ,
.
, .
-
, . , ,
, .
, ,
. , , ,
, :
Perl
, Perl , .
Perl 5.6, 5.8 5.10.
, , , Perl ,
Perl. , Perl, .
, ,
. Perl.
PCRE
PCRE Perl-Compatible Regular Expressions (Perl- ),
C (Philip Hazel).
: http://www.pcre.org.
PCRE 4 7.
23
, PCRE Perl,
, , , Perl. , , ,
Perl , Perl.
PCRE . PHP Delphi. , Perl , , PCRE.
.NET
Microsoft .NET Framework Perl
System.Text.RegularExpressions. .NET 1.0 3.5. , System.Text.RegularExpressions: 1.0 2.0.
.NET 1.1, 3.0 3.5 Regex
.
,
.NET, C#, VB.NET, Delphi for .NET COBOL.NET,
.NET. ,
.NET, , , .NET, , Perl. Visual Studio (VS). VS - , ,
Perl.
Java
Java 4 , , java.
util.regex. Java. , ,
Perl- , ,
C. java.
util.regex, Java 4, 5 6.
24
1.
,
Java , , , Java.
JavaScript
JavaScript , ECMA262 3.
ECMAScript, - JavaScript JScript. : Internet
Explorer 5.5 8.0, Firefox, Opera Safari ECMA-262.
.
, .
- ,
-, JavaScript
,
. Microsoft VBScript Adobe ActionScript.
Python
Python re. Python 2.4 2.5.
Python .
Ruby
Ruby
, Perl. Ruby 1.8 1.9. Ruby 1.8 ,
Ruby. Ruby 1.9 Onigurama.
Ruby 1.8 Onigurama, Ruby 1.9
Ruby.
Ruby Ruby 1.8, Onigurama
Ruby 1.9.
, Ruby
,
a++ . Ruby 1.8, ,
, Ruby 1.9
, .
25
Onigurama
Ruby 1.8, ,
. , , (?m) ,
(?s).
, . , ,
, , , .
, , , .
, , , . , 2.20 2.21.
,
2.22. 3 3.16 , .
, , , . .
, . .
,
.
, .
, ,
, -
26
1.
27
, .NET 2.0,
.
Java
java.util.regex . Java 4, 5 6.
.
JavaScript
JavaScript
, , ECMA-262 3.
Python
re Python
sub(). Python Python. Python 2.4 2.5. Python
.
Ruby
Ruby , .
Ruby 1.8 1.9. Ruby 1.8
, Ruby,
Ruby 1.9 Onigurama. Ruby 1.8 Onigurama, Ruby 1.9
Ruby.
Ruby Ruby 1.8,
Onigurama Ruby 1.9.
Ruby 1.8 1.9 , , Ruby 1.9
. , Ruby 1.9.
,
-
28
1.
. , ,
( UNIX).
.
3 , .
,
. 3.1.
,
.
, , , , .
.
RegexBuddy
RegexBuddy (. 1.1) , ,
, . , ,
.
RegexBuddy (Jan Goyvaerts), . RegexBuddy , RegexBuddy
OReilly.
(. 1.1) ,
, , RegexBuddy. . .
- , , RegexBuddy.
29
, .
Create () .
. , Insert Token ( ),
. , ,
RegexBuddy .
. 1.1. RegexBuddy
Test
() Highlight () RegexBuddy , .
,
, :
List All ( )
.
30
1.
Replace ()
Replace (), ,
. Replace () Test () ,
.
Split () ( Test, )
, .
List All ( )
Update Automatically ( ),
.
, ( )
, Test ()
, , Debug (). RegexBuddy Debug () . , , , .
,
, .
Use ()
. RegexBuddy
. .
, GREP (, ) .
, , , RegexBuddy
Paste (), ,
. RegexBuddy
,
,
31
.
, Copy (), , .
Library () .
. .
Forum () Login ().
RegexBuddy , ,
OK
RegexBuddy. .
RegexBuddy Windows 98, ME, 2000, XP Vista. Linux Apple
VMware, Parallels, CrossOver
Office , , WINE.
RegexBuddy http://www.regexbuddy.com/RegexBuddyCookbook.exe. , .
RegexPal
RegexPal (. 1.2) -, .
(Steven Levithan), . ,
, -. RegexPal JavaScript.
JavaScript, -,
-.
- ,
http://www.regexpal.com Enter regex here. RegexPal
, . RegexPal ,
JavaScript. -
, RegexPal
.
32
1.
. 1.2. RegexPal
-
- . , 3 , . , , .
regex.larsolavtorvik.com
(Lars Olav Torvik) , ,
http://regex.larsolavtorvik.com (. 1.3).
33
. 1.3. regex.larsolavtorvik.com
34
1.
. , , Result ().
- Ajax,
.
, PHP , .
.
. , , .
,
, , . 3.
Nregex
http://www.nregex.com (. 1.4) -, , (David Seruyange) .NET.
, , ,
.NET 1.x, , .
. Regular Expression
( ), . , , If I just had $5.00
then "she" wouldn't be so @#$! mad..
-, URL Load Target From URL ( URL) Load (), . , , Browse (), Load (), .
Matches & Replacements (
), , , . - Replacement String ( ),
. (...).
.NET, , . ,
35
. 1.4. Nregex
Rubular
http://www.rubular.com -, (. 1.5), Ruby 1.8.
36
1.
. 1.5. Rubular
myregexp.com
(Sergey Evdokimov) Java,
. http://www.myregexp.
com (. 1.6) -
. Java-, . , Java 4 ( ). -
37
. 1.6. myregexp.com
38
1.
, :
Find ()
, . Matcher.find() Java.
Match ()
, . ,
. String.
matches() Matcher.matches().
Split ()
, String.split() Pattern.split()
.
Replace ()
,
String.replaceAll() Matcher.replaceAll().
,
, , http://www.myregexp.com. Eclipse, IntelliJ IDEA.
reAnimator
reAnimator http://osteele.com/tools/reanimator
(. 1.7), (Oliver Steele), .
,
,
.
, reAnimator, . , . ,
reAnimator,
, , ,
. , , reAnimator, .
. 20.
39
. 1.7. reAnimator
Pattern ()
Edit (). Pattern () Set ().
Input ().
, , . , , . , . ,
.
reAnimator , ,
^ $ .
, .
40
1.
Expresso
Expresso ( espresso (), ) .NET,
. http://www.ultrapico.com/Expresso.htm. .NET 2.0
.
60- .
, Expresso .
, Ultrapico
. .
Expresso . 1.8. Regular Expression ( ), , . . Regex Analyzer ( )
. .
Design Mode ( )
,
Ignore Case ( ). , . , , Undock
() , . ,
Test Mode ( ).
Test Mode ( ) Run Match ( ),
Search Results ( ) .
. , .
Expression Library ( )
. , Run Match (-
41
. 1.8. Expresso
). Library () .
The Regulator
Regulator, http://
sourceforge.net/projects/regulator,
.NET, . .NET 2.0 . , .NET 1.x. Regulator ,
.
42
1.
43
grep
grep g/re/p, ed
UNIX, , .
UNIX, grep,
. UNIX, Linux OS X, man grep ,
.
Windows, , grep, .
PowerGREP
PowerGREP (Jan Goyvaerts), . , ,
grep Windows (. 1.10).
PowerGREP
, , . RegexBuddy
JGsoft.
,
Clear () Action () Search (), Action (). File Selector ( ) , File Selector ( ) Include File or Folder ( ) Include Folder and Subfolders ( ).
Execute () Action () .
, Action type ( ), Action (), Search-andreplace ( ). Search () Replace ().
. .
PowerGREP ,
. ,
, grep, -
44
1.
. 1.10. PowerGREP
PowerGREP,
.
PowerGREP Windows 98, ME, 2000, XP
Vista.
http://www.powergrep.com/PowerGREPCookbook.exe. ,
15 . , Results (),
, .
Windows Grep
Windows Grep (http://www.wingrep.com) grep- Windows. (. 1.11),
, .
POSIX ERE.
, , , . Windows Grep -
45
(shareware), , , , .
, Search ()
Search ().
Options () : Beginner Mode ( ) Expert Mode ( ). , .
Windows Grep
, . , ,
.
, All Matches ( )
View ().
, Search () Replace ().
46
1.
RegexRenamer
RegexRenamer (. 1.12)
grep-. ,
. http://regexrenamer.sourceforge.net. RegexRenamer 2.0 Microsoft .NET.
Match (), Replace (). , /i,
/g ,
.
/x ,
,
.
, , .
. , . , -
. 1.12. RegexRenamer
47
, , .
,
. , EditPad Pro,
, . .
, :
Boxer Text Editor (PCRE)
Dreamweaver (JavaScript)
EditPad Pro ( ,
, . RegexBuddy
JGsoft)
Multi-Edit (PCRE, Perl)
NoteTab (PCRE)
UltraEdit (PCRE)
, , , . , , , , . , ,
. ,
,
. , , , 2.1.
,
. ,
.
,
, , Mastering Regular Expressions
(Jeffrey E. F. Friedl),
OReilly,1 .
,
. .
,
. , ,
, 3- . . .
.: -, 2008.
49
2.1.
4 8,
.
- , . . 3,
,
. ,
, , . 22.
2.1.
ThepunctuationcharactersintheASCIItableare:
!#\$%&\(\)\*\+,-\./:;<=>\?@\[\\]\^_`\{\|}~
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, $()*+.?
[\^{|, . , Mary had a little lamb, Maryhadalittlelamb . , Regular Expression ( ) .
, , .
,
, , .
\$\(\)\*\+\.\?\[\\\^\{\|
$()*+.?[\^{|
50
2.
, : ], - }.
,
[, } {.
}. , [ ], 2.3.
-
, , , .
-
, .
, ,
, .
, .
. ,
,
.
ThepunctuationcharactersintheASCIItableare:
\Q!#$%&()*+,-./:;<=>?@[\]^_`{|}~\E
:
: Java 6, PCRE, Perl
\Q...\E ,
, \.\.\..
, Java 4 5 , , .
,
\Q...\E ,
PCRE, Perl Java 6.
Java 6, ,
PCRE Perl.
51
2.1.
ascii
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
(?i)ascii
:
: .NET, Java, PCRE, Perl, Python, Ruby
. regex regex, Regex, REGEX
ReGeX. regex
, .
. , , ,
. 3.4, , , , .
, (?i) , (?i)regex .
.NET, Java, PCRE, Perl, Python Ruby.
2.10 ,
.
.
2.3 5.14.
52
2.
2.2.
, ,
ASCII: bell, escape,
form feed, line feed, carriage return, horizontal tab, vertical tab. ASCII: 07, 1B, 0C,
0A, 0D, 09, 0B.
\a\e\f\n\r\t\v
:
: .NET, Java, PCRE, Python, Ruby
\x07\x1B\f\n\r\t\v
:
: .NET, Java, JavaScript, PCRE, Python, Ruby
\a\e\f\n\r\t\0x0B
:
: .NET, Java, PCRE, Perl, Python, Ruby
ASCII .
. ,
. . 2.1 .
2.1.
\a
\e
\f
\n
\r
bell
0x07
escape
0x1B
form feed
0x0C
0x0A
carriage return
0x0D
53
2.2.
\t
\v
horizontal tab
0x09
vertical tab
0x0B
ECMA-262 \a \e .
JavaScript ,
\a \e . Perl \v ( ), Perl (\x0B) (\013) .
, , , .
26
\cG\x1B\cL\cJ\cM\cI\cK
:
: .NET, Java, JavaScript, PCRE, Perl, Ruby 1.9
\cA \cZ 26 ,
1 26 ASCII. c . , ,
,
. Java .
, , Ctrl . Ctrl-H (backspace).
\cH .
7-
\x07\x1B\x0C\x0A\x0D\x09\x0B
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
54
2.
\x ,
,
ASCII. . 2.1 \x00 \x7F ASCII. ,
, .
\x80 \xFF ,
. \x80 \xFF .
, 2.7.
. 2.1. ASCII
.
2.7.
2.3.
,
calendar, , . -
2.3.
55
,
a e. , . ,
, .
calendar
c[ae]l[ae]nd[ae]r
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
[a-fA-F0-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
[^a-fA-F0-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , .
, .
, a e. . calendar, a, e a.
.
: \, ^, - ]. Java .NET [
. . [$()*+.?{|] , .
, ,
. , -
56
2.
.
, .
, . [][^-]
, ,
JavaScript, . ;
: [\]\[\^\-] . .
-
, , (
). , 2.2, , . , . , [\r\n]
(\r) (\n).
(^) ,
. ,
. , .
(-) , . ,
, , , ,
.
, , ASCII .
[A-z] ,
ASCII A z. ; , [A-Z\[\\\]\^_`a-z]
, . , , .
, [z-a] , .
[a-fA-F\d]
57
2.3.
:
: .NET, Java, PCRE, Perl, Python, Ruby
, ,
. ,
. \d [\d]
. , , , ,
. , \D
, , [^\d] .
\w .
, . , . , ,
. \W , .
\s . , . .NET, Perl
JavaScript \s , . ,
JavaScript \s , \d \w
ASCII. \S , \s .
\b . \
b ,
. , \b , \w ,
ASCII, \w ASCII, .
. 70, 2.6.
58
2.
(?i)[A-F0-9]
:
: .NET, Java, PCRE, Perl, Python, Ruby
(?i)[^A-F0-9]
:
: .NET, Java, PCRE, Perl, Python, Ruby
,
( 3.4) ( 2.1),
. ,
.
JavaScript , (?i) . JavaScript, /i .
,
.NET
[a-zA-Z0-9-[g-zG-Z]]
, . - , , , g z.
: [class-[subtract]] .
, , . , \p{IsThai}
. \P{N} ,
Number.
10 .
,
Java
[a-f[A-F][0-9]]
[a-f[A-F[0-9]]]
Java .
.
59
2.4.
, . , , :
[\w&&[a-fA-F0-9\s]]
.
. . , . , [g-zG-Z_] , :
[a-zA-Z0-9&&[^g-zG-Z]]
. - ,
, , g z. : [class&&[^subtract]] .
,
, , . , \p{IsThai} . \p{N} , Number.
[\p{InThai}&&[\p{N}]] 10 .
\p , 2.7.
.
2.1, 2.2 2.7.
2.4.
, ,
.
, , . , .
60
2.
,
.
: ( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
[\s\S]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. .
, .
, , ,
. ,
, , , . , ,
. , 3.21 .
(Larry Wall), Perl, Perl
,
(\n).
, , .
. , .
61
2.4.
,
, . . Perl (single line mode), Java
(dot all mode).
3.4 . , , . , .
JavaScript, , . 2.3, \s , \S ,
\s . ,
[\s\S] , , . [\d\D] [\w\W] .
, . \d\d.\d\d.\d\d
. 05/06/08,
99/99/99. , 12345678.
,
, .
. \d\d[/.\]\d\d[/.\-]\d\d -
, . -
99/99/99, 12345678 .
,
.
. , , , -
.
, . .
62
2.
(?s).
:
: .NET, Java, PCRE, Perl, Python
(?m).
:
: Ruby
, . 2.1
. 51 , JavaScript .
(?s)
.NET, Java, PCRE, Perl Python. s
single line ( ), Perl .
,
Ruby . Ruby (?m) . , . Ruby 1.9 (?m) . (?m) Perl 2.5.
.
2.3, 3.4 3.21.
2.5.
/
. alpha,
. omega, . begin,
.
end, .
63
2.5. /
^alpha
: ( ^ $
)
: .NET, Java, JavaScript, PCRE, Perl, Python
\Aalpha
:
: .NET, Java, PCRE, Perl, Python, Ruby
omega$
: ( ^ $
)
: .NET, Java, JavaScript, PCRE, Perl, Python
omega\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
^begin
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
end$
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^ , $ , \A , \Z \z . - ,
,
.
,
.
,
. , ,
64
2.
one two, ,
four:
one
two
four
one
two
four.
\A , .
, .
\A , , . A
.
JavaScript \A .
^ \A
, ^ $
.
, Ruby. Ruby
.
JavaScript,
^ \A . \A ,
.
\Z \z . \Z \z , ,
.
\Z \z ,
. \Z
, , . , omega\Z , . ,
65
2.5. /
, ; \Z . \z , ,
.
$ \Z ,
, ^ $
.
, Ruby. Ruby . \Z , $ , .
,
Perl. , $/ ( ) \n, Perl , ( ):
$line = <>;
Perl $line.
, endofinput.\z ,
.
endofinput.\Z endofinput.$ , .
, Perl :
chomp $line;
. ( chomp
- .)
JavaScript, \Z $ . \Z
,
.
^ , \A . Ruby ^
. ,
. .
66
2.
, .
, , 2.4. ,
. .
^ .
,
, , . \n^ , ^
\n .
$
,
\Z . Ruby $ . ,
.
$ . (, , .) $\n , $ \n .
, . ,
.
, .
. \A \Z - . ^ $ ^ $ -
.
67
2.5. /
. \A\Z , , . \A\z
. ^$ ^ $ .
(?m)^begin
:
: .NET, Java, PCRE, Perl, Python
(?m)end$
:
: .NET, Java, PCRE, Perl, Python
^ $
, . JavaScript 2.1 . 51.
(?m) ^ $
.NET, Java, PCRE, Perl Python. m
multiline () , Perl ^ $ .
, ,
Ruby . (?m) Ruby
.
Ruby (?m)
^ $ . Ruby ^ $
.
, Ruby
^ $
. JavaScript,
.
68
2.
(?-m) , , ,
\A \Z .
.
3.4 3.21.
2.6.
, cat
My cat is brown,
category bobcat. ,
cat staccato,
.
\bcat\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\Bcat\B
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\b . . . \b ,
, , .
, \b :
,
.
69
2.6.
, .
, , .
, , ,
. ,
, , . \b
, \b . \b \bx !\b . \b x\b
\b! . x\bx
!\b! .
, , \bcat\b . \b , c , . \b , t
, .
. \b , .
,
. ,
,
. \b (?m) , ,
^ $
.
\B ,
\b . , \B ,
.
, \B :
,
.
, .
70
2.
, .
.
\Bcat\B cat
staccato, My cat is brown, category bobcat.
,
( staccato, category bobcat, My cat
is brown), \Bcat cat\B \Bcat|cat\B . \Bcat cat
staccato bobcat. cat\B cat category ( staccato, \Bcat ). 2.8.
, . , . 2.3
. 57 , \w , . , \b .
, , , \b \B, , - , .
.NET Perl .
71
2.7. , ,
, ,
.
Python . ,
ASCII, , UNICODE U. \b \w .
Java .
\w ASCII.
\b
. Java \b\w\b , , . \b\b , \b . \w+
, \w ASCII.
.
2.3.
2.7. ,
,
(), .
;
,
. 2.1.
,
, Currency Symbol ( ). .
,
, Greek Extended.
,
, .
, , ,
.
72
2.
\u2122
:
: .NET, Java, JavaScript, Python
Python,
, : u\u2122.
\x{2122}
:
: PCRE, Perl, Ruby 1.9
PCRE UTF-8,
PHP UTF-8 /u. Ruby 1.8
.
,
\p{Sc}
:
: .NET, Java, PCRE, Perl, Ruby 1.9
PCRE UTF-8,
PHP UTF-8 /u. JavaScript Python . Ruby 1.8 .
\p{IsGreekExtended}
:
: .NET, Perl
\p{InGreekExtended}
:
: Java, Perl
JavaScript, PCRE, Python Ruby
.
\p{Greek}
:
: PCRE, Perl, Ruby 1.9
73
2.7. , ,
PCRE 6.5
, , UTF-8. PHP UTF-8 /u. .NET, JavaScript Python
. Ruby 1.8 .
\X
:
: PCRE, Perl
PCRE Perl , , , .
\P{M}\p{M}*
:
: .NET, Java, PCRE, Perl, Ruby 1.9
PCRE UTF-8,
PHP UTF-8 /u. JavaScript Python . Ruby 1.8 .
.
, , ,
. ,
, .
U+2122 . \u2122
\x{2122} , .
\u . , U+0000
U+FFFF. \x
, U+000000 U+10FFFF.
U+00E0 \x{E0} , \x{00E0} . U+100000 -
74
2.
.
.
,
, .
. 30 , 7 :
\p{L}
.
\p{Ll}
, .
\p{Lu}
, .
\p{Lt}
, ,
.
\p{Lm}
, .
\p{Lo}
,
.
\p{PM}
, ( , , ).
\p{Mn}
, ,
(, , ).
\p{Mc}
, , ( ).
\p{Me}
, (, , ).
2.7. , ,
75
\p{Z}
-.
\p{Zs}
,
.
\p{Zl}
- U+2028.
\p{Zp}
- U+2029.
\p{S}
, ,
, , .
\p{Sm}
.
\p{Sc}
.
\p{Sk}
() .
\p{So}
,
,
.
\p{N}
.
\p{Nd}
0 9 , .
\p{Nl}
, , .
\p{No}
,
0...9 ( ).
\p{P}
.
\p{Pd}
.
76
2.
\p{Ps}
.
\p{Pe}
\p{Pi}
.
\p{Pf}
.
\p{Pc}
, , .
\p{Po}
, , ,
, , .
\p{C}
.
\p{Cc}
0x000x1F ASCII 0x80
0x9F Latin-1.
\p{Cf}
, .
\p{Co}
, .
\p{Cs}
UTF-16.
\p{Cn}
, .
77
2.7. , ,
(). [\P{Ll}\P{Lu}\P{Lt}\P{Lm}\P{Lo}] ,
. \P{Ll} , Lu ( , Ll), \P{Lu}
, Ll.
.
. .
U+0000 U+FFFF 105 :
U+0000U+007F
U+0080U+00FF
U+0100U+017F
U+0180U+024F
U+0250U+02AF
U+02B0U+02FF
U+0300U+036F
U+0370U+03FF
U+0400U+04FF
U+0500U+052F
U+0530U+058F
U+0590U+05FF
U+0600U+06FF
U+0700U+074F
U+0780U+07BF
U+0900U+097F
U+0980U+09FF
U+0A00U+0A7F
U+0A80U+0AFF
U+0B00U+0B7F
U+0B80U+0BFF
U+0C00U+0C7F
U+0C80U+0CFF
U+0D00U+0D7F
U+0D80U+0DFF
U+0E00U+0E7F
U+0E80U+0EFF
U+0F00U+0FFF
U+1000U+109F
<\p{InBasic_Latin}>
<\p{InLatin-1_Supplement}>
<\p{InLatin_Extended-A}>
<\p{InLatin_Extended-B}>
<\p{InIPA_Extensions}>
<\p{InSpacing_Modifier_Letters}>
<\p{InCombining_Diacritical_Marks}>
<\p{InGreek_and_Coptic}>
<\p{InCyrillic}>
<\p{InCyrillic_Supplementary}>
<\p{InArmenian}>
<\p{InHebrew}>
<\p{InArabic}>
<\p{InSyriac}>
<\p{InThaana}>
<\p{InDevanagari}>
<\p{InBengali}>
<\p{InGurmukhi}>
<\p{InGujarati}>
<\p{InOriya}>
<\p{InTamil}>
<\p{InTelugu}>
<\p{InKannada}>
<\p{InMalayalam}>
<\p{InSinhala}>
<\p{InThai}>
<\p{InLao}>
<\p{InTibetan}>
<\p{InMyanmar}>
78
2.
U+10A0U+10FF
U+1100U+11FF
U+1200U+137F
U+13A0U+13FF
U+1400U+167F
U+1680U+169F
U+16A0U+16FF
U+1700U+171F
U+1720U+173F
U+1740U+175F
U+1760U+177F
U+1780U+17FF
U+1800U+18AF
U+1900U+194F
U+1950U+197F
U+19E0U+19FF
U+1D00U+1D7F
U+1E00U+1EFF
U+1F00U+1FFF
U+2000U+206F
U+2070U+209F
U+20A0U+20CF
U+20D0U+20FF
U+2100U+214F
U+2150U+218F
U+2190U+21FF
U+2200U+22FF
U+2300U+23FF
U+2400U+243F
U+2440U+245F
U+2460U+24FF
U+2500U+257F
U+2580U+259F
U+25A0U+25FF
U+2600U+26FF
U+2700U+27BF
U+27C0U+27EF
U+27F0U+27FF
U+2800U+28FF
U+2900U+297F
<\p{InGeorgian}>
<\p{InHangul_Jamo}>
<\p{InEthiopic}>
<\p{InCherokee}>
<\p{InUnified_Canadian_Aboriginal_Syllabics}>
<\p{InOgham}>
<\p{InRunic}>
<\p{InTagalog}>
<\p{InHanunoo}>
<\p{InBuhid}>
<\p{InTagbanwa}>
<\p{InKhmer}>
<\p{InMongolian}>
<\p{InLimbu}>
<\p{InTai_Le}>
<\p{InKhmer_Symbols}>
<\p{InPhonetic_Extensions}>
<\p{InLatin_Extended_Additional}>
<\p{InGreek_Extended}>
<\p{InGeneral_Punctuation}>
<\p{InSuperscripts_and_Subscripts}>
<\p{InCurrency_Symbols}>
<\p{InCombining_Diacritical_Marks_for_Symbols}>
<\p{InLetterlike_Symbols}>
<\p{InNumber_Forms}>
<\p{InArrows}>
<\p{InMathematical_Operators}>
<\p{InMiscellaneous_Technical}>
<\p{InControl_Pictures}>
<\p{InOptical_Character_Recognition}>
<\p{InEnclosed_Alphanumerics}>
<\p{InBox_Drawing}>
<\p{InBlock_Elements}>
<\p{InGeometric_Shapes}>
<\p{InMiscellaneous_Symbols}>
<\p{InDingbats}>
<\p{InMiscellaneous_Mathematical_Symbols-A}>
<\p{InSupplemental_Arrows-A}>
<\p{InBraille_Patterns}>
<\p{InSupplemental_Arrows-B}>
2.7. , ,
U+2980U+29FF
U+2A00U+2AFF
U+2B00U+2BFF
U+2E80U+2EFF
U+2F00U+2FDF
U+2FF0U+2FFF
U+3000U+303F
U+3040U+309F
U+30A0U+30FF
U+3100U+312F
U+3130U+318F
U+3190U+319F
U+31A0U+31BF
U+31F0U+31FF
U+3200U+32FF
U+3300U+33FF
U+3400U+4DBF
U+4DC0U+4DFF
U+4E00U+9FFF
U+A000U+A48F
U+A490U+A4CF
U+AC00U+D7AF
U+D800U+DB7F
U+DB80U+DBFF
U+DC00U+DFFF
U+E000U+F8FF
U+F900U+FAFF
U+FB00U+FB4F
U+FB50U+FDFF
U+FE00U+FE0F
U+FE20U+FE2F
U+FE30U+FE4F
U+FE50U+FE6F
U+FE70U+FEFF
U+FF00U+FFEF
U+FFF0U+FFFF
79
<\p{InMiscellaneous_Mathematical_Symbols-B}>
<\p{InSupplemental_Mathematical_Operators}>
<\p{InMiscellaneous_Symbols_and_Arrows}>
<\p{InCJK_Radicals_Supplement}>
<\p{InKangxi_Radicals}>
<\p{InIdeographic_Description_Characters}>
<\p{InCJK_Symbols_and_Punctuation}>
<\p{InHiragana}>
<\p{InKatakana}>
<\p{InBopomofo}>
<\p{InHangul_Compatibility_Jamo}>
<\p{InKanbun}>
<\p{InBopomofo_Extended}>
<\p{InKatakana_Phonetic_Extensions}>
<\p{InEnclosed_CJK_Letters_and_Months}>
<\p{InCJK_Compatibility}>
<\p{InCJK_Unified_Ideographs_Extension_A}>
<\p{InYijing_Hexagram_Symbols}>
<\p{InCJK_Unified_Ideographs}>
<\p{InYi_Syllables}>
<\p{InYi_Radicals}>
<\p{InHangul_Syllables}>
<\p{InHigh_Surrogates}>
<\p{InHigh_Private_Use_Surrogates}>
<\p{InLow_Surrogates}>
<\p{InPrivate_Use_Area}>
<\p{InCJK_Compatibility_Ideographs}>
<\p{InAlphabetic_Presentation_Forms}>
<\p{InArabic_Presentation_Forms-A}>
<\p{InVariation_Selectors}>
<\p{InCombining_Half_Marks}>
<\p{InCJK_Compatibility_Forms}>
<\p{InSmall_Form_Variants}>
<\p{InArabic_Presentation_Forms-B}>
<\p{InHalfwidth_and_Fullwidth_Forms}>
<\p{InSpecials}>
. ,
, , 100% . .
80
2.
Currency . Basic_Latin
Latin-1_Supplement. Currency
Symbol.
\p{InCurrency} \p{Sc} .
, \p{Cn} .
.
\p{InBlockName} .NET
Perl. Java \p{IsBlockName} .
Perl Is,
In, . Perl \p{Script} \p{IsScript} , \p{InScript} .
, , .
. U+FFFF :
<\p{Common}>
<\p{Arabic}>
<\p{Armenian}>
<\p{Bengali}>
<\p{Bopomofo}>
<\p{Braille}>
<\p{Buhid}>
<\p{CanadianAboriginal}>
<\p{Cherokee}>
<\p{Cyrillic}>
<\p{Devanagari}>
<\p{Ethiopic}>
<\p{Georgian}>
<\p{Greek}>
<\p{Gujarati}>
<\p{Gurmukhi}>
<\p{Han}>
<\p{Hangul}>
<\p{Hanunoo}>
<\p{Hebrew}>
<\p{Hiragana}>
<\p{Katakana}>
<\p{Khmer}>
<\p{Lao}>
<\p{Latin}>
<\p{Limbu}>
<\p{Malayalam}>
<\p{Mongolian}>
<\p{Myanmar}>
<\p{Ogham}>
<\p{Oriya}>
<\p{Runic}>
<\p{Sinhala}>
<\p{Syriac}>
<\p{Tagalog}>
<\p{Tagbanwa}>
<\p{TaiLe}>
<\p{Tamil}>
<\p{Telugu}>
<\p{Thaana}>
<\p{Thai}>
<\p{Tibetan}>
2.7. , ,
<\p{Inherited}>
<\p{Kannada}>
81
<\p{Yi}>
,
. , Thai,
. , Latin,
. . , Japanese,
Hiragana, Katakana, Han Latin,
.
, , Common. , , ,
.
, .
U+0061 a , U+00E0 a . , .
U+0300 . . ,
U+0061 U+0300, , , , U+00E0. U+0300
U+0061.
, , . ,
,
, , .
, ,
, , , , . , . , .
U+0061 U+0300, , Java, \u0061\u0300,
82
2.
U+0061, a U+0300. .. .
\P \p . ,
\P{Sc} , Currency Symbol. \P
, \p ,
, .
, \u , \x , \p \P , .
, , ,
, . , , , ,
(U+2122):
[\p{Pi}\p{Pf}\x{2122}]
:
: .NET, Java, PCRE, Perl, Ruby 1.9
, , ,
, . :
,
. , Greek Extended U+1F00 U+1FFF:
[\u1F00-\u1FFF]
83
2.7. , ,
:
: PCRE, Perl, Ruby 1.9
. ,
, . Greek:
[\u0370-\u0373\u0375\u0376-\u0377\u037A\u037B-\u037D\u0384\u0386
\u0388-\u038A\u038C\u038E-\u03A1\u03A3-\u03E1\u03F0-\u03F5\u03F6
\u03F7-\u03FF\u1D26-\u1D2A\u1D5D-\u1D61\u1D66-\u1D6A\u1DBF\u1F00-\u1F15
\u1F18-\u1F1D\u1F20-\u1F45\u1F48-\u1F4D\u1F50-\u1F57\u1F59\u1F5B\u1F5D
\u1F5F-\u1F7D\u1F80-\u1FB4\u1FB6-\u1FBC\u1FBD\u1FBE\u1FBF-\u1FC1
\u1FC2-\u1FC4\u1FC6-\u1FCC\u1FCD-\u1FCF\u1FD0-\u1FD3\u1FD6-\u1FDB
\u1FDD-\u1FDF\u1FE0-\u1FEC\u1FED-\u1FEF\u1FF2-\u1FF4\u1FF6-\u1FFC
\u1FFD-\u1FFE\u2126]
:
: .NET, Java, JavaScript, Python
,
Greek http://www.unicode.org/Public/
UNIDATA/Scripts.txt :
1. ;.* . ,
.
2. ^ ,
^ $
\u
\u. \.\.
-\u .
3. , , \s+ , . . , \u / \u , ,
Scripts.txt.
,
, .
, .
84
2.
\x{}
:
1. ;.* . ,
.
2. ^
^ $
\x{,
\x{. \.\.
}-\x{ .
3. , , \s+ , } .
. ,
\x{ / \x{ ,
Scripts.txt.
:
[\x{0370}-\x{0373}\x{0375}\x{0376}-\x{0377}\x{037A}\x{037B}-\x{037D}
\x{0384}\x{0386}\x{0388}-\x{038A}\x{038C}\x{038E}-\x{03A1}
\x{03A3}-\x{03E1}\x{03F0}-\x{03F5}\x{03F6}\x{03F7}-\x{03FF}
\x{1D26}-\x{1D2A}\x{1D5D}-\x{1D61}\x{1D66}-\x{1D6A}\x{1DBF}
\x{1F00}-\x{1F15}\x{1F18}-\x{1F1D}\x{1F20}-\x{1F45}\x{1F48}-\x{1F4D}
\x{1F50}-\x{1F57}\x{1F59}\x{1F5B}\x{1F5D}\x{1F5F}-\x{1F7D}
\x{1F80}-\x{1FB4}\x{1FB6}-\x{1FBC}\x{1FBD}\x{1FBE}\x{1FBF}-\x{1FC1}
\x{1FC2}-\x{1FC4}\x{1FC6}-\x{1FCC}\x{1FCD}-\x{1FCF}\x{1FD0}-\x{1FD3}
\x{1FD6}-\x{1FDB}\x{1FDD}-\x{1FDF}\x{1FE0}-\x{1FEC}\x{1FED}-\x{1FEF}
\x{1FF2}-\x{1FF4}\x{1FF6}-\x{1FFC}\x{1FFD}-\x{1FFE}\x{2126}
\x{10140}-\x{10174}\x{10175}-\x{10178}\x{10179}-\x{10189}
\x{1018A}\x{1D200}-\x{1D241}\x{1D242}-\x{1D244}\x{1D245}]
:
: PCRE, Perl, Ruby 1.9
.
http://www.unicode.org - Unicode
Consortium, , , .
, .
Unicode Explained,
(Jukka K. Korpella) (OReilly).
2.8.
85
, , , .
, . ASCII .
2.8.
Mary|Jane|Sue
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. Mary|Jane|Sue Mary,
Jane, Sue.
, .
, ,
, . , . 1 , -
, . , , , , , ,
. , , , , , 1.
Perl, ,
, .
86
2.
, .
Mary|Jane|Sue Mary,
Jane, and Sue went to Marys house,
Mary .
, Find Next ( ) , Mary . .
Jane , . Sue . .
,
.
J,
Mary . , J, Jane , Jane .
, Jane
, Mary,
Mary Jane . . . ,
, .
,
Sue.
Mary. , , , house.
,
. ,
Jane|Janet ,
Her name is Janet. . , Jane Janet Her name is Janet , .
87
2.9.
, . , , ,
. ,
.
Jane Janet .
, :
Janet|Jane . ,
: , . , .
, \bJane\b|\bJanet\b \bJanet\b|\bJane\b
Janet Her name is Janet. . .
2.12 : \bJanet?\b .
.
2.9.
2.9.
, Mary, Jane
Sue, . , .
,
yyyy-mm-dd, ,
. , , . , ,
, .
, 9999-99-99, .
88
2.
\b(Mary|Jane|Sue)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\b(\d\d\d\d)-(\d\d)-(\d\d)\b
: None
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , . , : \bMary|Jane|Sue\b ,
: \bMary , Jane Sue\b .
Jane Her
name is Janet.
J
Janet ,
. . , Mary , . , Jane , . . \b .
e t . J .
, .
Mary-Jane-Sue
, . , , :
\b(\d\d\d\d)-(\d\d)-(\d\d)\b .
yyyy-mm-dd.
\b\d\d\d\d-\d\d\d\d\b . -
89
2.9.
, , .
\b(\d\d\d\d)-(\d\d)-(\d\d)\b
. , .
(\d\d\d\d) 1.
(\d\d) 2. (\d\d) 3.
, , , , .
2008-05-24, 2008
, 05
24 .
. 2.10 , . 2.21 , . 3.9, , ,
.
\b(Mary|Jane|Sue)\b .
:
\b(?:Mary|Jane|Sue)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
(?: . ) .
, .
, , . : , ,
.
90
2.
.
( 2.10), ( 2.21) ( 3.9), ,
, .
,
.
2.1 , .NET, Java, PCRE, Perl Ruby , : sensitive(?i)caseless(?-i)sensitive . , (?i) ,
.
:
\b(?i:Mary|Jane|Sue)\b
:
: .NET, Java, PCRE, Perl, Ruby
sensitive(?i:caseless)sensitive
:
: .NET, Java, PCRE, Perl, Ruby
, ,
, .
. ,
(?i:...)
.
, : (?ism:group) . , , : (?-ism:group)
. (?i-sm:group)
(i) (s) ^ $
(m). 2.4 2.5.
91
2.10.
.
2.10, 2.11, 2.21 3.9.
2.10.
,
yyyy-mm-dd. ,
, . , 2008-08-08. , , , . ,
9999-99-99, . .
\b\d\d(\d\d)-\1-\1\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , . , 2.9.
.
, . 10 99 \10 \99 .
\01 . .
, \xFF .
92
2.
\d\d 08 , . 08
1. , . . : 08. .
, . .
. , 2008-08-08. -
08.
( 2.12) ( 2.13), . ,
.
2008-05-24 2007-07-07, , \b\d\d(\d\d) 2008, 08
( ) . . 08 05 .
, . . , 0 , \1
.
,
\b\d\d(\d\d) 2007,
07. .
07 . ,
,
. 2007-07-07.
, . \b\d\d\1-(\d\d)-\1
\b\d\d\1-\1-(\d\d)\b . , . JavaScript,
, ,
.
93
2.11.
, , ,
, . , ,
. (^)\1 , ^ , ; \1 . , .
JavaScript ,
. JavaScript
, , ,
JavaScript, ,
, , , .
\b\d\d\1-\1-(\d\d)\b JavaScript
12--34.
.
2.9, 2.11, 2.21 3.9.
2.11.
,
yyyy-mm-dd ,
. , , . , , , . ,
year, month day.
,
yyyy-mm-dd. , , .
, 2008-08-08. (08 ) magic.
, , , . -
94
2.
, 9999-99-99, .
\b(?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d)\b
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\b(?year\d\d\d\d)-(?month\d\d)-(?day\d\d)\b
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\b(?P<year>\d\d\d\d)-(?P<month>\d\d)-(?P<day>\d\d)\b
:
: PCRE 4 and later, Perl 5.10, Python
\b\d\d(?<magic>\d\d)-\k<magic>-\k<magic>\b
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\b\d\d(?magic\d\d)-\kmagic-\kmagic\b
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\b\d\d(?P<magic>\d\d)-(?P=magic)-(?P=magic)\b
:
: PCRE 4 and later, Perl 5.10, Python
2.9 2.10
. ,
. , .
.
, .
.
95
2.11.
. ,
, .
Python , .
: (?P<name>regex) .
, \w . (?P<name> ,
) .
Regex .NET
, . (?<name>regex) Python, P.
, \w . (?<name> , ) .
.NET
Python .NET. Perl 5.10 Onigurama Ruby 1.9.
PCRE Python , Perl . PCRE 7, Perl 5.10, .NET,
Python. , - PCRE
, Perl 5.10 Python. PCRE Perl 5.10 .NET
Python .
, .
PHP
PHP,
PCRE, Python.
,
.NET Ruby, .NET,
. PHP/PCRE
Python. , PCRE,
96
2.
,
. , .NET Ruby,
P .
PCRE 7 Perl 5.10 Python, , . , PCRE
PHP.
. ,
,
.
.
Python name (?P=name) . , , , .
. (?P=name) , \1 .
.NET \k<name>
\kname . . , ,
, , .
.
, . Perl 5.10 Ruby 1.9 .NET, , .NET
. , .
, .
.
2.9, 2.10, 2.21 3.9.
2.12.
97
2.12.
, :
( , 100 ).
32- .
32- h.
,
. .
\b\d{100}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\b[a-f0-9]{1,8}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\b[a-f0-9]{1,8}h?\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\d*\.\d+(e\d+)?
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
{n} , n ,
n . -
98
2.
{0} ,
. ab{0}c ac.
{1} ,
. ab{1}c abc .
{n,m} , n m , n. \b[a-f0-9]{1,8}\b
. .
2.13.
n m , . \b\d{100,100}\b
\b\d{100}\b .
{n,} , n , . , ,
.
\d{0,} , , \d* . . ,
, .
, n , , , . h{0,1} h
. h,
h{0,1} . h{0,1} ,
99
2.12.
, h. h
( h).
h? , h{0,1} . , , , .
, .
,
, . Perl , , Perl
. , , , . ,
. , (? .
,
, . (?:abc){3}
abcabcabc .
. (e\d+)? e, , . , , .
. 2.9, ,
, , . (\d\d){1,3} , . . 123456, 56, 56 . , 12 34, .
(\d\d){3} , \d\d\
d\d(\d\d) . ,
, , , ,
, -
100
2.
: ((?:\d\d){1,3}) . ,
. : ((\d\d){1,3}) .
123456, \1 123456, \2 56.
.NET , . Value
, , 56,
.
56, CaptureCollection , 56, 34 12.
.
2.9, 2.13, 2.14.
2.13.
, <p> </p>
XHTML . XHTML.
<p>.*?</p>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 2.12,
, , ,
.
XHTML ( XML , ). XHTML:
101
2.13.
<p>
The very <em>first</em> task is to find the beginning of a paragraph.
</p>
<p>
Then you have to find the end of the paragraph
</p>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. , 2.12.
<p>
, .* . , . .
, .*
, . : .*
XHTML, .
.* , < . . :
.
, , .
. ,
, , .
< , , .* .
< . ,
102
2.
, <
. , < . <
, .
< , / . . , </p> .
? ,
XHTML, <p> </p>. , XHTML, <p> </p>,
.
.
, : *? , +? , ?? {7,42}? .
,
. , .
,
. , , .
<p>.*?</p> ,
XHTML. <p> , .*? . </p>
<p>, . .*? , . </p> , .*? .
, </p> .*?
. , .*?
XHTML.
103
2.13.
* *?
. ,
.
. .
, , . , , 2.12, - , . , ,
, , , . \d , \b , \d , ,
( ).
\d+\b \d+?\b
.
,
-.
\d+?\b 1234X
\d+?
1. \b
1 2 . \d+?
12, \b .
, \d+? 1234, -
104
2.
\b . \d+? , \d
X .
.
\d+ ,
,
. . , \b\d+\b , . , \b\d++\b ,
.
.
2.8, 2.9, 2.12, 2.14 2.15.
2.14.
, .
.
\b\d+\b ,
\b\d+?\b . .
. . , .
\b\d++\b
:
: Java, PCRE, Perl 5.10, Ruby 1.9
.
.
\b(?>\d+)\b
:
: .NET, Java, PCRE, Perl, Ruby
105
2.14.
.
,
, .
JavaScript Python , .
.
: . ,
,
,
. .
,
. , : *+ , ++ , ?+ {7,42}+ .
.
,
106
2.
, .
1.
,
, . , ,
2.5. , 4,
456.
,
\b
.
( ) \b 2 3, 1 2.
.
\b(?>\d+)\b () 123abc 456 . \d+ 123. , \d+ , . \b , ,
. , 456.
, , .
, ,
. x++ (?>x+) , .
, .
,
,
.
\w++\d++ (?>\w+\d+) .
\w++\d++ , (?>\w+)(?>\d+) ,
107
2.14.
abc123. \w++
abc123,
\d++ .
, , \d++ . .
(?>\w+\d+) , . . , . abc123 \w+ abc123. . \d+ , \w+ . \d+ 3. ,
\w+ \d+ .
, . .
.
, ,
, .
, , . ,
.
.
2.12 2.15.
108
2.
2.15.
,
HTML, html, head, title
body .
HTML, .
<html>(?>.*?<head>)(?>.*?<title>)(?>.*?</title>)
(?>.*?</head>)(?>.*?<body[^>]*>)(?>.*?</body>).*?</html>
: ,
: .NET, Java, PCRE, Perl, Ruby
JavaScript Python . .
JavaScript Python
, , .
, :
<html>.*?<head>.*?<title>.*?</title>
.*?</head>.*?<body[^>]*>.*?</body>.*?</html>
: ,
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
HTML, . .*? , . , ,
. 2.4 2.13.
, , - HTML.
</html>.
109
2.15.
, .*? .
</html> , .*?
. ,
.
O(n7)
, . , .
,
128 ,
.
.
, .
,
.
,
,
. ,
.
, .
.*? , </body> . </html> ,
html.
110
2.
.*? ,
.
- , .
O(n), . , .
,
(x+x+)+y xxxxxxxxxx.
,
x. ,
.
x, Perl .
, , Perl ,
.
O(2n). y , x+ , . , , ,
x+ xxx, x+ x, x+ x. x 1024 .
32, 4 , , , , .
111
2.16.
, xx+y , , . .
, ,
. ,
.
() , , , .
.
2.13 2.14.
2.16.
, <b> </b>
HTML,
. , My <b>cat</b> is furry,
cat.
(?<=<b>)\w+(?=</b>)
:
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8 (?=</b>) , (?<=</b>) .
, ,
,
. -
112
2.
, ,
.
.
, , . (?<=text) .
(?<= . , text , .
, , (?<=<b>) , .
, ,
, , ,
. My <b>cat</b> is furry (?<=<b>) , c.
,
. <b>
c. , , . , , , . c . <b> . , ,
. .
,
. . , . (?=regex) . (?=
. , , regex .
113
2.16.
(?!regex) . , , , ,
, , ,
.
.
. ,
. , .
(?<!text) . ,
, , .
.
, , . ,
, .
114
2.
. , .
. . : ,
, .
. Perl, Python Ruby 1.9 ,
. (?<=one|two|three|fortytwo|gr[ae]y) ,
.
PCRE Java .
, . ,
, * , + {42,} .
PCRE Java , .
. ,
,
, .
2.16.
115
, .
.NET 1,
. ,
.
,
, - , -
.
, .
2.3 , . 58 ( 2.3) , Thai.
.NET
Java.
, ()
Thai ( ).
:
(?=\p{Thai})\p{N}
:
: PCRE, Perl, Ruby 1.9
, , 2.7.
, .
, RegexBuddy,
, () , RegexOptions.RightToLeft, .NET, .
116
2.
(?=\p{Thai})\p{N} , , , .
Thai ( \p{Thai} ), . ,
.
, , .
, , ,
. . 2.15 .
.
, , , . , ,
.
( ,
) . ,
, , . ,
117
2.16.
, . , , .
,
, ,
.
:
(?=(\d+))\w+\1
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
123x12. \d+ 12 , \w+ 3x,
, , \1 12.
.
.
\d+ 123.
. , , , 123, .
12 \1 ,
123 12, .
.
\w+ , , , \d+ , . .
118
2.
,
Python JavaScript, .
,
, .
.
(<b>)(\w+)(?=</b>)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
<b> .
, \w+ , .
My <b>cat</b> is
furry, <b>cat. <b>, cat.
, cat (
<b>), , ,
, , .
,
,
. ,
, . ,
, .
2.21.
, . . , , , .
, ,
( \z $ ). . , , .
2.17.
119
JavaScript :
var mainregexp = /\w+(?=<\/b>)/;
var lookbehind = /<b>$/;
if (match = mainregexp.exec(My <b>cat</b> is furry)) {
// </b>
var potentialmatch = match[0];
var leftContext = match.input.substring(0, match.index);
if (lookbehind.exec(leftContext)) {
// :
// potentialmatch <b>
} else {
// :
// potentialmatch ,
}
} else {
// </b>
}
.
5.5 5.6.
2.17.
, one, two
three, .
.
\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}(?(1)|(?!))(?(2)|(?!))(?(3)|(?!))
:
: .NET, JavaScript, PCRE, Perl, Python
Java Ruby . Java Ruby ( ) ,
,
.
\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
120
2.
, . . then else . , - . .
then else . .
, then . , , .
, (?!),
else. ,
, , . (?(1)|(?!)) , -.
,
,
.
.NET .
(?(name)then|else) , name .
, ,
(a)?b(?(1)c|d) . , , abc|bd) .
a, .
. , , ,
. a ,
121
2.17.
-. .
(a?) , .
, .
a .
a , b . .
,
( ), c . d .
, (a)?b(?(1)c|d)
ab, c, b,
d.
.NET, PCRE Perl, Python, . (?(?=if)then|else) (?=if) , . , 2.16.
, then . else .
, then
else ,
if ,
.
. , , , then else.
, ,
, : (?=if)then|(?!if)else . ,
then .
.
. , if ,
, (?=if)
.
else . ,
if .
122
2.
.
2.9 2.16.
2.18.
\d{4}-\d{2}-\d{2} yyyy-mmdd, . , , .
,
, .
\d{4}
\d{2}
\d{2}
#
#
#
#
#
:
: .NET, Java, PCRE, Perl, Python, Ruby
. , , .
, , JavaScript, , .
,
. .
.NET RegexOptions.IgnorePatternWhitespace. Java Pattern.COMMENTS. Python re.VERBOSE, Perl Ruby /x.
123
2.18.
. (#), . ,
( , ).
, , .
, [#]
\# .
, , , ,
. ,
[] \ . \x20 \u0020 \x{0020} . , \t . \r\n (Windows) \n
(UNIX/Linux/OS X).
. .
, .
, .
Java,
, . Java .
Java . Java ,
. [] [#] . \u0020 \# .
124
2.
(?#Year)\d{4}(?#Separator)-(?#Month)\d{2}-(?#Day)\d{2}
:
: .NET, PCRE, Perl, Python, Ruby
- , (?#comment) . (?# ) .
, JavaScript, , . Java .
(?x)\d{4}
\d{2}
\d{2}
#
#
#
#
#
:
: .NET, Java, PCRE, Perl, Python, Ruby
,
(?x) . , (?x) .
.
2.1
. 51.
2.19.
,
, : $%\*$1\1.
$%\*$$1\1
: .NET, JavaScript
2.19.
125
\$%\\*\$1\\1
: Java
$%\*\$1\\1
: PHP
\$%\*\$1\\1
: Perl
$%\*$1\\1
: Python, Ruby
, .
, , . .
,
,
,
. $1 / \1 . 2.21 , .
, ,
.
.NET JavaScript
.NET JavaScript .
, .
. , , , , , ,
. , .
, ,
.
, .
126
2.
$$%\*$$1\1
: .NET, JavaScript
, .NET ,
. .NET ${group} . JavaScript .
Java
Java
. . , .
PHP
PHP , , , .
. , , \\\\ . .
Perl
Perl : . , , $1 , Perl . ,
, .
, Perl \1 . , ,
. , , , .
. , , \\\\ . .
2.19.
127
Python Ruby
Python Ruby . , ,
.
Python \1 \9 \g< . .
Ruby ,
, , , .
,
. , , \\\\ . .
,
.
.
, replace()
, . ,
, , , , , .
RegexBuddy , .
, .
, , . .
.
3.14.
128
2.
2.20.
, URL
HTML, , URL .
, URL http:, . , Please visit http://www.regexcookbook.com Please visit <a href=http://www.regexcookbook.com>http://www.
regexcookbook.com</a>.
http:\S+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
<ahref="$&">$&</a>
: PHP, Ruby
<ahref="\&">\&</a>
: Ruby
<ahref="\g<0>">\g<0></a>
: Python
, .
, - ,
Python.
2.21.
129
Perl $& .
Perl .
.NET Ruby
, ,
. .
.
1
3.15.
2.21.
10
, 1234567890.
, (123) 456-7890.
\b(\d{3})(\d{3})(\d{4})\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
($1)$2-$3
130
2.
(${1})${2}-${3}
2.10 , , . , , . , , .
, Python Ruby, \1 ,
.
Perl $1 , . PHP
.
Perl $1 , .
, . .NET, Java,
JavaScript PHP $1
.
. 3.
$10
, ,
99 .
$10 \10 . 10, , 0 .
2.21.
131
, . , , , . $23 23,
.
,
3 .
, ,
. $4 \4 , .
.
Java Python ,
. . (
.) , $4 \4
, . 2.19.
,
\b(?<area>\d{3})(?<exchange>\d{3})(?<number>\d{4})\b
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
132
2.
\b(?area\d{3})(?exchange\d{3})(?number\d{4})\b
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\b(?P<area>\d{3})(?P<exchange>\d{3})(?P<number>\d{4})\b
:
: PCRE 4 , Perl 5.10, Python
(${area})${exchange}-${number}
: .NET
(\g<area>)\g<exchange>-\g<number>
: Python
(\k<area>)\k<exchange>-\k<number>
: Ruby 1.9
(\karea)\k'exchange'-\k'number'
: Ruby 1.9
($1)$2-$3
,
.NET, Python Ruby 1.9 , .
.NET Python
, .
.
Ruby
, .
Ruby 1.9 : \k<group> \kgroup . .
2.22.
133
.
1
2.9, 2.10, 2.11 3.15.
2.22.
, , ,
,
. , BeforeMatchAfter
Match, BeforeBeforeMatchAfterAfter, BeforeBeforeBeforeMatchAfterAfterAfter.
$`$_$
: .NET, Perl
\`\`\&\\
: Ruby
$`$`$&$$
: JavaScript
134
2.
, . : ,
. ,
, , , , , . , .
.NET Perl $` , $ $_ ,
. Perl ,
, , . $` .
U.S. , 1, . $ .
. , U.S.,
Enter. $_
. .NET Perl JavaScript
$` $ . JavaScript , .
, $& .
.
1
3.15.
,
. ,
; .
.
, ,
.
.
,
, , , . , - , , , ,
.
4 8 , .
. , , ,
,
.
136
3.
,
, . . , ,
. ,
.
. 21 , .
. 25 , , . , , .
,
. , .
,
.
, .
C#
C# Microsoft
.NET. System.Text.RegularExpressions
.NET.
C# 1.0 3.5 Visual Studio
2002 2008.
VB.NET
VB.NET Visual Basic.NET,
, Visual Basic 2002 , Visual Basic 6 . Visual Basic Microsoft .NET. System.Text.RegularExpressions .NET. Visual Basic 2002
2008.
Java
Java 4 , java.util.regex. -
137
138
3.
ereg
PHP
, PHP 5.3.0.
POSIX ERE.
.
POSIX ERE Ruby 1.9
PCRE. ,
ereg, mb_ereg preg. preg Perl
( 3.1).
Perl
, Perl , .
, m// s/// Perl, Perl. Perl 5.6,
5.8 5.10.
Python
Python re.
, ,
Python. Python 2.4 2.5.
Ruby
Ruby
. Ruby 1.8
1.9. Ruby
. Ruby 1.9
Onigurama, , Ruby 1.8. , ,
. 22.
Ruby 1.8 1.9.
, Ruby 1.9. Ruby,
, , , Ruby Onigurama. Ruby 1.8
Onigurama.
139
, , ,
.
, , .
ActionScript
ActionScript
ECMA-262, Adobe. 3.0,
ActionScript ECMA-262v3. JavaScript. ,
ActionScript JavaScript. , JavaScript,
ActionScript.
C
C . , , , ,
PCRE, . C http://
www.pcre.org. ,
.
C++
C++ . , , , , PCRE, .
C API
- C++,
PCRE (
http://www.pcre.org).
Delphi for Win32
Delphi
Win32 .
VCL,
. ,
PCRE. Delphi
C , VCL, PCRE,
. .exe.
140
3.
141
142
3.
, JavaScript, ECMA-262v3.
Internet Explorer 5.5 . ,
Windows XP
Vista, Windows, ,
Internet Explorer 5.5 . , Windows .
Visual
Basic, Project|References
VB. Microsoft VBScript
Regular Expressions 5.5,
Microsoft VBScript Regular Expressions 1.0. , 5.5, 1.0. 1.0 , .
, . View|Object Browser.
Object Browser VBScript_RegExp_55.
3.1.
[$\n\d/\\] ,
. , , , , , 0 9, . .
C#
:
[$\\n\\d/\\\\]
:
@[$\n\d/\\]
3.1.
143
VB.NET
[$\n\d/\\]
Java
[$\\n\\d/\\\\]
JavaScript
/[$\n\d\/\\]/
PHP
%[$\\n\d/\\\\]%
Perl
:
/[\$\n\d\/\\]/
m![\$\n\d/\\]!
:
s![\$\n\d/\\]!!
Python
:
r[$\n\d/\\]
:
[$\\n\\d/\\\\]
Ruby
, :
/[$\n\d\/\\]/
, ,
:
%r![$\n\d/\\]!
( , ), .
. , RegexBuddy
144
3.
RegexPal,
. ,
, .
, . ,
,
, ,
. , - . , , , . , . , , .
.
.
C#
C# Regex() - Regex. , , .
C# . , , C++ Java. . , ,
\n . RegexOptions.IgnorePatternWhitespace
( 3.4) , 2.18, \n \\n.
\n ,
. \\n , \n , .
@ . , . -
145
3.1.
, . @\n \n , , . \n ,
. .
:
C#, .
VB.NET
VB.NET
Regex() - Regex. , , .
Visual Basic .
. - .
Java
Java
Pattern.compile()
String. , ,
.
Java . .
, ,
\n , , \uFFFF .
JavaScript
JavaScript . . , .
146
3.
, RegExp
, , .
, .
PHP
, preg PHP, . JavaScript Perl, .
.
ereg mb_ereg. Perl - PCRE
PHP .
Perl. , ,
Perl /regex/, preg PHP
/regex/. Perl, . -
, . , -, . ,
, .
, ,
Perl JavaScript Ruby.
PHP .
( ),
, , . .
,
.
Perl
Perl .
, . -
147
3.1.
. - , , , $ @, ,
.
,
, m. (, ),
, m{regex}. , . .
. , , ,
.
.
m s . : s[regex]
[replace].
: s/regex/replace/.
Perl .
m/I am $name/ $name Jan, IamJan . ,
Perl $ ,
.
,
( 2.5).
. Perl
, ,
, ,
, , , . m/^regex$/ , , regex .
@ , Perl .
,
Perl,
.
148
3.
Python
re Python
.
, Python. , , .
(raw) .
Python -
. . r\d+ , \\d+,
.
, , . ,
, . ,
Python .
.
, 2.7, .
, u.
, \n.
. re. , . \n -
.
re.VERBOSE ( 3.4) , 2.18, \n
\\n r\n . \n , . \\n
r\n , \n , .
, r\n, , . -
3.2.
149
, \n ,
, .
Ruby
Ruby .
. , .
, r% .
, Regex
, , .
, .
Ruby JavaScript, , Ruby
Regexp, JavaScript
RegExp, camel caps.
.
2.3 ,
, .
3.4 , , .
3.2.
, .
150
3.
, ,
.
C#
using System.Text.RegularExpressions;
VB.NET
Imports System.Text.RegularExpressions
Java
import java.util.regex.*;
Python
import re
. -, .
, .
. .
C#
C# using, ,
. , System.Text.RegularExpressions.Regex()
Regex().
VB.NET
VB.NET Imports,
, -
3.3.
151
. , System.Text.RegularExpressions.
Regex() Regex().
Java
Java , java.
util.regex.
JavaScript
JavaScript .
PHP
preg PHP
4.2.0 .
Perl
Perl
.
Python
Python, re.
Ruby
Ruby .
3.3.
-
, .
C#
, :
Regex regexObj = new Regex(regex pattern);
152
3.
( UserInput):
try {
Regex regexObj = new Regex(UserInput);
} catch (ArgumentException ex) {
//
}
VB.NET
, :
Dim RegexObj As New Regex(regex pattern)
( UserInput):
Try
Dim RegexObj As New Regex(UserInput)
Catch ex As ArgumentException
End Try
Java
, :
Pattern regex = Pattern.compile(regex pattern);
( userInput):
try {
Pattern regex = Pattern.compile(userInput);
} catch (PatternSyntaxException ex) {
//
}
, Matcher:
Matcher regexMatcher = regex.matcher(subjectString);
, Matcher, , :
regexMatcher.reset(anotherSubjectString);
3.3.
153
JavaScript
:
var myregexp = /regex pattern/;
,
userinput:
var myregexp = new RegExp(userinput);
Perl
$myregex = qr/regex pattern/
,
$userinput:
$myregex = qr/$userinput/
Python
reobj = re.compile(regex pattern)
,
userinput:
reobj = re.compile(userinput)
Ruby
:
myregexp = /regex pattern/;
,
userinput:
myregexp = Regexp.new(userinput);
, .
. , , , ,
. , , . , -
154
3.
, ,
.
.NET
C# VB.NET System.Text.RegularExpressions.Regex.
: .
, Regex() ArgumentException. , . , ,
.
, , ,
.
, . , , ,
, .
, Regex. . Regex, , Regex, .
- ,
Regex, . Regex , , 15 . , Regex.CacheSize.
.
.
, , , .
3.3.
155
Java
Java Pattern
. Pattern.compile(), : .
,
Pattern.compile() PatternSyntaxException.
, .
,
,
. , , ,
.
,
. , , ,
, .
, Pattern , String. , ,
. .
.
Pattern .
Matcher. Matcher,
matcher() .
matcher() .
matcher() , . ,
, ,
. Pattern
Matcher . , Pattern.compile()
.
156
3.
, Matcher,
reset(), . , Matcher.
reset() Matcher, , , ,
regexMatcher.reset(nextString).find().
JavaScript
, 3.2,
. , .
(, , ), RegExp(). ,
. JavaScript RegExp , .
, JavaScript
.
, , . .
PHP
PHP . ,
- , preg .
preg 4 096 . , , ,
, , , ,
3.3.
157
Perl
Perl qr// . , ,
3.1, , m,
qr.
Perl .
qr//
. 3.5.
qr// , (,
). qr/$regexstring/
,
$regexstring.
m/$regexstring/ , m/$regexstring/o . /o 3.4.
Python
Python, re, compile(), .
, compile(). , re,
compile(), .
compile() 100 . 100 . , .
, ,
re. , compile().
158
3.
Ruby
, 3.2, .
,
.
(, , ), Regexp.new()
Regexp.compile(). , . Ruby Regexp ,
.
, Ruby .
,
,
.
.
CIL
C#
Regex regexObj = new Regex(regex pattern, RegexOptions.Compiled);
VB.NET
Dim RegexObj As New Regex(regex pattern, RegexOptions.Compiled)
3.4.
159
.
3.1, 3.2 3.4.
3.4.
: , ,
^ $ .
C#
Regex regexObj = new Regex(regex pattern,
RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase |
RegexOptions.Singleline | RegexOptions.Multiline);
160
3.
VB.NET
Dim RegexObj As New Regex(regex pattern,
RegexOptions.IgnorePatternWhitespace Or RegexOptions.IgnoreCase Or
RegexOptions.Singleline Or RegexOptions.Multiline)
Java
Pattern regex = Pattern.compile(regex pattern,
Pattern.COMMENTS | Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE |
Pattern.DOTALL | Pattern.MULTILINE);
JavaScript
:
var myregexp = /regex pattern/im;
, :
var myregexp = new RegExp(userinput, im);
PHP
regexstring = /regex pattern/simx;
Perl
m/regex pattern/simx;
Python
reobj = re.compile(regex pattern,
re.VERBOSE | re.IGNORECASE |
re.DOTALL | re.MULTILINE)
Ruby
:
myregexp = /regex pattern/mix;
, :
myregexp = Regexp.new(userinput,
Regexp::EXTENDED or Regexp::IGNORECASE or
Regexp::MULTILINE);
, ,
, ,
.
. ,
3.4.
161
, , . .
,
. .
, .
.NET
Regex() . RegexOptions.
: RegexOptions.IgnorePatternWhitespace
: RegexOptions.IgnoreCase
: RegexOptions.Singleline
^ $ : RegexOptions.Multiline
Java
Pattern.compile() . Pattern
.
,
|.
: Pattern.COMMENTS
: Pattern.CASE_INSENSITIVE |
Pattern.UNICODE_CASE
: Pattern.DOTALL
^ $ : Pattern.MULTILINE
,
, , , .
Pattern.CASE_INSENSITIVE,
A Z.
, . , Pattern.UNICODE_CASE,
,
, -
162
3.
JavaScript
, - JavaScript, RegExp , . /i /m,
.
.
RegExp()
. , . .
: JavaScript
: /i
: JavaScript
^ $ : /m
PHP
3.1 , preg PHP ,
, , .
, .
,
,
. /x, ,
.
: /x
: /i
: /s
^ $ : /m
Perl
-
3.4.
163
.
/x, ,
.
: /x
: /i
: /s
^ $ : /m
Python
compile() ( )
. |,
, re. re, , .
.
. .
,
, .
, ,
Perl.
: re.VERBOSE re.X
: re.IGNORECASE re.I
: re.DOTALL re.S
^ $ : re.MULTILINE re.M
Ruby
- Ruby, Regexp , . /i /m,
. .
Regexp.new()
. nil, , Regexp, or.
164
3.
: /r Regexp::EXTENDED
:/i Regexp::IGNORECASE
: /m Regexp::MULTILINE.
Ruby m multi line, ,
s single line.
^ $ : c Ruby .
.
\A \Z .
,
.NET
RegexOptions.ExplicitCapture , , . (group) (?:group) .
,
, (?:group) .
RegexOptions.ExplicitCapture (?n) . 2.9, 2.11 .
Java
Java Pattern.CANON_EQ,
. . 81,
. -
165
3.4.
, Pattern.UNIX_LINES
\n , ,
. , .
JavaScript
,
, , /g,
global ().
PHP
/u PCRE ,
UTF-8.
, \p{FFFF} \p{L} .
2.7. PCRE , .
/U , . .* , .*? . /U .* , .*? .
,
, /U, PHP. ,
/U /u, - . .
Perl
(, -
166
3.
, ),
/g, global ().
, m/I am
$name/, Perl
, $name . /o.
m/I am $name/o
, . $name , . , , 3.3.
Python
Python , ( 2.6)
\w , \d \s , ( 2.3). ASCII, .
re.LOCALE, re.L, . , ,
. , , , .
re.UNICODE, re.U,
. , , , . , , , .
Ruby
Regexp.new()
, , . ,
, .
.
,
. .
:
3.4.
167
n
None (). , . ASCII.
e
EUC .
s
Shift-JIS.
u
UTF-8,
, ( ,
).
, /n, /e, /s /u.
. /x, /i /m.
/s Ruby Perl, Java .NET. Ruby
/s Shift-JIS. Perl
.
Ruby /m.
.
, , 2. .
: 2.18
: . 51 2.1
: 2.4
^ $ : 2.5
3.1 3.3 . .
168
3.
3.5.
,
. . , regexpattern The regex pattern can be found. - , , .
C#
:
bool foundMatch = Regex.IsMatch(subjectString, regex pattern);
, :
bool foundMatch = false;
try {
foundMatch = Regex.IsMatch(subjectString, UserInput);
} catch (ArgumentNullException ex) {
// null
//
} catch (ArgumentException ex) {
//
}
, Regex:
Regex regexObj = new Regex(regex pattern);
bool foundMatch = regexObj.IsMatch(subjectString);
, Regex :
bool foundMatch = false;
try {
Regex regexObj = new Regex(UserInput);
try {
foundMatch = regexObj.IsMatch(subjectString);
} catch (ArgumentNullException ex) {
3.5.
169
// null
//
}
} catch (ArgumentException ex) {
//
}
VB.NET
:
Dim FoundMatch = Regex.IsMatch(SubjectString, regex pattern)
, :
Dim FoundMatch As Boolean
Try
FoundMatch = Regex.IsMatch(SubjectString, UserInput)
Catch ex As ArgumentNullException
Nothing
Catch ex As ArgumentException
End Try
, Regex:
Dim RegexObj As New Regex(regex pattern)
Dim FoundMatch = RegexObj.IsMatch(SubjectString)
, Regex :
Dim FoundMatch As Boolean
Try
Dim RegexObj As New Regex(UserInput)
Try
FoundMatch = Regex.IsMatch(SubjectString)
Catch ex As ArgumentNullException
Nothing
170
3.
End Try
Catch ex As ArgumentException
End Try
Java
, Matcher:
Pattern regex = Pattern.compile(regex pattern);
Matcher regexMatcher = regex.matcher(subjectString);
boolean foundMatch = regexMatcher.find();
, :
boolean foundMatch = false;
try {
Pattern regex = Pattern.compile(UserInput);
Matcher regexMatcher = regex.matcher(subjectString);
foundMatch = regexMatcher.find();
} catch (PatternSyntaxException ex) {
//
}
JavaScript
if (/regex pattern/.test(subject)) {
//
} else {
//
}
PHP
if (preg_match(/regex pattern/, $subject)) {
#
} else {
#
}
Perl
$_:
if (m/regex pattern/) {
#
} else {
#
}
3.5.
171
$subject:
if ($subject =~ m/regex pattern/) {
#
} else {
#
}
, :
$regex = qr/regex pattern/;
if ($subject =~ $regex) {
#
} else {
#
}
Python
:
if re.search(regex pattern, subject):
#
else:
#
, :
reobj = re.compile(regex pattern)
if reobj.search(subject):
#
else:
#
Ruby
if subject =~ /regex pattern/
#
else
#
end
:
if /regex pattern/ =~ subject
#
else
#
End
172
3.
.
,
, .
, , - . , .
,
.
. , (,
), .
C# VB.NET
Regex
IsMatch(), . IsMatch() . . , . null.
IsMatch() ArgumentNullException.
,
Regex.IsMatch(), Regex.
,
. ,
IsMatch() ArgumentException. , true , , false
.
, , Regex, IsMatch() . . , . , , , .
3.5.
173
, .
. IsMatch()
ArgumentOutOfRangeException.
, , .
IsMatch() , . ,
Regex.Match(subject, start, stop) Success Match.
3.8.
Java
, Matcher,
3.3. find()
Matcher.
String.matches(), Pattern.matches() Matcher.matches() , , .
JavaScript
, test() .
.
regexp.test() true, , false
.
PHP
preg_match() . : , .
, preg_match() 1,
,
0.
,
preg_match(), .
174
3.
Perl
Perl m//
, . //m
$_.
m// , =~,
. true,
,
false .
, !~,
=~.
Python
re search(), . ,
.
.
re.search() re.compile() search() .
: .
, search()
MatchObject. ,
search() None. if MatchObject ,
True, None , False. , , MatchObject.
search() match().
match() . match()
.
Ruby
=~ .
, . . , nil.
3.6.
175
, Regexp String.
Ruby 1.8 ,
. Ruby 1.9
, . 3.9.
Ruby
=~, .
Perl, Ruby =~,
Ruby 1.9,
, - .
.
3.6 3.7.
3.6.
,
. , , , . , ,
regexpattern , , regex pattern, The regex pattern can be found.
C#
:
bool foundMatch = Regex.IsMatch(subjectString, @\Aregex pattern\Z);
Regex:
Regex regexObj = new Regex(@\Aregex pattern\Z);
bool foundMatch = regexObj.IsMatch(subjectString);
176
3.
VB.NET
:
Dim FoundMatch = Regex.IsMatch(SubjectString, \Aregex pattern\Z)
Regex:
Dim RegexObj As New Regex(\Aregex pattern\Z)
Dim FoundMatch = RegexObj.IsMatch(SubjectString)
Java
:
boolean foundMatch = subjectString.matches(regex pattern);
:
Pattern regex = Pattern.compile(regex pattern);
Matcher regexMatcher = regex.matcher(subjectString);
boolean foundMatch = regexMatcher.matches(subjectString);
JavaScript
if (/^regex pattern$/.test(subject)) {
//
} else {
//
}
PHP
if (preg_match(/\Aregex pattern\Z/, $subject)) {
#
} else {
#
}
Perl
if ($subject =~ m/\Aregex pattern\Z/) {
#
177
3.6.
} else {
#
}
Python
:
if re.match(rregex pattern\Z, subject):
#
else:
#
:
reobj = re.compile(rregex pattern\Z)
if reobj.match(subject):
#
else:
#
Ruby
if subject =~ /\Aregex pattern\Z/
#
else
#
End
,
- . , - ,
. , , , , .
IP- , .
, $ \Z ,
( 3.21),
. 2.5,
, .
178
3.
C# VB.NET
Regex .NET , , .
,
\A , ,
\Z , . ,
. , one|two|three ,
: \A(?:one|two|three)\Z .
, , IsMatch(), .
Java
Java matches(). , . ,
.
matches() String
. true false
.
matches() Pattern : , . Pattern.matches() CharSequence.
String.matches() Pattern.matches().
, String.matches() Pattern.matches(),
, Pattern.
compile(regex).matcher(subjectString).matches(). , (, )
.
. , PatternSyntaxException.
-
179
3.6.
JavaScript
JavaScript , ,
.
,
^ ,
$ . /m . /m
. /m .
regexp.test(), .
PHP
PHP , , . ,
\A , , \Z , . , . ,
one|two|three ,
: \A(?:one|two|three)\Z .
, ,
preg_match(), .
Perl
Perl ,
. ,
,
\A , , \Z , . , . ,
180
3.
one|two|three ,
: \A(?:one|two|three)\Z .
, , ,
.
Python
match() search(),
. ,
match() , . , match() None. search(), ,
, ,
.
match() , . , , .
,
, \Z , .
Ruby
Regex Ruby , ,
.
,
\A , ,
\Z , . , .
, one|two|three , : \A(?:one|two|three)\Z .
, , =~, .
.
, ,
2.5.
2.8 2.9 .
181
3.7.
- , . ,
, , .
, 3.5.
3.7.
,
. . , . , \d+ Do you like 13 or 42?
13.
C#
:
string resultString = Regex.Match(subjectString, @\d+).Value;
, :
string resultString = null;
try {
resultString = Regex.Match(subjectString, @\d+).Value;
} catch (ArgumentNullException ex) {
// null
//
} catch (ArgumentException ex) {
//
}
, Regex:
Regex regexObj = new Regex(@\d+);
string resultString = regexObj.Match(subjectString).Value;
182
3.
, Regex :
string resultString = null;
try {
Regex regexObj = new Regex(@\d+);
try {
resultString = regexObj.Match(subjectString).Value;
} catch (ArgumentNullException ex) {
// null
//
}
} catch (ArgumentException ex) {
//
}
VB.NET
:
Dim ResultString = Regex.Match(SubjectString, \d+).Value
, :
Dim ResultString As String = Nothing
Try
ResultString = Regex.Match(SubjectString, \d+).Value
Catch ex As ArgumentNullException
Nothing
Catch ex As ArgumentException
End Try
, Regex:
Dim RegexObj As New Regex(\d+)
Dim ResultString = RegexObj.Match(SubjectString).Value
, Regex :
Dim ResultString As String = Nothing
Try
Dim RegexObj As New Regex(\d+)
Try
ResultString = RegexObj.Match(SubjectString).Value
3.7.
183
Catch ex As ArgumentNullException
Nothing
End Try
Catch ex As ArgumentException
End Try
Java
,
Matcher:
String resultString = null;
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
resultString = regexMatcher.group();
}
, :
String resultString = null;
try {
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
resultString = regexMatcher.group();
}
} catch (PatternSyntaxException ex) {
//
}
JavaScript
var result = subject.match(/\d+/);
if (result) {
result = result[0];
} else {
result = ;
}
PHP
if (preg_match(/\d+/, $subject, $groups)) {
$result = $groups[0];
} else {
$result = ;
}
184
3.
Perl
if ($subject =~ m/\d+/) {
$result = $&;
} else {
$result = ;
}
Python
:
matchobj = re.search(regex pattern, subject)
if matchobj:
result = matchobj.group()
else:
result =
, :
reobj = re.compile(regex pattern)
matchobj = reobj.search(subject)
if match:
result = matchobj.group()
else:
result =
Ruby
=~ $&:
if subject =~ /regex pattern/
result = $&
else
result =
end
, match() Regexp:
matchobj = /regex pattern/.match(subject)
if matchobj
result = matchobj[0]
else
result =
end
3.7.
185
, ,
, . , , . ,
.
.NET
Regex .NET -,
, .
Match(), Match.
Match Value, , .
,
Match, Value
.
Match(). ,
. null. Match()
ArgumentNullException.
, .
. .
, ArgumentException.
, , Regex
Match() . . ,
. , ,
. , . .
Match() ArgumentOutOfRangeException.
186
3.
Java
, ,
Matcher,
3.3, find() . find() true, group() , , .
find() false, group() , IllegalStateException.
Matcher.find() . , ,
. , .
, IndexOutOfBoundsException.
, find() , ,
. find() Pattern.
matcher() Matcher.reset(), .
JavaScript
string.match()
. ,
. , string.match() regexp.
3.7.
187
, string.match()
null. ,
.
, ,
null
.
, string.match()
.
, .
/g. string.match()
, 3.10.
PHP
preg_match(), , , , . preg_match() 1, . . 3.9.
Perl
m// , .
, $&, , .
.
Python
3.5 search().
MatchObject, search(), . , , group() .
Ruby
3.8 $~ MatchData. , .
, .
188
3.
$& , .
$~[0],
.
.
3.5, 3.8, 3.9, 3.10 3.11.
3.8.
, , , .
, .
C#
:
int matchstart, matchlength = -1;
Match matchResult = Regex.Match(subjectString, @\d+);
if (matchResult.Success) {
matchstart = matchResult.Index;
matchlength = matchResult.Length;
}
Regex:
int matchstart, matchlength = -1;
Regex regexObj = new Regex(@\d+);
Match matchResult = regexObj.Match(subjectString).Value;
if (matchResult.Success) {
matchstart = matchResult.Index;
matchlength = matchResult.Length;
}
VB.NET
:
3.8.
189
Dim MatchStart = -1
Dim MatchLength = -1
Dim MatchResult = Regex.Match(SubjectString, \d+)
If MatchResult.Success Then
MatchStart = MatchResult.Index
MatchLength = MatchResult.Length
End If
Regex:
Dim MatchStart = -1
Dim MatchLength = -1
Dim RegexObj As New Regex(\d+)
Dim MatchResult = Regex.Match(SubjectString, \d+)
If MatchResult.Success Then
MatchStart = MatchResult.Index
MatchLength = MatchResult.Length
End If
Java
int matchStart, matchLength = -1;
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
matchStart = regexMatcher.start();
matchLength = regexMatcher.end() - matchStart;
}
JavaScript
var matchstart = -1;
var matchlength = -1;
var match = /\d+/.exec(subject);
if (match) {
matchstart = match.index;
matchlength = match[0].length;
}
PHP
if (preg_match(/\d+/, $subject, $groups, PREG_OFFSET_CAPTURE)) {
$matchstart = $groups[0][1];
$matchlength = strlen($groups[0][0]);
}
Perl
if ($subject =~ m/\d+/g) {
$matchlength = length($&);
190
3.
$matchstart = length($`);
}
Python
:
matchobj = re.search(r\d+, subject)
if matchobj:
matchstart = matchobj.start()
matchlength = matchobj.end() - matchstart
:
reobj = re.compile(r\d+)
matchobj = reobj.search(subject)
if matchobj:
matchstart = matchobj.start()
matchlength = matchobj.end() - matchstart
Ruby
=~ $~:
if subject =~ /regex pattern/
matchstart = $~.begin()
matchlength = $~.end() - matchstart
end
, match() Regexp:
matchobj = /regex pattern/.match(subject)
if matchobj
matchstart = matchobj.begin()
matchlength = matchobj.end() - matchstart
end
.NET
Regex.Match(), .
Index Length Match, Regex.Match().
Index , . -
191
3.8.
, Index . ,
Index . Index . , .
, , \Z , .
Length . , . , , \b , .
, Regex.
Match() Match, Index Length . . ,
\A , . Match.Index Match.Length, ,
.
Match.Success.
Java
, Matcher.find(), . find() true, Matcher.start()
, . end() , . , , , . start()
end(), find(),
IllegalStateException.
JavaScript
exec() regexp .
. index , . , index
. . ,
length.
192
3.
,
regexp.exec() null.
lastIndex ,
exec(), . JavaScript lastIndex , regexp. regexp.lastIndex .
- ( 3.11). ,
, ,
match.index match[0].length.
PHP
, ,
, preg_
match(). , PREG_OFSET_CAPTURE. ,
preg_match() , 1.
, , , . PREG_OFSET_
CAPTURE, . ( ), , , ( ). : , .
,
, .
strlen(), . 1 , , .
Perl
, $&, . , $`, , .
3.8.
193
Python
start() MatchObject , . end()
, . , .
start() end() ,
. , start(1) , end(2) . Python
99 . 0 . start() end()
, , ( 99), IndexError. , , , start() end() -1.
, span() .
Ruby
3.5 =~. $~ MatchData.
.
,
=~; , -
- .
, match() Regexp.
. ,
MatchData nil . ,
MatchData $~, MatchData, . MatchData . 3.7
3.9 , , .
begin() , . end() -
194
3.
, . offset() . . , , 0.
, ,
. , begin(1) .
length() size(), . ,
MatchData ,
, 3.9.
.
3.5 3.9.
3.9.
3.7, ,
,
.
, , 2.9.
, Please visit http://www.regexcookbook.com for more
information http://([a-z0-9.-]+)
http://www.regexcookbook.com. ,
,
www.regexcookbook.com. , , .
. 7 , URL.
C#
:
string resultString = Regex.Match(subjectString,
http://([a-z0-9.-]+)).Groups[1].Value;
3.9.
195
Regex:
Regex regexObj = new Regex(http://([a-z0-9.-]+));
string resultString = regexObj.Match(subjectString).Groups[1].Value;
VB.NET
:
Dim ResultString = Regex.Match(SubjectString,
http://([a-z0-9.-]+)).Groups(1).Value
Regex:
Dim RegexObj As New Regex(http://([a-z0-9.-]+))
Dim ResultString = RegexObj.Match(SubjectString).Groups(1).Value
Java
String resultString = null;
Pattern regex = Pattern.compile(http://([a-z0-9.-]+));
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
resultString = regexMatcher.group(1);
}
JavaScript
var result = ;
var match = /http:\/\/([a-z0-9.-]+)/.exec(subject);
if (match) {
result = match[1];
} else {
result = ;
}
PHP
if (preg_match(%http://([a-z0-9.-]+)%, $subject, $groups)) {
$result = $groups[1];
} else {
$result = ;
}
Perl
if ($subject =~ m!http://([a-z0-9.-]+)!) {
$result = $1;
} else {
196
3.
$result = ;
}
Python
:
matchobj = re.search(http://([a-z0-9.-]+), subject)
if matchobj:
result = matchobj.group(1)
else:
result =
:
reobj = re.compile(http://([a-z0-9.-]+))
matchobj = reobj.search(subject)
if match:
result = matchobj.group(1)
else:
result =
Ruby
=~ , $1:
if subject =~ %r!http://([a-z0-9.-]+)!
result = $1
else
result =
end
, match() Regexp:
matchobj = %r!http://([a-z0-9.-]+)!.match(subject)
if matchobj
result = matchobj[1]
else
result =
end
2.10 2.21 ,
,
3.9.
197
.
, .
.
. , ,
,
, ,
.
. , , , , . , , , .
.NET
, - Regex.Match(), 3.7. Match
Groups. GroupCollection. , . Groups[1] , Groups[2]
, .
Groups Group . Group , Match,
Groups. Match.Groups[1].Value
, ,
, Match.Value
. Match.Groups[1].Index
Match.Groups[1].Length , . Index Length 3.8.
Groups[0] , . Match.Value Match.Groups[0].Value .
Groups
. ,
Groups[-1] Group, -1. Success. Groups[-1].Success false.
198
3.
, Match.Groups.Count. Count
, Count
.NET: , , . Groups Groups[0] Groups[1].
Groups.Count 2.
Java
,
, , ,
. group(), start() end() Matcher
. ,
, , .
.
, . , , IndexOutOfBoundsException. , , group(n) null,
start(n) end(n) -1.
JavaScript
, exec() .
. , , ,
.
, regexp.exec() null.
PHP
3.7 , , , preg_match().
preg_match() 1, . .
3.9.
199
, , .
.
.
preg_match() PREG_OFFSET_CAPTURE, , -
. .
. ,
.
Perl
m// ,
. $1, $2, $3 , , .
Python
,
3.7. group() . group(1)
, , group(2)
, . Python 99 . 0
. , , group()
IndexError. , , group()
None.
group() ,
. .
, groups() MatchObject. None , . groups()
, None
, .
200
3.
, groups()
groupdict(). groupdict() ,
None , .
Ruby
3.8 $~ MatchData. , ,
, . 1,
. .
$1, $2 , . $1 $~[1];
. $2 .
,
.
C#
:
string resultString = Regex.Match(subjectString,
http://(?<domain>[a-z0-9.-]+)).Groups[domain].Value;
Regex:
Regex regexObj = new Regex(http://(?<domain>[a-z0-9.-]+));
string resultString = regexObj.Match(subjectString).Groups[domain].Value;
C# Group
. Groups , . , .NET , . Match.
Groups[nosuchgroup].Success false.
3.9.
201
VB.NET
:
Dim ResultString = Regex.Match(SubjectString,
http://(?<domain>[a-z0-9.-]+)).Groups(domain).Value
Regex:
Dim RegexObj As New Regex(http://(?<domain>[a-z0-9.-]+))
Dim ResultString = RegexObj.Match(SubjectString).Groups(domain).Value
VB.NET Group
. Groups , . , .NET , . Match.
Groups[nosuchgroup].Success false.
PHP
if (preg_match(%http://(?P<domain>[a-z0-9.-]+)%, $subject, $groups)) {
$result = $groups[domain];
} else {
$result = ;
}
, $groups .
. , .
$groups[0] , $groups[1] $groups[domain]
, .
Perl
if ($subject =~ !http://(?<domain>[a-z0-9.-]+)%!) {
$result = $+{domain};
} else {
$result = ;
}
Perl,
5.10. $+
. Perl
202
3.
. $1 $+{name}
, .
Python
matchobj = re.search(http://(?P<domain>[a-z0-9.-]+), subject)
if matchobj:
result = matchobj.group(domain)
else:
result =
, group() .
.
2.9, .
2.11, .
3.10.
, . , ,
.
. , The lucky numbers are 7, 13, 16, 42, 65, and 99 \d+ : 7, 13,
16, 42, 65 99.
,
, .
C#
:
MatchCollection matchlist = Regex.Matches(subjectString, @\d+);
Regex:
Regex regexObj = new Regex(@\d+);
MatchCollection matchlist = regexObj.Matches(subjectString);
3.10.
203
VB.NET
:
Dim matchlist = Regex.Matches(SubjectString, \d+)
Regex:
Dim RegexObj As New Regex(\d+)
Dim MatchList = RegexObj.Matches(SubjectString)
Java
List<String> resultList = new ArrayList<String>();
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
resultList.add(regexMatcher.group());
}
JavaScript
var list = subject.match(/\d+/g);
PHP
preg_match_all(/\d+/, $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
Perl
@result = $subject =~ m/\d+/g;
,
, . 2.9.
Python
:
result = re.findall(r\d+, subject)
:
reobj = re.compile(r\d+)
result = reobj.findall(subject)
204
3.
Ruby
result = subject.scan(/\d+/)
.NET
Matches() Regex , . MatchCollection, .
.
. null. Matches()
ArgumentNullException.
, Matches(). , . .
, Regex
Matches().
. , . , , , . , . . Matches()
ArgumentOutOfRangeException.
, , .
Matches(), , . Regex.Match(subject, start, stop) ,
,
.
Java
Java , . , -
3.10.
205
JavaScript
string.match(), ,
3.7. , : /g. 3.4.
/g match() . list[0]
, list[1] . list.length. string.match() null, .
.
/g string.match() .
,
, 3.11.
PHP
PHP preg_match(), . preg_match_all() . , , .
preg_match_all()
preg_match(): , , , , . , , .
PREG_PATTERN_ORDER, PREG_SET_ORDER.
, PREG_PATTERN_
ORDER.
PREG_PATTERN_ORDER
, , . , -
206
3.
. preg_match(). , , , preg_match(),
, preg_match_all().
, preg_match_all().
, ,
PREG_PATTERN_ORDER .
, PREG_PATTERN_ORDER
. , preg_match_
all(%http://([a-z0-9.-]+)%, $subject, $result) $result[1] URL, .
PREG_SET_ORDER,
, . , preg_match_all(). , , . PREG_SET_ORDER, $result[0] , preg_match().
PREG_OFFSET_CAPTURE PREG_
PATTERN_ORDER PREG_SET_ORDER. , PREG_OFFSET_CAPTURE preg_match()
.
, , .
Perl
3.4 , /g,
. /g .
-, .
-
, . , . -
3.10.
207
,
. , . 2.9.
Python
findall() re. , . .
re.findall() re.compile(), findall() . : .
findall() , re.findall().
,
findall() . ,
findall() . , .
,
.
,
findall(), . , ,
.
Ruby
scan() String
. . scan() ,
.
, scan() . .
scan() .
. -
208
3.
, . .
, .
Ruby scan() , . , .
.
3.7, 3.11 3.12.
3.11.
,
, .
.
C#
:
Match matchResult = Regex.Match(subjectString, @\d+);
while (matchResult.Success) {
// ,
// matchResult
matchResult = matchResult.NextMatch();
}
Regex:
Regex regexObj = new Regex(@\d+);
matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
// ,
// matchResult
matchResult = matchResult.NextMatch();
}
3.11.
209
VB.NET
:
Dim MatchResult = Regex.Match(SubjectString, \d+)
While MatchResult.Success
,
matchResult
MatchResult = MatchResult.NextMatch
End While
Regex:
Dim RegexObj As New Regex(\d+)
Dim MatchResult = RegexObj.Match(SubjectString)
While MatchResult.Success
,
matchResult
MatchResult = MatchResult.NextMatch
End While
Java
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// ,
// regexMatcher
}
JavaScript
,
, exec() :
var regex = /\d+/g;
var match = null;
while (match = regex.exec(subject)) {
// , Firefox,
//
if (match.index == regex.lastIndex) regex.lastIndex++;
// ,
// match
}
210
3.
, ,
:
var regex = /\d+/g;
var match = null;
while (match = regex.exec(subject)) {
// ,
// match
}
PHP
preg_match_all(/\d+/, $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
# = $result[0][$i];
}
Perl
while ($subject =~ m/\d+/g) {
# = $&
}
Python
:
for matchobj in re.finditer(r\d+, subject):
# ,
# matchobj
:
reobj = re.compile(r\d+)
for matchobj in reobj.finditer(subject):
# ,
# matchobj
Ruby
subject.scan(/\d+/) {|match|
# ,
# match
}
3.11.
211
.NET
3.7 , - Match() Regex .
, ,
Match() . Match() Match,
matchResult.
Success matchResult true, .
Match . 3.7
Value, 3.8 Index Length 3.9
- Groups.
, NextMatch() matchResult. Match.
NextMatch() Match, Regex.
Match(). .
, matchResult.
NextMatch(), matchResult
. matchResult.Success, , NextMatch() . NextMatch()
, Match, Success false.
matchResult, while
NextMatch().
NextMatch() Match, .
Match .
NextMatch() .
, Regex.
Match(). Match .
Regex.Match() ,
. Regex.Match() , Match
.
Match.NextMatch()
212
3.
, Match,
Regex.Match(). Regex , Regex.Match()
,
.
Java
Java . while find(), 3.7. find() , Matcher, ,
.
JavaScript
, /g.
3.4. while (regexp.exec()) , regexp = /\d+/g.
/\d+/, while (regexp.exec()) ,
.
, while (/\d+/g.exec()) ( /g)
, ,
JavaScript, while.
, . ,
.
, regexp.exec(),
3.8 3.9. , exec() . , .
/g
lastIndex regexp, exec() . , .
exec() lastIndex. lastIndex
, .
lastIndex . ECMA-262v3 JavaScript ,
exec() lastIndex -
3.11.
213
, . , , , ,
.
, ( JavaScript), ,
, .
3.8 lastIndex ,
Internet Explorer .
Firefox ECMA-262v3, , regexp.
exec() .
. , re = /^.*$/gm; while (re.exec()),
, Firefox .
, 1
lastIndex, exec() .
, JavaScript. , , .
string.
match() ( 3.14)
string.replace() ( 3.10). , lastIndex, ECMA-262v3 , lastIndex 1 .
PHP
preg_match() ,
, . 3.8 preg_match()
$matchstart + $matchlength
,
, preg_match() 0.
3.18.
preg_match() -
. preg_match()
214
3.
.
preg_match_all(), ,
.
Perl
3.4 , /g, . /g
, .
while . ,
$& ( 3.7),
while.
Python
finditer() re ,
. , . , .
re.finditer() re.compile(), finditer() . : .
finditer() ,
re.finditer(). ,
finditer() . , . , . , .
Ruby
scan() String
. , .
,
.
.
3.12.
215
, .
, . . , .
subject.scan(/(a)(b)(c)/) {|a, b, c|
# a, b c
}
, , , .
, ,
nil.
,
,
. .
, :
subject.scan(/(a)(b)(c)/) {|abc|
# abc[0], abc[1] abc[2]
#
}
.
3.7, 3.8, 3.10 3.12.
3.12.
3.10 , , , .
, , ( ) .
, , 13.
216
3.
C#
:
StringCollection resultList = new StringCollection();
Match matchResult = Regex.Match(subjectString, @\d+);
while (matchResult.Success) {
if (int.Parse(matchResult.Value) % 13 == 0) {
resultList.Add(matchResult.Value);
}
matchResult = matchResult.NextMatch();
}
Regex:
StringCollection resultList = new StringCollection();
Regex regexObj = new Regex(@\d+);
matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
if (int.Parse(matchResult.Value) % 13 == 0) {
resultList.Add(matchResult.Value);
}
matchResult = matchResult.NextMatch();
}
VB.NET
:
Dim ResultList = New StringCollection
Dim MatchResult = Regex.Match(SubjectString, \d+)
While MatchResult.Success
If Integer.Parse(MatchResult.Value) Mod 13 = 0 Then
ResultList.Add(MatchResult.Value)
End If
MatchResult = MatchResult.NextMatch
End While
Regex:
Dim ResultList = New StringCollection
Dim RegexObj As New Regex(\d+)
Dim MatchResult = RegexObj.Match(SubjectString)
While MatchResult.Success
3.12.
If Integer.Parse(MatchResult.Value) Mod 13 = 0 Then
ResultList.Add(MatchResult.Value)
End If
MatchResult = MatchResult.NextMatch
End While
Java
List<String> resultList = new ArrayList<String>();
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
if (Integer.parseInt(regexMatcher.group()) % 13 == 0) {
resultList.add(regexMatcher.group());
}
}
JavaScript
var list = [];
var regex = /\d+/g;
var match = null;
while (match = regex.exec(subject)) {
// , Firefox,
//
if (match.index == regex.lastIndex) regex.lastIndex++;
// , match
if (match[0] % 13 == 0) {
list.push(match[0]);
}
}
PHP
preg_match_all(/\d+/, $subject, $matchdata, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matchdata[0]); $i++) {
if ($matchdata[0][$i] % 13 == 0) {
$list[] = $matchdata[0][$i];
}
}
Perl
while ($subject =~ m/\d+/g) {
if ($& % 13 == 0) {
push(@list, $&);
}
}
217
218
3.
Python
:
list = []
for matchobj in re.finditer(r\d+, subject):
if int(matchobj.group()) % 13 == 0:
list.append(matchobj.group())
:
list = []
reobj = re.compile(r\d+)
for matchobj in reobj.finditer(subject):
if int(matchobj.group()) % 13 == 0:
list.append(matchobj.group())
Ruby
list = []
subject.scan(/\d+/) {|match|
list << match if (Integer(match) % 13 == 0)
}
. \d+ , , ,
, .
- , , 13, , , , , .
, , . .
, , . , 13 .
, . , .
3.13.
219
.
3.7, 3.10 3.11.
3.13.
, .
,
.
, HTML,
, <b>. , . ,
<b> , . , 1 <b>2</b> 3 4 <b>5 6 7</b>
: 2, 5, 6 7.
C#
StringCollection resultList = new StringCollection();
Regex outerRegex = new Regex(<b>(.*?)</b>, RegexOptions.Singleline);
Regex innerRegex = new Regex(@\d+);
//
Match outerMatch = outerRegex.Match(subjectString);
while (outerMatch.Success) {
//
Match innerMatch = innerRegex.Match(outerMatch.Groups[1].Value);
while (innerMatch.Success) {
resultList.Add(innerMatch.Value);
innerMatch = innerMatch.NextMatch();
}
//
outerMatch = outerMatch.NextMatch();
}
VB.NET
Dim ResultList = New StringCollection
Dim OuterRegex As New Regex(<b>(.*?)</b>, RegexOptions.Singleline)
Dim InnerRegex As New Regex(\d+)
220
3.
Dim OuterMatch = OuterRegex.Match(SubjectString)
While OuterMatch.Success
Dim InnerMatch = InnerRegex.Match(OuterMatch.Groups(1).Value)
While InnerMatch.Success
ResultList.Add(InnerMatch.Value)
InnerMatch = InnerMatch.NextMatch
End While
OuterMatch = OuterMatch.NextMatch
End While
Java
, Java 4 :
List<String> resultList = new ArrayList<String>();
Pattern outerRegex = Pattern.compile(<b>(.*?)</b>, Pattern.DOTALL);
Pattern innerRegex = Pattern.compile(\\d+);
Matcher outerMatcher = outerRegex.matcher(subjectString);
while (outerMatcher.find()) {
Matcher innerMatcher = innerRegex.matcher(outerMatcher.group());
while (innerMatcher.find()) {
resultList.add(innerMatcher.group());
}
}
(
innerMatcher ), Java 5 :
List<String> resultList = new ArrayList<String>();
Pattern outerRegex = Pattern.compile(<b>(.*?)</b>, Pattern.DOTALL);
Pattern innerRegex = Pattern.compile(\\d+);
Matcher outerMatcher = outerRegex.matcher(subjectString);
Matcher innerMatcher = innerRegex.matcher(subjectString);
while (outerMatcher.find()) {
innerMatcher.region(outerMatcher.start(), outerMatcher.end());
while (innerMatcher.find()) {
resultList.add(innerMatcher.group());
}
}
JavaScript
var
var
var
var
result = [];
outerRegex = /<b>([\s\S]*?)<\/b>/g;
innerRegex = /\d+/g;
outerMatch = null;
3.13.
221
PHP
$list = array();
preg_match_all(%<b>(.*?)</b>%s, $subject, $outermatches,
PREG_PATTERN_ORDER);
for ($i = 0; $i < count($outermatches[0]); $i++) {
if (preg_match_all(/\d+/, $outermatches[0][$i], $innermatches,
PREG_PATTERN_ORDER)) {
$list = array_merge($list, $innermatches[0]);
}
}
Perl
while ($subject =~ m!<b>(.*?)</b>!gs) {
push(@list, ($& =~ m/\d+/g));
}
, ( \d+ ) , .
2.9.
Python
list = []
innerre = re.compile(r\d+)
for outermatch in re.finditer((?s)<b>(.*?)</b>, subject):
list.extend(innerre.findall(outermatch.group(1)))
Ruby
list = []
subject.scan(/<b>(.*?)<\/b>/m) {|outergroups|
list += outergroups[0].scan(/\d+/)
}
222
3.
, .
, , , , , . , .
.
. ,
, , ,
. . ,
, 1.
. HTML- <b> <b>(.*?)</b> 2.
\d+ .
, :
\d+(?=(?:(?!<b>).)*</b>)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, . , , ; -
, . .
,
.
,
, .
,
, . JavaScript <b>
([\s\S]*?)</b> .
223
3.13.
, ,
.
, , , . , <b>(.*?)</b> ,
;
,
, .
, , , . . HTML- <b> ,
.
,
3.11. , , , ,
, .
.
, 3.10. ,
,
.
, ( )
. , .
, , .
, ,
.
, . , . , (HTML- <b>) , , , <b>. , ,
, ,
.
224
3.
.
3.8, 3.10 3.11.
3.14.
before after .
C#
:
string resultString = Regex.Replace(subjectString, before, after);
, :
string resultString = null;
try {
resultString = Regex.Replace(subjectString, before, after);
} catch (ArgumentNullException ex) {
// null
// ,
} catch (ArgumentException ex) {
//
}
Regex:
Regex regexObj = new Regex(before);
string resultString = regexObj.Replace(subjectString, after);
, Regex :
string resultString = null;
try {
Regex regexObj = new Regex(before);
try {
resultString = regexObj.Replace(subjectString, after);
} catch (ArgumentNullException ex) {
3.14.
225
// null
//
}
} catch (ArgumentException ex) {
//
}
VB.NET
:
Dim ResultString = Regex.Replace(SubjectString, before, after)
, :
Dim ResultString As String = Nothing
Try
ResultString = Regex.Replace(SubjectString, before, after)
Catch ex As ArgumentNullException
null
,
Catch ex As ArgumentException
End Try
Regex:
Dim RegexObj As New Regex(before)
Dim ResultString = RegexObj.Replace(SubjectString, after)
, Regex :
Dim ResultString As String = Nothing
Try
Dim RegexObj As New Regex(before)
Try
ResultString = RegexObj.Replace(SubjectString, after)
Catch ex As ArgumentNullException
null
End Try
Catch ex As ArgumentException
End Try
226
3.
Java
:
String resultString = subjectString.replaceAll(before, after);
, :
try {
String resultString = subjectString.replaceAll(before, after);
} catch (PatternSyntaxException ex) {
//
} catch (IllegalArgumentException ex) {
// ( $
?)
} catch (IndexOutOfBoundsException ex) {
//
}
Matcher:
Pattern regex = Pattern.compile(before);
Matcher regexMatcher = regex.matcher(subjectString);
String resultString = regexMatcher.replaceAll(after);
, Matcher :
String resultString = null;
try {
Pattern regex = Pattern.compile(before);
Matcher regexMatcher = regex.matcher(subjectString);
try {
resultString = regexMatcher.replaceAll(after);
} catch (IllegalArgumentException ex) {
//
// ( $ ?)
} catch (IndexOutOfBoundsException ex) {
//
}
} catch (PatternSyntaxException ex) {
//
}
JavaScript
result = subject.replace(/before/g, after);
3.14.
227
PHP
$result = preg_replace(/before/, after, $subject);
Perl
$_,
$_:
s/before/after/g;
$subject, $subject:
$subject =~ s/before/after/g;
$subject, $result:
($result = $subject) =~ s/before/after/g;
Python
:
result = re.sub(before, after, subject)
:
reobj = re.compile(before)
result = reobj.sub(after, subject)
Ruby
result = subject.gsub(/before/, after)
.NET
.NET Regex.
Replace(). Replace() 10 .
.
MatchEvaluator 3.16.
, Replace(), , , -
228
3.
.
null. Replace() ArgumentNullException. Replace() , .
, . . . .
, ArgumentException.
,
, Regex, Replace() . , . .
Replace() Regex
, . ,
. Replace()
,
.
, , .
, . , Replace(subject, replacement, 3) ,
. , . , , . ,
. -1, . -1, Replace() ArgumentOutOfRangeException.
, ,
, .
, , , , . , . -
3.14.
229
. Replace() ArgumentOutOfRangeException.
Match(), Replace() ,
, .
Java
, replaceFirst() replaceAll() . :
. : Pattern.
compile(before).matcher(subjectString).replaceFirst(after) Pattern.
compile(before).matcher(subjectString).replaceAll(after).
, Matcher,
3.3, replaceFirst()
replaceAll() , .
, . PatternSyntaxException
Pattern.compile(), String.replaceFirst() String.replaceAll(), .
ArgumentException replaceFirst() replaceAll(),
.
, , IndexOutOfBoundsException.
JavaScript
, replace() .
, . replace() .
, /g. , 3.4. /g .
PHP
, preg_replace().
230
3.
, , .
, .
.
-1, . 0, . , preg_replace() , .
, .
, .
.
preg_replace() ,
.
,
preg_replace() ,
.
, preg_replace()
, . ,
. , ( ) . preg_replace()
. , , .
, ksort(),
preg_replace().
$replace :
$regex[0] =
$regex[1] =
$regex[2] =
$replace[2]
$replace[1]
/a/;
/b/;
/c/;
= 3;
= 2;
3.14.
231
$replace[0] = 1;
echo preg_replace($regex, $replace, abc);
ksort($replace);
echo preg_replace($regex, $replace, abc);
preg_replace() 321,
, .
ksort() 123, . ksort() , (true false)
preg_replace().
Perl
Perl s///
. s/// ,
$_, $_.
, =~, .
. ,
.
s/// , .
, ,
, .
Perl ,
.
,
/g, 3.4.
Perl .
Python
, sub() re. ,
, . sub() .
re.sub() re.compile() sub() . : .
232
3.
sub() ,
. ,
.
, . , . ,
.
Ruby
, gsub() String. ,
. ,
. , gsub() .
gsub() , . ,
gsub!(). , gsub!() nil.
, ,
.
.
1
3.15.
3.15.
,
. ,
, , 2.9.
, , , .
3.15.
233
C#
:
string resultString = Regex.Replace(subjectString, @(\w+)=(\w+), $2=$1);
Regex:
Regex regexObj = new Regex(@(\w+)=(\w+));
string resultString = regexObj.Replace(subjectString, $2=$1);
VB.NET
:
Dim ResultString = Regex.Replace(SubjectString, (\w+)=(\w+), $2=$1)
Regex:
Dim RegexObj As New Regex((\w+)=(\w+))
Dim ResultString = RegexObj.Replace(SubjectString, $2=$1)
Java
String.replaceAll():
String resultString = subjectString.replaceAll((\\w+)=(\\w+), $2=$1);
Matcher:
Pattern regex = Pattern.compile((\\w+)=(\\w+));
Matcher regexMatcher = regex.matcher(subjectString);
String resultString = regexMatcher.replaceAll($2=$1);
JavaScript
result = subject.replace(/(\w+)=(\w+)/g, $2=$1);
PHP
$result = preg_replace(/(\w+)=(\w+)/, $2=$1, $subject);
Perl
$subject =~ s/(\w+)=(\w+)/$2=$1/g;
234
3.
Python
:
result = re.sub(r(\w+)=(\w+), r\2=\1, subject)
:
reobj = re.compile(r(\w+)=(\w+))
result = reobj.sub(r\2=\1, subject)
Ruby
result = subject.gsub(/(\w+)=(\w+)/, \2=\1)
(\w+)=(\w+)
. , , , , ,
.
,
, , ,
. -.
. 1, 2.21 , .
.NET
.NET Regex.Replace(), ,
.
.NET,
2.21.
Java
Java replaceFirst()
replaceAll(), . Java, .
3.15.
235
JavaScript
JavaScript string.replace(), .
JavaScript, .
PHP
PHP preg_replace(),
. PHP, .
Perl
Perl replace s/regex/replace/ . $&, $1, $2 ,
3.7 3.9. , , .
Perl. ,
.
, , ,
. , , $1 \1. $1 .
Python
Python sub(),
.
Python, .
Ruby
Ruby String.gsub(),
. Ruby, .
Ruby , $1, . Ruby gsub(). -
236
3.
gsub() ,
. $1, , gsub().
, \1 . gsub() . . , ,
. \1 \\1
, \1
0x01.
C#
:
string resultString = Regex.Replace(subjectString,
@(?<left>\w+)=(?<right>\w+), ${right}=${left});
Regex:
Regex regexObj = new Regex(@(?<left>\w+)=(?<right>\w+));
string resultString = regexObj.Replace(subjectString, ${right}=${left});
VB.NET
:
Dim ResultString = Regex.Replace(SubjectString,
(?<left>\w+)=(?<right>\w+), ${right}=${left})
Regex:
Dim RegexObj As New Regex((?<left>\w+)=(?<right>\w+))
Dim ResultString = RegexObj.Replace(SubjectString, ${right}=${left})
3.15.
237
PHP
$result = preg_replace(/(?P<left>\w+)=(?P<right>\w+)/, $2=$1, $subject);
Perl
$subject =~ s/(?<left>\w+)=(?<right>\w+)/$+{right}=$+{left}/g;
Perl,
5.10. $+
, . , .
Python
:
result = re.sub(r(?P<left>\w+)=(?P<right>\w+), r\g<right>=\g<left>,
subject)
:
reobj = re.compile(r(?P<left>\w+)=(?P<right>\w+))
result = reobj.sub(r\g<right>=\g<left>, subject)
Ruby
result = subject.gsub(/(?<left>\w+)=(?<right>\w+)/, \k<left>=\k<right>)
.
1,
.
2.21, , .
238
3.
3.16. ,
, . , .
, ,
.
C#
:
string resultString = Regex.Replace(subjectString, @\d+,
new MatchEvaluator(ComputeReplacement));
Regex:
Regex regexObj = new Regex(@\d+);
string resultString = regexObj.Replace(subjectString,
new MatchEvaluator(ComputeReplacement));
ComputeReplacement. , :
public String ComputeReplacement(Match matchResult) {
int twiceasmuch = int.Parse(matchResult.Value) * 2;
return twiceasmuch.ToString();
}
VB.NET
:
Dim MyMatchEvaluator As New MatchEvaluator(AddressOf ComputeReplacement)
Dim ResultString = Regex.Replace(SubjectString, \d+, MyMatchEvaluator)
Regex:
Dim RegexObj As New Regex(\d+)
Dim MyMatchEvaluator As New MatchEvaluator(AddressOf ComputeReplacement)
Dim ResultString = RegexObj.Replace(SubjectString, MyMatchEvaluator)
3.16. ,
239
ComputeReplacement. , :
Public Function ComputeReplacement(ByVal MatchResult As Match) As String
Dim TwiceAsMuch = Int.Parse(MatchResult.Value) * 2;
Return TwiceAsMuch.ToString();
End Function
Java
StringBuffer resultString = new StringBuffer();
Pattern regex = Pattern.compile(\\d+);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
Integer twiceasmuch = Integer.parseInt(regexMatcher.group()) * 2;
regexMatcher.appendReplacement(resultString, twiceasmuch.toString());
}
regexMatcher.appendTail(resultString);
JavaScript
var result = subject.replace(/\d+/g,
function(match) { return match * 2; }
);
PHP
:
$result = preg_replace_callback(/\d+/, compute_replacement, $subject);
function compute_replacement($groups) {
return $groups[0] * 2;
}
:
$result = preg_replace_callback(
/\d+/,
create_function(
$groups,
return $groups[0] * 2;
),
$subject
);
Perl
$subject =~ s/\d+/$& * 2/eg;
240
3.
Python
:
result = re.sub(r\d+, computereplacement, subject)
:
reobj = re.compile(r\d+)
result = reobj.sub(computereplacement, subject)
computereplacement. ,
sub().
def computereplacement(matchobj):
return str(int(matchobj.group()) * 2)
Ruby
result = subject.gsub(/\d+/) {|match|
Integer(match) * 2
}
,
.
- , , .
C#
3.14
Regex.Replace(), .
, . Regex(), Replace() , .
MatchEvaluator. -, , .
new MatchEvaluator().
- MatchEvaluator() .
3.16. ,
241
, , System.Text.RegularExpressions.Match .
Match, - Regex.Match(), .
Replace() MatchEvaluator
, , , .
.
Match. , , matchResult.Value.
, , matchResult.Groups[].
- , matchResult.Value. null ,
( ).
VB.NET
3.14
Regex.Replace(), .
, . Dim ,
Replace() , .
MatchEvaluator. , , .
MatchEvaluator Dim. MatchEvaluator() AddressOf,
-. AddressOf
, .
, MatchEvaluator,
System.Text.
RegularExpressions.Match .
Match, - Regex.
Match(), . ,
ByVal.
242
3.
Replace() MatchEvaluator
, , , .
.
Match. , , MatchResult.Value.
, , MatchResult.Groups[].
- , MatchResult.Value. Nothing , (. . ).
Java
Java . , 3.11. appendReplacement()
Matcher. find() , appendTail(). appendReplacement() appendTail() .
appendReplacement() .
StringBuffer
.
, find(). , $1. ,
IllegalArgumentException.
, IndexOutOfBoundsException. appendReplacement()
find(),
IllegalStateException.
appendReplacement() . -, , ,
. , , . -,
,
.
- , . -
3.16. ,
243
,
appendReplacement() .
, ,
appendReplacement(). -
appendReplacement() ,
,
.
, appendTail().
, , appendReplacement().
JavaScript
JavaScript , .
, , string.replace() , .
, .
, , ,
.
. , , .
.
, ,
JavaScript , , . JavaScript .
PHP
preg_replace_callback() , preg_replace(), 3.14. , , , .
.
preg_replace_callback() , , . , , -
244
3.
create_function().
( -, ).
, preg_replace_callback() , . .
,
. ,
.
Perl
s/// , m//: /e. /e, execute
(),
, ,
Perl, , . $&
. .
Python
sub() Python
. .
, . ,
MatchObject, , search().
( ) . 3.7 3.9.
.
Ruby
gsub() String
: .
.
gsub() . . -
3.17. .
245
,
nil, .
, ,
. , , $~, $&
$1. 3.7, 3.8 3.9.
, \1 . .
.
3.9 3.15.
3.17.
, .
.
, HTML, <b>. <b> <before> <after>. , before <b>first before</b>
before <b>before before</b> : before
<b>first after</b> before <b>after after</b>.
C#
Regex outerRegex = new Regex(<b>.*?</b>, RegexOptions.Singleline);
Regex innerRegex = new Regex(before);
string resultString = outerRegex.Replace(subjectString,
new MatchEvaluator(ComputeReplacement));
public String ComputeReplacement(Match matchResult) {
//
//
return innerRegex.Replace(matchResult.Value, after);
}
246
3.
VB.NET
Dim
Dim
Dim
Dim
Java
StringBuffer resultString = new StringBuffer();
Pattern outerRegex = Pattern.compile(<b>.*?</b>);
Pattern innerRegex = Pattern.compile(before);
Matcher outerMatcher = outerRegex.matcher(subjectString);
while (outerMatcher.find()) {
outerMatcher.appendReplacement(resultString,
innerRegex.matcher(outerMatcher.group()).replaceAll(after));
}
outerMatcher.appendTail(resultString);
JavaScript
var result = subject.replace(/<b>.*?<\/b>/g,
function(match) {
return match.replace(/before/g, after);
}
);
PHP
$result = preg_replace_callback(%<b>.*?</b>%,
replace_within_tag, $subject);
function replace_within_tag($groups) {
return preg_replace(/before/, after, $groups[0]);
}
Perl
$subject =~ s%<b>.*?</b>%($match = $&) =~ s/before/after/g; $match;%eg;
Python
innerre = re.compile(before)
def replacewithin(matchobj):
return innerre.sub(after, matchobj.group())
3.18. .
247
Ruby
innerre = /before/
result = subject.gsub(/<b>.*?<\/b>/) {|match|
match.gsub(innerre, after)
}
3.16 , , .
. ,
<b>, , 3.14. ,
.
.
3.11, 3.13 3.16.
3.18.
, .
, . ,
, .
, HTML, () ( ), HTML.
ASCII, - HTML. , text
248
3.
class=middletext/span text
span
text span class=middletext/span text.
C#
string resultString = null;
Regex outerRegex = new Regex(<[^<>]*>);
Regex innerRegex = new Regex(\([^\]*)\);
//
int lastIndex = 0;
Match outerMatch = outerRegex.Match(subjectString);
while (outerMatch.Success) {
//
//
string textBetween =
subjectString.Substring(lastIndex, outerMatch.Index - lastIndex);
resultString = resultString +
innerRegex.Replace(textBetween, \u201C$1\u201D);
lastIndex = outerMatch.Index + outerMatch.Length;
//
resultString = resultString + outerMatch.Value;
//
outerMatch = outerMatch.NextMatch();
}
//
//
string textAfter = subjectString.Substring(lastIndex,
subjectString.Length - lastIndex);
resultString = resultString + innerRegex.Replace(textAfter,
\u201C$1\u201D);
VB.NET
Dim ResultString As String = Nothing
Dim OuterRegex As New Regex(<[^<>]*>)
Dim InnerRegex As New Regex(([^]*))
Dim LastIndex = 0
Dim OuterMatch = OuterRegex.Match(SubjectString)
While OuterMatch.Success
Dim TextBetween = SubjectString.Substring(LastIndex,
OuterMatch.Index - LastIndex);
ResultString = ResultString + InnerRegex.Replace(TextBetween,
ChrW(&H201C) + $1 + ChrW(&H201D))
3.18. . 249
LastIndex = OuterMatch.Index + OuterMatch.Length
ResultString = ResultString + OuterMatch.Value
OuterMatch = OuterMatch.NextMatch
End While
Dim TextAfter = SubjectString.Substring(LastIndex,
SubjectString.Length - LastIndex);
ResultString = ResultString +
InnerRegex.Replace(TextAfter, ChrW(&H201C) + $1 + ChrW(&H201D))
Java
StringBuffer resultString = new StringBuffer();
Pattern outerRegex = Pattern.compile(<[^<>]*>);
Pattern innerRegex = Pattern.compile(\([^\]*)\);
Matcher outerMatcher = outerRegex.matcher(subjectString);
int lastIndex = 0;
while (outerMatcher.find()) {
//
//
String textBetween = subjectString.substring(lastIndex,
outerMatcher.start());
Matcher innerMatcher = innerRegex.matcher(textBetween);
resultString.append(innerMatcher.replaceAll(\u201C$1\u201D));
lastIndex = outerMatcher.end();
// ,
resultString.append(outerMatcher.group());
}
//
//
String textAfter = subjectString.substring(lastIndex);
Matcher innerMatcher = innerRegex.matcher(textAfter);
resultString.append(innerMatcher.replaceAll(\u201C$1\u201D));
JavaScript
var result = ;
var outerRegex = /<[^<>]*>/g;
var innerRegex = /([^]*)/g;
var outerMatch = null;
var lastIndex = 0;
while (outerMatch = outerRegex.exec(subject)) {
if (outerMatch.index == outerRegex.lastIndex) outerRegex.lastIndex++;
//
//
250
3.
var textBetween = subject.substring(lastIndex, outerMatch.index);
result = result + textBetween.replace(innerRegex, \u201C$1\u201D);
lastIndex = outerMatch.index + outerMatch[0].length;
// ,
result = result + outerMatch[0];
}
//
//
var textAfter = subject.substr(lastIndex);
result = result + textAfter.replace(innerRegex, \u201C$1\u201D);
PHP
$result = ;
$lastindex = 0;
while (preg_match(/<[^<>]*>/, $subject, $groups, PREG_OFFSET_CAPTURE,
$lastindex)) {
$matchstart = $groups[0][1];
$matchlength = strlen($groups[0][0]);
//
//
$textbetween = substr($subject, $lastindex, $matchstart-$lastindex);
$result .= preg_replace(/([^]*)/, $1, $textbetween);
// ,
$result .= $groups[0][0];
//
$lastindex = $matchstart + $matchlength;
if ($matchlength == 0) {
// ,
//
//
$lastindex++;
}
}
//
//
$textafter = substr($subject, $lastindex);
$result .= preg_replace(/([^]*)/, $1, $textafter);
Perl
use encoding utf-8;
$result = ;
while ($subject =~ m/<[^<>]*>/g) {
$match = $&;
$textafter = $;
($textbetween = $`) =~ s/([^]*)/\x{201C}$1\x{201D}/g;
$result .= $textbetween . $match;
3.18. . . 251
}
$textafter =~ s/([^]*)/\x{201C}$1\x{201D}/g;
$result .= $textafter;
Python
innerre = re.compile(([^]*))
result = ;
lastindex = 0;
for outermatch in re.finditer(<[^<>]*>, subject):
#
#
textbetween = subject[lastindex:outermatch.start()]
result += innerre.sub(u\u201C\\1\u201D, textbetween)
lastindex = outermatch.end()
# ,
result += outermatch.group()
#
#
textafter = subject[lastindex:]
result += innerre.sub(u\u201C\\1\u201D, textafter)
Ruby
result = ;
textafter =
subject.scan(/<[^<>]*>/) {|match|
textafter = $
textbetween = $`.gsub(/([^]*)/, \1)
result += textbetween + match
}
result += textafter.gsub(/([^]*)/, \1)
3.13 , , ( )
( ).
.
, ,
, , . ,
, , .
, -
252
3.
. , ^ ,
, , , ^ .
, ,
.
. <[^<>]*> , , . HTML.
, HTML -
, ( ) . 3.11. , , , , , .
, 3.14. , , . . , .
,
,
.
([^]*) ,
, ,
, .
.
, . U+201C U+201D.
. Visual Studio 2008, , .
\u201C \x{201C} ,
3.19.
253
, , . , ,
, . , , ,
. , C# Java \u201C
, VB.NET
. VB.NET
ChrW().
Perl Ruby
Perl Ruby , , .
$` ( ) , , $ ( ) , . , , . $` , .
Python
, . ,
encode(), :
print result.encode(1252)
.
3.11, 3.13 3.16.
3.19.
.
,
.
254
3.
, , HTML,
. Ilike<b>bold</b>and<i>italic
</i>fonts : Ilike, bold, and, italic
fonts.
C#
:
string[] splitArray = Regex.Split(subjectString, <[^<>]*>);
, :
string[] splitArray = null;
try {
splitArray = Regex.Split(subjectString, <[^<>]*>);
} catch (ArgumentNullException ex) {
// null
//
} catch (ArgumentException ex) {
//
}
Regex:
Regex regexObj = new Regex(<[^<>]*>);
string[] splitArray = regexObj.Split(subjectString);
, Regex :
string[] splitArray = null;
try {
Regex regexObj = new Regex(<[^<>]*>);
try {
splitArray = regexObj.Split(subjectString);
} catch (ArgumentNullException ex) {
// null
//
}
} catch (ArgumentException ex) {
//
}
3.19.
255
VB.NET
:
Dim SplitArray = Regex.Split(SubjectString, <[^<>]*>)
, :
Dim SplitArray As String()
Try
SplitArray = Regex.Split(SubjectString, <[^<>]*>)
Catch ex As ArgumentNullException
null
Catch ex As ArgumentException
End Try
Regex:
Dim RegexObj As New Regex(<[^<>]*>)
Dim SplitArray = RegexObj.Split(SubjectString)
, Regex :
Dim SplitArray As String()
Try
Dim RegexObj As New Regex(<[^<>]*>)
Try
SplitArray = RegexObj.Split(SubjectString)
Catch ex As ArgumentNullException
null
End Try
Catch ex As ArgumentException
End Try
Java
, String.Split():
String[] splitArray = subjectString.split(<[^<>]*>);
256
3.
, :
try {
String[] splitArray = subjectString.split(<[^<>]*>);
} catch (PatternSyntaxException ex) {
//
}
Pattern:
Pattern regex = Pattern.compile(<[^<>]*>);
String[] splitArray = regex.split(subjectString);
, Pattern :
String[] splitArray = null;
try {
Pattern regex = Pattern.compile(<[^<>]*>);
splitArray = regex.split(subjectString);
} catch (ArgumentException ex) {
//
}
JavaScript
string.split():
result = subject.split(/<[^<>]*>/);
, ,
string.split() .
:
var list = [];
var regex = /<[^<>]*>/g;
var match = null;
var lastIndex = 0;
while (match = regex.exec(subject)) {
// , Firefox,
//
if (match.index == regex.lastIndex) regex.lastIndex++;
//
list.push(subject.substring(lastIndex, match.index));
lastIndex = match.index + match[0].length;
}
//
list.push(subject.substr(lastIndex));
3.19.
257
PHP
$result = preg_split(/<[^<>]*>/, $subject);
Perl
@result = split(m/<[^<>]*>/, $subject);
Python
:
result = re.split(<[^<>]*>, subject))
:
reobj = re.compile(<[^<>]*>)
result = reobj.split(subject)
Ruby
result = subject.split(/<[^<>]*>/)
, 3.10. , .
.
C# VB.NET
, .NET Regex.Split().
, , . null. Split() ArgumentNullException.
Split() .
, . .
. , ArgumentException.
258
3.
,
, Regex, Split() . .
Split() Regex
, . ,
. Split()
, .
, , . , regexObj.Split(subject, 3) ,
. Split() , , . .
, , Split()
,
. regexObj.Split(subject, 1) , . regexObj.
Split(subject, 0) ,
Split() .
, Split()
ArgumentOutOfRangeException.
, , ,
, . , ,
, ,
. ,
.
, ,
, .
, , , .
. Split()
3.19.
259
Java
, split() . .
Pattern.compile(regex).split(subjectString).
, Pattern Pattern.compile().
. split() Pattern, .
Matcher. Matcher split().
Pattern.split() , String.split() .
, . , Pattern.split(subject, 3) , . split() , , .
. , , split() ,
. Pattern.split(subject, 1) , .
,
. ,
, - ,
.
,
.
260
3.
Java . , , Pattern.split()
. , . , ,
. Java
.
JavaScript
JavaScript, split() . , ,
split() .
. .
, . ,
. /g ( 3.4) .
, -
split(), JavaScript.
, , . , , , . , , split(),
( 2.9).
JavaScript . , .
, , JavaScript,
. , , .
3.12.
, , .
, ,
3.8.
3.19.
261
PHP
preg_split().
, . , $_.
, . , preg_split($regex, $subject, 3) ,
. preg_split() ,
,
.
. ,
, preg_split() ,
. -1, .
,
. ,
, - ,
. , .
preg_split() . ,
PREG_SPLIT_NO_EMPTY .
Perl
split(). , .
, . ,
split(/regex/, subject, 3) ,
262
3.
. split() , , . . ,
, split() ,
.
, Perl . -, , . , , .
, , . , ($one,
$two, $three) = split(/,/)
$_ 4 .
,
. ,
, - ,
.
,
.
Python
, split() re. , . split()
.
re.split() re.split() split()
. : .
split()
.
,
. , .
, ,
. -
3.19.
263
, .
. , .
Ruby
, split()
.
split() , , . , subject.split(re, 3) , . split()
,
,
. . ,
, split()
, . split(re, 1)
, .
,
. ,
, - ,
.
,
.
Ruby . , , split() .
,
. , ,
. Ruby
.
.
3.20.
264
3.
3.20. ,
.
, ,
.
, HTML
. , Ilike<b>bold</b>and<i>italic</i>fonts : Ilike, <b>, bold, </b>, and, <i>, italic, </i>
fonts.
C#
:
string[] splitArray = Regex.Split(subjectString, (<[^<>]*>));
Regex:
Regex regexObj = new Regex((<[^<>]*>));
string[] splitArray = regexObj.Split(subjectString);
VB.NET
:
Dim SplitArray = Regex.Split(SubjectString, (<[^<>]*>))
Regex:
Dim RegexObj As New Regex((<[^<>]*>))
Dim SplitArray = RegexObj.Split(SubjectString)
Java
List<String> resultList = new ArrayList<String>();
Pattern regex = Pattern.compile(<[^<>]*>);
Matcher regexMatcher = regex.matcher(subjectString);
int lastIndex = 0;
while (regexMatcher.find()) {
3.20. ,
265
resultList.add(subjectString.substring(lastIndex,
regexMatcher.start()));
resultList.add(regexMatcher.group());
lastIndex = regexMatcher.end();
}
resultList.add(subjectString.substring(lastIndex));
JavaScript
var list = [];
var regex = /<[^<>]*>/g;
var match = null;
var lastIndex = 0;
while (match = regex.exec(subject)) {
// , Firefox,
//
if (match.index == regex.lastIndex) regex.lastIndex++;
//
list.push(subject.substring(lastIndex, match.index), match[0]);
lastIndex = match.index + match[0].length;
}
//
list.push(subject.substr(lastIndex));
PHP
$result = preg_split(/(<[^<>]*>)/, $subject, -1,
PREG_SPLIT_DELIM_CAPTURE);
Perl
@result = split(m/(<[^<>]*>)/, $subject);
Python
:
result = re.split((<[^<>]*>), subject))
:
reobj = re.compile((<[^<>]*>))
result = reobj.split(subject)
Ruby
list = []
lastindex = 0;
subject.scan(/<[^<>]*>/) {|match|
266
3.
list << subject[lastindex..$~.begin(0)-1];
list << $&
lastindex = $~.end(0)
}
list << subject[lastindex..subject.length()]
Java
Pattern.split() Java .
3.12, ,
. , , 3.8.
3.20. ,
267
JavaScript
string.split() JavaScript
. JavaScript . , - , .
, ,
3.12
,
. ,
, 3.8.
PHP
preg_split()
PREG_SPLIT_DELIM_CAPTURE, . PREG_SPLIT_DELIM_CAPTURE PREG_SPLIT_NO_EMPTY
|.
, preg_split() , . , , , , ,
.
,
, , . ,
: Ilike, <b>, bold, </b>, and, <i> italic</i>fonts.
Perl
split() Perl
. ,
.
, split() ,
. split(/(<[^<>]*>)/, $subject, 4)
, , , . , , ,
268
3.
Python
split() Python . , .
, split() , . split((<[^<>]*>), subject, 3)
, , , . , , ,
. , : I like, <b>, bold, </b>, and , <i> italic</i> fonts.
10 ,
split(regex, subject, 3) 34 .
Python . , .
Ruby
String.split() Ruby .
3.12, ,
. , , 3.8.
.
2.9.
2.11.
3.21.
269
3.21.
grep
, (
) . , , .
C#
,
,
:
string[] lines = Regex.Split(subjectString, \r?\n);
:
Regex regexObj = new Regex(regex pattern);
for (int i = 0; i < lines.Length; i++) {
if (regexObj.IsMatch(lines[i])) {
// lines[i]
} else {
// lines[i]
}
}
VB.NET
,
,
:
Dim Lines = Regex.Split(SubjectString, \r?\n)
:
Dim RegexObj As New Regex(regex pattern)
For i As Integer = 0 To Lines.Length - 1
If RegexObj.IsMatch(Lines(i)) Then
Lines[i]
Else
Lines[i]
End If
Next
270
3.
Java
,
,
:
String[] lines = subjectString.split(\r?\n);
:
Pattern regex = Pattern.compile(regex pattern);
Matcher regexMatcher = regex.matcher();
for (int i = 0; i < lines.length; i++) {
regexMatcher.reset(lines[i]);
if (regexMatcher.find()) {
// lines[i]
} else {
// lines[i]
}
}
JavaScript
,
,
. 3.19, .
var lines = subject.split(/\r?\n/);
:
var regexp = /regex pattern/;
for (var i = 0; i < lines.length; i++) {
if (lines[i].match(regexp)) {
// lines[i]
} else {
// lines[i]
}
}
PHP
,
,
:
$lines = preg_split(/\r?\n/, $subject)
:
foreach ($lines as $line) {
if (preg_match(/regex pattern/, $line)) {
3.21.
271
// line
} else {
// line
}
}
Perl
,
,
:
@lines = split(m/\r?\n/, $subject)
:
foreach $line (@lines) {
if ($line =~ m/regex pattern/) {
# $line
} else {
# $line
}
}
Python
,
,
:
lines = re.split(\r?\n, subject);
:
reobj = re.compile(regex pattern)
for line in lines[:]:
if reobj.search(line):
# line
else:
# line
Ruby
,
,
:
lines = subject.split(/\r?\n/)
:
re = /regex pattern/
lines.each { |line|
272
3.
if line =~ re
# line
else
# line
}
, ,
, .
, , . , . ,
, .
. , ,
, , . , , , ,
.
, . , . . , ,
, ,
,
.
3.19 , ,
. \r\n
, , -
273
3.21.
\r?\n .
,
Windows,
UNIX.
,
. , , , 3.5.
.
3.11 3.19.
, . ,
, , ,
.
, , ,
.
,
. , , , . , .
, ,
, .
,
.
4.1.
- , . ,
4.1.
275
. .
.
@ :
^\S+@\S+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
\A\S+@\S+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
, , @, , . , , @, , , ,
:
^[A-Z0-9+_.-]+@[A-Z0-9.-]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
\A[A-Z0-9+_.-]+@[A-Z0-9.-]+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
.
, ,
RFC 2822, . , , ()
276
4.
(|). , ,
, ,
SQL:
^[\w!#$%&*+/=?`{|}~^.-]+@[A-Z0-9.-]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
\A[\w!#$%&*+/=?`{|}~^.-]+@[A-Z0-9.-]+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
,
, , ,
. ,
:
^[\w!#$%&*+/=?`{|}~^-]+(?:\.[\w!#$%&*+/=?`{|}~^-]+)*@
[A-Z0-9-]+(?:\.[A-Z0-9-]+)*$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
\A[\w!#$%&*+/=?`{|}~^-]+(?:\.[\w!#$%&*+/=?`{|}~^-]+)*@
[A-Z0-9-]+(?:\.[A-Z0-9-]+)*\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
: . , , , secondlevel.com thirdlevel.secondlevel.com. .com . , , . (.com) (.museum):
^[\w!#$%&*+/=?`{|}~^-]+(?:\.[\w!#$%&*+/=?`{|}~^-]+)*@
(?:[A-Z0-9-]+\.)+[A-Z]{2,6}$
4.1.
277
:
: .NET, Java, PCRE, Perl, Python, Ruby
- ,
,
, , . ,
,
, . , , , .
, .
RFC 2822, , asdf@asdf.asdf .
,
, . asdf.
,
, john.doe@somewhere.com
, .
-, somewhere.com , , John Doe Delete , -.
, ,
. , , .
, ,
. , , #$%@.-, ,
, .
278
4.
,
. 276 .
. , ,
, . , ,
,
,
. .
,
.
, com|net|org|mil|edu . , ,
.
, , . 2, 90% , .
, .
.
[A-Za-z] [A-Z] , .
.
X [Xx] .
2.3, \S \w . \S
, \w .
@ \. @ , . ,
. @
,
. , , 2.1.
279
4.1.
[A-Z0-9.-] ,
, . A Z, 0 9,
. , , .
2.3, ,
: [\w!#$%&*+/=?`{|}~^.-] .
, 19 .
+ * , . .
. , [A-Z0-9.-]+ , , / .
, (?:[A-Z0-9-]+\.)+ , / ,
.
. , ,
, , .
2.12.
(?:group) .
,
. (group) , .
, ,
(?: ( .
- , , .
2.9.
^ $ , . , .
.
- drop database; -- joe@server.com haha!
.
-
280
4.
, joe@server.
com . 2.5. ,
^ $ .
Ruby . , ,
Ruby, . , ,
^ $ , drop
database; -joe@server.com
haha!,
, \
A \Z .
,
, JavaScript. JavaScript \A \Z . 2.5.
^ $ , \A \Z
, . . , , JavaScript Ruby .
, , Ruby .
Ruby, \A
\Z , .
, .
, RegexBuddy.
. .
,
. . ^\S+@\
281
4.1.
S+$ :
, @ .
, ,
.
, .
, , , , ,
.
, , , ,
, ^ $ . . , , , , , asdf@asdf.as asdf@asdf.as99.
, , .
.
^ $ \b . , ^[A-Z0-9+_.-]+@(?:[A-Z09-]+\.)+[A-Z]{2,6}$ \b[A-Z0-9+_.-]+@(?:[A-Z0-9]+\.)+[A-Z]{2,6}\b .
. 275
. 276. .
.
RFC 2822 , , . RFC 2822 http://www.ietf.org/rfc/rfc2822.txt.
282
4.
4.2.
, , , . : 1234567890, 123-456-7890,
123.456.7890, 123 456 7890, (123) 456 7890 . ,
(123) 456-7890,
.
,
- . ,
.
^\(?([0-9]{3})\)?[-.]?([0-9]{3})[-.]?([0-9]{4})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
($1)$2-$3
: Python, Ruby
C#
Regex regexObj =
new Regex(@^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$);
if (regexObj.IsMatch(subjectString)) {
string formattedPhoneNumber =
regexObj.Replace(subjectString, ($1) $2-$3);
} else {
//
}
283
4.2.
JavaScript
var regexObj = /^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$/;
if (regexObj.test(subjectString)) {
var formattedPhoneNumber =
subjectString.replace(regexObj, ($1) $2-$3);
} else {
//
}
3.5 3.15.
. ,
-
(, ). , ,
:
^
\(
#
#
?
#
(
#
[0-9] #
{3} #
)
#
\)
#
?
#
[-. ]
#
?
#
...
#
$
#
.
(...
.
1...
...
.
1.
)...
.
-. ...
.
[ .]
.
^ $ ,
, .
, . , ^
, $ . ,
, 123-456-78901.
, , -
284
4.
,
. ,
, . \( \) , , . ,
. ,
.
, , , , , .
, .
. . [0-9] , . , ,
\d , ,
\d , ,
. \d 2.3.
[-.] , -. .
,
, [0-9] .
, . , [.\-] .
, . {3} , ,
. [0-9]{3}
[0-9][0-9][0-9] , . ( ) ,
. {0,1} .
4.2.
285
, , . , , .
, , , ,
NANP (North American Numbering Plan ). NANP , 1. , ,
16 . .
, 10- . , , :
29,
08 .
, ,
, 29,
.
, ,
.
.
^\(?([2-9][0-8][0-9])\)?[-.]?([2-9][0-9]{2})[-.]?([0-9]{4})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, . ,
. , , ; , , , .
286
4.
:
\(?\b([0-9]{3})\)?[-.]?([0-9]{3})[-.]?([0-9]{4})\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^ $, \b , - . , \b
, . ,
^ $ . ( \b ), ,
.
( 2.6).
,
. , ,
, . , ,
.
1
1, (
,
), ,
:
^(?:\+?1[-.]?)?\(?([0-9]{3})\)?[-.]?([0-9]{3})[-.]?([0-9]{4})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , , +1
(123) 456-7890 1-123-456-7890. ,
(?:...) . ,
, , .
4.2.
287
, , .
, ,
.
, ,
, $1 $2 ( ).
(?:\+?1[-.]?)?. 1
, 1
- (, ).
, ,
1 ,
-
, 1.
, , :
^(?:\(?([0-9]{3})\)?[-.]?)?([0-9]{3})[-.]?([0-9]{4})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
($1)$2-$3 , , : () 1234567, . , , .
.
4.3 ,
.
(NANP)
, , 16 .
http://www.nanpa.com.
288
4.
4.3.
. , , .
^\+(?:[0-9]?){6,14}[0-9]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
JavaScript
function validate (phone) {
var regex = /^\+(?:[0-9] ?){6,14}[0-9]$/;
if (regex.test(phone)) {
//
} else {
//
}
}
3.5.
, , , . , , ITU-T E.123. , (
, )
. ,
(~) -
289
4.3.
, ,
( ,
) . (ITU-T E.164) , 15 . .
, ,
. , \x20 :
^
\+
(?:
[0-9]
\x20
?
)
{6,14}
[0-9]
$
#
#
#
#
#
#
#
#
#
#
.
+.
...
.
...
.
.
6 14 .
.
.
:
: .NET, Java, PCRE, Perl, Python, Ruby
^ $
, . , (?:...) , , . {6,14} , - .
[0-9] ( 6
14 7 15) ,
.
EPP
^\+[0-9]{1,3}\.[0-9]{4,14}(?:x.+)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
290
4.
, (Extensible Provisioning Protocol, EPP).
(
2004 ) .
, .com, .info, .net, .org .us. ,
EPP , ( ) .
EPP +CCC.NNNNNNNNNNxEEEE, C , 1
3 , N 14 E
. , ,
. x .
.
4.2 .
ITU-T recommendation E.123 (Notation for national and international telephone numbers, e-mail addresses and Web addresses
,
-)
http://www.itu.int/rec/T-REC-E.123.
ITU-T Recommendation E.164 (The international public
telecommunication numbering plan
) http://www.itu.int/rec/TREC-E.164.
http://
www.itu.int/ITU-T/inr/nnp.
RFC 4933
EPP, .
RFC 4933 http://tools.ietf.org/html/rfc4933.
4.4.
mm/
dd/yy, mm/dd/yyyy, dd/mm/yy dd/mm/yyyy. -
4.4.
291
, , , ,
31 .
, :
^[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
^[0-3][0-9]/[0-3][0-9]/(?:[0-9][0-9])?[0-9][0-9]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
m/d/yy mm/dd/yyyy. :
^(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
mm/dd/yyyy, :
^(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
d/m/yy dd/mm/yyyy. :
^(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
dd/mm/yyyy, :
^(3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9])/[0-9]{4}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
292
4.
, :
^(?:(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])|
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]))/(?:[0-9]{2})?[0-9]{2}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
^(?:(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])|
(3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9]))/[0-9]{4}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
^(?:
# m/d mm/dd
(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])
|
# d/m dd/mm
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])
)
# /yy /yyyy
/(?:[0-9]{2})?[0-9]{2}$
:
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:
# mm/dd
(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])
|
# dd/mm
(3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9])
)
# /yyyy
/[0-9]{4}$
:
: .NET, Java, PCRE, Perl, Python, Ruby
, - ,
, .
. -,
293
4.4.
,
. , - 4/1 . - 1, . , , .
, .
, , , :
1 31. . 3[01]|[12][0-9]|0?[1-9] , 3, 0 1,
1 2, , 0, 1 9. , [1-9] , .
, , 0 9, ASCII .
6.
. ,
, , \d{2}/\d{2}/\d{4} .
, , 99/99/9999,
, . , .
, ,
0/0/00 31/31/2008. , - , ( 2.3), , ( 2.12), . (?:[0-9]{2})?[0-9]{2} , . [0-9]{2} . (?:[0-9]{2})?
. ( 2.9) , {2} . [0-9]{2}? ,
[0-9]{2} . ,
4 . . .
294
4.
, {2} , .
3 6 1
12, 1 31. ( 2.8), .
, ,
.
,
. . JavaScript . , . , ,
12/31 31/12 ,
31/31.
, , , ^ $ .
. , , 12/12/2001
9912/12/200199. ,
, .
.
.
^ $ \b . :
\b(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
4.5, 4.6 4.7.
4.5.
295
4.5.
mm/dd/yy, mm/
dd/yyyy, dd/mm/yy dd/mm/yyyy. , , 31 .
C#
, :
DateTime foundDate;
Match matchResult = Regex.Match(SubjectString,
^(?<month>[0-3]?[0-9])/(?<day>[0-3]?[0-9])/ +
(?<year>(?:[0-9]{2})?[0-9]{2})$);
if (matchResult.Success) {
int year = int.Parse(matchResult.Groups[year].Value);
if (year < 50) year += 2000;
else if (year < 100) year += 1900;
try {
foundDate = new DateTime(year,
int.Parse(matchResult.Groups[month].Value),
int.Parse(matchResult.Groups[day].Value));
} catch {
//
}
}
, :
DateTime foundDate;
Match matchResult = Regex.Match(SubjectString,
^(?<day>[0-3]?[0-9])/(?<month>[0-3]?[0-9])/ +
(?<year>(?:[0-9]{2})?[0-9]{2})$);
if (matchResult.Success) {
int year = int.Parse(matchResult.Groups[year].Value);
if (year < 50) year += 2000;
else if (year < 100) year += 1900;
try {
foundDate = new DateTime(year,
int.Parse(matchResult.Groups[month].Value),
int.Parse(matchResult.Groups[day].Value));
} catch {
//
}
296
4.
}
Perl
, :
@daysinmonth = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
$validdate = 0;
if ($subject =~ m!^([0-3]?[0-9])/([0-3]?[0-9])/((?:[0-9]{2})?[0-9]{2})$!) {
$month = $1;
$day = $2;
$year = $3;
$year += 2000 if $year < 50;
$year += 1900 if $year < 100;
if ($month == 2 && $year % 4 == 0 && ($year % 100 != 0 ||
$year % 400 == 0)) {
$validdate = 1 if $day >= 1 && $day <= 29;
} elsif ($month >= 1 && $month <= 12) {
$validdate = 1 if $day >= 1 && $day <= $daysinmonth[$month-1];
}
}
, :
@daysinmonth = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
$validdate = 0;
if ($subject =~ m!^([0-3]?[0-9])/([0-3]?[0-9])/((?:[0-9]{2})?[0-9]{2})$!) {
$day = $1;
$month = $2;
$year = $3;
$year += 2000 if $year < 50;
$year += 1900 if $year < 100;
if ($month == 2 && $year % 4 == 0 && ($year % 100 != 0 ||
$year % 400 == 0)) {
$validdate = 1 if $day >= 1 && $day <= 29;
} elsif ($month >= 1 && $month <= 12) {
$validdate = 1 if $day >= 1 && $day <= $daysinmonth[$month-1];
}
}
, :
^(?:
# (29 )
(?<month>0?2)/(?<day>[12][0-9]|0?[1-9])
4.5.
|
# 30
(?<month>0?[469]|11)/(?<day>30|[12][0-9]|0?[1-9])
|
# 31
(?<month>0?[13578]|1[02])/(?<day>3[01]|[12][0-9]|0?[1-9])
)
#
/(?<year>(?:[0-9]{2})?[0-9]{2})$
:
: .NET
^(?:
# (29 )
(0?2)/([12][0-9]|0?[1-9])
|
# 30
(0?[469]|11)/(30|[12][0-9]|0?[1-9])
|
# 31
(0?[13578]|1[02])/(3[01]|[12][0-9]|0?[1-9])
)
#
/((?:[0-9]{2})?[0-9]{2})$
:
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:(0?2)/([12][0-9]|0?[1-9])|(0?[469]|11)/(30|[12][0-9]|0?[1-9])|
(0?[13578]|1[02])/(3[01]|[12][0-9]|0?[1-9]))/((?:[0-9]{2})?[0-9]{2})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
^(?:
# (29 )
(?<day>[12][0-9]|0?[1-9])/(?<month>0?2)
|
# 30
(?<day>30|[12][0-9]|0?[1-9])/(?<month>0?[469]|11)
|
# 31
(?<day>3[01]|[12][0-9]|0?[1-9])/(?<month>0?[13578]|1[02])
)
#
/(?<year>(?:[0-9]{2})?[0-9]{2})$
297
298
4.
:
: .NET
^(?:
# (29 )
([12][0-9]|0?[1-9])/(0?2)
|
# 30
(30|[12][0-9]|0?[1-9])/([469]|11)
|
# 31
(3[01]|[12][0-9]|0?[1-9])/(0?[13578]|1[02])
)
#
/((?:[0-9]{2})?[0-9]{2})$
:
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:([12][0-9]|0?[1-9])/(0?2)|(30|[12][0-9]|0?[1-9])/([469]|11)|
(3[01]|[12][0-9]|0?[1-9])/(0?[13578]|1[02]))/((?:[0-9]{2})?[0-9]{2})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, . , , //, ,
. , 0 39 .
mm/dd/yyyy dd/mm/yyyy , , .
, . . C#
DateTime, .NET.
. ,
. -
4.5.
299
, , , .
,
, ,
. . , . 1 2. 30
3 4. 31
5 6. 7
.
.NET . .NET ( 2.11)
.
.NET , , month day, , . , , ,
, , . .
, , , , . , ,
. .
, . , , 2 2007 29 2008 ,
d/m/yy dd/mm/yyyy:
# 2 2007 29 2008
^(?:
# 2 2007 31 2007
(?:
# 2 31
(?<day>3[01]|[12][0-9]|0?[2-9])/(?<month>0?5)/(?<year>2007)
|
# 1 31
(?:
300
4.
# 30
(?<day>30|[12][0-9]|0?[1-9])/(?<month>0?[69]|11)
|
# 31
(?<day>3[01]|[12][0-9]|0?[1-9])/(?<month>0?[78]|1[02])
)
/(?<year>2007)
)
|
# 1 2008 29 2008
(?:
# 1 29
(?<day>[12][0-9]|0?[1-9])/(?<month>0?8)/(?<year>2008)
|
# 1 30
(?:
#
(?<day>[12][0-9]|0?[1-9])/(?<month>0?2)
|
# 30
(?<day>30|[12][0-9]|0?[1-9])/(?<month>0?[46])
|
# 31
(?<day>3[01]|[12][0-9]|0?[1-9])/(?<month>0?[1357])
)
/(?<year>2008)
)
)$
:
: .NET, Java, PCRE, Perl, Python, Ruby
.
4.4, 4.6 4.7.
4.6.
4.6.
301
12- :
^(1[0-2]|0?[1-9]):([0-5]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
24- :
^(2[0-3]|[01]?[0-9]):([0-5]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 12- :
^(1[0-2]|0?[1-9]):([0-5]?[0-9]):([0-5]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 24- :
^(2[0-3]|[01]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. , .
, .
60 , 60 . , .
. [0-5]?[0-9] 0 5, 0 9. 0 59.
.
0 9. 10 00 09, . 2.3 2.12
, .
( 2.8).
302
4.
,
. 12- , 0, 10 , 1, 0, 1 2.
1[0-2]|0?[1-9] . 24- ,
0 1, 10 ,
2, 0 3. ,
2[0-3]|[01]?[0-9] . 10 . , , .
, , . , , . 2.9 , .
3.9 , , .
, , , .
, . , , , ,
.
, , ,
,
^ $ . . , , 12:12 9912:1299.
, , .
.
.
^ $ \b . :
\b(2[0-3]|[01]?[0-9]):([0-5]?[0-9])\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
303
,
( 2.16). The
time is 16:08:42 sharp. , :
(?<![:\w])(2[0-3]|[01]?[0-9]):([0-5]?[0-9])(?![:\w])
:
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
.
4.4, 4.5 4.7.
4.7.
ISO 8601
/ ISO 8601, . ,
XML Schema date, time dateTime
ISO 8601.
,
2008-08. :
^([0-9]{4})-(1[0-2]|0[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>[0-9]{4})-(?<month>1[0-2]|0[1-9])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
304
4.
^(?P<year>[0-9]{4})-(?P<month>1[0-2]|0[1-9])$
:
: PCRE, Python
, 2008-08-30. . YYYYMMDD
YYYYMM-DD, ISO 8601:
^([0-9]{4})-?(1[0-2]|0[1-9])-?(3[0-1]|0[1-9]|[1-2][0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>[0-9]{4})-?(?<month>1[0-2]|0[1-9])-?
(?<day>3[0-1]|0[1-9]|[1-2][0-9])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, 2008-08-30. .
YYYY-MMDD YYYYMM-DD. :
^([0-9]{4})(-)?(1[0-2]|0[1-9])(?(2)-)(3[0-1]|0[1-9]|[1-2][0-9])$
:
: .NET, PCRE, Perl, Python
, 2008-08-30. .
YYYY-MMDD YYYYMM-DD. :
^([0-9]{4})(?:(1[0-2]|0[1-9])|-?(1[0-2]|0[1-9])-?)
(3[0-1]|0[1-9]|[1-2][0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 2008-W35. :
^([0-9]{4})-?W(5[0-3]|[1-4][0-9]|0[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>[0-9]{4})-?W(?<week>5[0-3]|[1-4][0-9]|0[1-9])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
305
, 2008-W35-6. :
^([0-9]{4})-?W(5[0-3]|[1-4][0-9]|0[1-9])-?([1-7])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>[0-9]{4})-?W(?<week>5[0-3]|[1-4][0-9]|0[1-9])-?(?<day>[1-7])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, 2008-243. :
^([0-9]{4})-?(36[0-6]|3[0-5][0-9]|[12][0-9]{2}|0[1-9][0-9]|00[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>[0-9]{4})-?
(?<day>36[0-6]|3[0-5][0-9]|[12][0-9]{2}|0[1-9][0-9]|00[1-9])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, 17:21. :
^(2[0-3]|[01]?[0-9]):?([0-5]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<hour>2[0-3]|[01]?[0-9]):?(?<minute>[0-5]?[0-9])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, , 17:21:59. :
^(2[0-3]|[01]?[0-9]):?([0-5]?[0-9]):?([0-5]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<hour>2[0-3]|[01]?[0-9]):?(?<minute>[0-5]?[0-9]):?
(?<second>[0-5]?[0-9])$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, Z, +07 +07:00. :
^(Z|[+-](?:2[0-3]|[01]?[0-9])(?::?(?:[0-5]?[0-9]))?)$
306
4.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, ,
17:21:59+07:00. .
:
^(2[0-3]|[01]?[0-9]):?([0-5]?[0-9]):?([0-5]?[0-9])
(Z|[+-](?:2[0-3]|[01]?[0-9])(?::?(?:[0-5]?[0-9]))?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<hour>2[0-3]|[01]?[0-9]):?(?<minute>[0-5]?[0-9]):?(?<sec>[0-5]?[0-9])
(?<timezone>Z|[+-](?:2[0-3]|[01]?[0-9])(?::?(?:[0-5]?[0-9]))?)$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, 2008-0830 2008-08-30+07:00. . date
XML Schema:
^(-?(?:[1-9][0-9]*)?[0-9]{4})-(1[0-2]|0[1-9])-(3[0-1]|0[1-9]|[1-2][0-9])
(Z|[+-](?:2[0-3]|[0-1][0-9]):[0-5][0-9])?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>-?(?:[1-9][0-9]*)?[0-9]{4})-(?<month>1[0-2]|0[1-9])-
(?<day>3[0-1]|0[1-9]|[1-2][0-9])
(?<timezone>Z|[+-](?:2[0-3]|[0-1][0-9]):[0-5][0-9])?$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
, 01:45:36 01:45:36.123+07:00. time XML
Schema:
^(2[0-3]|[0-1][0-9]):([0-5][0-9]):([0-5][0-9])(\.[0-9]+)?
(Z|[+-](?:2[0-3]|[0-1][0-9]):[0-5][0-9])?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<hour>2[0-3]|[0-1][0-9]):(?<minute>[0-5][0-9]):(?<second>[0-5][0-9])
(?<ms>\.[0-9]+)?(?<timezone>Z|[+-](?:2[0-3]|[0-1][0-9]):[0-5][0-9])?$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
307
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^(?<year>-?(?:[1-9][0-9]*)?[0-9]{4})-(?<month>1[0-2]|0[1-9])-
(?<day>3[0-1]|0[1-9]|[1-2][0-9])T(?<hour>2[0-3]|[0-1][0-9]):
(?<minute>[0-5][0-9]):(?<second>[0-5][0-9])(?<ms>\.[0-9]+)?
(?<timezone>Z|[+-](?:2[0-3]|[0-1][0-9]):[0-5][0-9])?$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
ISO 8601 . , ,
,
, ISO 8601, . , XML Schema
. ,
, . , , . , (?:group) . ,
.
308
4.
3.9 , , , .
,
, .
. .
.NET, PCRE 7, Perl 5.10 Ruby 1.9 (?<name>group) . PCRE Python,
, (?P<name>group) ,
P . 2.11 3.9.
. , 01
31. 32- 13- .
, 31 .
4.5 , .
, , ,
, 4.4 4.6.
.
4.4, 4.5 4.6.
4.8.
-
- .
. , . , , ,
. .
^[A-Z0-9]+$
4.8. -
309
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
Ruby
if subject =~ /^[A-Z0-9]+$/i
puts -
else
puts -
end
3.4 3.5.
, :
^
[A-Z0-9]
+
$
# .
# A Z 0 9...
#
.
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^ $ , . , . + . , + * . *
, .
ASCII
128 7- ASCII. 33 :
^[\x00-\x7F]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
310
4.
ASCII
,
ASCII, .
( 0x0A 0x0D, ) , \n (
) \r ( ):
^[\n\r\x20-\x7E]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
ISO-8859-1 Windows-1252
ISO-8859-1 Windows-1252 ( ANSI) , Latin-1 ( , ISO/IEC 8859-1).
0x80
0x9F. ISO-8859-1
, Windows-1252
.
,
, , ,
Windows.
, ISO8859-1 Windows-1252 ( ):
^[\x00-\x7F\xA0-\xFF]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, [A-Z0-9] , . : \x00-\x7F \xA0-\xFF.
-
. -
4.8. -
311
,
, :
^[\p{L}\p{N}]+$
:
: .NET, Java, PCRE, Perl, Ruby 1.9
, , . , JavaScript, Python
Ruby 1.8. , PCRE, , PCRE UTF-8.
preg PHP ( PCRE) /u .
Python:
^[^\W_]+$
:
: Python
Python UNICODE U . , . \w ,
-
. \W ,
. , , ,
.1
.
4.9 , ,
.
( ) , , ( 2.16) ( ,
. 58 2.3).
312
4.
4.9.
, 1 10 A Z.
, , . , JavaScript length, , .
, , . , 1 10 , A Z. , ,
, AZ.
^[A-Z]{1,10}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
Perl
if ($ARGV[0] =~ /^[A-Z]{1,10}$/) {
print \n;
} else {
print \n;
}
3.5.
,
:
^
[A-Z]
# .
# A Z...
313
4.9.
{1,10} #
1 10 .
# .
:
: .NET, Java, PCRE, Perl, Python, Ruby
^ $ ,
,
10 . [A-Z] A Z, {1,10}
1 10 . , , ,
.
, [A-Z]
.
a z,
[A-Za-z]
. 3.4 , .
, , [A-z] .
,
. AZ az
ASCII .
[A-z] [A-Z[\]^_`a-z] .
, {1,10} ,
, , , , .
2.16, ( ) , , ^ $ , . ,
. , (?=...) , , -
314
4.
, . , . :
^(?=.{1,10}$).*
:
: .NET, Java, PCRE, Perl, Python, Ruby
^(?=[\S\s]{1,10}$)[\S\s]*
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, $ , , . , . .* ( [\S\s]* , , JavaScript) , .
, , . , 3.4. JavaScript , , . , .
60 2.4.
, 10 100 :
^\s*(?:\S\s*){10,100}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
315
4.9.
:
: .NET, Java, PCRE, Perl, Ruby 1.9
PCRE, PCRE UTF-8. PHP UTF-8
/u.
Separator
, \p{Z} , \s , . , \p{Z}
\s , . \s
0x09 0x0D (, , , ), Separator. \p{Z} \s .
{10,100} , . ,
. , .
, ,
, . , 10 100 , , ,
:
^\W*(?:\w+\b\W*){10,100}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
316
4.
,
, ASCII, .
^[^\p{L}\p{N}_]*(?:[\p{L}\p{N}_]+\b[^\p{L}\p{N}_]*){10,100}$
:
: .NET, Java, Perl
^[^\p{L}\p{N}_]*(?:[\p{L}\p{N}_]+(?:[^\p{L}\p{N}_]+|$)){10,100}$
:
: .NET, Java, PCRE, Perl, Ruby 1.9
PCRE UTF-8. UTF-8
PHP /u.
, ( ) , ,
. 70,
2.6.
,
( \p{L} \p{N} ), , , \w
\W .
, , . \W ( [^\p{L}\p{N}_] )
, . , , , ,
\b \w \W ( [\p{L}\p{N}_] [^\p{L}\p{N}_] ), , .
317
4.10.
,
.
(
PCRE Ruby 1.9) . ( ) ( ) , . , , , PCRE Ruby \b . Java \
b , , \w .
,
, , ASCII, JavaScript Ruby 1.8. ,
,
, :
^\s*(?:\S+(?:\s+|$)){10,100}$
:
: .NET, Java, JavaScript, Perl, PCRE, Python, Ruby
,
, . , , , ( far-reaching), , .
.
4.8 4.10.
4.10.
,
, .
, , , ,
318
4.
. , . , , ,
MS-DOC/Windows ( \r\n ), Mac OS
( \r ) UNIX/Linux/OS X ( \n ).
.
, (?:...) , , (? ...) ,
, . , ( \A ^ ,
\z , \Z $ ). . :
\A(?>(?>\r\n?|\n)?[^\r\n]*){0,5}\z
:
: .NET, Java, PCRE, Perl, Ruby
\A(?:(?:\r\n?|\n)?[^\r\n]*){0,5}\Z
:
: Python
^(?:(?:\r\n?|\n)?[^\r\n]*){0,5}$
:
: JavaScript
PHP (PCRE)
if (preg_match('/\A(?>(?>\r\n?|\n)?[^\r\n]*){0,5}\z/', $_POST['subject'])) {
print ;
} else {
print ;
}
3.5.
4.10.
319
, ,
, , MS-DOS/Windows, Mac OS UNIX/
Linux/OS X,
. , , .
JavaScript, . JavaScript , .
:
^
(?:
(?:
\r
#
#
#
#
#
\n
#
#
?
#
|
#
\n
#
)
#
?
#
[^\r\n] #
*
#
)
#
{0,5}
#
$
#
.
, ...
, ...
(CR, ASCII- 0x0D).
(LF, ASCII- 0x0A)...
.
...
.
.
.
, CR LF...
.
.
.
.
:
: .NET, Java, PCRE, Perl, Python, Ruby
^ . , ,
, ,
.
, . , ( ). -
320
4.
. ,
, .
( , , ).
, (?:[^\r\n]*(?:\r\n?|\n)?) ,
.
,
.
, ( \r\n , , MS-DOS/
Windows)
( \r ,
Mac OS)
(\n, ,
UNIX/Linux/OS X)
.
( , Python JavaScript) , . , , ,
(
2.15).
, . ^ $
. , , , ,
\A , \Z \z .
, , .
....
Perl ,
. Perl
321
4.10.
$ , . $ Perl . , Perl
, : \Z \z .
\Z Perl ,
$ , , , ^ $ . \z , . , , , \z , , .
Perl , /. .NET, Java, PCRE Ruby , \Z \z , ,
Perl. Python \Z (
), ,
, \z (
) Perl. JavaScript z, , , , $ ( , ^ $ ).
\A .
, , JavaScript
( ).
, , , , . ,
, , .
, , , MSDOS/Windows, Mac OS UNIX/Linux/OS X. -
322
4.
, . , .
\A(?>\R?\V*){0,5}\z
:
: PCRE 7 (with the PCRE_BSR_UNICODE option), Perl 5.10
\A(?>(?>\r\n?|[\n-\f\x85\x{2028}\x{2029}])?
[^\n-\r\x85\x{2028}\x{2029}]*){0,5}\z
:
: PCRE, Perl
\A(?>(?>\r\n?|[\n-\f\x85\u2028\u2029])?[^\n-\r\x85\u2028\u2029]*){0,5}\z
:
: .NET, Java, Ruby
\A(?:(?:\r\n?|[\n-\f\x85\u2028\u2029])?[^\n-\r\x85\u2028\u2029]*){0,5}\Z
:
: Python
^(?:(?:\r\n?|[\n-\f\x85\u2028\u2029])?[^\n-\r\x85\u2028\u2029]*){0,5}$
:
: JavaScript
, . 4.1 .
4.1.
U+000D U+000A
\r\n
U+000A
\n
U+000B
\v
(CRLF)
Windows MS-DOS
(LF)
UNIX,
Linux OS X
(VT)
()
323
4.11.
U+000C
\f
U+000D
\r
U+0085
\x85
U+2028
U+2029
\x2028
\x{2028}
\x2029
\x{2029}
(FF)
()
(CR)
Mac OS
(NEL)
IBM ()
()
()
.
4.9.
4.11.
,
, .
,
, true, t, yes, y, okay, ok 1, .
,
.
^(?:1|t(?:rue)?|y(?:es)?|ok(?:ay)?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
324
4.
JavaScript
var yes = /^(?:1|t(?:rue)?|y(?:es)?|ok(?:ay)?)$/i;
if (yes.test(subject)) {
alert(Yes);
} else {
alert(No);
}
3.4 3.5.
,
. , ,
:
^
(?:
1
|
t(?:rue)?
#
#
#
#
#
#
|
#
y(?:es)? #
#
|
#
ok(?:ay)? #
#
)
#
$
#
.
, ...
1.
...
t,
rue.
...
y,
es.
...
ok,
ay.
.
.
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
. .
, ^(?:[1ty]|true|yes|ok(?:ay)?)$
. , ^(?:1|t|true|y|yes|ok|okay)$ , , , , , -
325
4.12.
| ( ? ).
,
,
. .
, . , , , ^true|yes$ ,
, true
yes, . ^(?:true|yes)$
, true yes, .
.
5.2 5.3.
4.12.
, .
,
,
. ,
, , , . .
^(?!000|666)(?:[0-6][0-9]{2}|7(?:[0-6][0-9]|7[0-2]))-
(?!00)[0-9]{2}-(?!0000)[0-9]{4}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
326
4.
Python
if re.match(r^(?!000|666)(?:[0-6][0-9]{2}|7(?:[0-6][0-9]|7[0-2]))-
(?!00)[0-9]{2}-(?!0000)[0-9]{4}$, sys.argv[1]):
print
else:
print
3.5.
AAA-GG-SSSS:
. 000 666
, 772.
, 01 99.
,
0001 9999.
,
. , :
^
(?!000|666)
(?:
[0-6]
[0-9]{2}
|
7
(?:
[0-6]
[0-9]
|
7
[0-2]
)
)
(?!00)
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
.
000 666 .
, ...
0 6.
.
...
7.
, ...
0 6.
.
...
7.
0 2.
.
.
-.
00 .
327
4.12.
[0-9]{2}
(?!0000)
[0-9]{4}
$
#
#
#
#
#
.
-.
0000 .
.
.
:
: .NET, Java, PCRE, Perl, Python, Ruby
^ $ ,
,
, . . - , , , , .
, . (?!000|666) , 000 666. 772.
,
, . -, ,
, 0 6,
,
000 666. : [0-6][0-9]{2} . , 7, , , , (?:[0-6][0-9]{2}|7) ,
.
, 7, ,
700 772, , ,
7. 0 6, . 7,
, 0 2.
, 7, 7(?:[0-6][0-9]|7[0-2]) , 7,
.
, ,
, (?:[0-6][0-9]{2}|7(?:[0-6][0-9]|7[0-2])) .
, 000 772.
328
4.
, ^ $ . - .
\b(?!000|666)(?:[0-6][0-9]{2}|7(?:[0-6][0-9]|7[0-2]))-
(?!00)[0-9]{2}-(?!0000)[0-9]{4}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
http://www.socialsecurity.
gov ,
.
(Social Security
Number Verification Service, SSNVS), : http://www.socialsecurity.
gov/employer/ssnv.htm, , .
,
, , 6.5.
4.13. ISBN
(International Standard Book Number, ISBN), (ISBN-10) (ISBN-13) .
ISBN
. ISBN 9780-596-52068-7, ISBN-13: 978-0-596-52068-7, 978 0 596 52068 7, 9780596520687,
ISBN-10 0-596-52068-9 0-596-52068-9
ISBN.
ISBN
, .
4.13. ISBN
329
, , ISBN, .
ISBN-10:
^(?:ISBN(?:-10)?:?)?(?=[-0-9X]{13}$|[0-9X]{10}$)[0-9]{1,5}[-]?
(?:[0-9]+[-]?){2}[0-9X]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
ISBN-13:
^(?:ISBN(?:-13)?:?)?(?=[-0-9]{17}$|[0-9]{13}$)97[89][-]?[0-9]{1,5}
[-]?(?:[0-9]+[-]?){2}[0-9]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
ISBN-10 ISBN-13:
^(?:ISBN(?:-1[03])?:?)?(?=[-0-9]{17}$|[-0-9X]{13}$|[0-9X]{10}$)
(?:97[89][-]?)?[0-9]{1,5}[-]?(?:[0-9]+[-]?){2}[0-9X]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
JavaScript
// ` ` ISBN-10 ISBN-13
var regex = /^(?:ISBN(?:-1[03])?:? )?(?=[-0-9 ]{17}$|[-0-9X ]{13}$|
[0-9X]{10}$)(?:97[89][- ]?)?[0-9]{1,5}[- ]?(?:[0-9]+[- ]?){2}[0-9X]$/;
if (regex.test(subject)) {
// , ISBN,
//
var chars = subject.replace(/[^0-9X]/g, ).split();
// ISBN `chars`,
// `last`
var last = chars.pop();
var sum = 0;
var digit = 10;
var check;
if (chars.length == 9) {
// ISBN-10
for (var i = 0; i < chars.length; i++) {
sum += digit * parseInt(chars[i], 10);
330
4.
digit -= 1;
}
check = 11 - (sum % 11);
if (check == 10) {
check = X;
} else if (check == 11) {
check = 0;
}
} else {
// ISBN-13
for (var i = 0; i < chars.length; i++) {
sum += (i % 2 * 2 + 1) * parseInt(chars[i], 10);
}
check = 10 - (sum % 10);
if (check == 10) {
check = 0;
}
}
if (check == last) {
alert( ISBN);
} else {
alert( ISBN);
}
} else {
alert( ISBN);
}
Python
import re
import sys
# ISBN-10 ISBN-13
regex = re.compile(^(?:ISBN(?:-1[03])?:? )?(?=[-0-9 ]{17}$|
[-0-9X ]{13}$|[0-9X]{10}$)(?:97[89][- ]?)?[0-9]{1,5}[- ]?
(?:[0-9]+[- ]?){2}[0-9X]$)
subject = sys.argv[1]
if regex.search(subject):
# , ISBN, ,
#
chars = re.sub([^0-9X], , subject).split()
# ISBN `chars`
# `last`
last = chars.pop()
4.13. ISBN
331
if len(chars) == 9:
# ISBN-10
val = sum((x + 2) * int(y) for x,y in enumerate(reversed(chars)))
check = 11 - (val % 11)
if check == 10:
check = X
elif check == 11:
check = 0
else:
# ISBN-13
val = sum((x % 2 * 2 + 1) * int(y) for x,y in enumerate(chars))
check = 10 - (val % 10)
if check == 10:
check = 0
if (str(check) == last):
print ISBN
else:
print ISBN
else:
print ISBN
3.5.
ISBN
. 10- ISBN ISO 2108 1970 . ISBN, 1 2007 , 13 .
ISBN-10 ISBN-13
, . ,
. . :
13- ISBN 978 979.
,
.
.
ISBN.
332
4.
.
ISBN-10 0
9 X ( 10 ),
ISBN-13 0 9. ISBN , .
ISBN-10
ISBN-13, . ,
.
Java ,
:
^
(?:
ISBN
(?:-1[03])?
:?
\
)?
(?=
[-0-9\ ]{17}$
|
[-0-9X\ ]{13}$
|
[0-9X]{10}$
)
(?:
97[89]
[-\ ]?
)?
[0-9]{1,5}
[-\ ]?
(?:
[0-9]+
[-\ ]?
){2}
[0-9X]
$
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
.
, ...
ISBN.
-10 -13.
:.
().
.
...
17 , ,
. ...
13 , , X ,
. ...
X, .
.
, ...
978 979.
.
.
.
.
, ...
.
.
.
X.
.
:
: .NET, Java, PCRE, Perl, Python, Ruby
333
4.13. ISBN
(?:ISBN(?:-1[03])?:?)? , , ( , , ):
ISBN
ISBN-10
ISBN-13
ISBN:
ISBN-10:
ISBN-13:
( ISBN )
(?=[-0-9]{17}$|[-09X]{13}$|[0-9X]{10}$) , ( | ),
.
( )
$ , , :
[-0-9]{17}$
ISBN-13 ( 17 )
[-0-9X]{13}$
ISBN-13 ISBN-10 ( 13 )
[0-9X]{10}$
ISBN-10 ( 10 )
, ISBN, . (?:97[89][-]?)? 978 979, ISBN-13. , , ISBN-10. [0-9]{1,5}
[-]?
, . (?:[0-9]+[-]?){2} , , . , [0-9X]$
.
, ,
(-
334
4.
X), , ISBN. ,
ISBN , ( , ISBN-10 ISBN-13). JavaScript Python.
, .
ISBN-10
ISBN-10
0 10 ( 10
X). :
1. 9 , 10 2, .
2. 11.
3. ( ) 11.
4. 11, 0, , 10, X.
0-596-52068-?, ISBN-10:
1:
sum = 100 + 95 + 89 + 76 + 65 + 52 + 40 + 36 + 28
=
0 + 45 + 72 + 42 + 30 + 10 + 0 + 18 + 16
= 233
2:
233 11 = 21, 2
3:
11 2 = 9
4:
9 [ ]
9,
: ISBN 0-596-52068-9.
ISBN-13
ISBN-13
0 9 :
1. 12
1 3, .
4.13. ISBN
335
2. 10.
3. ( ) 10.
4. 10, 0.
978-0-596-52068-?, ISBN-13:
1:
sum = 19 + 37 + 18 + 30 + 15 + 39 + 16 + 35 + 12 + 30 + 16 + 38
= 9 + 21 + 8 + 0 + 5 + 27 + 6 + 15 + 2 + 0 + 6 + 24
= 123
2:
123 10 = 12, 3
3:
10 3 = 7
4:
7 [ ]
7,
: ISBN 978-0-596-52068-7.
ISBN
ISBN-10 ISBN-13
,
ISBN . , ISBN
. -,
( 10- 13-
) -, ISBN :
\bISBN(?:-1[03])?:?(?=[-0-9]{17}$|[-0-9X]{13}$|[0-9X]{10}$)
(?:97[89][-]?)?[0-9]{1,5}[-]?(?:[0-9]+[-]?){2}[0-9X]\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
ISBN
ISBN-10, ISBN-13, .
( 2.17), ,
ISBN-10 ISBN-13 ISBN .
ISBN-10 ISBN-13,
336
4.
. , ,
, ISBN-10 ISBN-13,
. :
^
(?:ISBN(-1(?:(0)|3))?:?\ )?
(?(1)
(?(2)
(?=[-0-9X ]{13}$|[0-9X]{10}$)
[0-9]{1,5}[- ]?(?:[0-9]+[- ]?){2}[0-9X]$
|
(?=[-0-9 ]{17}$|[0-9]{13}$)
97[89][- ]?[0-9]{1,5}[- ]?(?:[0-9]+[- ]?){2}[0-9]$
)
|
(?=[-0-9 ]{17}$|[-0-9X ]{13}$|[0-9X]{10}$)
(?:97[89][- ]?)?[0-9]{1,5}[- ]?(?:[0-9]+[- ]?){2}[0-9X]$
)
$
:
: .NET, PCRE, Perl, Python
.
ISBN Users Manual ISBN http://
www.isbn-international.org.
http://www.isbn-international.org/en/identifiers/allidentifiers.html
, 1 5 , ISBN.
4.14.
ZIP- ( ),
:
(ZIP + 4). 12345 12345-6789
1234, 123456, 123456789 1234-56789.
337
4.14.
^[0-9]{5}(?:-[0-9]{4})?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
VB.NET
If Regex.IsMatch(subjectString, ^[0-9]{5}(?:-[0-9]{4})?$) Then
Console.WriteLine( ZIP-)
Else
Console.WriteLine( ZIP-)
End If
3.5.
ZIP-, :
^
[0-9]{5}
(?:
[0-9]{4}
)
?
$
#
#
#
#
#
#
#
#
.
.
, ...
-.
.
.
.
.
:
: .NET, Java, PCRE, Perl, Python, Ruby
,
. ZIP-
,
^ $ , \b[0-9]{5}(?:-[0-9]{4})?\b .
.
4.15, 4.16 4.17.
338
4.
4.15. ,
, , .
^(?!.*[DFIOQU])[A-VXY][0-9][A-Z][0-9][A-Z][0-9]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , D, F, I, O, Q U. [A-VXY] , , W Z . , . , K1A
0B1, , .
.
4.14, 4.16 4.17.
4.16. ,
,
, .
^[A-Z]{1,2}[0-9R][0-9A-Z]?[0-9][ABD-HJLNP-UW-Z]{2}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
4.17. ,
339
- , . ,
, , .
.
.
BS7666 http://www.govtalk.gov.uk/gdsc/
html/frames/PostCode.htm, .
4.14, 4.16 4.17.
4.17. ,
, ,
, .
^(?:Post(?:Office)?|P[.]?O\.?)?Box\b
: , ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
C#
Regex regexObj = new Regex(
@^(?:Post (?:Office )?|P[. ]?O\.? )?Box\b,
RegexOptions.IgnoreCase | RegexOptions.Multiline
);
if (regexObj.IsMatch(subjectString) {
Console.WriteLine( , );
} else {
Console.WriteLine( );
}
340
4.
3.5.
, :
^
# .
(?:
# , ...
Post\
# Post .
(?:Office\ )? # Office .
|
# ...
P[.\ ]?
# P , , .
O\.?\
# O, .
)?
# .
Box
# Box.
\b
# .
: , ^ $
: .NET, Java, PCRE, Perl, Python, Ruby
, , :
Post Office Box
post box
P.O. box
P O Box
Po. box
PO Box
Box
,
,
, .
, , . ,
, , , .
4.18. , 341
.
4.14, 4.15 4.16.
4.18.
,
, ,
.
, , , .
, . ,
, .
. , ,
.
.
^(.+?)([^\s,]+)(,?(?:[JS]r\.?|III?|IV))?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
$2,$1$3
: Python, Ruby
342
4.
JavaScript
function formatName (name) {
return name.replace(/^(.+?) ([^\s,]+)(,? (?:[JS]r\.?|III?|IV))?$/i,
$2, $1$3);
}
3.15.
. , ,
, .
, :
^
(
#
#
.+?
#
#
)
#
\
#
(
#
[^\s,]+
#
#
)
#
(
#
,?\
#
(?:
#
[JS]r\.? #
|
#
III?
#
|
#
IV
#
)
#
)?
#
$
#
.
1...
,
.
.
.
2...
.
.
3...
, .
, ...
Jr, Jr., Sr Sr..
...
II III.
...
IV.
.
.
.
:
: .NET, Java, PCRE, Perl, Python, Ruby
:
4.18. , 343
, ,
( ).
.
, . ,
Sacha Baron Cohen Cohen, Sacha Baron,
Baron Cohen, Sacha.
, (, Charles de Gaulle (
) de Gaulle, Charles 15 Chicago
Manual of Style,
- (Merriam-Websters Biographical Dictionary)).
^ $
, , . , (, ), .
,
, .
.
.+?
, von de. , . ,
, , Mary Lou Norma Jeane, ,
. .
[^\s,]+ . , , ,
Latin.
344
4.
, Jr. III, . , .
. +? ,
+ ?
(
) , ( ) .
, , , , .
, ,
.
. 4.2
.
4.2.
John F. Kennedy
Kennedy, John F.
Scarlett OHara
OHara, Scarlett
Pep Le Pew
Pew, Pep Le
J.R.R. Tolkien
Tolkien, J.R.R.
Catherine Zeta-Jones
Zeta-Jones, Catherine
,
, -
4.19.
345
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
$2,$1$3
: Python, Ruby
4.19.
,
, .
, , , ,
.
, . , -.
10 30 .
, ,
. . :
[-]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.14 , .
346
4.
, ,
, . , , :
^(?:
(?<visa>4[0-9]{12}(?:[0-9]{3})?) |
(?<mastercard>5[1-5][0-9]{14}) |
(?<discover>6(?:011|5[0-9][0-9])[0-9]{12}) |
(?<amex>3[47][0-9]{13}) |
(?<diners>3(?:0[0-5]|[68][0-9])[0-9]{11}) |
(?<jcb>(?:2131|1800|35\d{3})\d{11})
)$
:
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
^(?:
(?P<visa>4[0-9]{12}(?:[0-9]{3})?) |
(?P<mastercard>5[1-5][0-9]{14}) |
(?P<discover>6(?:011|5[0-9][0-9])[0-9]{12}) |
(?P<amex>3[47][0-9]{13}) |
(?P<diners>3(?:0[0-5]|[68][0-9])[0-9]{11}) |
(?P<jcb>(?:2131|1800|35\d{3})\d{11})
)$
:
: PCRE, Python
Java, Perl 5.6, Perl 5.8 Ruby 1.8 . . 1 Visa,
2 MasterCard , 6 JCB:
^(?:
(4[0-9]{12}(?:[0-9]{3})?) |
(5[1-5][0-9]{14}) |
(6(?:011|5[0-9][0-9])[0-9]{12}) |
(3[47][0-9]{13}) |
(3(?:0[0-5]|[68][0-9])[0-9]{11}) |
((?:2131|1800|35\d{3})\d{11})
)$
#
#
#
#
#
#
Visa
MasterCard
Discover
AMEX
Diners Club
JCB
:
: .NET, Java, PCRE, Perl, Python, Ruby
347
4.19.
JavaScript . , :
^(?:(4[0-9]{12}(?:[0-9]{3})?)|(5[1-5][0-9]{14})|
(6(?:011|5[0-9][0-9])[0-9]{12})|(3[47][0-9]{13})|
(3(?:0[0-5]|[68][0-9])[0-9]{11})|((?:2131|1800|35\d{3})\d{11}))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
:
^(?:
4[0-9]{12}(?:[0-9]{3})? |
5[1-5][0-9]{14} |
6(?:011|5[0-9][0-9])[0-9]{12} |
3[47][0-9]{13} |
3(?:0[0-5]|[68][0-9])[0-9]{11} |
(?:2131|1800|35\d{3})\d{11}
)$
#
#
#
#
#
#
Visa
MasterCard
Discover
AMEX
Diners Club
JCB
:
: .NET, Java, PCRE, Perl, Python, Ruby
JavaScript:
^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|
3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\d{3})\d{11})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.6 , . , 3.9, , . , .
- JavaScript
<html>
<head>
<title>Credit Card Test</title>
</head>
<body>
<h1> </h1>
348
4.
<form>
<p>, :</p>
<p><input type=text size=20 name=cardnumber
onkeyup=validatecardnumber(this.value)></p>
<p id=notice>( )</p>
</form>
<script>
function validatecardnumber(cardnumber) {
//
cardnumber = cardnumber.replace(/[ -]/g, );
//
//
var match = /^(?:(4[0-9]{12}(?:[0-9]{3})?)|(5[1-5][0-9]{14})|
(6(?:011|5[0-9][0-9])[0-9]{12})|(3[47][0-9]{13})|(3(?:0[0-5]|[68][0-9])
[0-9]{11})|((?:2131|1800|35\d{3})\d{11}))$/.exec(cardnumber);
if (match) {
// , ,
var types = [Visa, MasterCard, Discover, American Express,
Diners Club, JCB];
//
// ( )
for (var i = 1; i < match.length; i++) {
if (match[i]) {
//
document.getElementById(notice).innerHTML = types[i - 1];
break;
}
}
} else {
document.getElementById(notice).innerHTML = ( );
}
}
</script>
</body>
</html>
, . .
,
, .
349
4.19.
, , ,
, , ,
. , ,
, , .
[-] .
.
. [-] , , \D ,
, .
, ,
. , .
:
Visa
13 16 , 4.
MasterCard
16 , 51 55.
Discover
16 , 6011, 65.
American Express
15 , 34 37.
Diners Club
14 , 300 305, 36 38.
JCB
15 , 2131 1800, 16 , 35.
,
. JCB | .
|| |) ,
350
4.
.
,
Visa, MasterCard AMEX, :
^(?:
4[0-9]{12}(?:[0-9]{3})? | # Visa
5[1-5][0-9]{14} |
# MasterCard
3[47][0-9]{13}
# AMEX
)$
:
: .NET, Java, PCRE, Perl, Python, Ruby
:
^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})$
Regex options:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, \b .
-
- JavaScript
. 347 , . onkeyup,
validatecardnumber().
,
. .
, regexp.
exec() null (
). , regexp.exec() .
. 1 6 ,
.
, . ,
.
( undefined ,
). , . -
4.19.
351
, .
, . ,
. ,
.
, ,
luhn(cardnumber); else validatecardnumber().
, . .
.
JavaScript ,
:
function luhn(cardnumber) {
// ,
var getdigits = /\d/g;
var digits = [];
while (match = getdigits.exec(cardnumber)) {
digits.push(parseInt(match[0], 10));
}
//
var sum = 0;
var alt = false;
for (var i = digits.length - 1; i >= 0; i--) {
if (alt) {
digits[i] *= 2;
if (digits[i] > 9) {
digits[i] -= 9;
}
}
sum += digits[i];
alt = !alt;
}
//
352
4.
if (sum % 10 == 0) {
document.getElementById(notice).innerHTML += ; ;
} else {
document.getElementById(notice).innerHTML +=
; ;
}
}
. .
validatecardnumber()
,
.
\d , , .
/g. match[0] .
( ), parseInt(), , ,
sum , ,
.
. 10
, . .
4.20.
- , .
:
, (
) , - ( ),
,
( ) . , . , . ,
4.20.
353
, , .
, .
,
, . , JavaScript,
, , .
,
.
, , . ,
, :
[-.]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.14 , . , -
, , .
.
27 , , :
^(
(AT)?U[0-9]{8} |
(BE)?0?[0-9]{9} |
(BG)?[0-9]{9,10} |
(CY)?[0-9]{8}L |
(CZ)?[0-9]{8,10} |
(DE)?[0-9]{9} |
(DK)?[0-9]{8} |
(EE)?[0-9]{9} |
#
#
#
#
#
#
#
#
354
4.
(EL|GR)?[0-9]{9} |
#
(ES)?[0-9A-Z][0-9]{7}[0-9A-Z] |
#
(FI)?[0-9]{8} |
#
(FR)?[0-9A-Z]{2}[0-9]{9} |
#
(GB)?([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3}) | #
(HU)?[0-9]{8} |
#
(IE)?[0-9]S[0-9]{5}L |
#
(IT)?[0-9]{11} |
#
(LT)?([0-9]{9}|[0-9]{12}) |
#
(LU)?[0-9]{8} |
#
(LV)?[0-9]{11} |
#
(MT)?[0-9]{8} |
#
(NL)?[0-9]{9}B[0-9]{2} |
#
(PL)?[0-9]{10} |
#
(PT)?[0-9]{9} |
#
(RO)?[0-9]{2,10} |
#
(SE)?[0-9]{12} |
#
(SI)?[0-9]{8} |
#
(SK)?[0-9]{10}
#
)$
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
, . , -
, . , JavaScript . :
^((AT)?U[0-9]{8}|(BE)?0?[0-9]{9}|(BG)?[0-9]{9,10}|(CY)?[0-9]{8}L|
(CZ)?[0-9]{8,10}|(DE)?[0-9]{9}|(DK)?[0-9]{8}|(EE)?[0-9]{9}|
(EL|GR)?[0-9]{9}|(ES)?[0-9A-Z][0-9]{7}[0-9A-Z]|(FI)?[0-9]{8}|
(FR)?[0-9A-Z]{2}[0-9]{9}|(GB)?([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3})|
(HU)?[0-9]{8}|(IE)?[0-9]S[0-9]{5}L|(IT)?[0-9]{11}|
(LT)?([0-9]{9}|[0-9]{12})|(LU)?[0-9]{8}|(LV)?[0-9]{11}|(MT)?[0-9]{8}|
(NL)?[0-9]{9}B[0-9]{2}|(PL)?[0-9]{10}|(PT)?[0-9]{9}|(RO)?[0-9]{2,10}|
(SE)?[0-9]{12}|(SI)?[0-9]{8}|(SK)?[0-9]{10})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.6 .
4.20.
355
, , . ,
DE 123.456.789.
, 27 , . ,
.
[-.] , . , .
. , [-.]
[^A-Z0-9] .
, ,
. ,
,
, , . JavaScript , .
27 , .
:
U99999999
999999999 0999999999
999999999 9999999999
99999999L
356
4.
99999999, 999999999 9999999999
999999999
99999999
999999999
999999999
X9999999X
99999999
XX999999999
99999999
9S99999L
99999999999
999999999 99999999999
99999999
99999999999
99999999
999999999B99
357
4.20.
999999999
999999999
99999999999
99999999
999999999
, . , .
. , ,
. ,
.
, ,
. , | ,
. , , || . || , , ,
,
.
27 . ,
.
.
, \b .
358
4.
,
27 , , .
, 27 .
, , ,
:
^(AT)?U[0-9]{8}$
^(BE)?0?[0-9]{9}$
^(BG)?[0-9]{9,10}$
^(CY)?[0-9]{8}L$
^(CZ)?[0-9]{8,10}$
^(DE)?[0-9]{9}$
^(DK)?[0-9]{8}$
^(EE)?[0-9]{9}$
^(EL|GR)?[0-9]{9}$
^(ES)?[0-9A-Z][0-9]{7}[0-9A-Z]$
^(FI)?[0-9]{8}$
^(FR)?[0-9A-Z]{2}[0-9]{9}$
^(GB)?([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3})$
^(HU)?[0-9]{8}$
4.20.
359
^(IE)?[0-9]S[0-9]{5}L$
^(IT)?[0-9]{11}$
^(LT)?([0-9]{9}|[0-9]{12})$
^(LU)?[0-9]{8}$
^(LV)?[0-9]{11}$
^(MT)?[0-9]{8}$
^(NL)?[0-9]{9}B[0-9]{2}$
^(PL)?[0-9]{10}$
^(PT)?[0-9]{9}$
^(RO)?[0-9]{2,10}$
^(SE)?[0-9]{12}$
^(SI)?[0-9]{8}$
^(SK)?[0-9]{10}$
3.6.
, .
, , . , . ,
3.9. ,
, . -
360
4.
,
.
. EL , GR
.
.
, . , . , , .
,
- , http://ec.europa.eu/taxation_
customs/vies/vieshome.do.
, ,
2.3, 2.5 2.8.
,
, . , ,
, . ,
,
.
,
.
, .
,
.
5.1.
cat,
.
, .
, ,
hellcat, application Catwoman.
:
\bcat\b
362
5. ,
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 3.7. , 3.14.
, , , cat , . , , cat
, , ,
.
,
, .
2.6.
JavaScript, PCRE Ruby ,
ASCII.
,
^|[^A-Za-z0-9_] [A-Za-z0-9_]
[A-Za-z0-9_] [^A-Za-z09_]|$ . Python, UNICODE U. \b , , Latin. , JavaScript, PCRE Ruby
\bber\b darber, dar ber.
.
, , r. ,
, , .
363
5.1.
. 5.1.
364
5. ,
, cat
dog JavaScript. , cat . \b \w :
// 8-
var L = A-Za-z\xAA\xB5\xBA\xC0-\xD6\xD8-\xF6\xF8-\xFF;
var pattern = ([^{L}]|^)cat([^{L}]|$).replace(/{L}/g, L);
var regex = new RegExp(pattern, gi);
// cat dog
//
subject = subject.replace(regex, $1dog$2);
,
JavaScript \xHH ( HH ). , L, ,
. , \xHH, (, \\xHH) . .
.
5.2, 5.3 5.4.
5.2.
,
.
:
\b(?:one|two|three)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
365
5.2.
5.3.
JavaScript
var subject = One times two plus one equals three.;
var regex = /\b(?:one|two|three)\b/gi;
subject.match(regex);
// : [One,two,one,three]
//
//
//
//
//
,
.
,
,
.
: , (
| ). , .
\bone\b|\btwo\b|\bthree\b . .
, , -
366
5. ,
,
, . , . ,
,
, awe|awesome awesome, awe
.
,
, .
two three
, ,
\b(?:one|t(?:wo|hree))\b . 5.3
, , .
JavaScript
JavaScript .
match(),
JavaScript, . match() , /g
(global ). , , null, .
(match_words()), , ,
. , , , .
, . (/i)
.
.
5.1, 5.3 5.4.
5.3.
367
5.3.
color colour .
, phobia.
,
. .
Color colour
\bcolou?r\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, phobia
\b\w*phobia\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
regular expression
\breg(?:ularexpressions?|ex(?:ps?|e[sn])?)\b
368
5. ,
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
( \b ) .
, .
Color Colour
color colour,
colorblind. ? , u. , ? ,
, .
, ( u ), , .
?
. , , , ,
. , , , ,
.
369
5.3.
, ,
, .
, . ,
. ,
, , . , ,
. , , .
:
, <[cat]{3}> cat, act,
ttt .
, <[^cat]>,
, c, a t.
, . <[a|b|c]> abc|. , ,
, . , .
, , 2.3.
, phobia
, ,
. , , arachnophobia hexakosioihexekontahexaphobia, * ,
phobia. ,
phobia ,
* + .
370
5. ,
,
n . ( )
\bSte(?:ve|ven|phen)\b .
Ste , , \b(?:Steve|Steven|Stephen)\b \bSteve\b|\
bSteven\b|\bStephen\b .
, , ,
, Ste. , S.
,
, ,
. ,
( Ste), . \bSte(?:ven?|phen)\
b , ,
, .
2.13.
regular expression
, ,
regular expression.
,
.
, ,
,
JavaScript.
, :
\b
reg
(?:
ular\
expressions?
|
ex
(?:
#
#
#
#
#
#
#
#
.
reg.
, ...
ular .
expression expressions.
...
ex.
, ...
5.4. ,
ps?
|
e
[sn]
)
?
)
\b
371
#
p ps.
#
...
#
e.
#
sn.
# .
#
.
# .
# .
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
:
regular expressions
regular expression
regexps
regexp
regexes
regexen
regex
.
5.1, 5.2 5.4.
5.4. ,
, cat. Catwoman ,
cat, , cat .
, :
\b(?!cat\b)\w+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
372
5. ,
,
( [^...] ) , , [^cat] , ,
cat. [^cat] , , c, a t. , \b[^cat]+\b cat, cup, c.
\b[^c][^a][^t]\w* , , c, a
t. ,
, , , ,
.
, , , , :
\b
(?!
#
#
#
cat #
\b #
)
#
\w+ #
.
, , ,
, ...
cat.
.
.
.
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
, (?!...) . cat, ,
,
,
.
, . + \w+ ,
.
categorically match any word except cat, :
categorically, match, any, word except.
5.5. ,
373
,
, , cat, ,
cat, :
\b(?:(?!cat)\w)+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
.
, , ; ,
, ,
. , , .
cat, .
, .
cat.
.
2.16, ( , ).
5.1, 5.5, 5.6 5.11.
5.5. ,
,
cat,
, ,
.
374
5. ,
:
\b\w+\b(?!\W+cat\b)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.7 3.14 ,
.
,
( \b ) ( \w ). 2.6.
(?!...) , . , , , , .
, ,
, .
, .
,
, , ,
. ( ) 2.16.
: \
W+ , , cat , ,
, cat, , ,
cat.
,
cat, , cat.
cat,
375
5.6. ,
5.4,
\b(?!cat\b)\w+\b(?!\W+cat\b) .
, cat ( cat , ), :
\b\w+\b(?=\W+cat\b)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
2.16, ( , ).
5.4 5.6.
5.6. ,
,
cat, , .
.
, , , . 2.16.
, (? !...) . , ,
, , .
376
5. ,
-.
. 11, .
, cat
(?<!\bcat\W+)\b\w+
:
: .NET
(?<!\bcat\W{1,9})\b\w+
:
: .NET, Java, PCRE
(?<!\bcat)(?:\W+|^)(\w+)
:
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8 ,
. , , , ,
,
JavaScript:
var subject = My cat is furry.,
main_regex = /\b\w+/g,
lookbehind = /\bcat\W+$/i,
lookbehind_type = false, //
matches = [],
match,
left_context;
while (match = main_regex.exec(subject)) {
left_context = subject.substring(0, match.index);
if (lookbehind_type == lookbehind.test(left_context)) {
matches.push(match[0]);
} else {
main_regex.lastIndex = match.index + 1;
}
}
// : [My,cat,furry]
5.6. ,
377
,
(? !\bcat\W+) . + , , , .NET. ,
, ,
.
378
5. ,
JavaScript ,
JavaScript , , , .
()
.
(? !\
bcat\W+)\b\w+ : \bcat\W+ , ( \b\w+ ). $ ,
. lookbehind ( /m), $
$(?!\s) , . lookbehind_type ,
: true , false
.
main_
regex() exec() (
3.11). , , , (left_context) lookbehind
. lookbehind , , . lookbehind_type, ,
.
. , matches. , , ( main_regex.lastIndex), main_regex, exec .
! !
lastIndex, , /g (global ). lastIndex . , , -
5.7.
379
, .
, .
, . . -
, .
,
cat ( cat , ), :
(?<=\bcat\W+)\b\w+
:
: .NET
(?<=\bcat\W{1,9})\b\w+
:
: .NET, Java, PCRE
(?<=\bcat)(?:\W+|^)(\w+)
:
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
.
2.16, ( , ).
5.4 5.5.
5.7.
NEAR , . , , : ,
, NOT OR, -
380
5. ,
,
, word1,
word2, , ,
.
, :
\b(?:word1\W+(?:\w+\W+){0,5}?word2|word2\W+(?:\w+\W+){0,5}?word1)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\b(?:
word1
\W+ (?:\w+\W+){0,5}?
word2
|
word2
\W+ (?:\w+\W+){0,5}?
word1
)\b
#
#
#
#
#
#
#
...
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
. . JavaScript , . 3.5 3.7 , .
, , . word1 word2,
381
5.7.
.
word1 word2, .
{0,5}?.
, . word1 word2 word2, word1 word2,
() .
, 0 5
. , {1,15}? , , 15 .
, ,
( \w \W , ),
, (, ).
.
, ,
.
. , .
, ,
, . , .
, word1 word2
. ,
; , word2:
\b(?:word1|(word2))\W+(?:\w+\W+){0,5}?(?(1)word1|word2)\b
:
: .NET, PCRE, Perl, Python
382
5. ,
,
, , :
\b(?:(?<w1>word1)|(?<w2>word2))\W+(?:\w+\W+){0,5}?(?(w2)(?&w1)|(?&w2))\b
:
: PCRE 7, Perl 5.10
word1 word2
, (?<name>...) . (?&name) ,
,
. , . , \k<name>
(.NET, PCRE 7, Perl 5.10) (?P=name) (PCRE 4 , Perl 5.10,
Python), , .
, (?&name) , , .
, , . , , , , . , , , .
,
.
, ,
. ,
. , ,
?
(. 5.2). n!, 1 n (n ). 24 . 10 , . , ,
, .
. , , ( ,
383
5.7.
), ,
. , ,
:
\b(?:(?>(word1)|(word2)|(word3)|(?(1)|(?(2)|(?(3)|(?!))))\w+)\b\W*?){3,8}
(?(1)(?(2)(?(3)|(?!))|(?!))|(?!))
:
: .NET, PCRE, Perl
:
[ 12, 21 ]
= 2
:
[ 123, 132,
213, 231,
312, 321 ]
= 6
:
[ 1234, 1243, 1324, 1342, 1423, 1432,
2134, 2143, 2314, 2341, 2413, 2432,
3124, 3142, 3214, 3241, 3412, 3421,
4123, 4132, 4213, 4231, 4312, 4321 ]
= 24
:
2! = 2 1
3! = 3 2 1
4! = 4 3 2 1
5! = 5 4 3 2 1
...
10! = 10 9 8 7 6 5 4 3 2 1
=
=
=
=
2
6
24
120
= 3628800
. 5.2.
, ( 2.14) Python:
\b(?:(?:(word1)|(word2)|(word3)|(?(1)|(?(2)|(?(3)|(?!))))\w+)\b\W*?){3,8}
(?(1)(?(2)(?(3)|(?!))|(?!))|(?!))
:
: .NET, PCRE, Perl, Python
{3,8} , , -
384
5. ,
. , (?!) , , ,
. .
, \w+ , , . , , ,
.
, , ,
,
, , .
. ,
. , .
, , , ,
Java Ruby (
).
, , . ,
.
\b(?:(?>word1()|word2()|word3()|(?>\1|\2|\3)\w+)\b\W*?){3,8}\1\2\3
:
: .NET, Java, PCRE, Perl, Ruby
\b(?:(?:word1()|word2()|word3()|(?:\1|\2|\3)\w+)\b\W*?){3,8}\1\2\3
:
: .NET, Java, PCRE, Perl, Python, Ruby
. , , ,
:
385
5.7.
\b(?:(?>word1()|word2()|word3()|word4()|
(?>\1|\2|\3|\4)\w+)\b\W*?){4,9}\1\2\3\4
:
: .NET, Java, PCRE, Perl, Ruby
\b(?:(?:word1()|word2()|word3()|word4()|
(?:\1|\2|\3|\4)\w+)\b\W*?){4,9}\1\2\3\4
:
: .NET, Java, PCRE, Perl, Python, Ruby
. , \1 , , , , ,
. , ,
, .
(? \1|\2|\3)
\w+ , . , , .
Python ,
, Python ,
. , . ,
, .
JavaScript
. JavaScript
, Python,
,
, . ,
, . JavaScript , -
386
5. ,
, , . : , ,
, ,
, .
JavaScript , , , ((a)|(b))+ . , , , , . , (?:(a)|(b))+ ab,
a. JavaScript,
, . (?:(a)|(b))+ ab,
1 , , JavaScript
undefined, , RegExp.
prototype.exec().
JavaScript ,
, .
, , , .
,
, , .
\A(?=.*?\bword1\b)(?=.*?\bword2\b).*\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
387
5.8.
^(?=[\s\S]*?\bword1\b)(?=[\s\S]*?\bword2\b)[\s\S]*$
: ( ^ $ )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
.
. JavaScript
, JavaScript \A \Z .
, 3.6. word1
word2 , . , , . , \A(?=.*?\bword1\b)(?=.*?\bword2\b)(?=.*?\
bword3\b).*\Z .
.
5.5 5.6.
5.8.
,
. , The the.
, ,
.
, , :
\b([A-Z]+)\s+\1\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, 1. -
388
5. ,
, ,
(, HTML),
. 3.15 , ,
.
,
, , 3.7, , .
, grep, ,
1, , , , .
, ,
.
, , . , . (\w)\1 \w{2} . , .
2.10.
. , ,
, A
Z a z ( ). , , Letter ( \p{L} ), ( . 72).
\s+ , , ,
, .
, , ( ), \s+ [^\S\r\n] .
,
. PCRE 7 Perl 5.10 \h , -
5.9.
389
, ,
, .
, ,
, this thistle.
, ,
. ,
, that that
had had. , , (
oink oink ha ha) . .
.
2.10, .
5.9, , .
5.9.
,
- . , .
( uniq
UNIX Get-Unique PowerShell Windows), , . , , , .
, , 1 . ( , 1
. . .
390
5. ,
,
) , .
1:
, ,
, ,
.
.
, :
^(.*)(?:(?:\r?\n|\r)\1)+$
: ^ $ ( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
$1
: Python, Ruby
,
, ( ). ,
. 3.15 ,
.
2:
, , , , :
^([^\r\n]*)(?:\r?\n|\r)(?=.*^\1$)
5.9.
391
: , ^ $
: .NET, Java, PCRE, Perl, Python, Ruby
JavaScript,
:
^(.*)(?:\r?\n|\r)(?=[\s\S]*^\1$)
: ^ $ ( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
( , .)
:
3:
, .
, :
^([^\r\n]*)$(.*?)(?:(?:\r?\n|\r)\1$)+
: ^ $ ( )
: .NET, Java, PCRE, Perl, Python, Ruby
JavaScript, -
, .
^(.*)$([\s\S]*?)(?:(?:\r?\n|\r)\1$)+
: ^ $ ( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
$1$2
: Python, Ruby
392
5. ,
1 2,
.
, , .
.
1:
, , . , . .
-, ( ^ ) .
,
^ $ ( 3.4 , ). .*
( ), 1. , , , .
, , . , ,
,
.
, ( -
393
5.9.
). 1, .
2:
1,
,
. -,
, JavaScript, [^\r\n] (
, ) . , ,
, . -, , , .
,
(
), . ,
.
3:
,
( , ,
, ),
3 2. ,
(
2), , , ,
.
1, ( )
2. 1 2 ,
, - .
. -,
,
, , , , . -,
, ,
394
5. ,
,
.
.
,
,
. , , :
value1
value2
value2
value3
value3
value1
value2
.
. 5.1.
5.1.
value1
value1
value1
value1
value2
value2
value2
value2
value2
value2
value3
value3
value3
value3
value2
value3
value3
value1
value2
value2
/
.
2.10, .
5.8, , .
395
5.10. ,
5.10. ,
,
ninja .
^.*\bninja\b.*$
: , ^ $
( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
, ninja, \bninja\b .
2.6, ,
, ,
ninja,
.
, .* .
. ,
. ninja, , , .
,
, ,
. , $ , . .
, , ,
. ,
, , , -
396
5. ,
,
- .
, ,
(
^ $ , ), . , , ^ $ , , , . 3.4
, . JavaScript Ruby ,
, , JavaScript , , Ruby
.
, ,
:
^.*\b(one|two|three)\b.*$
: , ^ $
( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
: one, two three. ,
, . -, , -,
1 , . ,
, .
, , , ,
, . , ^.*?\b(one|two|three)\b.*$ , 1 , .
,
, :
^(?=.*?\bone\b)(?=.*?\btwo\b)(?=.*?\bthree\b).+$
: , ^ $
( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
5.11. ,
397
,
, . .+ ,
, .
.
5.11, , , .
5.11. ,
, ninja.
^(?:(?!\bninja\b).)*$
: , ^ $
( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, ( 2.16). , .
\
bninja\b . ^ $ , , .
, , ,
. , ^ $
, , , .
398
5. ,
,
, ninja.
. , , , . 3.21, .
.
5.10, , , .
5.12.
,
, , :
^\s+
: ( ^ $
)
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\s+$
: ( ^ $
)
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, . , 3.14.
,
.
399
5.12.
, . : , ( ^
$ , ), ,
( \s ), ,
( + ).
C#, VB.NET
String.Trim([chars])
Java
string.trim()
PHP
trim($string)
Python, Ruby
string.strip()
JavaScript Perl , :
Perl:
sub trim {
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
JavaScript:
function trim (string) {
return string.replace(/^\s+/, ).replace(/\s+$/, );
}
// , ,
// trim :
String.prototype.trim = function () {
400
5. ,
return this.replace(/^\s+/, ).replace(/\s+$/, );
};
, Perl JavaScript, \s ,
, , , ,
.
, . ( ) , , .
, .
JavaScript, JavaScript , , , [\s\S] .
.
string.replace(/^\s+|\s+$/g, );
, , . ( 2.8) /g (global ),
,
(
). , , .
string.replace(/^\s*([\s\S]*?)\s*$/, $1)
( ) 1. 1 .
,
, .
,
401
5.12.
, [\s\S] . , ( \s*$ ).
, - ,
.
string.replace(/^\s*([\s\S]*\S)?\s*$/, $1)
, ,
, .
, \S . , -
,
, ? .
, .
[\s\S]* ,
.
, ,
\S
,
.
, , , . , .
string.replace(/^\s*(\S*(?:\s+\S+)*)\s*$/, $1)
,
. , , . , ,
, , ,
, .
, , ,
.
402
5. ,
, , , , . ,
,
,
.
.
5.13.
5.13.
,
. , .
,
.
3.14 ,
.
\s+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
[\t]+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. -
403
5.14.
, HTML
( ),
- .
( , , ) . + ,
( \s ), , , , . +
{2,} ,
.
, , ,
, . , .
, , , .
.
.
5.12.
5.14.
,
. , , , .
,
-
404
5. ,
, . , ,
JavaScript, ,
( . 5.3). ,
,
, .
. 5.3 ,
.
5.3. ,
C#, VB.NET
Regex.Escape(str)
Java
Pattern.quote(str)
Perl
quotemeta(str)
PHP
preg_quote(str, [delimiter])
Python
re.escape(str)
Ruby
Regexp.escape(str)
JavaScript
, .
, ,
, ( ).
,
.
3.15 , , . , :
[[\]{}()*+?.\\|^$\-,&#\s]
5.14.
405
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. , , . 2.19.
\$&
: .NET, JavaScript
\\$&
: Perl
\\$0
: Java, PHP
\\\0
: PHP, Ruby
\\\&
: Ruby
\\\g<0>
: Python
JavaScript
, , RegExp.escape() JavaScript:
RegExp.escape = function (str) {
return str.replace(/[[\]{}()*+?.\\|^$\-,&#\s]/g, \\$&);
};
// ...
var str = Hello.World?;
var escaped_str = RegExp.escape(str);
alert(escaped_str == Hello\\.World\\?); // -> true
406
5. ,
. , . , :
[] {} ()
, <[> <]>, . , <{> <}>, ,
, . , <(> <)>, , .
* + ?
,
,
, .
(
Perl 5.10 PCRE 7).
. \ |
,
, .
^ $
, . ,
.
, , .
.
. , , . ,
,
, .
5.14.
407
,
, <{1,5}>. ,
, ( ) ,
, .
&
,
Java , , . , ,
.
#
(
<\s>) , . .
, ( $& ,
\& , $0 , \0 \g 0 ) . Perl
$& ,
.
$& -
Perl, ,
. $& $1.
2.1
. 50, , \Q...\E . Java, PCRE Perl, .
, \E , . , ,
.
408
5. ,
.
2.1, ,
. , , , , ,
,
, .
. 56 , , , 5 6. , ,
\d ( 2.3). , .
, 56 , ,
, :-) , \p{P}{3} .
, ,
,
, , :
1 100?. . , ,
. ,
, ,
.
6.1.
,
.
410
6.
:
\b[0-9]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
\A[0-9]+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
^[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
:
(?<=^|\s)[0-9]+(?=$|\s)
:
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
. :
(^|\s)([0-9]+)(?=$|\s)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
[+-]?\b[0-9]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, ,
:
\A[+-]?[0-9]+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
411
6.1.
^[+-]?[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
.
, :
([+-]*)?\b[0-9]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. ( 2.3) ( 2.12): [0-9]+ .
[0-9] \d.
.NET Perl \d
, [0-9] 10 ASCII. -
, , ASCII, \d [0-9] .
, , ASCII, ,
, , \d [0-9] . , , ,
, ,
ASCII. ,
, ,
.
,
. , A4 , .
, , .
, , , . -
412
6.
\A \Z , . , JavaScript . JavaScript
^ $ ,
/m,
. Ruby , .
, ( 2.6). , , . ,
4 4 A4. 4\b
, 4 - . \b4 \b4\b A4, \b A 4.
, .
,
, ,
,
. +4
+4B, \b\+4\b \+4\b . +4, ,
. \b\+4\b +4
3+4, 3 , + .
\+4\b . \b \+\b4\b .
\b + 4 . \b , . \+?\b4\b 4 A4, \+?4\b
.
.
$123,456.78.
\b[0-9]+\b , 123, 456 78. ,
, .
, , .
413
6.2.
, ,
. (?=$|\s)
( ). (? =^|\s) . \s
, ,
. 2.16.
.
2.3 2.12.
6.2.
,
.
x :
\b[0-9A-F]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\b[0-9A-Fa-f]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
414
6.
, :
\A[0-9A-F]+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
^[0-9A-F]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
x 0x:
\b0x[0-9A-F]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
x &H:
&H[0-9A-F]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
x , H:
\b[0-9A-F]+H\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 8- :
\b[0-9A-F]{2}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 16- :
\b[0-9A-F]{4}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 32-
:
\b[0-9A-F]{8}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 64- :
415
6.2.
\b[0-9A-F]{16}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
(
):
\b(?:[0-9A-F]{2})+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
. ,
, , A F. ,
. , ,
.
.
[0-9a-f]
, [0-9A-F] .
, [0-9a-fA-F] . , , 3.4.
. .
, , .
, A-F a-f .
(?:[0-9A-F]{2})+ . [0-9A-F]{2}
. (?:[0-9A-F]{2})+ .
( 2.9) ,
+ -
416
6.
, , ,
. , ,
. , 10 9 11 F 11.
( 2.6).
. , , &H, . , .
, , .
, , , . \A \Z ,
. , JavaScript .
JavaScript ^ $ ,
/m,
. Ruby ,
.
.
2.3 2.12.
6.3.
,
.
6.3.
417
:
\b[01]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
\A[01]+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
^[01]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, B:
\b[01]+B\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 8- :
\b[01]{8}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 16- :
\b[01]{16}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
( , 8):
\b(?:[01]{8})+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
. ,
0 1. -
418
6.
, :
[01] .
.
2.3 2.12.
6.4.
, , ,
.
\b0*([1-9][0-9]*|0)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
$1
Perl
while ($subject =~ m/\b0*([1-9][0-9]*|0)\b/g) {
push(@list, $1);
}
PHP
$result = preg_replace(/\b0*([1-9][0-9]*|0)\b/, $1, $subject);
. 0* , . [1-9][0-9]*
, , -
6.5.
419
.
, . , , ,
6.1.
,
,
, 3.11.
( ) , 3.9. ,
Perl.
. .
( ) , . , PHP.
.
3.15 6.1.
6.5.
, . , , .
1 12 ( )
^(1[0-2]|[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 24 ():
^(2[0-4]|1[0-9]|[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
420
6.
1 31 ( ):
^(3[01]|[12][0-9]|[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 53 ( ):
^(5[0-3]|[1-4][0-9]|[1-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 59 ( ):
^[1-5]?[0-9]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 100 ():
^(100|[1-9]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 100:
^(100|[1-9][0-9]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
32 126 ( ASCII):
^(12[0-6]|1[01][0-9]|[4-9][0-9]|3[2-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 127 ( ):
^(12[0-7]|1[01][0-9]|[1-9]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
128 127 ( ):
^(12[0-7]|1[01][0-9]|[1-9]?[0-9]|-(12[0-8]|1[01][0-9]|[1-9]?[0-9]))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
6.5.
421
0 255 ( ):
^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 366 ( ):
^(36[0-6]|3[0-5][0-9]|[12][0-9]{2}|[1-9][0-9]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1900 2099 ():
^(19|20)[0-9]{2}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 32767 ( ):
^(3276[0-7]|327[0-5][0-9]|32[0-6][0-9]{2}|3[01][0-9]{3}|[12][0-9]{4}|
[1-9][0-9]{1,3}|[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
32768 32767 ( ):
^(3276[0-7]|327[0-5][0-9]|32[0-6][0-9]{2}|3[01][0-9]{3}|[12][0-9]{4}|
[1-9][0-9]{1,3}|[0-9]|-(3276[0-8]|327[0-5][0-9]|32[0-6][0-9]{2}|
3[01][0-9]{3}|[12][0-9]{4}|[1-9][0-9]{1,3}|[0-9]))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 65535 ( ):
^(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|
[1-9][0-9]{1,3}|[0-9])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
. .
.
422
6.
, 0 255, . [0-255] . , , ,
, 0 255. , [0125] , , 0, 1, 2 5.
, , , ,
, . 6.1 ,
. . , , , .
.
,
,
. ( 2.3) ( 2.8).
, [0-5] . , 0 9
ASCII .
[0-5] , ,
[j-o] [\x09-\x0E]
.
,
. .
, 12 24, . ,
1 12, .
, ,
, . 40 59 . 44 55
.
423
6.5.
,
, 40 59.
, .
,
.
[45][0-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
40 59 . . 4 5. [45]
. 10 .
[0-9] .
44 55 , . 4 5.
4, 4 9,
44 49. 5, 0 5, 50 55.
:
4[4-9]|5[0-5]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 4[4-9] 5[0-5] . , , (4[4-9])|(5[0-5) .
, . 34 65
.
3 6. 3, -
424
6.
4 9. 4 5,
. 6, 0 5:
3[4-9]|[45][0-9]|6[0-5]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , , ,
.
1 12 . 1 9, , 10 12, .
, :
1[0-2]|[1-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. , . ,
. ,
12, 1[0-2]|[1-9] 12, [1-9]|1[0-2] 1 . [1-9] .
1 ,
1[0-2] , .
85 117 ,
. 85 99 ,
100 117 . , . , : 8, 5 9.
9, . , : 1. 0, . 1, 0 7. : 85 89, 90 99, 100 109 110 117. ,
425
6.5.
POSIX- . . , ,
, ,
POSIX. , [1-9]|1[0-2]
1 12.
. . ,
^([1-9]|1[0-2])$ ^(1[0-2]|[1-9])$ , , , 12 12,
POSIX- . ,
, .
,
. 20 1.
, , :
8[5-9]|9[0-9]|10[0-9]|11[0-7]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, ,
. ,
, , .
,
. , , 0 65535,
:
6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9]|
[1-5][0-9][0-9][0-9][0-9]|[1-9][0-9][0-9][0-9]|[1-9][0-9][0-9]|
[1-9][0-9]|[0-9]
426
6.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
,
. ,
(, , 6),
. , . ,
.
, .
, . 2.12.
6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|
[1-9][0-9]{3}|[1-9][0-9]{2}|[1-9][0-9]|[0-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
[1-9][0-9]{3}|[1-9][0-9]{2}|[1-9][0-9]
. . [1-9][0-9]{1,3} .
6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|
[1-9][0-9]{1,3}|[0-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
, 6 :
6(?:553[0-5]|55[0-2][0-9]|5[0-4][0-9]{2}|[0-4][0-9]{3})|[1-5][0-9]{4}|
[1-9][0-9]{1,3}|[0-9]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, 6 . -
6.6.
427
, .
.
.
2.8, 4.12 6.1.
6.6.
, . , , .
1 C ( 1 12: ):
^[1-9a-c]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 18 ( 1 24: ):
^(1[0-8]|[1-9a-f])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 1F ( 1 31: ):
^(1[0-9a-f]|[1-9a-f])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 35 ( 1 53: ):
^(3[0-5]|[12][0-9a-f]|[1-9a-f])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 3B ( 0 59: ):
^(3[0-9a-b]|[12]?[0-9a-f])$
428
6.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 64 ( 1 100):
^(6[0-4]|[1-5][0-9a-f]|[1-9a-f])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
20 7E ( 32 126: ASCII):
^(7[0-9a-e]|[2-6][0-9a-f])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 7F ( 0 127: 7- ):
^[1-7]?[0-9a-f]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 FF ( 0 255: 8- ):
^[1-9a-f]?[0-9a-f]$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
1 16E ( 1 366: ):
^(16[0-9a-e]|1[0-5][0-9a-f]|[1-9a-f][0-9a-f]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
76C 833 ( 1900 2099: ):
^(83[0-3]|8[0-2][0-9a-f]|7[7-9a-f][0-9a-f]|76[c-f])$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
0 7FFF: ( 0 32767: 15- ):
^([1-7][0-9a-f]{3}|[1-9a-f][0-9a-f]{1,2}|[0-9a-f])$
429
6.7.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. , ,
.
.
ASCII , ,
[0-F] , .
,
, ASCII
.
: [0-9A-F] .
.
. [0-9A-F] , [0-9a-f] . [0-9A-Fa-f] .
. .
, 3.4.
.
2.8 6.2.
6.7.
, , , , -
430
6.
. , , , 3.12.
: , , :
^[-+][0-9]+\.[0-9]+[eE][-+]?[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
: , , :
^[-+][0-9]+\.[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , :
^[-+]?[0-9]+\.[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
: , , :
^[-+]?[0-9]*\.[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, . , .
, . .
^[-+]?([0-9]+(\.[0-9]+)?|\.[0-9]+)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, . , .
, . .
^[-+]?([0-9]+(\.[0-9]*)?|\.[0-9]+)$
431
6.7.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, . , .
, . .
^[-+]?([0-9]+(\.[0-9]*)?|\.[0-9]+)([eE][-+]?[0-9]+)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
:
[-+]?(\b[0-9]+(\.[0-9]*)?|\.[0-9]+)([eE][-+]?[0-9]+\b)?
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
( 2.5), ,
,
. ,
, 6.1.
, ,
:
. ( 2.3) ,
e. + ? ( 2.12) .
.
, ,
. + *
.
, , .
, , , -
432
6.
. [-+]?[0-9]*\.?[09]* ,
, ,
.
123abc456, {$&} , {123}{}a{}b{}c{456}{}. 123 456, .
, , ,
, .
.
, ,
, . , , 123. , , , . , , .
, , , ( 2.8)
( 2.9). [0-9]+(\.[0-9]+)?
. \.[0-9]+ .
[0-9]+(\.[0-9]+)?|\.[0-9]+
. ,
, .
.
, , .
[0-9]+(\.[0-9]+)?|\.[0-9]+ , . , [0-9]+(\.[0-9]*)?|\.[0-9]+ .
- ? , . , . + ( ) *
( ).
6.8.
433
, .
.
, . , , . , , . , , .
.
2.3, 2.8, 2.9 2.12.
6.8.
, , .
:
^[0-9]{1,3}(,[0-9]{3})*\.[0-9]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
, .
^[0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
, .
^([0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)?|\.[0-9]+)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
434
6.
,
:
\b[0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)?\b|\.[0-9]+\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, ,
, .
,
[0-9]+ [0-9]{1,3}(,[0-9]{3})* .
1 3 ,
, .
[0-9]{0,3}(,[0-9]{3})* ,
,
, , ,123. , . , , , . .
, , .
, ,
, .
.
2.3, 2.9 2.12.
6.9.
,
IV, XIII MVIII.
:
^[MDCLXVI]+$
435
6.9.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
^(?=[MDCLXVI])M*(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
^(?=[MDCLXVI])M*(C[MD]|D?C*)(X[CL]|L?X*)(I[XV]|V?I*)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
^(?=[MDCLXVI])M*D?C{0,4}L?X{0,4}V?I{0,4}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
M, D, C, L, X, V I,
1000, 500, 100, 50, 10, 5 1 .
, ,
, .
( )
. . , 4
IV, IIII.
, .
. 1999
MCMXCIX, M 1000, CM 900, XC 90 IX 9.
: M , M* .
10 , .
C[MD] CM CD,
900 400. D?C{0,3} DCCC, DCC,
DC, D, CCC, CC, C , 800, 700, 600, 500,
300, 200, 100 . 10 .
436
6.
X[CL]|L?X{0,3} ,
I[XV]|V?I{0,3} .
, .
, .
, . . , , ,
. .
(?=[MDCLXVI]) . ,
2.16,
. ,
.
.
, IIII, IV.
, , - .
4 IIII, IV.
.
( 2.5), ,
, .
,
^ $ \b .
Perl
, , . [MDLV]|C[MD]?|X[CL]?|I[XV]? ,
:
sub roman2decimal {
my $roman = shift;
if ($roman =~
m/^(?=[MDCLXVI])
(M*)
(C[MD]|D?C{0,3})
(X[CL]|L?X{0,3})
# 1000
# 100
# 10
437
6.9.
(I[XV]|V?I{0,3})
$/ix)
# 1
{
#
my %r2d = (I => 1, IV => 4, V => 5, IX => 9,
X => 10, XL => 40, L => 50, XC => 90,
C => 100, CD => 400, D => 500, CM => 900,
M => 1000);
my $decimal = 0;
while ($roman =~ m/[MDLV]|C[MD]?|X[CL]?|I[XV]?/ig) {
$decimal += $r2d{uc($&)};
}
return $decimal;
} else {
#
return 0;
}
}
.
2.3, 2.8, 2.9, 2.12, 2.16, 3.9 3.11.
URL,
, ,
, , ,
:
URL, URN
IP-
Windows
URL ,
, (World Wide
Web). , ,
.
7.1. URL
, URL, .
URL:
^(https?|ftp|file)://.+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
439
7.1. URL
\A(https?|ftp|file)://.+\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
:
\A
(https?|ftp)://
[a-z0-9-]+(\.[a-z0-9-]+)+
([/?].*)?
\Z
#
#
#
#
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(https?|ftp)://[a-z0-9-]+(\.[a-z0-9-]+)+
([/?].+)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
. (http ftp),
(www ftp):
\A
((https?|ftp)://|(www|ftp)\.)
[a-z0-9-]+(\.[a-z0-9-]+)+
([/?].*)?
\Z
#
#
#
#
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^((https?|ftp)://|(www|ftp)\.)[a-z0-9-]+(\.[a-z0-9-]+)+([/?].*)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
, .
, :
\A
(https?|ftp)://
[a-z0-9-]+(\.[a-z0-9-]+)+
(/[\w-]+)*
/[\w-]+\.(gif|png|jpg)
\Z
#
#
#
#
#
#
: ,
440
7. URL,
:
: .NET, Java, JavaScript, PCRE, Perl, Python
, URL - URL. ,
URL
.
URL , , .
URL , .
URL, -.
URL :
scheme://user:password@domain.name:80/path/file.ext?param=value¶m2
=value2#fragment
. URL
file: . URL http:
.
, URL , -: http, https, ftp file. ^ ( 2.5). ( 2.8) . https? http|https .
, http file, , . .+$ , , , .
441
7.1. URL
, URL,
.
( 2.18) . , , ,
. JavaScript
.
URL
FTP , HTTP FTP , .
ASCII.
(IDN) .
, . ( 2.3), .
, . ( , , Perl .)
. .* , . ,
, [/?].* ,
? ( 2.12).
, ,
URL. URL
.
- URL, , , . , www.regexbuddy.com
http://www.regexbuddy.com. URL,
, www. ftp..
(https?|ftp)://|(www|ftp)\. .
, . https? ftp ,
:// . www ftp , . -
442
7. URL,
,
.
URL ,
, ASCII,
GIF, PNG JPEG.
, . , , \w ( 2.3).
? , . .
. , .
404, . , URL.
.
2.3, 2.8, 2.9 2.12.
7.2. URL
URL . URL
, ,
.
URL :
\b(https?|ftp|file)://\S+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL :
\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|$!:,.;]*
[A-Z0-9+&@#/%=~_|$]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL . , www ftp:
443
7.2. URL
\b((https?|ftp|file)://|(www|ftp)\.)[-A-Z0-9+&@#/%?=~_|$!:,.;]*
[A-Z0-9+&@#/%=~_|$]
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
Visit http://www.somesite.com/page, where you will find more information.
URL?
http://www.somesite.com/page, : URL. , (%20). . - .
, , URL, :
http://www.somesite.com/page, where you will find more information.
, , ,
URL, , URL . \S , , . , , S , \S , \s .
.
2.3.
.
, URL . , URL, , .
\S
.
, .
,
URL. (-
444
7. URL,
2.12), URL . ,
, URL ,
. , .
3.4.
URL, , . , ,
URL.
- URL, , ,
. , www.regexbuddy.com http://www.regexbuddy.com. URL, ,
www. ftp..
(https?|ftp)://|(www|ftp)\.
. , , , . https? ftp , :// . www ftp ,
. ,
, .
.
2.3 2.6.
7.3. URL,
URL . URL
,
, URL. URL ,
,
URL.
445
7.3. URL,
\b(?:(?:https?|ftp|file)://|(www|ftp)\.)[-A-Z0-9+&@#/%?=~_|$!:,.;]*
[-A-Z0-9+&@#/%=~_|$]
|(?:(?:https?|ftp|file)://|(www|ftp)\.)[^\r\n]+
|(?:(?:https?|ftp|file)://|(www|ftp)\.)[^\r\n]+
: ,
, ,
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL
, URL.
, ,
.
,
, URL . , , URL
. URL , : https?|ftp|file www|ftp .
, , URL.
.
2.8 2.9.
446
7. URL,
7.4. URL,
\b(?:(?:https?|ftp|file)://|www\.|ftp\.)
(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#/%=~_|$?!:,.])*
(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[A-Z0-9+&@#/%=~_|$])
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
\b(?:(?:https?|ftp|file)://|www\.|ftp\.)(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)
|[-A-Z0-9+&@#/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|
[A-Z0-9+&@#/%=~_|$])
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL ,
.
, , .
- :
http://en.wikipedia.org/wiki/PC_Tools_(Central_Point_Software)
http://msdn.microsoft.com/en-us/library/aa752574(VS.85).aspx
, , , -
447
7.4. URL,
.
,
URL, . URL
Microsoft .
, ,
. , .
, ,
7.2. : , URL, , URL
, URL, (
). 7.2 ,
URL .
. :
[-A-Z0-9+&@#/%=~_|$?!:,.]
\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#/%=~_|$?!:,.]
:
[A-Z0-9+&@#/%=~_|$]
\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[A-Z0-9+&@#/%=~_|$]
( 2.8).
,
( 2.9).
\([-A-Z0-9+&@#
/%=~_|$?!:,.]* \) , .
, URL,
.
, URL
, .
, URL, ,
448
7. URL,
URL URL,
, , , .
, URL, .
URL. , . , .
, ,
(ab*c|d)* , a c ,
b d .
(ab*c|d*)* .
,
d , * d . d ,
. (d*)* dddd . , , .
,
2-1-1, 1-2-1 1-1-2.
, 2-2, 1-3 3-1. ,
.
, 2.15.
, , , - ,
URL, -
, .
.
2.8 2.9.
7.5. URL
449
7.6. URN
URL ,
7.2 7.4. :
<ahref=$&>$&</a>
: PHP, Ruby
<ahref=\&>\&</a>
: Ruby
<ahref=\g<0>>\g<0></a>
: Python
, 3.15.
. , URL,
ahref=URL URL /a , URL URL. , . ,
, . 2.20.
.
2.21, 3.15, 7.2 7.4.
7.6. URN
450
7. URL,
, :
\Aurn:
#
[a-z0-9][a-z0-9-]{0,31}:
#
[a-z0-9()+,\-.:=@;$_!*%/?#]+
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^urn:[a-z0-9][a-z0-9-]{0,31}:[a-z0-9()+,\-.:=@;$_!*%/?#]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
URN :
\burn:
#
[a-z0-9][a-z0-9-]{0,31}:
#
[a-z0-9()+,\-.:=@;$_!*%/?#]+
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
\burn:[a-z0-9][a-z0-9-]{0,31}:[a-z0-9()+,\-.:=@;$_!*%/?#]+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URN , ,
URN (), URN:
\burn:
#
[a-z0-9][a-z0-9-]{0,31}:
#
[a-z0-9()+,\-.:=@;$_!*%/?#]*[a-z0-9+=@$/]
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
\burn:[a-z0-9][a-z0-9-]{0,31}:[a-z0-9()+,\-.:=@;$_!*%/?#]*[a-z0-9+=@$/]
451
7.6. URN
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URN .
urn:, .
(Namespace
Identifier, NID). 1 32 . . ,
. ( 2.3): , 0 31 , . URN ,
.
URN c
(Namespace Specific String, NSS). .
. , , ( 2.12).
,
URN, ,
. ^ $
, Ruby, \A \Z , JavaScript. 2.5.
,
URN. URN, , , , , , , URN. , , RFC
2141. URN , NSS,
,
, URN.
452
7. URL,
, ( ) ( )
, . , , , NSS , , ,
, .
.
2.3 2.12.
7.7. URL
\A
(#
[a-z][a-z0-9+\-.]*:
(#
//
([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+
|\[[a-f0-9:.]+\]
|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
#
#
#
#
#
#
#
IP- IPv6
IP-
IPv...
(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
|#
(/?[a-z0-9\-._~%!$&()*+,;=:@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)?
)
|# URL ( )
(#
[a-z0-9\-._~%!$&()*+,;=@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
|#
(/[a-z0-9\-._~%!$&()*+,;=:@]+)+/?
)
)
#
(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
#
(\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
453
7.7. URL
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
\A
(#
(?<scheme>[a-z][a-z0-9+\-.]*):
(#
//
(?<user>[a-z0-9\-._~%!$&()*+,;=]+@)?
(?<host>[a-z0-9\-._~%]+
|
\[[a-f0-9:.]+\]
|
\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
#
#
#
#
#
#
#
#
IP- IPv6
IP-
IPv...
(?<port>:[0-9]+)?
(?<path>(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)
|#
(?<path>/?[a-z0-9\-._~%!$&()*+,;=:@]+
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)?
)
|# URL ( )
(?<path>
#
[a-z0-9\-._~%!$&()*+,;=@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
|#
(/[a-z0-9\-._~%!$&()*+,;=:@]+)+/?
)
)
#
(?<query>\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
#
(?<fragment>\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
: ,
: .NET
\A
(#
(?<scheme>[a-z][a-z0-9+\-.]*):
(#
//
(?<user>[a-z0-9\-._~%!$&()*+,;=]+@)?
(?<host>[a-z0-9\-._~%]+
|
\[[a-f0-9:.]+\]
|
\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
(?<port>:[0-9]+)?
(?<hostpath>(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)
#
#
#
#
#
#
#
#
IP- IPv6
IP-
IPv...
454
7. URL,
|#
(?<schemepath>/?[a-z0-9\-._~%!$&()*+,;=:@]+
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)?
)
|# URL ( )
(?<relpath>
#
[a-z0-9\-._~%!$&()*+,;=@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
|#
(/[a-z0-9\-._~%!$&()*+,;=:@]+)+/?
)
)
#
(?<query>\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
#
(?<fragment>\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
: ,
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\A
(#
(?P<scheme>[a-z][a-z0-9+\-.]*):
(#
//
(?P<user>[a-z0-9\-._~%!$&()*+,;=]+@)?
(?P<host>[a-z0-9\-._~%]+
|
\[[a-f0-9:.]+\]
|
\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
#
#
#
#
#
#
#
#
IP- IPv6
IP-
IPv...
(?P<port>:[0-9]+)?
(?P<hostpath>(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)
|#
(?P<schemepath>/?[a-z0-9\-._~%!$&()*+,;=:@]+
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)?
)
|# URL ( )
(?P<relpath>
#
[a-z0-9\-._~%!$&()*+,;=@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
|#
(/[a-z0-9\-._~%!$&()*+,;=:@]+)+/?
)
)
#
(?P<query>\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
#
(?P<fragment>\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
7.7. URL
455
: ,
: PCRE 4 , Perl 5.10, Python
^([a-z][a-z0-9+\-.]*:(\/\/([a-z0-9\-._~%!$&()*+,;=]+@)?([a-z0-9\-._~%]+|
\[[a-f0-9:.]+\]|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])(:[0-9]+)?
(\/[a-z0-9\-._~%!$&()*+,;=:@]+)*\/?|(\/?[a-z0-9\-._~%!$&()*+,;=:@]+
(\/[a-z0-9\-._~%!$&()*+,;=:@]+)*\/?)?)|([a-z0-9\-._~%!$&()*+,;=@]+
(\/[a-z0-9\-._~%!$&()*+,;=:@]+)*\/?|(\/[a-z0-9\-._~%!$&()*+,;=:@]+)
+\/?))
(\?[a-z0-9\-._~%!$&()*+,;=:@\/?]*)?(#[a-z0-9\-._~%!$&()*+,;=:@\/?]*)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
,
URL, , ,
URL. ,
, URL , URL.
URL.
URL ,
, , , URL, URL .
URL,
. , , .
RFC 3986 , URL.
URL, URL URL ,
. RFC 3986 , , , .
. URL , . , URL.
RFC 3986 URL, . , - - URL , RFC 3986 , %20.
456
7. URL,
URL ,
http: ftp:. . , .
: [a-z][a-z0-9+\-.]* .
IPv4,
IPv6, , IP-, . IPv6
\[[a-f0-9:.]+\] , ,
, \[v[a-f0-9][a-z0-9\._~%!$&()*+,;=:]+\] . IP- , , IPv6.
,
URL. 7.17 , IPv6 .
, , , . :[0-9]+ ,
.
457
7.7. URL
, , . , ,
. , . . .
C (/[a-z0-9\._~%!$&()*+,;=:@]+)*/? .
URL , , .
, . , , : /?
[a-z0-9\-._~%!$&()*+,;=:@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/? .
URL , , , .
, . URL , .
,
. ,
URL, . [a-z0-9\-._~%!$&()*+,;=@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/? .
, , . , , , .
(/[a-z0-9\-._~%!$&()*+,;=:@]+)+/? .
, URL, , , , , .
.
. , . . ,
URL,
\?[a-z0-9\-._~%!$&()*+,;=:@/?]* . .
. , .
458
7. URL,
URL ,
.
URL. \#[a-z0-9\._~%!$&()*+,;=:@/?]* .
URL, . 2.11 ,
, . .NET
,
, .
,
URL , / .
, path, ,
URL / .
, ,
. , , . , ,
. .
.
2.3, 2.8, 2.9 2.12.
7.8. URL
URL
^([a-z][a-z0-9+\-.]*):
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
459
7.8. URL
URL
\A
([a-z][a-z0-9+\-.]*):
(#
//
([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+
|\[[a-f0-9:.]+\]
|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
#
#
#
#
#
#
#
IP- IPv6
IP-
IPv...
(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
|#
(/?[a-z0-9\-._~%!$&()*+,;=:@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)?
)
#
(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
#
(\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^([a-z][a-z0-9+\-.]*):(//([a-z0-9\-._~%!$&()*+,;=]+@)?([a-z0-9\-._~%]+|
\[[a-f0-9:.]+\]|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?|(/?[a-z0-9\-._~%!$&()*+,;=:@]+
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?)?)(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
(#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL . ,
, URL . URL . URL, -
460
7. URL,
.
2.9, 3.9 7.7.
7.9. URL
, URL.
, jan ftp://jan@www.regexcookbook.com.
URL
^[a-z0-9+\-.]+://([a-z0-9\-._~%!$&()*+,;=]+)@
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
461
7.9. URL
URL
\A
[a-z][a-z0-9+\-.]*://
([a-z0-9\-._~%!$&()*+,;=]+)@
([a-z0-9\-._~%]+
|\[[a-f0-9:.]+\]
|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
(\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
#
#
#
#
#
#
#
#
#
IP- IPv6
IP- IPv...
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^[a-z][a-z0-9+\-.]*://([a-z0-9\-._~%!$&()*+,;=]+)@([a-z0-9\-._~%]+|
\[[a-f0-9:.]+\]|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
(#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
URL , ,
URL. , URL,
, URL .
@. @ , , URL,
@ URL.
, .
,
, , URL. [a-z0-9+\-.]+ ://.
, .
@, , . [a-z0-9\-._~%!$&()*+,;=] , .
462
7. URL,
,
URL .
URL, .
.
, -
URL ( ) . 2.9. 3.9 , , , .
,
URL, 7.7.
, URL, . , , URL, , .
. ,
, 7.8
URL,
,
, . , , URL.
, URL, , , 7.7. 7.7
( ) .
@.
@,
.
.
2.9, 3.9 7.7.
7.10. URL
463
7.10. URL
URL
\A
[a-z][a-z0-9+\-.]*://
([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+
|\[[a-z0-9\-._~%!$&()*+,;=:]+\])
#
#
#
#
IP- IPv4
IP- IPv6+
: ,
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^[a-z][a-z0-9+\-.]*://([a-z0-9\-._~%!$&()*+,;=]+@)?([a-z0-9\-._~%]+|
\[[a-z0-9\-._~%!$&()*+,;=:]+\])
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL
\A
[a-z][a-z0-9+\-.]*://
([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+
|\[[a-f0-9:.]+\]
|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
(\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
#
#
#
#
#
#
#
#
#
IP- IPv6
IP- IPv...
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^[a-z][a-z0-9+\-.]*://([a-z0-9\-._~%!$&()*+,;=]+@)?([a-z0-9\-._~%]+|
\[[a-f0-9:.]+\]|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])(:[0-9]+)?
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
(#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
URL , , URL.
464
7. URL,
\A ^ , . [a-z][az0-9+\-.]*:// , ([a-z0-9\-._~%!$&()*+,;=]+@)?
. .
,
URL . ,
. , - URL . IPv6
. 2.9. 3.9 , ,
, .
,
URL,
7.7. , URL, . . , 7.9.
, , 7.7.
, , . , , URL.
,
URL, , ,
465
7.11. URL
7.7. 7.7
( ) .
.
2.9, 3.9 7.7.
7.11. URL
, URL. , 80 http://www.regexcookbook.com:80/.
URL
\A
[a-z][a-z0-9+\-.]*://
([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+
|\[[a-z0-9\-._~%!$&()*+,;=:]+\])
:(?<port>[0-9]+)
#
#
#
#
#
IPv4
IPv6+
: ,
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^[a-z][a-z0-9+\-.]*://([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+|\[[a-z0-9\-._~%!$&()*+,;=:]+\]):([0-9]+)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL
\A
[a-z][a-z0-9+\-.]*://
([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+
|\[[a-f0-9:.]+\]
|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]+\])
:([0-9]+)
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?
(\?[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
(\#[a-z0-9\-._~%!$&()*+,;=:@/?]*)?
\Z
#
#
#
#
#
#
#
#
#
IPv6
IP- IPv...
466
7. URL,
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^[a-z][a-z0-9+\-.]*:\/\/([a-z0-9\-._~%!$&()*+,;=]+@)?
([a-z0-9\-._~%]+|\[[a-f0-9:.]+\]|\[v[a-f0-9][a-z0-9\-._~%!$&()*+,;=:]
+\]):([0-9]+)(\/[a-z0-9\-._~%!$&()*+,;=:@]+)*\/?
(\?[a-z0-9\-._~%!$&()*+,;=:@\/?]*)?(#[a-z0-9\-._~%!$&()*+,;=:@\/?]*)?$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
URL , ,
URL. \A ^ , . [a-z][a-z0-9+\-.]*:// , ([a-z0-9\-._~%!$&()*+,;=]+@)? . ([a-z0-9\-._~%]+|\[[a-z0-9\._~%!$&()*+,;=:]+\]) .
,
.
,
[0-9]+ .
,
URL . , ,
. , -
URL .
. 2.9. 3.9 , ,
, .
,
URL,
7.7. , URL, . . , 7.10.
, ,
7.12 URL
467
, , , .
3.
,
URL, , , 7.7. 7.7 ( ) .
.
2.9, 3.9 7.7.
7.12 URL
, URL. ,
/index.html http://www.regexcookbook.
com/index.html /index.html#fragment.
URL, , URL.
URL, , :
\A
# ,
([a-z][a-z0-9+\-.]*:(//[^/?#]+)?)?
#
([a-z0-9\-._~%!$&()*+,;=:@/]*)
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^([a-z][a-z0-9+\-.]*:(//[^/?#]+)?)?([a-z0-9\-._~%!$&()*+,;=:@/]*)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL, , URL.
URL, :
\A
# ,
([a-z][a-z0-9+\-.]*:(//[^/?#]+)?)?
468
7. URL,
#
(/?[a-z0-9\-._~%!$&()*+,;=@]+(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?|/)
# , URL
([#?]|\Z)
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^([a-z][a-z0-9+\-.]*:(//[^/?#]+)?)?(/?[a-z0-9\-._~%!$&()*+,;=@]+
(/[a-z0-9\-._~%!$&()*+,;=:@]+)*/?|/)([#?]|$)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL, , URL. URL,
, :
\A
# ,
(?>([a-z][a-z0-9+\-.]*:(//[^/?#]+)?)?)
#
([a-z0-9\-._~%!$&()*+,;=:@/]+)
: ,
: .NET, Java, PCRE, Perl, Ruby
, , , ,
URL. 7.7 , / URL, ,
URL .
\A
^, . [a-z]
[a-z0-9+\-.
]*: , //[^/?#]+ . ,
, , URL , . ( ), ( ) ( ). ,
, ( 2.3).
469
7.12 URL
, , :
(//[^/?#]+)? . URL.
, .
, ,
,
.
, URL, [a-z0-9\._~%!$&()*+,;=:@/]* , . , .
,
. , ,
URL. - , .
7.7
URL / . ,
, .
, . URL http://www.
regexcookbook.com, , . . , .
( ), .
, .
, . (, , 2.13.) , , , , : , (//[^/?#]+)? .
[a-z0-9\-._~%!$&()*+,;=:@/]+ //www.regexcookbook.
com, , , -
470
7. URL,
.
, , . , ,
http .
. , ,
URL,
. , URL .
, . ( 2.15) , , JavaScript Python. ,
. , , ,
, ,
, . ,
, .
. ,
,
null JavaScript.
,
URL, 7.7. .NET ,
.NET path ,
URL. , ,
: hostpath, schemepath
relpath. -,
, , . , .
,
7.7.
6, 7 8. ,
471
7.13. URL
.
2.9, 3.9 7.7.
7.13. URL
^[^?#]+\?([^#]+)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL ,
,
URL. URL . , URL.
^[^?#]+\? .
, , . ^ ( 2.5),
^ ( 2.3).
URL () , . , , URL
URL, ,
\? ^[^?#]+\? .
URL, . URL .
472
7. URL,
, , [^#]+
.
, .
URL, . URL, , URL,
[^#]+ , , . , URL (
) . 2.9. 3.9 ,
, , .
,
URL,
7.7. 7.7
, URL,
12.
.
2.9, 3.9 7.7.
7.14. URL
, URL.
, top http://www.regexcookbook.
com#top /index.html#top.
#(.+)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
URL ,
, URL. URL
473
7.15.
. URL,
, URL.
, .
#.+ . , .
URL, . , , URL. , #,
.
,
URL,
7.7. 7.7
, URL,
13.
.
2.9, 3.9 7.7.
7.15.
,
, .
, :
^([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
\A([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
:
\b([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}\b
474
7. URL,
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
63 :
\b((?=[a-z0-9-]{1,63}\.)[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,63}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
punycode:
\b((xn--)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
63 ,
punycode:
\b((?=[a-z0-9-]{1,63}\.)(xn--)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,63}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
domain.tld subdomain.domain.tld,
.
(top-level domain, tld) .
: [a-z]{2,} .
, .
.
[a-z0-9]+(-[a-z0-9]+)* . ,
, ,
. , ( 2.3), ,
.
, \. . URL
, , :
475
7.15.
([a-z0-9]+(-[a-z0-9]+)*\.)+.
,
, , .
, , ,
. ^ $
Ruby \A \Z JavaScript. 2.5.
,
\b ( 2.6).
,
63 .
, , , [a-z0-9]+(-[a-z0-9]+)* ,
,
63 .
[-a-z0-9]{1,63} , 63 , \b([a-z0-9]{1,63}\.)+[a-z]{2,63} , . .
,
. , , 2.16.
[a-z0-9]+(-[a-z09]+)*\. , ,
, [-a-z0-9]{1,63}\. , ,
63 . (?=[-a-z0-9]{1,63}\.)
[a-z0-9]+(-[a-z0-9]+)*\. .
(?=[-a-z0-9]{1,63}\.) ,
1 63 , . .
63 - 63 , . ,
63 .
. ,
[a-z0-9]+(-[a-z0-9]+)*\. , . , -
476
7. URL,
63 , ,
.
(Internationalized Domain Names, IDN) . , . , .es
,
.
, punycode.
punycode, , , ,
, . , , punycode, xn--. ,
(xn--)? , .
.
2.3, 2.12 2.16.
7.16. IPv4
, IPv4 255.255.255.255.
32- .
, IP-:
^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, IP-:
^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
477
7.16. IPv4
, IP-
:
\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, IP-
:
\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, IP-:
^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, IP-:
^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
Perl
if ($subject =~ m/^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})/)
{
$ip = $1 << 24 | $2 << 16 | $3 << 8 | $4;
}
IP- 4 255.255.255.255,
0 255. IP- .
.
, .
IP- [0-9]
{1,3} . 0 999,
0 255. ,
478
7. URL,
, IP-,
IP- .
IP- 25[0-5]|2[04][0-9]|[01]?[0-9][0-9]? .
0 255,
10 99
0 9. 25[0-5]
250 255, 2[0-4][0-9]
200 249 [01]?[0-9][0-9]? 0
199, . 6.5.
, IP, , .
, , 2.5. IP-
, ,
\b
( 2.6).
(?:number\.){3}
number . IP-
( 2.9), ( 2.12). ,
IP-. IP-.
.
IP- , .
. , , , .
, IP-.
32- . , , Perl $1, $2, $3 $4. 3.9. Perl , , (<<). String.toInteger() , .
7.17. IPv6
479
.
2.3, 2.8, 2.9 2.12.
7.17. IPv6
,
IPv6 , / .
IPv6 ,
16-
, (: 1762:0:0:0:0:B03:1:AF18).
.
, IPv6 :
^(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
\A(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
IPv6 :
(?<![:.\w])(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}(?![:.\w])
:
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8
. , IPv6
.
:
\b(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
480
7. URL,
IPv6 , 16- , .
, . . . IPv4 IPv6
IPv4 IPv6. IPv6 : 1762:0:0:0:0:B03:127.32.67.15.
, IPv6 :
^(?:[A-F0-9]{1,4}:){6}(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
IPv6 :
(?<![:.\w])(?:[A-F0-9]{1,4}:){6}
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?![:.\w])
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
JavaScript Ruby 1.8
. , IPv6
.
:
\b(?:[A-F0-9]{1,4}:){6}(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
IPv6
.
, IPv6 :
\A
(?:[A-F0-9]{1,4}:){6}
(?:[A-F0-9]{1,4}:[A-F0-9]{1,4}
#
# 6
# 2
481
7.17. IPv6
| (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} # 4
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
)\Z
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:[A-F0-9]{1,4}:){6}(?:[A-F0-9]{1,4}:[A-F0-9]{1,4}|
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
IPv6 :
(?<![:.\w])
(?:[A-F0-9]{1,4}:){6}
(?:[A-F0-9]{1,4}:[A-F0-9]{1,4}
| (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
)(?![:.\w])
#
#
#
#
6
2
4
: ,
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8
. , IPv6
.
:
\b
(?:[A-F0-9]{1,4}:){6}
(?:[A-F0-9]{1,4}:[A-F0-9]{1,4}
| (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
)\b
#
#
#
#
6
2
4
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
\b(?:[A-F0-9]{1,4}:){6}(?:[A-F0-9]{1,4}:[A-F0-9]{1,4}|
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))\b
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
482
7. URL,
IPv6 .
,
,
, , ,
. , .
, , .
IP-,
. IP- ,
, .
, 1762::B03:1:AF18 1762:0:0:0:0:B03:1:AF18 . , IPv6. , IPv6
:
\A(?:
#
(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}
# , 7
|(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}
\Z) #
# 1
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:)
)\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}|
(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}$)(([0-9A-F]{1,4}:){1,7}|:)
((:[0-9A-F]{1,4}){1,7}|:))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
IPv6 :
(?<![:.\w])(?:
#
(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}
# , 7
|(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}
(?![:.\w])) #
7.17. IPv6
483
# 1
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:)
)(?![:.\w])
: ,
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8
, , IPv6
. ,
,
:
(?:
#
(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}
# , 7
|(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}
(?![:.\w])) #
# 1
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:)
)(?![:.\w])
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
(?:(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}|(?=(?:[A-F0-9]{0,4}:){0,7}
[A-F0-9]{0,4}(?![:.\w]))(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4})
{1,7}|:))(?![:.\w])
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
IPv6 . , , , ,
, . ,
.
, ,
, . , , -
484
7. URL,
.
IP-, , .
, 1762::B03:127.32.67.15 1762:0:0:0:0:B03:127.32.67.15.
IPv6 ,
, .
, IPv6 :
\A
(?:
#
(?:[A-F0-9]{1,4}:){6}
# , 6
|(?=(?:[A-F0-9]{0,4}:){0,6}
(?:[0-9]{1,3}\.){3}[0-9]{1,3} # 4
\Z) #
# 1
(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:)
)
# 255.255.255.
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
# 255
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:(?:[A-F0-9]{1,4}:){6}|(?=(?:[A-F0-9]{0,4}:){0,6}(?:[0-9]{1,3}\.)
{3}[0-9]{1,3}$)(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:))
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4]
[0-9]|[01]?[0-9][0-9]?)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
IPv6
:
(?<![:.\w])
(?:
#
(?:[A-F0-9]{1,4}:){6}
# , 6
|(?=(?:[A-F0-9]{0,4}:){0,6}
(?:[0-9]{1,3}\.){3}[0-9]{1,3} # 4
(?![:.\w])) #
# 1
(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:)
7.17. IPv6
485
)
# 255.255.255.
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
# 255
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
(?![:.\w])
: ,
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8
, , IPv6
. ,
,
:
(?:
#
(?:[A-F0-9]{1,4}:){6}
# , 6
|(?=(?:[A-F0-9]{0,4}:){0,6}
(?:[0-9]{1,3}\.){3}[0-9]{1,3} # 4
(?![:.\w])) #
# 1
(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:)
)
# 255.255.255.
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
# 255
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
(?![:.\w])
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
(?:(?:[A-F0-9]{1,4}:){6}|(?=(?:[A-F0-9]{0,4}:){0,6}(?:[0-9]{1,3}\.){3}
[0-9]{1,3}(?![:.\w]))(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:))
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?![:.\w])
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
IPv6 , : , .
486
7. URL,
, IPv6:
\A(?:
#
(?:
#
(?:[A-F0-9]{1,4}:){6}
# , 6
|(?=(?:[A-F0-9]{0,4}:){0,6}
(?:[0-9]{1,3}\.){3}[0-9]{1,3} # 4
\Z) #
# 1
(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:)
)
# 255.255.255.
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
# 255
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
|#
(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}
|# , 7
(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}
\Z) #
# 1
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:)
)\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:(?:(?:[A-F0-9]{1,4}:){6}|(?=(?:[A-F0-9]{0,4}:){0,6}(?:[0-9]{1,3}\.){3}
[0-9]{1,3}$)(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:))
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)|(?:[A-F0-9]{1,4}:){7}
[A-F0-9]{1,4}|(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}$)
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
IPv6 , :
(?<![:.\w])(?:
#
(?:
#
(?:[A-F0-9]{1,4}:){6}
# , 6
|(?=(?:[A-F0-9]{0,4}:){0,6}
(?:[0-9]{1,3}\.){3}[0-9]{1,3} # 4
(?![:.\w])) #
7.17. IPv6
487
# 1
(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:)
)
# 255.255.255.
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
# 255
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
|#
(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}
|# , 7
(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}
(?![:.\w])) #
# 1
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:)
)(?![:.\w])
: ,
: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript Ruby 1.8 , , IPv6 .
,
, :
(?:
#
(?:
#
(?:[A-F0-9]{1,4}:){6}
# , 6
|(?=(?:[A-F0-9]{0,4}:){0,6}
(?:[0-9]{1,3}\.){3}[0-9]{1,3} # 4
(?![:.\w])) #
# 1
(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:)
)
# 255.255.255.
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
# 255
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
|#
(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}
|# , 7
(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}
(?![:.\w])) #
# 1
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:)
)(?![:.\w])
488
7. URL,
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
(?:(?:(?:[A-F0-9]{1,4}:){6}|(?=(?:[A-F0-9]{0,4}:){0,6}(?:[0-9]{1,3}\.){3}
[0-9]{1,3}(?![:.\w]))(([0-9A-F]{1,4}:){0,5}|:)((:[0-9A-F]{1,4}){1,5}:|:))
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|
[01]?[0-9][0-9]?)|(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}|
(?=(?:[A-F0-9]{0,4}:){0,7}[A-F0-9]{0,4}(?![:.\w]))
(([0-9A-F]{1,4}:){1,7}|:)((:[0-9A-F]{1,4}){1,7}|:))(?![:.\w])
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
- IPv6
, IPv4.
. : .
, .
, , .
.
, IPv6,
IP-
.
IP- , 2.5. JavaScript ^
$ , Ruby \A
\Z . . Ruby ^ $ ,
, . Ruby
,
.
IPv6
(? ![:.\w]) (?![:.\w]) ,
(, ), . , . 2.16. ,
489
7.17. IPv6
,
() . . , ,
, ,
. 2.6.
IPv6
. , . [A-F0-9]{1,4} 1 4 , 16- .
( 2.3) . .
3.4.
(?:[A-F0-9]{1,4}:){7} , .
. , 2.9, .
,
. .
, .
IPv6 . (?:[A-F0-9]
{1,4}:){6} , ,
.
IPv4. 7.16.
490
7. URL,
, , .
32
IPv6. 16- ,
4 , IPv4.
, , . 32 . 2.8, ( ) . ,
.
, , .
IPv4.
,
. ,
. , 1:0:0:0:0:6:0:0,
1::6:0:0 1:0:0:0:0:6::
IPv6. , . ,
,
.
. IPv6 ,
,
. :
(
([0-9A-F]{1,4}:){1,7}
| :
)
(
(:[0-9A-F]{1,4}){1,7}
| :
)
# 1 7
#
# 1 7
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
7.17. IPv6
491
IPv6 , , .
,
, JavaScript,
.
JavaScript , , ,
, .
.
1 7 , , , .
1 7 ,
, , . ,
1 7 , 1 7 , 1 7 .
. 1 7
, , ,
, c
7. IPv6 8 . , , , ,
7.
.
1 7 .
, 7 -
.
, . , -,
aaaaxbbb. 1 8 0 7 a, x 0 7 b.
. , . ,
.
, .
492
7. URL,
\A(?:a{7}x
| a{6}xb?
| a{5}xb{0,2}
| a{4}xb{0,3}
| a{3}xb{0,4}
| a{2}xb{0,5}
| axb{0,6}
| xb{0,7}
)\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
a . b,
a x. , .
, , IPv6 . , , 2.16. , .
\A
(?=[abx]{1,8}\Z)
a{0,7}xb{0,7}
\Z
:
: .NET, Java, PCRE, Perl, Python, Ruby
\A . . 1 8 a , b / x , , . \Z .
,
.
\A \Z -. aaaaxbbb , . , ,
- , ,
, ,
493
7.17. IPv6
. ,
.
, . , a{0,7} ,
. ,
, ,
.
, a{0,7}xb{0,7}
15 , ,
8-, 8 . ,
a{0,7}xb{0,7} , . a*xb*
, a{0,7}xb{0,7} .
\Z
.
, . ,
- axba,
1 8
.
, , .
, ,
, .
, IPv6 ,
IPv4, .
,
, , .
.
()
,
IPv4 . ,
7.16, IPv4,
,
.
494
7. URL,
IPv4 ,
, , IPv6 . ,
IPv4. IPv4 , . IPv4, , , .
,
. IPv6 :
, .
()
. , , IPv6.
, . IPv6
, , . IPv6 . () .
,
,
.
IPv4.
:
^(6words|compressed6words)ip4$
:
^(8words|compressed8words)$
:
^((6words|compressed6words)ip4|8words|compressed8words)$
:
^((6words|compressed6words)ip4|(8words|compressed8words))$
7.18. Windows
495
.
2.16 7.16.
7.18. Windows
,
Microsoft
Windows.
\A
(?:[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\ #
(?:[^\\/:*?<>|\r\n]+\\)*
#
[^\\/:*?<>|\r\n]*
#
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\(?:[^\\/:*?<>|\r\n]+\\)*
[^\\/:*?<>|\r\n]*$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
UNC
\A
(?:[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\| #
(?:[^\\/:*?<>|\r\n]+\\)*
#
[^\\/:*?<>|\r\n]*
#
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\(?:[^\\/:*?<>|\r\n]+\\)*
[^\\/:*?<>|\r\n]*$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
496
7. URL,
, UNC
\A
(?:(?:[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\|
\\?[^\\/:*?<>|\r\n]+\\?)
(?:[^\\/:*?<>|\r\n]+\\)*
[^\\/:*?<>|\r\n]*
\Z
#
#
#
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:(?:[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\|\\?[^\\/:*?<>|\r\n]+\\?)
(?:[^\\/:*?<>|\r\n]+\\)*[^\\/:*?<>|\r\n]*$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
,
, .
, . [a-z]:\\ . , , , .
Windows ,
: \/:*? |. . ,
,
[^\\/:*? |\r\n]+ . , . \r \n . 2.3. ( 2.12) ,
.
.
(?:[^\\/:*? |\r\n]+\\)* , , ,
( 2.9) ,
( 2.12).
[^\\/:*?
|\r\n]* . , -
497
7.18. Windows
.
, ,
* + .
UNC
,
,
(Universal Naming Convention, UNC).
UNC \\server\share\folder\file.
,
, , UNC. [az]: , , ,
.
(?:[a-z]:|\\\\[a-z0-9_.$]+\\
[a-z0-9_.$]+) . ( 2.8).
[a-z]:
\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+ .
.
, . 2.9, (?: . .
. UNC , .
, UNC
(, .., ) . ,
, . ,
.
\\?[^\\/:*? |\r\n]+\\? . ,
. \\? , , . [^\\/:*? |\r\n]+ .
, \\? , , , -
498
7. URL,
.
, \\? ,
. , , , , , , .
, ,
, . ,
, , .
, ,
, . , , . , .
, .
, , , . .
.
2.3, 2.8, 2.9 2.12.
7.19. Windows
, , Microsoft
Windows. ,
Windows, ,
.
\A
(?<drive>[a-z]:)\\
(?<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
7.19. Windows
499
(?<file>[^\\/:*?<>|\r\n]*)
\Z
: ,
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\A
(?P<drive>[a-z]:)\\
(?P<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?P<file>[^\\/:*?<>|\r\n]*)
\Z
: ,
: PCRE 4 , Perl 5.10, Python
\A
([a-z]:)\\
((?:[^\\/:*?<>|\r\n]+\\)*)
([^\\/:*?<>|\r\n]*)
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^([a-z]:)\\((?:[^\\/:*?<>|\r\n]+\\)*)([^\\/:*?<>|\r\n]*)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
UNC
\A
(?<drive>[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\
(?<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?<file>[^\\/:*?<>|\r\n]*)
\Z
: ,
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\A
(?P<drive>[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\
(?P<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?P<file>[^\\/:*?<>|\r\n]*)
\Z
: ,
: PCRE 4 , Perl 5.10, Python
500
7. URL,
, UNC
.
.
\A
(?<drive>[a-z]:\\|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+\\|\\?)
(?<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?<file>[^\\/:*?<>|\r\n]*)
\Z
: ,
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\A
(?P<drive>[a-z]:\\|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+\\|\\?)
(?P<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?P<file>[^\\/:*?<>|\r\n]*)
\Z
: ,
: PCRE 4 , Perl 5.10, Python
\A
([a-z]:\\|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+\\|\\?)
((?:[^\\/:*?<>|\r\n]+\\)*)
([^\\/:*?<>|\r\n]*)
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^([a-z]:\\|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+\\|\\?)
((?:[^\\/:*?<>|\r\n]+\\)*)([^\\/:*?<>|\r\n]*)$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
. ,
, .
, ,
,
501
7.19. Windows
. ,
: drive (), folder () file ().
( 2.11), .
: 1, 2 3. 3.9 , , , .
UNC
, UNC,
.
, UNC
, . , . . , , .
, UNC
, .
, .
, . ,
, , .
Windows. , , ,
, . , , , .
: .
, , (
), , ( ):
502
7. URL,
\A
(?:
(?<drive>[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\
(?<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?<file>[^\\/:*?<>|\r\n]*)
| (?<relativefolder>\\?(?:[^\\/:*?<>|\r\n]+\\)+)
(?<file2>[^\\/:*?<>|\r\n]*)
| (?<relativefile>[^\\/:*?<>|\r\n]+)
)
\Z
: ,
: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\A
(?:
(?P<drive>[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\
(?P<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?P<file>[^\\/:*?<>|\r\n]*)
| (?P<relativefolder>\\?(?:[^\\/:*?<>|\r\n]+\\)+)
(?P<file2>[^\\/:*?<>|\r\n]*)
| (?P<relativefile>[^\\/:*?<>|\r\n]+)
)
\Z
: ,
: PCRE 4 , Perl 5.10, Python
\A
(?:
([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\
((?:[^\\/:*?<>|\r\n]+\\)*)
([^\\/:*?<>|\r\n]*)
| (\\?(?:[^\\/:*?<>|\r\n]+\\)+)
([^\\/:*?<>|\r\n]*)
| ([^\\/:*?<>|\r\n]+)
)
\Z
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
^(?:([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\((?:[^\\/:*?<>|\r\n]+\\)*)
([^\\/:*?<>|\r\n]*)|(\\?(?:[^\\/:*?<>|\r\n]+\\)+)([^\\/:*?<>|\r\n]*)|
([^\\/:*?<>|\r\n]+))$
:
: .NET, Java, JavaScript, PCRE, Perl, Python
503
7.20. Windows
- ,
, , .
,
, .
.NET
. .NET ,
. .NET,
folder () file (),
, folder file :
\A
(?:
(?<drive>[a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\
(?<folder>(?:[^\\/:*?<>|\r\n]+\\)*)
(?<file>[^\\/:*?<>|\r\n]*)
| (?<folder>\\?(?:[^\\/:*?<>|\r\n]+\\)+)
(?<file>[^\\/:*?<>|\r\n]*)
| (?<file>[^\\/:*?<>|\r\n]+)
)
\Z
: ,
: .NET
.
2.9, 2.11, 3.9 7.18.
7.20.
Windows
, ()
Windows .
, . , c c:\folder\file.ext.
504
7. URL,
^([a-z]):
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, ,
, .
, UNC.
Windows
, , . , ,
, , .
^ ( 2.5).
, Ruby , ,
Windows . [a-z] ( 2.3).
( ),
, . , , .
.
2.9, .
3.9, , , , .
7.19, , Windows.
7.21.
UNC
, ()
Windows .
7.21. UNC
505
UNC,
, . , server share \\server\share\folder\file.ext.
^\\\\([a-z0-9_.$]+)\\([a-z0-9_.$]+)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, , ,
UNC.
.
UNC .
Windows , , UNC. ,
, ,
^ ( 2.5).
, Ruby , ,
Windows . \\\\
. ,
, . [a-z0-9_.$]+ . , , .
, , , . \\server\share.
.
2.9, .
3.9, , , , .
7.19, , Windows.
506
7. URL,
7.22. Windows
, ()
Windows . . , \folder\subfolder\ c:\folder\subfolder\file.ext
\\server\share\folder\subfolder\file.ext.
^([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)?((?:\\|^)
(?:[^\\/:*?<>|\r\n]+\\)+)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, Windows, ,
UNC, . .
, ^([a-z]:|\\\\[a-z0-9_.$]+\\[a-z09_.$]+)? , , . . ,
7.20,
, UNC, 7.21. 2.8.
, , .
, .
(?:[^\\/:*?|\r\n]+\\)+. .
,
,
. . , , . , , , .
, .
. ,
507
7.22. Windows
, , .
,
.
.
, ,
, e\, , \\server\share\. , , , (\\|^) , \\? .
, \\server\shar , e\ ,
2.13. , . :
^([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)?
((?:\\?(?:[^\\/:*?<>|\r\n]+\\)+)
, , , , , , . \\server\share \\server\
share
, , , .
, [a-z09_.$]+ , , .
+ , .
, , .
, , : e\.
, (?:[^\\/:*? |\r\n]+\\)+ , . , .
(\\|^) \\? , . ,
, .
,
508
7. URL,
, . , ,
, (\\|^) . , . , , , ,
, . (\\|^)
,
, (?:[^\\/:*? |\r\n]+\\)+
,
,
.
7.18 7.19,
.
, ,
, , .
, ,
7.19.
, , .
.
.
, , .
.
2.9, .
3.9, , , , .
7.19, , Windows.
7.23. Windows
, ()
Windows .
. ,
file.ext c:\folder\file.ext.
7.23. Windows
509
[^\\/:*?<>|\r\n]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , , , .
.
, , , /
.
$ ( 2.5).
, Ruby , ,
Windows . [^\\/:*? |\r\n]+ ( 2.3) , . , , ,
.
, ,
,
. , ,
.
.
3.7, , , , .
7.19, , Windows.
510
7. URL,
7.24.
Windows
, ()
Windows . , . , .ext c:\folder\file.ext.
\.[^.\\/:*?<>|\r\n]+$
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, 7.23.
, . 7.23 . 7.23
, .
.
\. ,
.
, Version 2.0.txt, . ,
. .
, . ,
. $ , .txt, .0.
, , .
, ,
, .
.
7.19, , Windows.
7.25.
511
7.25.
[\\/:*?<>|]+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
: .NET, Java, JavaScript, PHP, Perl,
Python, Ruby
Windows \/:*? |.
, , - .
[\\/:*? |] . , . .
+ . , . , , ,
,
,
.
512
7. URL,
, .
.
3.14, ,
.
,
: HTML, XHTML, XML, CSV INI.
, , ,
, ,
. , .
,
.
, , , . , ,
,
, ,
(, ). , .
, , . , , ,
.
(Hypertext Markup Language, HTML)
HTML ,
- . HTML-, ,
514
8.
. HTML, -, - HTML.
HTML:
( , ), , . HTML 4.01, 1999 , .
HTML . , . ( , , , )
( , , ).
(, <html>), (, </html>). , . ,
. ,
(, <div><div></div></div> , <div><span></div></span> ).
( <p>, ) . ,
. ( <br>, ), . .
HTML AZ. . .
<script> <style> ,
.
</style> </script>,
, .
,
, .
-. , <a>
(anchor ) Click me!:
8.
515
<a href=http://www.regexcookbook.com
title = Regex Cookbook>Click me!</a>
,
. . ,
, ( ). , A-Z, a-z, 0-9,
, , ( , <^[-.0-9:A-Z_a-z]+$>). ( selected checked, ) , , . ,
, . ,
(, selected=selected).
AZ. . , .
HTML 4 252
( ).
&#nnnn;
&#xhhhh;, nnnn 0-9, hhhh 0-9 A-F ( ). &; ( ,
HTML) , ,
, (< >), (") (&).
( ,
0xA0), ,
HTML ,
. , ,
, . (&) .
HTML :
<!-- -->
<!-- ,
-->
516
8.
, , HTML.
, HTML
, . , , , HTML . , , ,
OReilly (Chuck
Musciano) (Bill Kennedy) HTML & XHTML: The
Definitive Guide1, .
HTML
XHTML XML ( ), , .
, HTML XHTML. ,
6- . . . .: -, 2008.
8.
517
, HTML, ,
HTML, :
XHTML XML,
<?xml version=1.0 encoding=UTF-8?>.
.
/>.
XML ,
HTML, .
, . .
.
518
8.
XML XML,
<?xml version=1.0 encoding=UTF-8?>, , . , <?xml-stylesheet type=text/xsl href=transform.xslt?> , transform.xslt XSL.
DOCTYPE , . :
<!DOCTYPE example [
<!ENTITY copy ©>
<!ENTITY copyright-notice Copyright © 2008, OReilly Media>
]>
CDATA .
<![CDATA[ ]]>.
. ,
/>.
XML ( , , )
. AZ, az,
(:) (_), 09, (-) (.).
8.4.
. .
, , XML , , XML. (
HTML, )
.
XML
HTML XHTML, , . , XML, XML, XHTML HTML.
8.
519
520
8.
CSV, , . , :
aaa,b b,c cc
1,,333, three,
still more threes
. 8.1 CSV .
8.1. CSV
aaa
b b
c cc
( )
, CSV, , , CSV.
csv
,
. ( ),
.
(INI)
INI
. , , , .
INI, .
INI -, , .
, .
,
, . ,
.
.
, .
8.1. XML
521
. .
INI ( ),
(user post) (name, title content):
; last modified 2008-12-25
[user]
name=J. Random Hacker
[post]
title = Regular Expressions Rock!
content = Let me count the ways...
8.1. XML
HTML, XHTML
XML, , , , - .
, ,
. , , ,
. ,
, ,
, .
, , , .
, , , (X)HTML ( ) , , .
,
,
.
, ,
, . -
522
8.
<, >:
<[^>]*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
>
;
.
,
(X)HTML. ,
> :
<(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, :
<
(?: [^>] # ...
| [^]* # ...
| [^]* #
)*
>
:
: .NET, Java, PCRE, Perl, Python, Ruby
,
. , JavaScript,
, JavaScript .
(X)HTML ( )
> , (X)HTML. , , , , DOCTYPE < .
, ,
,
. , , . -
523
8.1. XML
1 , :
</?([A-Za-z][^\s>/]*)(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<
/?
([A-Za-z][^\s>/]*)
(?: [^>]
| [^]*
| [^]*
)*
>
#
#
#
#
#
1
...
...
:
: .NET, Java, PCRE, Perl, Python, Ruby
, JavaScript, .
(X)HTML ()
, ,
, (X)HTML, . , . , , (X)
HTML, ,
(, , , ). HTML XHTML,
. 1 2 ( ), , :
<(?:([A-Z][-:A-Z0-9]*)(?:\s+[A-Z][-:A-Z0-9]*(?:\s*=\s*(?:[^]*|
[^]*|[-.:\w]+))?)*\s*/?|/([A-Z][-:A-Z0-9]*)\s*)>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
:
524
8.
<
(?:
([A-Z][-:A-Z0-9]*)
(?:
\s+
[A-Z][-:A-Z0-9]*
(?:
\s*=\s*
(?: [^]*
| [^]*
| [-.:\w]+
)
)?
)*
\s*
/?
|
/
([A-Z][-:A-Z0-9]*)
\s*
) #
> #
#
# ...
# 1
# ...
# ...
#
#
#
-
#
#
#
(HTML)
#
#
(HTML)
#
#
# (XHTML)
# ...
#
# 2
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
XML ()
XML , , .
HTML , :
<(?:([_:A-Z][-.:\w]*)(?:\s+[_:A-Z][-.:\w]*\s*=\s*(?:[^]*|[^]*))*\s*
/?|/([_:A-Z][-.:\w]*)\s*)>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<
(?:
([_:A-Z][-.:\w]*)
(?:
\s+
[_:A-Z][-.:\w]*
\s*=\s*
(?: [^]*
#
# ...
# 1
# ...
# ...
#
#
-
#
8.1. XML
525
| [^]*
#
#
)*
#
\s*
#
/?
#
|
# ...
/
#
([_:A-Z][-.:\w]*) # 2
\s*
#
)
#
>
#
)
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
,
(X)HTML, 1 2, . ,
XML, , (X)HTML,
, HTML (
). ,
.
, XML- ,
, , . -
XML (X)HTML, .
, , ,
. , , HTML, , (Document Object
Model, DOM). SAX XPath. ,
.
526
8.
, , .
, XML .
. HTML XHTML,
, .
, HTML
XHTML, <br />
XHTML.
,
, , .
XML (X)HTML.
, ,
,
,
. , , , .
[^>] , .
, ( .*? ) ( JavaScript [\s\S]*? ). ( <.*> ) ,
< > ,
.
8.1. XML
527
. :
<div>
</div>
<div class=box>
<div id=pandoras-box class=box />
<!-- comment -->
<!DOCTYPE html>
<< < w00t! >
<>
, .
,
<input> <input type=button value=>>> <input
type=button onclick=alert(2 > 1)>.
>, . , CDATA XML, DOCTYPE, <script> ,
>.
- , , , ,
, .
>
,
, , , , . ,
XML- ,
, .
, >, . , <input> : <input type=button value=>>> <input type=button
onclick=alert(2 > 1)>.
[^>], ,
( ), ( ). , , . , . -
528
8.
, , , ( , ).
( [^]* [^]* ). , >,
, .
,
,
.
, ,
.
(!)
,
,
* + ( [^>] ). , , .
.
, .
<, >, , , 2.15.
( ) ,
<, , . !
, (
8.1. XML
529
JavaScript Python , ), , . ,
. ,
( ), .
:
<(?>(?:(?>[^>]+)|[^]*|[^]*)*)>
:
: .NET, Java, PCRE, Perl, Ruby
:
<(?:[^>]++|[^]*|[^]*)*+>
:
: Java, PCRE, Perl 5.10, Ruby 1.9
(X)HTML ( )
,
- (X)HTML .
, , , . , HTML, , ,
, ,
.
, ,
(<) AZ az,
/ ( ). <
, , DOCTYPE, XML, CDATA
530
8.
. -,
, , <textarea> . (X)HTML XML . 534 ,
. , .
< . /? , , . ([A-Za-z][^\s>/]*) ,
1. (, ),
( , ). .
, [A-Za-z] , .
, [^\s>/] ,
, . ( \s , ), > ( ) / (
> XHTML).
( )
. ,
. ,
, , DOM , , , .
, ,
: (?:[^>]|[^]*|[^]*)* . , , .
, ,
() :
<([A-Za-z][^\s>/]*)(?:[^>/]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
(/), , .
531
8.1. XML
<([A-Za-z][^\s>/]*)(?:[^>]|[^]*|[^]*)*/>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
<([A-Za-z][^\s>/]*)(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. /? , < .
</([A-Za-z][^\s>/]*)(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
. ,
,
. , , , , .
(!) . 528 ,
. , ,
[^\s>/] , , , , .
,
. , // , :
532
8.
</?([A-Za-z](?>[^\s>/]*))(?>(?:(?>[^>]+)|[^]*|[^]*)*)>
:
: .NET, Java, PCRE, Perl, Ruby
</?([A-Za-z][^\s>/]*+)(?:[^>]++|[^]*|[^]*)*+>
:
: Java, PCRE, Perl 5.10, Ruby 1.9
(X)HTML ()
, , HTML XHTML,
,
. :
AZ az,
AZ, az, 09,
( :
^[-:A-Za-z0-9]+$ ).
, .
, ( )
(/).
( , ), 1 2, . ,
.
. 1:
<([A-Z][-:A-Z0-9]*)(?:\s+[A-Z][-:A-Z0-9]*(?:\s*=\s*
(?:[^]*|[^]*|[-.:\w]+))?)*\s*/?>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
533
8.1. XML
/? , > ,
, . , .
(
/ ), .
</([A-Z][-:A-Z0-9]*)\s*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
.
, ,
. , , * , + ? ( ). , , .
XML ()
XML , .
XML,
, XML, .
(X)HTML (), HTML, XML:
(, ). , .
XML ( -
534
8.
) , ,
.
,
[_:A-Z][-.:\w]* 8.4. , XML.
(X)HTML,
1 2, , / .
,
.
. 1:
<([_:A-Z][-.:\w]*)(?:\s+[_:A-Z][-.:\w]*\s*=\s*
(?:[^]*|[^]*))*\s*/?>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
/? , > ,
, . , .
, .
</([_:A-Z][-.:\w]*)\s*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, CDATA
DOCTYPE.
(X)HTML XML
XML- ,
, ,
. (X)HTML
XML, , , . ,
8.1. XML
535
, (X)HTML XML.
, , ( ), CDATA XML .
, ,
.
3.18 ,
.
: . , ,
. , (X)HTML XML.
.
(X)HTML.
, <script>,
<style>, <textarea> <xmp>1 ( ):
<!--.*?--\s*>|<(script|style|textarea|xmp)\b(?:[^>]|[^]*|
[^]*)*?(?:/>|>.*?</\1\s*>)
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
, ,
:
#
<!-- .*? --\s*>
|
#
<( script | style | textarea | xmp )\b
(?: [^>] #
| [^]* #
| [^]* #
)*?
1
<xmp> , ,
<pre>. <pre> ,
( HTML), . <xmp>
HTML 3.2 HTML 4.
536
8.
(?: #
/>
| #
> .*? </\1\s*>
)
: , ,
: .NET, Java, PCRE, Perl, Python, Ruby
, , JavaScript, JavaScript .
, [\
s\S] , JavaScript:
<!--[\s\S]*?--\s*>|<(script|style|textarea|xmp)\b(?:[^>]|[^]*|
[^]*)*?(?:/>|>[\s\S]*?</\1\s*>)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<script>, <style>, <textarea> <xmp>,
() , .
, . ()
1, , ( ,
).
XML. , CDATA DOCTYPE. , | :
<!--.*?--\s*>|<!\[CDATA\[.*?]]>|<!DOCTYPE\s(?:[^<>]|[^]*|
[^]*|<!(?:[^>]|[^]*|[^]*)*>)*>
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
:
#
<!-- .*? --\s*>
8.1. XML
537
|
# CDATA
<!\[CDATA\[ .*? ]]>
|
#
<!DOCTYPE\s
(?: [^<>] #
| [^]* #
| [^]* #
| <!(?:[^>]|[^]*|[^]*)*> #
)*
>
: , ,
: .NET, Java, PCRE, Perl, Python, Ruby
, JavaScript ( ):
<!--[\s\S]*?--\s*>|<!\[CDATA\[[\s\S]*?]]>|<!DOCTYPE\s(?:[^<>]|[^]*|
[^]*|<!(?:[^>]|[^]*|[^]*)*>)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
HTML 4
HTML,
HTML, .
91 HTML 4.
HTML, <blink>, <bgsound>, <embed> <nobr>. , XHTML 1.1 (
XHTML 1.0 ), , HTML 5:
</?(a|abbr|acronym|address|applet|area|b|base|basefont|bdo|big|blockquote|
body|br|button|caption|center|cite|code|col|colgroup|dd|del|dfn|dir|div|
dl|dt|em|fieldset|font|form|frame|frameset|h1|h2|h3|h4|h5|h6|head|hr|html|
i|iframe|img|input|ins|isindex|kbd|label|legend|li|link|map|menu|meta|
noframes|noscript|object|ol|optgroup|option|p|param|pre|q|s|samp|script|
538
8.
select|small|span|strike|strong|style|sub|sup|table|tbody|td|textarea|
tfoot|th|thead|title|tr|tt|u|ul|var)\b(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, , | . , ,
.
, . , ,
<0.
, 0, , , 91 0.
( ,
), ,
, 91 19.
, :
</?(a(?:bbr|cronym|ddress|pplet|rea)?|b(?:ase(?:font)?|do|ig|lockquote|
ody|r|utton)?|c(?:aption|enter|ite|o(?:de|l(?:group)?))|d(?:[dlt]|el|fn|
i[rv])|em|f(?:ieldset|o(?:nt|rm)|rame(?:set)?)|h(?:[1-6r]|ead|tml)|
i(?:frame|mg|n(?:put|s)|sindex)?|kbd|l(?:abel|egend|i(?:nk)?)|m(?:ap|
e(?:nu|ta))|no(?:frames|script)|o(?:bject|l|p(?:tgroup|tion))|p(?:aram|
re)?|q|s(?:amp|cript|elect|mall|pan|t(?:rike|rong|yle)|u[bp])?|t(?:able|
body|[dhrt]|extarea|foot|head|itle)|ul?|var)\b(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
:
<
/? #
( # 1
a(?:bbr|cronym|ddress|pplet|rea)?|
b(?:ase(?:font)?|do|ig|lockquote|ody|r|utton)?|
c(?:aption|enter|ite|o(?:de|l(?:group)?))|
d(?:[dlt]|el|fn|i[rv])|
em|
f(?:ieldset|o(?:nt|rm)|rame(?:set)?)|
h(?:[1-6r]|ead|tml)|
i(?:frame|mg|n(?:put|s)|sindex)?|
kbd|
l(?:abel|egend|i(?:nk)?)|
m(?:ap|e(?:nu|ta))|
539
8.1. XML
no(?:frames|script)|
o(?:bject|l|p(?:tgroup|tion))|
p(?:aram|re)?|
q|
s(?:amp|cript|elect|mall|pan|t(?:rike|rong|yle)|u[bp])?|
t(?:able|body|[dhrt]|extarea|foot|head|itle)|
ul?|
var
) \b
(?: [^>]
| [^]*
| [^]*
)*
>
#
#
#
#
, >, ,
:
: .NET, Java, PCRE, Perl, Python, Ruby
, :
<
/? #
( # 1
a (?:
|
|
|
|
)?|
b (?:
|
|
|
|
|
|
)?|
c (?:
|
|
|
) |
d (?:
|
|
|
) |
em |
bbr
cronym
ddress
pplet
rea
#
#
#
#
#
#
ase (?:font)?
#
do
#
ig
#
lockquote
#
ody
#
r
#
utton
#
#
aption
#
enter
#
ite
#
o (?:de|l(?:group)?) #
#
[dlt]
#
el
#
fn
#
i[rv]
#
#
#
( <a>)
<base>, <basefont>
( <b>)
<dir>, <div>
540
8.
f (?: ieldset
| o (?:nt|rm)
| rame (?:set)?
) |
h (?: [1-6r]
| ead
| tml
) |
i (?: frame
| mg
| n (?:put|s)
| sindex
)?|
kbd |
l (?: abel
| egend
| i (?:nk)?
) |
m (?: ap
| e (?:nu|ta)
) |
no (?: frames
| script
) |
o (?: bject
| l
| p (?:tgroup|tion)
) |
p (?: aram
| re
)?|
q |
s (?: amp
| cript
| elect
| mall
| pan
| t (?:rike|rong|yle)
| u[bp]
)?|
t (?: able
| body
| [dhrt]
| extarea
| foot
| head
| itle
) |
ul? |
var
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
<font>, <form>
<frame>, <frameset>
<h1>, <h2>, <h3>, <h4>, <h5>, <h6>, <hr>
<input>, <ins>
( <i>)
<li>, <link>
<menu>, <meta>
<optgroup>, <option>
( <p>)
<u>, <ul>
#
#
#
#
541
, >,
:
: .NET, Java, PCRE, Perl, Python, Ruby
XHTML, , , XHTML 1.0 , 14 : <applet>, <basefont>, <center>, <dir>, <font>, <frame>,
<frameset>, <iframe>, <isindex>, <menu>, <noframes>, <s>, <strike> <u>.
XHTML 1.1 XHTML 1.0
( ): <rb>, <rbc>, <rp>, <rt>, <rtc> <ruby>.
, XHTML 1.0 1.1, .
.
, ; 8.2 , .
8.4, ,
XML .
<b> <strong>,
.
<b> , :
<(/?)b\b((?:[^>]|[^]*|[^]*)*)>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
542
8.
:
< #
(/?)
# 1
b \b
#
(
# 2
(?: [^>] #
, >,
| [^]* #
| [^]* #
)*
#
)
#
>
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
, :
<$1strong$2>
: Python, Ruby
, ,
2 :
<$1strong>
: Python, Ruby
, , 3.15.
XML- .
. <b> <strong> , .
<
. -
543
, , /? ,
. ( , )
- .
, <b>. . ,
B.
( \b ), , ,
. <b>,
<br>, <body>, <blockquote> , b.
( \s ), ,
. .
XML XHTML ,
XML
, , . ,
-, <b-sharp>. ,
(?=[\s/>]) . ,
.
((?:[^>]|[^]*|[^]*)*) , , ,
. , ,
( ) .
, . ,
[^>] , , >, . ,
, , ,
, .
544
8.
, .
. ( | ).
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<
(/?)
([bi]|em|big) \b
(
(?: [^>]
| [^]*
| [^]*
)*
)
>
#
# 1
# 2
#
. 3
#
, >,
#
#
#
#
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
<b> <i>
[bi] ,
(|), <em> <big>. , ,
.
, .
, , , , 3. , <strong>
, ,
.
545
, :
<$1strong$3>
: Python, Ruby
,
3 :
<$1strong>
: Python, Ruby
.
8.1, , XML- ,
.
8.3,
, , , .
8.3. XML- ,
<em> <strong>
,
<em> <strong>.
,
<em> <strong>, <em> <strong>,
.
546
8.
( , 3.14),
.
1: ,
<em> <strong>
</?(?!(?:em|strong)\b)[a-z](?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
< /?
(?!
#
#
(?: em | strong ) #
,
\b
#
#
)
#
[a-z]
# a-z
(?: [^>]
#
, >,
| [^]*
#
| [^]*
#
)*
#
>
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
2: , <em>
<strong>, ,
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
< /?
(?!
#
#
(?: em | strong ) #
,
\s* >
#
,
)
#
[a-z]
# a-z
(?: [^>]
#
, >,
| [^]*
#
#
#
#
547
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
, XML- . ,
, (X)HTML ( ) 8.1.
, 1.
. 1 <em> <strong>
, , . 2 ,
1, <em>
<strong>, . . 8.2 , .
8.2.
<i>
</i>
<em>
</em>
, (
), 2 <em> <strong>
.
548
8.
( ) , , ,
, . ,
,
HTML.
HTML
(XSS),
<, > &
(<, > &)
, , , ( ). style ,
CSS . , <, > & ,
<(/?)em>
<$1em> ( Python Ruby, <\1em> ).
: , <a>, <em> <strong>, . <a>, , href title,
, , <em> <strong>,
- . .
, ,
(<a>, <em> <strong>).
href title, <a>.
.
, , , :
<(?!(?:em|strong|a(?:\s+(?:href|title)\s*=\s*(?:[^]*|[^]*))*)\s*>)
[a-z](?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
< /?
(?!
#
#
8.4. XML
549
(?: em
| strong
| a
(?:
# <em>...
# <strong>...
# <a>...
#
<a>,
#
...
\s+ #
, href / title
(?:href|title)
\s*=\s*
(?:[^]*|[^]*) #
#
)*
)
\s* >
)
[a-z]
(?: [^>]
| [^]*
| [^]*
)*
>
# , ...
# ,
# a-z
# , >,
#
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
, . , , , , 3.11 3.16, , ,
( ,
- ).
.
8.1, , XML- ,
.
8.2, , .
8.4. XML
, XML ( ). XML ,
, ,
550
8.
, .
, , , , , , .
, . XML.
, , , , XML,
.
:
thing
_thing_2_
:-
fantastic4:the.thing
,
, Latin, ,
.
, 09.
, :
thing!
.thing.with.a.dot.in.front
-thingamajig
2nd_thing
XML , , . XML 1.0, ( ), XML 1.1
1.0, . XML 1.1 , 1.0,
. , . , -
551
8.4. XML
, .
, XML 1.0,
XML 1.0. XML 1.1, XML 1.0. W3C 2008 , XML 1.1.
,
( ^...$ \A...\Z ), . ,
XML,
. 2.5.
XML 1.0 ()
\A[:_\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nl}][:_\-.\p{L}\p{M}\p{Nd}\p{Nl}]*\Z
:
: .NET, Java, PCRE, Perl, Ruby 1.9
PCRE UTF-8,
( \p{...} ).
PHP UTF-8 /u.
XML 1.1 ()
,
.
, . FF ( 255) <\u>
\x{...} .
552
8.
\A[:_A-Za-z\xC0-\xD6\xD8-\xF6\xF8-\u02FF\u0370-\u037D\u037F-\u1FFF\u200C
\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD]
[:_\-.A-Za-z0-9\xB7\xC0-\xD6\xD8-\xF6\xF8-\u036F\u0370-\u037D\u037F-\u1FFF
\u200C\u200D\u203F\u2040\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-
\uFDCF\uFDF0-\uFFFD]*\Z
:
: .NET, Java, Python, Ruby 1.9
^[:_A-Za-z\xC0-\xD6\xD8-\xF6\xF8-\u02FF\u0370-\u037D\u037F-\u1FFF\u200C
\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD]
[:_\-.A-Za-z0-9\xB7\xC0-\xD6\xD8-\xF6\xF8-\u036F\u0370-\u037D\u037F-\u1FFF
\u200C\u200D\u203F\u2040\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-
\uFDCF\uFDF0-\uFFFD]*$
: ( ^ $
)
: .NET, Java, JavaScript, Python
\A[:_A-Za-z\xC0-\xD6\xD8-\xF6\xF8-\x{2FF}\x{370}-\x{37D}\x{37F}-\x{1FFF}
\x{200C}\x{200D}\x{2070}-\x{218F}\x{2C00}-\x{2FEF}\x{3001}-\x{D7FF}
\x{F900}-\x{FDCF}\x{FDF0}-\x{FFFD}][:_\-.A-Za-z0-9\xB7\xC0-\xD6\xD8-\xF6
\xF8-\x{36F}\x{370}-\x{37D}\x{37F}-\x{1FFF}\x{200C}\x{200D}\x{203F}
\x{2040}\x{2070}-\x{218F}\x{2C00}-\x{2FEF}\x{3001}-\x{D7FF}\x{F900}-
\x{FDCF}\x{FDF0}-\x{FFFD}]*\Z
:
: PCRE, Perl
PCRE UTF-8,
\x{...}
FF. PHP
UTF-8 /u.
553
8.4. XML
XML, ,
, , .
, .
, , .
XML 1.0
, ,
XML 1.0
. (:), (_) :
(Ll)
(Lu)
(Lt)
(Lo)
(Nl)
(-), (.) :
(M), : ,
(Mn), , (Mc), (Me)
(Lm)
(Nd)
. :
\A
#
[:_\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nl}] #
554
8.
[:_\-.\p{L}\p{M}\p{Nd}\p{Nl}]*
\Z
# ( )
#
:
: .NET, Java, PCRE, Perl, Ruby 1.9
, PCRE
UTF-8. PHP UTF-8 /u.
,
(Ll, Lu, Lt, Lo Lm) \p{L} .
, . . -,
XML 1.0 ( ) . -,
XML 1.0 2.0,
1996 .
,
XML 1.0.
, , 2.0,
, . ,
XML 1.0 (http://
www.w3.org/TR/2006/REC-xml-20060816), 2.3 Common Syntactic
Constructs B Character Classes.
.
Perl PCRE (Ll), (Lu) (Lt) , (L&). , \p{...} , .
, \pL\pM \p{L}\p{M} :
\A[:_\p{L&}\p{Lo}\p{Nl}][:_\-.\pL\pM\p{Nd}\p{Nl}]*\Z
:
: PCRE, Perl
.NET , Lm
L ,
:
8.4. XML
555
\A[:_\p{L}\p{Nl}-[\p{Lm}]][:_\-.\p{L}\p{M}\p{Nd}\p{Nl}]*\Z
:
: .NET
Java, , PCRE Perl, .
, (
) Lm L:
\A[:_\pL\p{Nl}&&[^\p{Lm}]][:_\-.\pL\pM\p{Nd}\p{Nl}]*\Z
:
: Java
JavaScript, Python Ruby 1.8 . Ruby 1.9 ,
, .
XML 1.1
XML 1.0 2.0. , , (, , ). XML
, XML 1.1
XML 1.0. , ,
, , , 2.0, , .
, , ,
, , .
XML 1.1
, XML 1.0 .
(, 8.1), , XML,
, -
556
8.
, .
, . ,
(
), .
, ,
.
,
-,
:
[^\d\s/<=>][^\s/<=>]*
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, .
, .
, +
:
(?!\d)[^\s/<=>]+
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
(John Cowan),
XML 1.1, , XML 1.1
, : http://recycledknowledge.blogspot.
com/2008/02/which-characters-are-excluded-in-xml.html.
Background to Changes in XML 1.0, 5th Edition
http://www.w3.org/XML/2008/02/xml10_5th_edition_background.html,
,
XML 1.1, XML 1.0, .
557
8.5. HTML
<p> <br>
, , ,
HTML -. , ,
<p>...</p>. , <br>.
, .
.
1: HTML
HTML, HTML &, < > (. 8.3).
-.
8.3. HTML
&
<
>
&
<
>
(&),
, .
2: <br>
:
\r\n?|\n
558
8.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
\R
:
: PCRE 7, Perl 5.10
:
<br>
3: <br> </p><p>
:
<br>\s*<br>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
</p><p>
4: <p>...</p>
.
JavaScript
, JavaScript
html_from_plaintext. , , ,
HTML:
function html_from_plaintext (subject) {
// 1 ( )
subject = subject.replace(/&/g, &).
replace(/</g, <).
replace(/>/g, >);
// 2
subject = subject.replace(/\r\n?|\n/g, <br>);
// 3
subject = subject.replace(/<br>\s*<br>/g, </p><p>);
559
->
->
->
->
->
->
<p>Test.</p>
<p>Test.<br></p>
<p>Test.</p><p></p>
<p>Test1.<br>Test2.</p>
<p>Test1.</p><p>Test2.</p>
<p>< AT&T ></p>
. , JavaScript, /g, , replace , . \n, ,
JavaScript ( 0x0A
ASCII) .
1: HTML
,
( . 8.3, ,
). JavaScript
, ,
.
2: <br>
Windows/MS-DOS (CRLF), UNIX/Linux/OS X (LF) Mac OS (CR) \r\n?|\n .
Perl 5.10 PCRE 7 \R ( R) , .
<br> ,
560
8.
</p><p> .
HTML , .
XHTML,
<br> <br/> .
, , , .
3: <br> </p><p>
, ,
,
</p>, <p>. ( ), . 2 ( <br>),
. . HTML.
XHTML
<br/>, <br>,
, , <br/>\s*<br/> .
4.10, \R
Perl PCRE , , ,
\R .
8.6.
XML-
8.6. XML-
561
.
, , , :
id.
<div> id.
id, my-id.
, my-class
class ( ).
id ( )
, ,
( ) :
<[^>]+\sid\b[^>]*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<
#
[^>]+
# , .
\s id \b #
[^>]*
# , id
>
#
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
id ( )
, , >, , id
:
<(?:[^>]|[^]*|[^]*)+?\sid\s*=\s*([^]*|[^]*)
(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
562
8.
:
<
#
(?: [^>]
#
| [^]*
#
| [^]*
#
)+?
#
\s id
#
\s* = \s*
#
( [^]* | [^]* ) #
(?: [^>]
#
| [^]*
#
| [^]*
#
)*
#
>
#
.
...
-
1
...
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
id 1. .
,
\s*=\
s*([^]*|[^]*) \b . , id,
.
<div> id
, . < div\s . \s ( ) , div. ,
, ,
, , ,
(id). +?\sid *?\
bid , , id
( ) :
<div\s(?:[^>]|[^]*|[^]*)*?\bid\s*=\s*([^]*|[^]*)
(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
563
8.6. XML-
:
<div \s
#
(?: [^>]
#
| [^]*
#
| [^]*
#
)*?
#
\b id
#
\s* = \s*
#
( [^]* | [^]* ) #
(?: [^>]
#
| [^]*
#
| [^]*
#
)*
#
>
#
.
...
-
1
...
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
id, my-id
,
id ( ) . 561
id, . , ([^]*|[^]*) (?:my-id|my-id) :
<(?:[^>]|[^]*|[^]*)+?\sid\s*=\s*(?:my-id|my-id)
(?:[^>]|[^]*|[^]*)*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
< #
(?: [^>]
#
| [^]* #
| [^]* #
)+?
#
\s id
#
\s* = \s*
#
(?: my-id #
| my-id ) #
(?: [^>]
#
| [^]* #
| [^]* #
)*
#
>
#
.
...
-
...
...
: ,
564
8.
(?:my-id|my-id) ,
my-id ( ), ([])my-id\1 . , .
, my-class class
, , , , . ,
. ,
class (
) , , my-class.
:
<(?:[^>]|[^]*|[^]*)+>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
8.1 XML- . ,
,
, .
, ,
3.13, class , :
^(?:[^>]|[^]*|[^]*)+?\sclass\s*=\s*([^]*|[^]*)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
class 1. , class, ^(?:[^>]|[^]*|[^]*)+? ,
, class .
, class. , ,
565
8.6. XML-
, .
. ,
( )
, .
, , 1 ,
:
(?:^|\s)my-class(?:\s|$)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
, myclass , . , ,
.
\bmy-class\b
not-my-class.
, , .
, ,
, . ,
, XPath,
SAX DOM.
, , , . ,
, , .
.
8.7, , .
566
8.
8.7. cellspacing
<table>,
(X)HTML cellspacing=0
, cellspacing.
XML-
, .
, .
1:
<table>, cellspacing,
, :
<table\b(?![^>]*?\scellspacing\b)([^>]*)>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<table \b
#
#
(?!
#
[^>]
#
*?
#
\s cellspacing \b #
)
#
(
#
[^>]
#
*
#
#
)
#
>
#
<table,
,
, >...
, ()
cellspacing,
. 1
, >...
,
()
>
:
: .NET, Java, PCRE, Perl, Python, Ruby
2:
[^>]
(?:[^>]|[^]*|[^]*) . , -,
, >, -,
567
, cellspacing .
:
<table\b(?!(?:[^>]|[^]*|[^]*)*?\scellspacing\b)
((?:[^>]|[^]*|[^]*)*)>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
<table \b
#
#
(?!
#
(?: [^>]
#
| [^]*
#
| [^]*
#
)*?
#
\s cellspacing \b #
)
#
(
#
(?: [^>]
#
| [^]*
#
| [^]*
#
)*
#
)
#
>
#
<table,
,
, >,
, ()
cellspacing
. 1
, >,
, ()
:
: .NET, Java, PCRE, Perl, Python, Ruby
, , ,
<table> (
) 1. , , cellspacing. :
<tablecellspacing=0$1>
: Python, Ruby
3.15 , , .
568
8.
, ,
. , ,
.
, (?![^>]*?\scellspacing\b) ,
. , , , , cellspacing - . cellspacing
, ,
.
,
[^>]*? , , , ,
( >). ( \scellspacing\b ) cellspacing, .
( \s ),
.
, cellspacing ,
.
, : ([^>]*) .
, . , ,
. ,
.
569
8.8. XML-
, >
.
2, , , , ,
[^>] (?:[^>]|[^]*|[^]*) .
.
.
8.6, , .
8.8. XML-
(X)HTML XML. , -,
,
, .
. , :
<!--.*?-->
:
: .NET, Java, PCRE, Perl, Python, Ruby
. , - JavaScript
, , . JavaScript:
570
8.
<!--[\s\S]*?-->
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
( ). 3.14 ,
.
<!-- --> .
( , ), . .*?
[\s\S]*? .
, .
JavaScript [\s\
S] . .
\s , \S .
.
*?
,
. , -->, ,
,
, -->. ( , , 2.13.) , XML- . ,
() -->.
- , HTML <script> <style> .
, , . ,
(X)HTML
8.8. XML-
571
JavaScript CSS. ,
<textarea>, CDATA .
, . , ,
(X)HTML XML, . ,
, .
, , , , , , , , .
, .
, .
(X)HTML XML . 534 XML- .
. , 3.18 , , , , ( ):
<(script|style|textarea|xmp)\b(?:[^>]|[^]*|[^]*)*?
(?:/>|>.*?</\1\s*>)|<[a-z](?:[^>]|[^]*|[^]*)*>|<!\[CDATA\[.*?]]>
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
,
:
# :
<( script | style | textarea | xmp )\b
(?: [^>] #
| [^]* #
...
| [^]* #
)*?
(?: #
/>
| # ,
> .*? </\1\s*>
)
|
572
8.
# :
<[a-z]
#
(?: [^>] #
| [^]* #
...
| [^]* #
...
)*
>
|
# CDATA
<!\[CDATA\[ .*? ]]>
: , ,
: .NET, Java, PCRE, Perl, Python, Ruby
JavaScript, :
:
<(script|style|textarea|xmp)\b(?:[^>]|[^]*|[^]*)*?
(?:/>|>[\s\S]*?</\1\s*>)|<[a-z](?:[^>]|[^]*|[^]*)*>|<!\[CDATA\[
[\s\S]*?]]>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
XML-
, (X)HTML XML,
, !-- -->. :
, .
, <!-- com--ment --> , , .
,
. , <!-- comment --->
, <!---> .
573
8.8. XML-
<!--[^-]*(?:-[^-]+)*--\s*>
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, <!---->. , , . , , . , ( 2.13).
, , ,
[^-] ,
, ( , <!--(?:-?[^-]+)*-\s*> ). , 2.15.
, .
,
, . , (?:-?[^-]+)* : , , , .
. , , , * . --> ( , , ), ,
. , . ,
. , (?:-[^-]+)*
, , + ,
.
574
8.
,
, . , JavaScript
Python:
<!--(?>-?[^-]+)*--\s*>
:
: .NET, Java, PCRE, Perl, Ruby
( ) 2.14.
C-
,
. C
/* */, // .
:
/\*[\s\S]*?\*/|//.*
: (
)
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
.
8.9, , , XML- .
8.9. XML-
, , ,
. , -
8.9. XML-
575
. 41, ,
, . ,
, , ,
.
grep, ,
1.
,
.
.
, :
TODO.
:
<!--.*?-->
:
: .NET, Java, PCRE, Perl, Python, Ruby
JavaScript ,
, , :
<!--[\s\S]*?-->
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
,
, , TODO . , TODO, :
\bTODO\b
PowerGREP,
1, , .
576
8.
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.13 ,
( 2.16) ,
. , ,
TODO -->, . , ,
, <!-- -->:
\bTODO\b(?=(?:(?!<!--).)*?-->)
: ,
: .NET, Java, PCRE, Perl, Python, Ruby
JavaScript , [\s\S] :
\bTODO\b(?=(?:(?!<!--)[\s\S])*?-->)
:
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
3.13 , . . , , ,
\bTODO\b .
, , *? ,
, . 2.13, --> ( ),
.
577
8.9. XML-
.
, . , ,
:
\b TODO \b
# TODO,
(?=
# ,
(?:
# , ...
(?! <!-- ) #
, <!--
.
#
)*?
# ,
-->
# -->
) #
: ,
, , : . -
, .
,
TODO, -->. \bTODO
\b(?=.*?-->) ( ),
<!--TODO-->.
.*? ,
,
TODO -->, - . *?
578
8.
, , , , -->.
\bTODO(?=.*?-->)\b , \b , - .
, , ( . 111).
, .
, , ,
, ,
TODO .
, , \bTODO\b(?=.*?-->) , , TODO
<!-- separate comment -->? TODO, -->, ,
. , , , ,
!--, . ,
[^<!-] , <, ! -, <!--.
,
. (?!<!--).
, , . , (?:(?!<!--).) , *? , .
, , , : \bTODO\
b(?=(?:(?!<!--).)*?-->) . JavaScript, , : \
bTODO\b(?=(?:(?!<!--)[\s\S])*?-->) .
, TODO -->
<!-- , : <!-- --> .
, :
8.10. , CSV
579
, ,
,
.
,
( ).
: , TODO ,
, .NET.
.NET
:
(?<=<!--(?:(?!-->).)*?)\bTODO\b(?=(?:(?!<!--).)*?-->)
: ,
: .NET
, .NET,
, , , . , <!--, , , -->.
, TODO. .
.
8.8, ,
XML- .
8.10. ,
CSV
, CSV,
. , , , .
580
8.
CSV -, . , ( )
. , , ,
, 2,
1.
CSV,
, ,
(Comma-Separated Values, CSV).
(,|\r?\n|^)([^,\r\n]+|(?:[^]|)*)?
: ( ^ $
)
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
::
( , | \r?\n | ^ ) #
#
(
#
[^,\r\n]+
#
|
#
(?:[^]|)* #
#
)?
#
#
1
2 :
...
(
)
,
: , ( ^ $
)
: .NET, Java, PCRE, Perl, Python, Ruby
3.11, CSV 1 .
.
.
, (
, ). CSV
2 , -
8.10. , CSV
581
. , , , 1.
JavaScript
-,
Replace () . ( Input ()) , -
( Output ()).
CSV, , . ,
.html -:
<html>
<head>
<title>Change CSV delimiters from commas to tabs</title>
</head>
<body>
<p>Input:</p>
<textarea id=input rows=5 cols=75></textarea>
<p><input type=button value=Replace onclick=commas_to_tabs()></p>
<p>Output:</p>
<textarea id=output rows=5 cols=75></textarea>
<script>
function commas_to_tabs () {
var input = document.getElementById(input),
output = document.getElementById(output),
regex = /(,|\r?\n|^)([^,\r\n]+|(?:[^]|)*)?/g,
result = ,
match;
while (match = regex.exec(input.value)) {
// 1
if (match[1] == ,) {
// (
// ) 2.
// 2 ( -
//
// ), .
result += \t + (match[2] || );
582
8.
} else {
//
result += match[0];
}
//
if (match.index == regex.lastIndex) regex.lastIndex++;
}
output.value = result;
}
</script>
</body>
</html>
, , CSV ( , ) .
.
, (,|\r?\n|^) ,
, .
, CSV. , . , ,
^ . ,
, , (, ).
, , , . , , . ,
, , 1 - . , , CSV , , ?
, ,
, -
583
8.10. , CSV
, .NET (
. 113).
.NET , , .
, ,
2. CSV, . ,
.
, 2 , | . ,
[^,\r\n]+ , ,
( + ), , . , , .
2, (?:[^]|)* ,
, . ,
, , , () , .
* , , .
,
CSV. , , , .
.
8.11, ,
CSV, .
584
8.
8.11. CSV
CSV.
8.10 CSV. , ( ) .
( )
CSV , .
,
CSV . , ,
.
CSV,
, ,
(Comma-Separated Values, CSV)
. 519.
(,|\r?\n|^)([^,\r\n]+|(?:[^]|)*)?
: ( ^ $
)
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
( , | \r?\n | ^ ) #
#
(
#
[^,\r\n]+
#
|
#
(?:[^]|)* #
#
)?
#
#
1
2 :
...
(
)
,
: , ( ^ $
)
: .NET, Java, PCRE, Perl, Python, Ruby
8.11. CSV
585
JavaScript
-,
Extract Column 3 ( 3) . Input ()
, , , ,
(
) Output (). ,
.html -:
<html>
<head>
<title>Extract the third column from a CSV string</title>
</head>
<body>
<p>Input:</p>
<textarea id=input rows=5 cols=75></textarea>
<p><input type=button value=Extract Column 3
onclick=display_csv_column(2)></p>
<p>Output:</p>
<textarea id=output rows=5 cols=75></textarea>
<script>
function display_csv_column (index) {
var input = document.getElementById(input),
output = document.getElementById(output),
column_fields = get_csv_column(input.value, index);
if (column_fields.length > 0) {
// ,
// (\n)
output.value = column_fields.join(\n);
} else {
output.value = [No data found to extract];
}
}
586
8.
// CSV,
//
function get_csv_column (csv, index) {
var regex = /(,|\r?\n|^)([^,\r\n]+|(?:[^]|)*)?/g,
result = [],
column_index = 0,
match;
while (match = regex.exec(csv)) {
// 1. ,
// column_index. .
if (match[1] == ,) {
column_index++;
} else {
column_index = 0;
}
if (column_index == index) {
// (. 2)
result.push(match[2]);
}
//
if (match.index == regex.lastIndex) regex.lastIndex++;
}
return result;
}
</script>
</body>
</html>
8.10, , .
JavaScript,
CSV .
get_csv_column()
. 1. ,
, ,
column_index, . 1
( ), ,
, column_index .
8.11. CSV
587
, column_index
, . ,
, 2 (, ). , get_csv_column() ,
(, ).
,
(\n).
, , , . get_csv_column() ,
, (index).
CSV ,
. , (
).
,
.
,
, .
CSV
1 1
^([^,\r\n]+|(?:[^]|)*)?(?:,(?:[^,\r\n]+|(?:[^]|)*)?)*
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
CSV
2 1
^(?:[^,\r\n]+|(?:[^]|)*)?,([^,\r\n]+|(?:[^]|)*)?
(?:,(?:[^,\r\n]+|(?:[^]|)*)?)*
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
588
8.
CSV
3 1
^(?:[^,\r\n]+|(?:[^]|)*)?(?:,(?:[^,\r\n]+|(?:[^]|)*)?){1},
([^,\r\n]+|(?:[^]|)*)?(?:,(?:[^,\r\n]+|(?:[^]|)*)?)*
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
{1} ,
3. , {2} , 4, {3} 5 . 3, {1}
, .
( 1).
1 .
$1
: Python, Ruby
8.12.
INI
INI.
. INI (,
[Section1]). :
^\[[^\]\r\n]+]
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
589
8.13. INI
,
:
^ $
^
.
\[ [. [
, .
[^\]\r\n] ,
, ], (\r)
(\n). +,
,
....
] ] . ,
.
,
. Section1:
^\[Section1]
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
[Section1]
, . ( ) , , (, Item1=[Value1]).
.
8.13, , INI.
8.14, ,
-.
8.13. INI
INI (
- ),
590
8.
INI .
8.12 , INI. ,
, , ,
[ (
):
^\[[^\]\r\n]+](?:\r?\n(?:[^[\r\n].*)?)*
: ^ $ ( )
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
:
^ \[ [^\]\r\n]+ ] #
(?:
# ...
\r?\n
#
(?:
# ...
[^[\r\n]
#
, [
.*
#
)?
# ,
)*
# ,
: ^ $ , (
)
: .NET, Java, PCRE, Perl, Python, Ruby
INI ^\[[^\]\r\n]+]
, , [. :
[Section1]
Item1=Value1
Item2=[Value2]
; [SectionA]
; SectionA
ItemA=ValueA ; ItemA Section1
8.14. - INI
591
[Section2]
Item3=Value3
Item4 = Value4
.
[Section2],
. Section2 .
.
8.12, ,
INI.
8.14, , .
8.14. -
INI
- INI
(, Item1=Value1), . 1
(Item1), 2 (Value1).
,
( ):
^([^=;\r\n]+)=([^;\r\n]*)
: ^ $
: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
^
( [^=;\r\n]+ )
=
( [^;\r\n]* )
#
#
#
#
1
2
: ^ $ ,
: .NET, Java, PCRE, Perl, Python, Ruby
592
8.
,
INI, .
^ ,
( ^ $ ). , , .
, ( [^=;\r\n] ), ( + ),
1. , : , , ( \r ) ( \n ). INI, ,
.
(
) . , , ,
. -, ( ). -,
* , ( , , ).
.
8.12, ,
INI.
8.13, , INI.
C
7- , 53
$ ( )
, 49
, 63, 65, 66,
147, 279, 283
, 406
, 125
$~[0], , 188
$`, , 134, 192, 253
$_, , 134
$', , 134, 253
$&, , 129, 187, 192
$~, (Ruby), 187
, 200
$+, , 201, 237
& ()
, 407
| ( )
, 85, 301, 365
, 406
- ()
, 55
, 49
, 345, 348
, 406
, ()
, 57
, 433
, 407
* ()
, 279
,
101
, 49
, 406
? ( )
, 102
, 99
, 49
, 406
+ ( )
, 104
, 279
, 49
, 406
[] ( )
, 55
, 49
, 279
, 406
@ ( at)
Perl, 147
, 278
() ( )
, 108, 318
, 49
, 91
, 87
, 406
^ ()
, 49
, 63, 65,
279, 283
, 55
, 406
(?:), , 279
\\ ( )
, 142
, 55
, 49
, 49
, 125, 406
, 403
!~, (Perl), 174
=~, (Perl), 174
=~, (Ruby), 174, 193
# ()
, 123
# (), 407
TM
( ), , 71
594
. ()
, 61
, 49
, 59
, 406
{} ( )
, 97
, 49
, 406
A
\\A, , 63, 64, 65,
280
\\a ( ASCII bell),
52
ActionScript, ,
139
appendReplacement(), , 242
appendTail(), , 242
B
\\b, , , 57, 68,
286, 302, 362
\\B, , 69
<b>, , <strong>,
541
begin(), , Ruby ( MatchData),
193
C
\\cA \\cZ, , 53
CacheSize, (.NET), 154
CANON_EQ, (Java), 164
CASE_INSENSITIVE, (Java),
161
CDATA, XML, 518
cellspacing, ,
<table>, 566
CIL (Common Intermediate Language ), 158
Comma-Separated Values (, , CSV), 519
, 584
, 579
COMMENTS, (Java), 161
compile(),
Java ( Pattern), 145, 155, 161
Python ( re), 157, 163
Ruby ( Regexp), 158
Count, (Match.Groups), 198
Currency, , 79
c
(Namespace Specific String, NSS),
451
C, , 139
C#, , 136
, 150
,
144
, 154
C++, , 139
D
\\D, , 57
\\d, , 57, 411
Delphi for Win32, , 139
Delphi Prism, ,
140
DOCTYPE, (HTML), 516
DOTALL, (Java), 161
E
\\E, , , 50
\\e ( ASCII escape),
52
ECMA-262, , 137, 139
ECMAScript, (RegexOptions),
164
ECMAScript ,
137
end(),
Java ( Matcher), 191, 198
Python ( MatchObject), 193
Ruby ( MatchData), 193
EPP (Extensible Provisioning Protocol
), 290
ereg, (PHP), 138
,
146
exec(), , JavaScript, 191, 198
, 212
ExplicitCapture,
(RegexOptions), 164
EXTENDED, (Regexp), 164
Extensible Hypertext Markup Language
( , XHTML), 516
<table>,
566
595
, 521
,
, 545
Extensible Markup Language ( , XML), 517
<table>,
566
, 560
, 574
, 521, 524
, 549
,
, 545
, 569
F
\\f ( ASCII form
feed), 52
findall(), , Python ( re), 207
finditer(), , Python ( re), 214
find(), , Java ( Matcher), 173,
186, 191, 205
, 212
G
Get-Unique, (PowerShell), 389
grep, , R,
140
Groovy, , 140
GroupCollection, (.NET), 197
groupdict(), , Python (
MatchObject), 200
groups(), , Python (
MatchObject), 199
Groups, .NET (
Match()), 197
Group, (.NET), 197
group(),
Java ( Matcher), 198
re, (Python), 199
gsub(), , Ruby ( String), 232,
244
, 235
H
Hypertext Markup Language ( , HTML), 513
<table>,
566
<p>
, 557
<b> <strong>,
541
, 521
, 537
,
, 545
I
(?-i), , 51
(?i), , 51
IGNORECASE, (Regexp), 164
IgnoreCase, (RegexOptions),
161
IgnorePatternWhitespace,
(RegexOptions), 161
IndexOutOfBoundsException, , 229
index,
JavaScript, 191
.NET, 190
INI,
, 588
, 591
, 589
IPv4, , , 476
IPv6, , , 479
IsMatch(), (.NET), 172
ISO 8601, , 303
J
JavaScript, ,
137
,
151
,
145
, 27
,
162, 165
,
24, 137
, 156
java.util.regex, , 136
, 151
Java, , 136
, 151
,
145
, 27
596
,
161, 164
,
23, 136
, 155
MULTILINE,
Java, 161
Regexp, 164
Multiline, (RegexOptions), 161
m//, (Perl), 174, 187
, 199
.NET, , 136
,
161, 164
, 154
\\n ( )
C#, 144
Java, 145
Python, 143, 148
\\n ( ASCII
newline), 52
NEAR-, 379
new(), (Ruby), 158, 163
Nregex, - , 34
L
lastIndex, , JavaScript (
RegExp), 192, 212, 378
length(), , Ruby ( MatchData),
194
length,
JavaScript, 191
.NET, 190
M
(?m), , 67
MatchData, (Ruby), 187
begin(), , 193
end(), , 193
length(), , 194
offset(), , 194
size(), , 194
Matcher, (Java), 156
end(), , 191, 198
find(), , 173, 186, 191, 205, 212
group(), , 198
reset(), , 156, 186
start(), , 191, 198
Matches(), (.NET), 204
MatchEvaluator, (.NET), 240
MatchObject, (Python)
end(), , 193
start(), , 193
Match, (.NET)
NextMatch(), , 211
Index Length, 190
match(),
Ruby ( Regexp), 193
match(), (JavaScript), 205
Match(), (.NET), 190
Groups, , 197
Value, , 185
, 211
mb_ereg, (PHP), 137
,
146
Microsoft VBScript, , 141
O
offset(), , Ruby ( MatchData),
194
Onigurama, , 138
P
\\P{}, , 76
\\p{}, , 72
PatternSyntaxException, ,
155
Pattern, (Java), 155
compile(), , 145, 155, 161
split(), , 259
, 161, 164
PCRE (Perl-Compatible Regular
Expressions Perl- ), , 22,
137
, 26
Perl, , 138
,
151
,
146
, 26
,
163, 166
597
,
21, 138
, 157
PHP, , 137
, 151
,
146
, 26
,
162, 165
,
137
POSIX- , 425
PowerGREP, , 43
PowerShell, , 140
PREG_OFFSET_CAPTURE, ,
192, 199, 206
PREG_PATTERN_ORDER, ,
205
PREG_SET_ORDER, , 205
PREG_SPLIT_DELIM_CAPTURE, , 267
PREG_SPLIT_NO_EMPTY, ,
261, 267
preg, (PHP), 137
preg_match_all(), , 205, 214
preg_match(), , 173, 187, 213
preg_replace_callback(), ,
243
preg_replace(), , 137, 229,
235, 237
preg_split(), 267
preg_split(), , 261
, 151
,
146
,
162
, 156
Python, , 138
, 151
,
148
, 27
,
163, 166
,
24
, 157
,
138
Q
\\Q, , , 50
qr//, (Perl), 157
R
\\r ( ASCII
carriage return), 52
REALbasic, ,
141
RegexOptions, , 161, 164
RegexOptions,
Compiled, , 158
regexpr, , R, 140
RegExp, (JavaScript)
exec(), , 198, 212
index, , 191
lastIndex, , 192, 212, 378
length, , 191
test(), , 173
Regexp, (Ruby)
compile(), , 158
new(), , 158, 163
RegExp(), (JavaScript), 156
RegexRenamer, , 46
Regex, (.NET)
IsMatch(), , 172
Matches(), , 204
Replace(), , 227, 234, 240, 241
Split(), , 257, 266
RegEx, (REALbasic), 141
Regex(), (.NET), 144, 154
RegexOptions, ( ), 158
replaceAll(), , Java ( String),
229
, 234
replaceFirst(), , Java ( String),
229
, 234
replace(), , JavaScript (
String), 229
, 235
Replace(), (.NET), 227, 240, 241
, 234
replace, ( String)
, 400
598
<table>, , , 566
\\t ( ASCII
horizontal tab), 52
test(), (JavaScript), 173
TPerlRegEx, , 140
U
\\u,
Java, 145
UNC, , 497, 501, 504
UNICODE_CASE, (Java), 161
UNICODE ( U), , 57, 311
uniq, (UNIX), 389
UNIX_LINES, (Java), 165
URL,
, 460
, 462
, 465
, 467
599
, 471
, 458
, 472
, 442, 444, 446
, 438
, , 452
V
\\v ( ASCII vertical
tab), 52
Value, , .NET ( Match), 185
VB.NET, , 136,
141
, 150
,
145
, 154
Visual Basic 6, ,
141
W
Windows Grep, , 44
X
\\W, , 57, 70
, 315
\\w, , 278
, 315
\\w, , 57, 70
XML 1.0, , 550, 551, 553
XML 1.1, , 550, 551, 555
Z
\\Z, , 63, 64, 66,
280
\\z, , 63, 64
, 351
- , , 308
, 72, 80
, 82
, 108, 110, 318
, 116
( ), 514
, 548
, 560
,
, 548
, 377
, 98
,
, 149
, 72, 77
, 82
,
Windows, 503
, , 418
, , 398
, 429
, 573
, 58
, 101
, , 104
, 109
, , 300
, 133
, 128
,
129
, 364
, 222
, URL,
448
, 68, 302
, 412
, 57
, 73, 81
, 87
, 108, 110,
318
, 131
, 118
, 99
, 111,
313, 362
600
, 119
, 384
, 131
, , 416
, , 436
, 81
,
25
, 22, 138
, 55
, , 425
, , 188
, , 312
, , 473
, , 389
, , 352
(single line), , 61
,
, 398
, 224
, 232
, 133
, 128
, 129
, 238
, 125
, 245
, 247
, CSV, 519
, 104
, 75
,
(Comma-Separated Values, CSV), 519
, 584
, 579
(Namespace Identifier, NID), 451
, 202
, 181
,
194
,
CSV, 579
XML, 518
, 549
,
Windows, 506
Windows, 508
,
511
, , 341
, 96
, 93, 94, 131,
200, 236
(.NET), 299
,
376, 378
, 149
, URL, 460
,
Windows, 504
, URL,
462
, 27
Expresso, 40
grep, 43
myregexp.com, 36
Nregex, 34
reAnimator, 38
RegexBuddy, 28
RegexPal, 31
Rubular, 35
The Regulator, 41
, 47
, 315
601
,
, 142
, 109
, 72, 74
, 284
, 98
, 573
, 98
,
97
, 72, 73
, 77
, 76
, 81
HTML, 515
XML, 569, 570, 574
INI, 520
, 122
(IPv6), 482,
490
, 483, 485, 493, 494
, 377
, 423
, 133
ISBN-10, 334
ISBN-13, 334
, 134
, 142
, , 49
, 51
, 351
, 100
, 100
,
104
,
, 288
, 49
, 55
, 403
,
, 85
, , 85
, 100
, 100,
102
,
104
, 515, 557
,
515, 557
(multiline) , 66
, 317
, 51
, 90
,
113, 371, 374
,
113, 375
Python, 148
, , 52
(?:), 279
, 89
, , , 161
, , 425
, , 345
, , 325
, 418
, 91
, 94, 96
, 384
, 131
, 232
, 387
, 208
, 58
, , 151
(HTML),
516
, 314
602
, 315
, 317
, 85
(Perl), 147
(Perl), 146
, 112, 313
, 113, 371, 374
, 112
, 188
Windows, 500, 501
, 55
, INI, 520
,
, 159
, 222
, ,
391, 393
, 98
, 58
, 100
, 97
, 108
, 232
,
, 402
, , 387
,
112
, 386
, 112
, , 188
, 397
URL , 442, 444, 446
, , 339
XML, 560
, 269
, 364, 367, 371, 373, 375, 379,
387, 361
, 389, 395, 397
XML, 521
, 409, 413, 416
, 25
, 124
, 224
,
, 238
HTML
, 557
, 245
, 247
, 232
, 541
, CSV, 519
, 584
, URL, 465
, , 390, 393
, 269
, , 367
,
, 338
338
, 336
, 134
, 108
, 341
, 123
, 74
, , 402
, 57
,
398
, 353, 355
, 345, 348
, 283
URL, 438
, 274
603
, 168
,
300
, 290
ISO 8601,
303
, 473
, 352
XML, 570
, 288
,
325
, 215
Windows, 495
, 112
, 175
, 111, 313, 362
, 113
, 121
URN, 449
, 282
URL, 452
, , 119
HTML, 537
, 312, 317, 323, 328, 336,
345, 351
ISBN, 328
, 345
,
325
- , 308
ANSI, 310
, 312
,
317
, 336
, 323
, HTML,
557
, 384
(Windows)
, 498
, 503
, 504
, 506
, 508
, 510
, 495
,
511
, URL, 467
, 253
, 264
CSV, ,
579
, INI, 520
, 58
, 321
HTML, 557
(), Windows, 510
(Extensible Provisioning
Protocol, EPP), 290
(Extensible Markup Language, XML),
517
<table>,
566
, 574
, 521, 524
, 549
,
, 545
, 569
(Extensible Hypertext Markup
Language, XHTML), 516
<table>,
566
, 521
,
, 545
, 278
, 58
,
51
, 403
Perl, 21
, 122
Python, 148
, 222
, 112, 313, 363
, 376, 378
604
, 118
, 113, 375
, 112
, 434
, 222
, , 122
, 161
, 72
^ $
, , 159
, ,
71
ANSI, , 310
ISO-8859-1, , 310
Windows-1252, ,
310
, 72
, 57, 70
, 55, 284
, 123
, 82
, 58
, 58
, 58
, 56
,
, 379
XML- , 574
,
, 373
, , 375
, , 371
, 364
, 361
, 387
, 367
, , 397
, , 395
, 68
, 403
(IPv6), 480,
489
, 483, 485, 493, 494
,
219
, 66
, 151
, 56
, 397
, 51, 278
/ , 62
, 245
, 133
,
232
, 208
, 219
, 119
, 222
IPv4, 476
IPv6, 479
, 222
INI,
588
XML, 549
, 49
, 59
, 52
,
85
, 54
- INI,
591
, 97, 100, 108
INI, 589
, 91
, 521
, 68
, 390, 392
, 87
, 194
, 93, 200,
236, 299
, 131
, 118
, 384
, 232
605
, 91
, 387
, 119
, 131
(IPv6), 479
(PHP), 146
C#, 144
Java, 145
Perl, 147
PHP, 146
VB.NET, 145
(Python), 148
URL, 438
, 312
URN, 449
, 253
, 403
,
URL, 471
, 389
, 397
, 395
, 390, 392
, 398
, 389
, 402
, URL, 458
(Python), 148
( ), 514
<table>,
566
, 541
, 521
,
, 545
, 288
, 282
(dot all), , 61
, , 61, 159
, 418
, 511
, 402
URL, ,
452
(Uniform
Resource Name, URN), , 449
, 76
, 381
, 159
, 323
(INI), 520
, 377
, 97
ISO 8601, 303
, 300, 303
, 290, 303
, 282
, 341
, 290
, URL,
472
, ,
419, 427
, 409
, 433
, 418
(), 411
, 75
, 57
IPv4, , , 476
IPv6, , , 479
, 429
, , 416
606
, , 352
ISBN, , 328
, ,
345
, 325
, 434
, 433
, 418
, 409, 419
, 427
, 413
, ,
515
, , 413
, 427
, 49
, 50, 407
, 55
, 403
, 125
, ,
274
, , 514
,
, 71
, , 80
(Hypertext
Markup Language, HTML), 513
<table>,
566
<p>
, 557
<b> <strong>,
541
, 521
, 537
,
, 545
, 63
- - Books.Ru
ISBN 978-5-93286-181-3,
. Books.Ru .
- , . ,
- (piracy@symbol.ru), .