diff options
author | Cronet Mainline Eng <cronet-mainline-eng+copybara@google.com> | 2024-02-02 09:37:13 +0000 |
---|---|---|
committer | Chidera Olibie <colibie@google.com> | 2024-02-02 09:53:24 +0000 |
commit | 5cfdd35118d5a23349255971e97737e32895ec0f (patch) | |
tree | f6b803e3a8bbddaf4814d1a43930799c3d7f4d8e /third_party/re2/src/doc/syntax.txt | |
parent | abce8a39488511c10b95ac52d1a3fdd2e886da83 (diff) | |
download | cronet-5cfdd35118d5a23349255971e97737e32895ec0f.tar.gz |
Cronet 121.0.6167.71: import third_party/re2
Bug: b/322154153
FolderOrigin-RevId: /tmp/copybara-origin/src
Change-Id: Ic5f3b7c7578bf4e12b03944d863325abfc88853a
Diffstat (limited to 'third_party/re2/src/doc/syntax.txt')
-rw-r--r-- | third_party/re2/src/doc/syntax.txt | 463 |
1 files changed, 463 insertions, 0 deletions
diff --git a/third_party/re2/src/doc/syntax.txt b/third_party/re2/src/doc/syntax.txt new file mode 100644 index 000000000..6070efd96 --- /dev/null +++ b/third_party/re2/src/doc/syntax.txt @@ -0,0 +1,463 @@ +RE2 regular expression syntax reference +------------------------------------- + +Single characters: +. any character, possibly including newline (s=true) +[xyz] character class +[^xyz] negated character class +\d Perl character class +\D negated Perl character class +[[:alpha:]] ASCII character class +[[:^alpha:]] negated ASCII character class +\pN Unicode character class (one-letter name) +\p{Greek} Unicode character class +\PN negated Unicode character class (one-letter name) +\P{Greek} negated Unicode character class + +Composites: +xy «x» followed by «y» +x|y «x» or «y» (prefer «x») + +Repetitions: +x* zero or more «x», prefer more +x+ one or more «x», prefer more +x? zero or one «x», prefer one +x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more +x{n,} «n» or more «x», prefer more +x{n} exactly «n» «x» +x*? zero or more «x», prefer fewer +x+? one or more «x», prefer fewer +x?? zero or one «x», prefer zero +x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer +x{n,}? «n» or more «x», prefer fewer +x{n}? exactly «n» «x» +x{} (== x*) NOT SUPPORTED vim +x{-} (== x*?) NOT SUPPORTED vim +x{-n} (== x{n}?) NOT SUPPORTED vim +x= (== x?) NOT SUPPORTED vim + +Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}» +reject forms that create a minimum or maximum repetition count above 1000. +Unlimited repetitions are not subject to this restriction. + +Possessive repetitions: +x*+ zero or more «x», possessive NOT SUPPORTED +x++ one or more «x», possessive NOT SUPPORTED +x?+ zero or one «x», possessive NOT SUPPORTED +x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED +x{n,}+ «n» or more «x», possessive NOT SUPPORTED +x{n}+ exactly «n» «x», possessive NOT SUPPORTED + +Grouping: +(re) numbered capturing group (submatch) +(?P<name>re) named & numbered capturing group (submatch) +(?<name>re) named & numbered capturing group (submatch) +(?'name're) named & numbered capturing group (submatch) NOT SUPPORTED +(?:re) non-capturing group +(?flags) set flags within current group; non-capturing +(?flags:re) set flags during re; non-capturing +(?#text) comment NOT SUPPORTED +(?|x|y|z) branch numbering reset NOT SUPPORTED +(?>re) possessive match of «re» NOT SUPPORTED +re@> possessive match of «re» NOT SUPPORTED vim +%(re) non-capturing group NOT SUPPORTED vim + +Flags: +i case-insensitive (default false) +m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false) +s let «.» match «\n» (default false) +U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false) +Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). + +Empty strings: +^ at beginning of text or line («m»=true) +$ at end of text (like «\z» not «\Z») or line («m»=true) +\A at beginning of text +\b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other) +\B not at ASCII word boundary +\G at beginning of subtext being searched NOT SUPPORTED pcre +\G at end of last match NOT SUPPORTED perl +\Z at end of text, or before newline at end of text NOT SUPPORTED +\z at end of text +(?=re) before text matching «re» NOT SUPPORTED +(?!re) before text not matching «re» NOT SUPPORTED +(?<=re) after text matching «re» NOT SUPPORTED +(?<!re) after text not matching «re» NOT SUPPORTED +re& before text matching «re» NOT SUPPORTED vim +re@= before text matching «re» NOT SUPPORTED vim +re@! before text not matching «re» NOT SUPPORTED vim +re@<= after text matching «re» NOT SUPPORTED vim +re@<! after text not matching «re» NOT SUPPORTED vim +\zs sets start of match (= \K) NOT SUPPORTED vim +\ze sets end of match NOT SUPPORTED vim +\%^ beginning of file NOT SUPPORTED vim +\%$ end of file NOT SUPPORTED vim +\%V on screen NOT SUPPORTED vim +\%# cursor position NOT SUPPORTED vim +\%'m mark «m» position NOT SUPPORTED vim +\%23l in line 23 NOT SUPPORTED vim +\%23c in column 23 NOT SUPPORTED vim +\%23v in virtual column 23 NOT SUPPORTED vim + +Escape sequences: +\a bell (== \007) +\f form feed (== \014) +\t horizontal tab (== \011) +\n newline (== \012) +\r carriage return (== \015) +\v vertical tab character (== \013) +\* literal «*», for any punctuation character «*» +\123 octal character code (up to three digits) +\x7F hex character code (exactly two digits) +\x{10FFFF} hex character code +\C match a single byte even in UTF-8 mode +\Q...\E literal text «...» even if «...» has punctuation + +\1 backreference NOT SUPPORTED +\b backspace NOT SUPPORTED (use «\010») +\cK control char ^K NOT SUPPORTED (use «\001» etc) +\e escape NOT SUPPORTED (use «\033») +\g1 backreference NOT SUPPORTED +\g{1} backreference NOT SUPPORTED +\g{+1} backreference NOT SUPPORTED +\g{-1} backreference NOT SUPPORTED +\g{name} named backreference NOT SUPPORTED +\g<name> subroutine call NOT SUPPORTED +\g'name' subroutine call NOT SUPPORTED +\k<name> named backreference NOT SUPPORTED +\k'name' named backreference NOT SUPPORTED +\lX lowercase «X» NOT SUPPORTED +\ux uppercase «x» NOT SUPPORTED +\L...\E lowercase text «...» NOT SUPPORTED +\K reset beginning of «$0» NOT SUPPORTED +\N{name} named Unicode character NOT SUPPORTED +\R line break NOT SUPPORTED +\U...\E upper case text «...» NOT SUPPORTED +\X extended Unicode sequence NOT SUPPORTED + +\%d123 decimal character 123 NOT SUPPORTED vim +\%xFF hex character FF NOT SUPPORTED vim +\%o123 octal character 123 NOT SUPPORTED vim +\%u1234 Unicode character 0x1234 NOT SUPPORTED vim +\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim + +Character class elements: +x single character +A-Z character range (inclusive) +\d Perl character class +[:foo:] ASCII character class «foo» +\p{Foo} Unicode character class «Foo» +\pF Unicode character class «F» (one-letter name) + +Named character classes as character class elements: +[\d] digits (== \d) +[^\d] not digits (== \D) +[\D] not digits (== \D) +[^\D] not not digits (== \d) +[[:name:]] named ASCII class inside character class (== [:name:]) +[^[:name:]] named ASCII class inside negated character class (== [:^name:]) +[\p{Name}] named Unicode property inside character class (== \p{Name}) +[^\p{Name}] named Unicode property inside negated character class (== \P{Name}) + +Perl character classes (all ASCII-only): +\d digits (== [0-9]) +\D not digits (== [^0-9]) +\s whitespace (== [\t\n\f\r ]) +\S not whitespace (== [^\t\n\f\r ]) +\w word characters (== [0-9A-Za-z_]) +\W not word characters (== [^0-9A-Za-z_]) + +\h horizontal space NOT SUPPORTED +\H not horizontal space NOT SUPPORTED +\v vertical space NOT SUPPORTED +\V not vertical space NOT SUPPORTED + +ASCII character classes: +[[:alnum:]] alphanumeric (== [0-9A-Za-z]) +[[:alpha:]] alphabetic (== [A-Za-z]) +[[:ascii:]] ASCII (== [\x00-\x7F]) +[[:blank:]] blank (== [\t ]) +[[:cntrl:]] control (== [\x00-\x1F\x7F]) +[[:digit:]] digits (== [0-9]) +[[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]) +[[:lower:]] lower case (== [a-z]) +[[:print:]] printable (== [ -~] == [ [:graph:]]) +[[:punct:]] punctuation (== [!-/:-@[-`{-~]) +[[:space:]] whitespace (== [\t\n\v\f\r ]) +[[:upper:]] upper case (== [A-Z]) +[[:word:]] word characters (== [0-9A-Za-z_]) +[[:xdigit:]] hex digit (== [0-9A-Fa-f]) + +Unicode character class names--general category: +C other +Cc control +Cf format +Cn unassigned code points NOT SUPPORTED +Co private use +Cs surrogate +L letter +LC cased letter NOT SUPPORTED +L& cased letter NOT SUPPORTED +Ll lowercase letter +Lm modifier letter +Lo other letter +Lt titlecase letter +Lu uppercase letter +M mark +Mc spacing mark +Me enclosing mark +Mn non-spacing mark +N number +Nd decimal number +Nl letter number +No other number +P punctuation +Pc connector punctuation +Pd dash punctuation +Pe close punctuation +Pf final punctuation +Pi initial punctuation +Po other punctuation +Ps open punctuation +S symbol +Sc currency symbol +Sk modifier symbol +Sm math symbol +So other symbol +Z separator +Zl line separator +Zp paragraph separator +Zs space separator + +Unicode character class names--scripts: +Adlam +Ahom +Anatolian_Hieroglyphs +Arabic +Armenian +Avestan +Balinese +Bamum +Bassa_Vah +Batak +Bengali +Bhaiksuki +Bopomofo +Brahmi +Braille +Buginese +Buhid +Canadian_Aboriginal +Carian +Caucasian_Albanian +Chakma +Cham +Cherokee +Chorasmian +Common +Coptic +Cuneiform +Cypriot +Cypro_Minoan +Cyrillic +Deseret +Devanagari +Dives_Akuru +Dogra +Duployan +Egyptian_Hieroglyphs +Elbasan +Elymaic +Ethiopic +Georgian +Glagolitic +Gothic +Grantha +Greek +Gujarati +Gunjala_Gondi +Gurmukhi +Han +Hangul +Hanifi_Rohingya +Hanunoo +Hatran +Hebrew +Hiragana +Imperial_Aramaic +Inherited +Inscriptional_Pahlavi +Inscriptional_Parthian +Javanese +Kaithi +Kannada +Katakana +Kawi +Kayah_Li +Kharoshthi +Khitan_Small_Script +Khmer +Khojki +Khudawadi +Lao +Latin +Lepcha +Limbu +Linear_A +Linear_B +Lisu +Lycian +Lydian +Mahajani +Makasar +Malayalam +Mandaic +Manichaean +Marchen +Masaram_Gondi +Medefaidrin +Meetei_Mayek +Mende_Kikakui +Meroitic_Cursive +Meroitic_Hieroglyphs +Miao +Modi +Mongolian +Mro +Multani +Myanmar +Nabataean +Nag_Mundari +Nandinagari +New_Tai_Lue +Newa +Nko +Nushu +Nyiakeng_Puachue_Hmong +Ogham +Ol_Chiki +Old_Hungarian +Old_Italic +Old_North_Arabian +Old_Permic +Old_Persian +Old_Sogdian +Old_South_Arabian +Old_Turkic +Old_Uyghur +Oriya +Osage +Osmanya +Pahawh_Hmong +Palmyrene +Pau_Cin_Hau +Phags_Pa +Phoenician +Psalter_Pahlavi +Rejang +Runic +Samaritan +Saurashtra +Sharada +Shavian +Siddham +SignWriting +Sinhala +Sogdian +Sora_Sompeng +Soyombo +Sundanese +Syloti_Nagri +Syriac +Tagalog +Tagbanwa +Tai_Le +Tai_Tham +Tai_Viet +Takri +Tamil +Tangsa +Tangut +Telugu +Thaana +Thai +Tibetan +Tifinagh +Tirhuta +Toto +Ugaritic +Vai +Vithkuqi +Wancho +Warang_Citi +Yezidi +Yi +Zanabazar_Square + +Vim character classes: +\i identifier character NOT SUPPORTED vim +\I «\i» except digits NOT SUPPORTED vim +\k keyword character NOT SUPPORTED vim +\K «\k» except digits NOT SUPPORTED vim +\f file name character NOT SUPPORTED vim +\F «\f» except digits NOT SUPPORTED vim +\p printable character NOT SUPPORTED vim +\P «\p» except digits NOT SUPPORTED vim +\s whitespace character (== [ \t]) NOT SUPPORTED vim +\S non-white space character (== [^ \t]) NOT SUPPORTED vim +\d digits (== [0-9]) vim +\D not «\d» vim +\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim +\X not «\x» NOT SUPPORTED vim +\o octal digits (== [0-7]) NOT SUPPORTED vim +\O not «\o» NOT SUPPORTED vim +\w word character vim +\W not «\w» vim +\h head of word character NOT SUPPORTED vim +\H not «\h» NOT SUPPORTED vim +\a alphabetic NOT SUPPORTED vim +\A not «\a» NOT SUPPORTED vim +\l lowercase NOT SUPPORTED vim +\L not lowercase NOT SUPPORTED vim +\u uppercase NOT SUPPORTED vim +\U not uppercase NOT SUPPORTED vim +\_x «\x» plus newline, for any «x» NOT SUPPORTED vim + +Vim flags: +\c ignore case NOT SUPPORTED vim +\C match case NOT SUPPORTED vim +\m magic NOT SUPPORTED vim +\M nomagic NOT SUPPORTED vim +\v verymagic NOT SUPPORTED vim +\V verynomagic NOT SUPPORTED vim +\Z ignore differences in Unicode combining characters NOT SUPPORTED vim + +Magic: +(?{code}) arbitrary Perl code NOT SUPPORTED perl +(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl +(?n) recursive call to regexp capturing group «n» NOT SUPPORTED +(?+n) recursive call to relative group «+n» NOT SUPPORTED +(?-n) recursive call to relative group «-n» NOT SUPPORTED +(?C) PCRE callout NOT SUPPORTED pcre +(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED +(?&name) recursive call to named group NOT SUPPORTED +(?P=name) named backreference NOT SUPPORTED +(?P>name) recursive call to named group NOT SUPPORTED +(?(cond)true|false) conditional branch NOT SUPPORTED +(?(cond)true) conditional branch NOT SUPPORTED +(*ACCEPT) make regexps more like Prolog NOT SUPPORTED +(*COMMIT) NOT SUPPORTED +(*F) NOT SUPPORTED +(*FAIL) NOT SUPPORTED +(*MARK) NOT SUPPORTED +(*PRUNE) NOT SUPPORTED +(*SKIP) NOT SUPPORTED +(*THEN) NOT SUPPORTED +(*ANY) set newline convention NOT SUPPORTED +(*ANYCRLF) NOT SUPPORTED +(*CR) NOT SUPPORTED +(*CRLF) NOT SUPPORTED +(*LF) NOT SUPPORTED +(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre +(*BSR_UNICODE) NOT SUPPORTED pcre + |