summaryrefslogtreecommitdiff
path: root/third_party/re2/src/doc/syntax.txt
diff options
context:
space:
mode:
authorCronet Mainline Eng <cronet-mainline-eng+copybara@google.com>2024-02-02 09:37:13 +0000
committerChidera Olibie <colibie@google.com>2024-02-02 09:53:24 +0000
commit5cfdd35118d5a23349255971e97737e32895ec0f (patch)
treef6b803e3a8bbddaf4814d1a43930799c3d7f4d8e /third_party/re2/src/doc/syntax.txt
parentabce8a39488511c10b95ac52d1a3fdd2e886da83 (diff)
downloadcronet-5cfdd35118d5a23349255971e97737e32895ec0f.tar.gz
Cronet 121.0.6167.71: import third_party/re2
Bug: b/322154153 FolderOrigin-RevId: /tmp/copybara-origin/src Change-Id: Ic5f3b7c7578bf4e12b03944d863325abfc88853a
Diffstat (limited to 'third_party/re2/src/doc/syntax.txt')
-rw-r--r--third_party/re2/src/doc/syntax.txt463
1 files changed, 463 insertions, 0 deletions
diff --git a/third_party/re2/src/doc/syntax.txt b/third_party/re2/src/doc/syntax.txt
new file mode 100644
index 000000000..6070efd96
--- /dev/null
+++ b/third_party/re2/src/doc/syntax.txt
@@ -0,0 +1,463 @@
+RE2 regular expression syntax reference
+-------------------------­-------­-----
+
+Single characters:
+. any character, possibly including newline (s=true)
+[xyz] character class
+[^xyz] negated character class
+\d Perl character class
+\D negated Perl character class
+[[:alpha:]] ASCII character class
+[[:^alpha:]] negated ASCII character class
+\pN Unicode character class (one-letter name)
+\p{Greek} Unicode character class
+\PN negated Unicode character class (one-letter name)
+\P{Greek} negated Unicode character class
+
+Composites:
+xy «x» followed by «y»
+x|y «x» or «y» (prefer «x»)
+
+Repetitions:
+x* zero or more «x», prefer more
+x+ one or more «x», prefer more
+x? zero or one «x», prefer one
+x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
+x{n,} «n» or more «x», prefer more
+x{n} exactly «n» «x»
+x*? zero or more «x», prefer fewer
+x+? one or more «x», prefer fewer
+x?? zero or one «x», prefer zero
+x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
+x{n,}? «n» or more «x», prefer fewer
+x{n}? exactly «n» «x»
+x{} (== x*) NOT SUPPORTED vim
+x{-} (== x*?) NOT SUPPORTED vim
+x{-n} (== x{n}?) NOT SUPPORTED vim
+x= (== x?) NOT SUPPORTED vim
+
+Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}»
+reject forms that create a minimum or maximum repetition count above 1000.
+Unlimited repetitions are not subject to this restriction.
+
+Possessive repetitions:
+x*+ zero or more «x», possessive NOT SUPPORTED
+x++ one or more «x», possessive NOT SUPPORTED
+x?+ zero or one «x», possessive NOT SUPPORTED
+x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
+x{n,}+ «n» or more «x», possessive NOT SUPPORTED
+x{n}+ exactly «n» «x», possessive NOT SUPPORTED
+
+Grouping:
+(re) numbered capturing group (submatch)
+(?P<name>re) named & numbered capturing group (submatch)
+(?<name>re) named & numbered capturing group (submatch)
+(?'name're) named & numbered capturing group (submatch) NOT SUPPORTED
+(?:re) non-capturing group
+(?flags) set flags within current group; non-capturing
+(?flags:re) set flags during re; non-capturing
+(?#text) comment NOT SUPPORTED
+(?|x|y|z) branch numbering reset NOT SUPPORTED
+(?>re) possessive match of «re» NOT SUPPORTED
+re@> possessive match of «re» NOT SUPPORTED vim
+%(re) non-capturing group NOT SUPPORTED vim
+
+Flags:
+i case-insensitive (default false)
+m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
+s let «.» match «\n» (default false)
+U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
+Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
+
+Empty strings:
+^ at beginning of text or line («m»=true)
+$ at end of text (like «\z» not «\Z») or line («m»=true)
+\A at beginning of text
+\b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
+\B not at ASCII word boundary
+\G at beginning of subtext being searched NOT SUPPORTED pcre
+\G at end of last match NOT SUPPORTED perl
+\Z at end of text, or before newline at end of text NOT SUPPORTED
+\z at end of text
+(?=re) before text matching «re» NOT SUPPORTED
+(?!re) before text not matching «re» NOT SUPPORTED
+(?<=re) after text matching «re» NOT SUPPORTED
+(?<!re) after text not matching «re» NOT SUPPORTED
+re& before text matching «re» NOT SUPPORTED vim
+re@= before text matching «re» NOT SUPPORTED vim
+re@! before text not matching «re» NOT SUPPORTED vim
+re@<= after text matching «re» NOT SUPPORTED vim
+re@<! after text not matching «re» NOT SUPPORTED vim
+\zs sets start of match (= \K) NOT SUPPORTED vim
+\ze sets end of match NOT SUPPORTED vim
+\%^ beginning of file NOT SUPPORTED vim
+\%$ end of file NOT SUPPORTED vim
+\%V on screen NOT SUPPORTED vim
+\%# cursor position NOT SUPPORTED vim
+\%'m mark «m» position NOT SUPPORTED vim
+\%23l in line 23 NOT SUPPORTED vim
+\%23c in column 23 NOT SUPPORTED vim
+\%23v in virtual column 23 NOT SUPPORTED vim
+
+Escape sequences:
+\a bell (== \007)
+\f form feed (== \014)
+\t horizontal tab (== \011)
+\n newline (== \012)
+\r carriage return (== \015)
+\v vertical tab character (== \013)
+\* literal «*», for any punctuation character «*»
+\123 octal character code (up to three digits)
+\x7F hex character code (exactly two digits)
+\x{10FFFF} hex character code
+\C match a single byte even in UTF-8 mode
+\Q...\E literal text «...» even if «...» has punctuation
+
+\1 backreference NOT SUPPORTED
+\b backspace NOT SUPPORTED (use «\010»)
+\cK control char ^K NOT SUPPORTED (use «\001» etc)
+\e escape NOT SUPPORTED (use «\033»)
+\g1 backreference NOT SUPPORTED
+\g{1} backreference NOT SUPPORTED
+\g{+1} backreference NOT SUPPORTED
+\g{-1} backreference NOT SUPPORTED
+\g{name} named backreference NOT SUPPORTED
+\g<name> subroutine call NOT SUPPORTED
+\g'name' subroutine call NOT SUPPORTED
+\k<name> named backreference NOT SUPPORTED
+\k'name' named backreference NOT SUPPORTED
+\lX lowercase «X» NOT SUPPORTED
+\ux uppercase «x» NOT SUPPORTED
+\L...\E lowercase text «...» NOT SUPPORTED
+\K reset beginning of «$0» NOT SUPPORTED
+\N{name} named Unicode character NOT SUPPORTED
+\R line break NOT SUPPORTED
+\U...\E upper case text «...» NOT SUPPORTED
+\X extended Unicode sequence NOT SUPPORTED
+
+\%d123 decimal character 123 NOT SUPPORTED vim
+\%xFF hex character FF NOT SUPPORTED vim
+\%o123 octal character 123 NOT SUPPORTED vim
+\%u1234 Unicode character 0x1234 NOT SUPPORTED vim
+\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
+
+Character class elements:
+x single character
+A-Z character range (inclusive)
+\d Perl character class
+[:foo:] ASCII character class «foo»
+\p{Foo} Unicode character class «Foo»
+\pF Unicode character class «F» (one-letter name)
+
+Named character classes as character class elements:
+[\d] digits (== \d)
+[^\d] not digits (== \D)
+[\D] not digits (== \D)
+[^\D] not not digits (== \d)
+[[:name:]] named ASCII class inside character class (== [:name:])
+[^[:name:]] named ASCII class inside negated character class (== [:^name:])
+[\p{Name}] named Unicode property inside character class (== \p{Name})
+[^\p{Name}] named Unicode property inside negated character class (== \P{Name})
+
+Perl character classes (all ASCII-only):
+\d digits (== [0-9])
+\D not digits (== [^0-9])
+\s whitespace (== [\t\n\f\r ])
+\S not whitespace (== [^\t\n\f\r ])
+\w word characters (== [0-9A-Za-z_])
+\W not word characters (== [^0-9A-Za-z_])
+
+\h horizontal space NOT SUPPORTED
+\H not horizontal space NOT SUPPORTED
+\v vertical space NOT SUPPORTED
+\V not vertical space NOT SUPPORTED
+
+ASCII character classes:
+[[:alnum:]] alphanumeric (== [0-9A-Za-z])
+[[:alpha:]] alphabetic (== [A-Za-z])
+[[:ascii:]] ASCII (== [\x00-\x7F])
+[[:blank:]] blank (== [\t ])
+[[:cntrl:]] control (== [\x00-\x1F\x7F])
+[[:digit:]] digits (== [0-9])
+[[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
+[[:lower:]] lower case (== [a-z])
+[[:print:]] printable (== [ -~] == [ [:graph:]])
+[[:punct:]] punctuation (== [!-/:-@[-`{-~])
+[[:space:]] whitespace (== [\t\n\v\f\r ])
+[[:upper:]] upper case (== [A-Z])
+[[:word:]] word characters (== [0-9A-Za-z_])
+[[:xdigit:]] hex digit (== [0-9A-Fa-f])
+
+Unicode character class names--general category:
+C other
+Cc control
+Cf format
+Cn unassigned code points NOT SUPPORTED
+Co private use
+Cs surrogate
+L letter
+LC cased letter NOT SUPPORTED
+L& cased letter NOT SUPPORTED
+Ll lowercase letter
+Lm modifier letter
+Lo other letter
+Lt titlecase letter
+Lu uppercase letter
+M mark
+Mc spacing mark
+Me enclosing mark
+Mn non-spacing mark
+N number
+Nd decimal number
+Nl letter number
+No other number
+P punctuation
+Pc connector punctuation
+Pd dash punctuation
+Pe close punctuation
+Pf final punctuation
+Pi initial punctuation
+Po other punctuation
+Ps open punctuation
+S symbol
+Sc currency symbol
+Sk modifier symbol
+Sm math symbol
+So other symbol
+Z separator
+Zl line separator
+Zp paragraph separator
+Zs space separator
+
+Unicode character class names--scripts:
+Adlam
+Ahom
+Anatolian_Hieroglyphs
+Arabic
+Armenian
+Avestan
+Balinese
+Bamum
+Bassa_Vah
+Batak
+Bengali
+Bhaiksuki
+Bopomofo
+Brahmi
+Braille
+Buginese
+Buhid
+Canadian_Aboriginal
+Carian
+Caucasian_Albanian
+Chakma
+Cham
+Cherokee
+Chorasmian
+Common
+Coptic
+Cuneiform
+Cypriot
+Cypro_Minoan
+Cyrillic
+Deseret
+Devanagari
+Dives_Akuru
+Dogra
+Duployan
+Egyptian_Hieroglyphs
+Elbasan
+Elymaic
+Ethiopic
+Georgian
+Glagolitic
+Gothic
+Grantha
+Greek
+Gujarati
+Gunjala_Gondi
+Gurmukhi
+Han
+Hangul
+Hanifi_Rohingya
+Hanunoo
+Hatran
+Hebrew
+Hiragana
+Imperial_Aramaic
+Inherited
+Inscriptional_Pahlavi
+Inscriptional_Parthian
+Javanese
+Kaithi
+Kannada
+Katakana
+Kawi
+Kayah_Li
+Kharoshthi
+Khitan_Small_Script
+Khmer
+Khojki
+Khudawadi
+Lao
+Latin
+Lepcha
+Limbu
+Linear_A
+Linear_B
+Lisu
+Lycian
+Lydian
+Mahajani
+Makasar
+Malayalam
+Mandaic
+Manichaean
+Marchen
+Masaram_Gondi
+Medefaidrin
+Meetei_Mayek
+Mende_Kikakui
+Meroitic_Cursive
+Meroitic_Hieroglyphs
+Miao
+Modi
+Mongolian
+Mro
+Multani
+Myanmar
+Nabataean
+Nag_Mundari
+Nandinagari
+New_Tai_Lue
+Newa
+Nko
+Nushu
+Nyiakeng_Puachue_Hmong
+Ogham
+Ol_Chiki
+Old_Hungarian
+Old_Italic
+Old_North_Arabian
+Old_Permic
+Old_Persian
+Old_Sogdian
+Old_South_Arabian
+Old_Turkic
+Old_Uyghur
+Oriya
+Osage
+Osmanya
+Pahawh_Hmong
+Palmyrene
+Pau_Cin_Hau
+Phags_Pa
+Phoenician
+Psalter_Pahlavi
+Rejang
+Runic
+Samaritan
+Saurashtra
+Sharada
+Shavian
+Siddham
+SignWriting
+Sinhala
+Sogdian
+Sora_Sompeng
+Soyombo
+Sundanese
+Syloti_Nagri
+Syriac
+Tagalog
+Tagbanwa
+Tai_Le
+Tai_Tham
+Tai_Viet
+Takri
+Tamil
+Tangsa
+Tangut
+Telugu
+Thaana
+Thai
+Tibetan
+Tifinagh
+Tirhuta
+Toto
+Ugaritic
+Vai
+Vithkuqi
+Wancho
+Warang_Citi
+Yezidi
+Yi
+Zanabazar_Square
+
+Vim character classes:
+\i identifier character NOT SUPPORTED vim
+\I «\i» except digits NOT SUPPORTED vim
+\k keyword character NOT SUPPORTED vim
+\K «\k» except digits NOT SUPPORTED vim
+\f file name character NOT SUPPORTED vim
+\F «\f» except digits NOT SUPPORTED vim
+\p printable character NOT SUPPORTED vim
+\P «\p» except digits NOT SUPPORTED vim
+\s whitespace character (== [ \t]) NOT SUPPORTED vim
+\S non-white space character (== [^ \t]) NOT SUPPORTED vim
+\d digits (== [0-9]) vim
+\D not «\d» vim
+\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
+\X not «\x» NOT SUPPORTED vim
+\o octal digits (== [0-7]) NOT SUPPORTED vim
+\O not «\o» NOT SUPPORTED vim
+\w word character vim
+\W not «\w» vim
+\h head of word character NOT SUPPORTED vim
+\H not «\h» NOT SUPPORTED vim
+\a alphabetic NOT SUPPORTED vim
+\A not «\a» NOT SUPPORTED vim
+\l lowercase NOT SUPPORTED vim
+\L not lowercase NOT SUPPORTED vim
+\u uppercase NOT SUPPORTED vim
+\U not uppercase NOT SUPPORTED vim
+\_x «\x» plus newline, for any «x» NOT SUPPORTED vim
+
+Vim flags:
+\c ignore case NOT SUPPORTED vim
+\C match case NOT SUPPORTED vim
+\m magic NOT SUPPORTED vim
+\M nomagic NOT SUPPORTED vim
+\v verymagic NOT SUPPORTED vim
+\V verynomagic NOT SUPPORTED vim
+\Z ignore differences in Unicode combining characters NOT SUPPORTED vim
+
+Magic:
+(?{code}) arbitrary Perl code NOT SUPPORTED perl
+(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
+(?n) recursive call to regexp capturing group «n» NOT SUPPORTED
+(?+n) recursive call to relative group «+n» NOT SUPPORTED
+(?-n) recursive call to relative group «-n» NOT SUPPORTED
+(?C) PCRE callout NOT SUPPORTED pcre
+(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
+(?&name) recursive call to named group NOT SUPPORTED
+(?P=name) named backreference NOT SUPPORTED
+(?P>name) recursive call to named group NOT SUPPORTED
+(?(cond)true|false) conditional branch NOT SUPPORTED
+(?(cond)true) conditional branch NOT SUPPORTED
+(*ACCEPT) make regexps more like Prolog NOT SUPPORTED
+(*COMMIT) NOT SUPPORTED
+(*F) NOT SUPPORTED
+(*FAIL) NOT SUPPORTED
+(*MARK) NOT SUPPORTED
+(*PRUNE) NOT SUPPORTED
+(*SKIP) NOT SUPPORTED
+(*THEN) NOT SUPPORTED
+(*ANY) set newline convention NOT SUPPORTED
+(*ANYCRLF) NOT SUPPORTED
+(*CR) NOT SUPPORTED
+(*CRLF) NOT SUPPORTED
+(*LF) NOT SUPPORTED
+(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
+(*BSR_UNICODE) NOT SUPPORTED pcre
+