aboutsummaryrefslogtreecommitdiff
path: root/dist2/doc/pcre2compat.3
diff options
context:
space:
mode:
Diffstat (limited to 'dist2/doc/pcre2compat.3')
-rw-r--r--dist2/doc/pcre2compat.357
1 files changed, 29 insertions, 28 deletions
diff --git a/dist2/doc/pcre2compat.3 b/dist2/doc/pcre2compat.3
index 39ccc2ea..6e448f6c 100644
--- a/dist2/doc/pcre2compat.3
+++ b/dist2/doc/pcre2compat.3
@@ -1,4 +1,4 @@
-.TH PCRE2COMPAT 3 "12 February 2019" "PCRE2 10.33"
+.TH PCRE2COMPAT 3 "28 July 2018" "PCRE2 10.32"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
@@ -23,9 +23,10 @@ character is not "a" three times (in principle; PCRE2 optimizes this to run the
assertion just once). Perl allows some repeat quantifiers on other assertions,
for example, \eb* (but not \eb{3}), but these do not seem to have any use.
.P
-3. Capture groups that occur inside negative lookaround assertions are counted,
-but their entries in the offsets vector are set only when a negative assertion
-is a condition that has a matching branch (that is, the condition is false).
+3. Capturing subpatterns that occur inside negative lookaround assertions are
+counted, but their entries in the offsets vector are set only when a negative
+assertion is a condition that has a matching branch (that is, the condition is
+false).
.P
4. The following Perl escape sequences are not supported: \eF, \el, \eL, \eu,
\eU, and \eN when followed by a character name. \eN on its own, matching a
@@ -33,9 +34,8 @@ non-newline character, and \eN{U+dd..}, matching a Unicode code point, are
supported. The escapes that modify the case of following letters are
implemented by Perl's general string-handling and are not part of its pattern
matching engine. If any of these are encountered by PCRE2, an error is
-generated by default. However, if either of the PCRE2_ALT_BSUX or
-PCRE2_EXTRA_ALT_BSUX options is set, \eU and \eu are interpreted as ECMAScript
-interprets them.
+generated by default. However, if the PCRE2_ALT_BSUX option is set, \eU and \eu
+are interpreted as ECMAScript interprets them.
.P
5. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE2 is
built with Unicode support (the default). The properties that can be tested
@@ -79,13 +79,13 @@ documentation for details.
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
into subroutine calls is now supported, as in Perl.
.P
-9. If any of the backtracking control verbs are used in a group that is called
-as a subroutine (whether or not recursively), their effect is confined to that
-group; it does not extend to the surrounding pattern. This is not always the
-case in Perl. In particular, if (*THEN) is present in a group that is called as
-a subroutine, its action is limited to that group, even if the group does not
-contain any | characters. Note that such groups are processed as anchored
-at the point where they are tested.
+9. If any of the backtracking control verbs are used in a subpattern that is
+called as a subroutine (whether or not recursively), their effect is confined
+to that subpattern; it does not extend to the surrounding pattern. This is not
+always the case in Perl. In particular, if (*THEN) is present in a group that
+is called as a subroutine, its action is limited to that group, even if the
+group does not contain any | characters. Note that such subpatterns are
+processed as anchored at the point where they are tested.
.P
10. If a pattern contains more than one backtracking control verb, the first
one that is backtracked onto acts. For example, in the pattern
@@ -101,20 +101,21 @@ strings when part of a pattern is repeated. For example, matching "aba" against
the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
"b".
.P
-13. PCRE2's handling of duplicate capture group numbers and names is not as
-general as Perl's. This is a consequence of the fact the PCRE2 works internally
-just with numbers, using an external table to translate between numbers and
-names. In particular, a pattern such as (?|(?<a>A)|(?<b>B), where the two
-capture groups have the same number but different names, is not supported, and
-causes an error at compile time. If it were allowed, it would not be possible
-to distinguish which group matched, because both names map to capture group
-number 1. To avoid this confusing situation, an error is given at compile time.
+13. PCRE2's handling of duplicate subpattern numbers and duplicate subpattern
+names is not as general as Perl's. This is a consequence of the fact the PCRE2
+works internally just with numbers, using an external table to translate
+between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b>B),
+where the two capturing parentheses have the same number but different names,
+is not supported, and causes an error at compile time. If it were allowed, it
+would not be possible to distinguish which parentheses matched, because both
+names map to capturing subpattern number 1. To avoid this confusing situation,
+an error is given at compile time.
.P
14. Perl used to recognize comments in some places that PCRE2 does not, for
-example, between the ( and ? at the start of a group. If the /x modifier is
-set, Perl allowed white space between ( and ? though the latest Perls give an
-error (for a while it was just deprecated). There may still be some cases where
-Perl behaves differently.
+example, between the ( and ? at the start of a subpattern. If the /x modifier
+is set, Perl allowed white space between ( and ? though the latest Perls give
+an error (for a while it was just deprecated). There may still be some cases
+where Perl behaves differently.
.P
15. Perl, when in warning mode, gives warnings for character classes such as
[A-\ed] or [a-[:digit:]]. It then treats the hyphens as literals. PCRE2 has no
@@ -199,6 +200,6 @@ Cambridge, England.
.rs
.sp
.nf
-Last updated: 12 February 2019
-Copyright (c) 1997-2019 University of Cambridge.
+Last updated: 28 July 2018
+Copyright (c) 1997-2018 University of Cambridge.
.fi