add PIRegularExpression

This commit is contained in:
2025-08-11 14:23:29 +03:00
parent 91955d44fa
commit 654c0847b2
481 changed files with 434858 additions and 0 deletions

BIN
3rd/pcre2/testdata/grepbinary vendored Normal file

Binary file not shown.

3
3rd/pcre2/testdata/grepfilelist vendored Normal file
View File

@@ -0,0 +1,3 @@
testdata/grepinputv
testdata/grepinputx

643
3rd/pcre2/testdata/grepinput vendored Normal file
View File

@@ -0,0 +1,643 @@
This is a file of miscellaneous text that is used as test data for checking
that the pcregrep command is working correctly. The file must be more than
24KiB long so that it needs more than a single read() call to process it. New
features should be added at the end, because some of the tests involve the
output of line numbers, and we don't want these to change.
PATTERN at the start of a line.
In the middle of a line, PATTERN appears.
This pattern is in lower case.
Here follows a whole lot of stuff that makes the file over 24KiB long.
-------------------------------------------------------------------------------
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick
brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
-------------------------------------------------------------------------------
aaaaa0
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbb
cccccccccccccccccccccccccccccccccccccccccc
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
eeeee
aaaaa2
ffffffffff
This is a line before the binary zero.
This line contains a binary zero here >< for testing.
This is a line after the binary zero.
ABOVE the elephant
ABOVE
ABOVE theatre
AB.VE
AB.VE the turtle
010203040506
match 1:
a
match 2:
b
match 3:
c
match 4:
d
match 5:
e
Rhubarb
Custard Tart
zxc
cvb
bnm
asd
qwe
ert
tyu
uio
ggg
asd
dfg
ghj
jkl
abx
def
ghi
xyz
PUT NEW DATA ABOVE THIS LINE.
=============================
Check up on PATTERN near the end.
This is the last line of this file.

15
3rd/pcre2/testdata/grepinput3 vendored Normal file
View File

@@ -0,0 +1,15 @@
triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t2_txt s1_tag s_txt p_tag p_txt o_tag
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t5_txt s1_tag s_txt p_tag p_txt o_tag
o_txt
triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt

17
3rd/pcre2/testdata/grepinput8 vendored Normal file
View File

@@ -0,0 +1,17 @@
X one
X two X three X four
X five
X six
X seven…X eightX nineX ten
Before 111
Before 222Before 333…Match
After 111
After 222After 333
And so on and so on
And so on and so on
ſ
ſſſſſ
ÁabcÁ Kk
A

1
3rd/pcre2/testdata/grepinputBad8 vendored Normal file
View File

@@ -0,0 +1 @@
Aက<EFBFBD>CD Z

View File

@@ -0,0 +1 @@
abc<EFBFBD>

BIN
3rd/pcre2/testdata/grepinputC.bz2 vendored Normal file

Binary file not shown.

BIN
3rd/pcre2/testdata/grepinputC.gz vendored Normal file

Binary file not shown.

17
3rd/pcre2/testdata/grepinputM vendored Normal file
View File

@@ -0,0 +1,17 @@
Data file for multiline tests of multiple matches.
start end in between start
end and following
Other stuff
start end in between start
end and following start
end other stuff
start end in between start
end
** These two lines must be last.
start end in between start
end

2
3rd/pcre2/testdata/grepinputUN vendored Normal file
View File

@@ -0,0 +1,2 @@
abcሴdef
xyz

10
3rd/pcre2/testdata/grepinputv vendored Normal file
View File

@@ -0,0 +1,10 @@
The quick brown
fox jumps
over the lazy dog.
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
A buried feline in the syndicate
trailing spaces

43
3rd/pcre2/testdata/grepinputx vendored Normal file
View File

@@ -0,0 +1,43 @@
This is a second file of input for the pcre2grep tests.
Here is the pattern again.
Pattern
That time it was on a line by itself.
To pat or not to pat, that is the question.
complete pair
of lines
That was a complete pair
of lines all by themselves.
complete pair
of lines
And there they were again, to check line numbers.
one
two
three
four
five
six
seven
eight
nine
ten
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
eighteen
nineteen
twenty
This line contains pattern not on a line by itself.
This is the last line of this file.

7
3rd/pcre2/testdata/greplist vendored Normal file
View File

@@ -0,0 +1,7 @@
This is a file of patterns for testing the -f option. Don't include any blank
lines because they will match everything! This is no longer true, so have one.
pattern
line by itself
End of the list of patterns.

43
3rd/pcre2/testdata/grepnot.bz2 vendored Normal file
View File

@@ -0,0 +1,43 @@
This is a second file of input for the pcre2grep tests.
Here is the pattern again.
Pattern
That time it was on a line by itself.
To pat or not to pat, that is the question.
complete pair
of lines
That was a complete pair
of lines all by themselves.
complete pair
of lines
And there they were again, to check line numbers.
one
two
three
four
five
six
seven
eight
nine
ten
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
eighteen
nineteen
twenty
This line contains pattern not on a line by itself.
This is the last line of this file.

1332
3rd/pcre2/testdata/grepoutput vendored Normal file
View File

@@ -0,0 +1,1332 @@
---------------------------- Test 1 ------------------------------
PATTERN at the start of a line.
In the middle of a line, PATTERN appears.
Check up on PATTERN near the end.
RC=0
---------------------------- Test 2 ------------------------------
PATTERN at the start of a line.
RC=0
---------------------------- Test 3 ------------------------------
7:PATTERN at the start of a line.
8:In the middle of a line, PATTERN appears.
10:This pattern is in lower case.
642:Check up on PATTERN near the end.
RC=0
---------------------------- Test 4 ------------------------------
4
RC=0
---------------------------- Test 5 ------------------------------
./testdata/grepinput:7:PATTERN at the start of a line.
./testdata/grepinput:8:In the middle of a line, PATTERN appears.
./testdata/grepinput:10:This pattern is in lower case.
./testdata/grepinput:642:Check up on PATTERN near the end.
./testdata/grepinputx:3:Here is the pattern again.
./testdata/grepinputx:5:Pattern
./testdata/grepinputx:42:This line contains pattern not on a line by itself.
RC=0
---------------------------- Test 6 ------------------------------
7:PATTERN at the start of a line.
8:In the middle of a line, PATTERN appears.
10:This pattern is in lower case.
642:Check up on PATTERN near the end.
3:Here is the pattern again.
5:Pattern
42:This line contains pattern not on a line by itself.
RC=0
---------------------------- Test 7 ------------------------------
./testdata/grepinput
./testdata/grepinputx
RC=0
---------------------------- Test 8 ------------------------------
./testdata/grepinput
RC=0
---------------------------- Test 9 ------------------------------
RC=0
---------------------------- Test 10 -----------------------------
RC=1
---------------------------- Test 11 -----------------------------
1:This is a second file of input for the pcre2grep tests.
2:
4:
5:Pattern
6:That time it was on a line by itself.
7:
8:To pat or not to pat, that is the question.
9:
10:complete pair
11:of lines
12:
13:That was a complete pair
14:of lines all by themselves.
15:
16:complete pair
17:of lines
18:
19:And there they were again, to check line numbers.
20:
21:one
22:two
23:three
24:four
25:five
26:six
27:seven
28:eight
29:nine
30:ten
31:eleven
32:twelve
33:thirteen
34:fourteen
35:fifteen
36:sixteen
37:seventeen
38:eighteen
39:nineteen
40:twenty
41:
43:This is the last line of this file.
RC=0
---------------------------- Test 12 -----------------------------
Pattern
RC=0
---------------------------- Test 13 -----------------------------
Here is the pattern again.
That time it was on a line by itself.
seventeen
This line contains pattern not on a line by itself.
RC=0
---------------------------- Test 14 -----------------------------
./testdata/grepinputx:To pat or not to pat, that is the question.
RC=0
---------------------------- Test 15 -----------------------------
pcre2grep: Error in command-line regex at offset 4: quantifier does not follow a repeatable item
RC=2
---------------------------- Test 16 -----------------------------
pcre2grep: Failed to open ./testdata/nonexistfile: No such file or directory
RC=2
---------------------------- Test 17 -----------------------------
features should be added at the end, because some of the tests involve the
output of line numbers, and we don't want these to change.
RC=0
---------------------------- Test 18 -----------------------------
4:features should be added at the end, because some of the tests involve the
output of line numbers, and we don't want these to change.
583:brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
-------------------------------------------------------------------------------
RC=0
---------------------------- Test 19 -----------------------------
Pattern
RC=0
---------------------------- Test 20 -----------------------------
10:complete pair
of lines
16:complete pair
of lines
RC=0
---------------------------- Test 21 -----------------------------
24:four
25-five
26-six
27-seven
--
34:fourteen
35-fifteen
36-sixteen
37-seventeen
RC=0
---------------------------- Test 22 -----------------------------
21-one
22-two
23-three
24:four
--
31-eleven
32-twelve
33-thirteen
34:fourteen
RC=0
---------------------------- Test 23 -----------------------------
one
two
three
four
five
six
seven
--
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
RC=0
---------------------------- Test 24 -----------------------------
four
five
six
seven
eight
nine
ten
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
eighteen
nineteen
twenty
This line contains pattern not on a line by itself.
This is the last line of this file.
RC=0
---------------------------- Test 25 -----------------------------
15-
16-complete pair
17-of lines
18-
19-And there they were again, to check line numbers.
20-
21-one
22-two
23-three
24:four
25-five
26-six
27-seven
28-eight
29-nine
30-ten
31-eleven
32-twelve
33-thirteen
34:fourteen
RC=0
---------------------------- Test 26 -----------------------------
complete pair
of lines
And there they were again, to check line numbers.
one
two
three
four
five
six
seven
eight
nine
ten
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
eighteen
nineteen
twenty
This line contains pattern not on a line by itself.
This is the last line of this file.
RC=0
---------------------------- Test 27 -----------------------------
four
five
six
seven
eight
nine
ten
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
eighteen
nineteen
twenty
This line contains pattern not on a line by itself.
This is the last line of this file.
RC=0
---------------------------- Test 28 -----------------------------
14-of lines all by themselves.
15-
16-complete pair
17-of lines
18-
19-And there they were again, to check line numbers.
20-
21-one
22-two
23-three
24:four
25-five
26-six
27-seven
28-eight
29-nine
30-ten
31-eleven
32-twelve
33-thirteen
34:fourteen
RC=0
---------------------------- Test 29 -----------------------------
of lines all by themselves.
complete pair
of lines
And there they were again, to check line numbers.
one
two
three
four
five
six
seven
eight
nine
ten
eleven
twelve
thirteen
fourteen
fifteen
sixteen
seventeen
eighteen
nineteen
twenty
This line contains pattern not on a line by itself.
This is the last line of this file.
RC=0
---------------------------- Test 30 -----------------------------
./testdata/grepinput-4-features should be added at the end, because some of the tests involve the
./testdata/grepinput-5-output of line numbers, and we don't want these to change.
./testdata/grepinput-6-
./testdata/grepinput:7:PATTERN at the start of a line.
./testdata/grepinput:8:In the middle of a line, PATTERN appears.
./testdata/grepinput-9-
./testdata/grepinput:10:This pattern is in lower case.
--
./testdata/grepinput-639-PUT NEW DATA ABOVE THIS LINE.
./testdata/grepinput-640-=============================
./testdata/grepinput-641-
./testdata/grepinput:642:Check up on PATTERN near the end.
--
./testdata/grepinputx-1-This is a second file of input for the pcre2grep tests.
./testdata/grepinputx-2-
./testdata/grepinputx:3:Here is the pattern again.
./testdata/grepinputx-4-
./testdata/grepinputx:5:Pattern
--
./testdata/grepinputx-39-nineteen
./testdata/grepinputx-40-twenty
./testdata/grepinputx-41-
./testdata/grepinputx:42:This line contains pattern not on a line by itself.
RC=0
---------------------------- Test 31 -----------------------------
./testdata/grepinput:7:PATTERN at the start of a line.
./testdata/grepinput:8:In the middle of a line, PATTERN appears.
./testdata/grepinput-9-
./testdata/grepinput:10:This pattern is in lower case.
./testdata/grepinput-11-
./testdata/grepinput-12-Here follows a whole lot of stuff that makes the file over 24KiB long.
./testdata/grepinput-13-
--
./testdata/grepinput:642:Check up on PATTERN near the end.
./testdata/grepinput-643-This is the last line of this file.
--
./testdata/grepinputx:3:Here is the pattern again.
./testdata/grepinputx-4-
./testdata/grepinputx:5:Pattern
./testdata/grepinputx-6-That time it was on a line by itself.
./testdata/grepinputx-7-
./testdata/grepinputx-8-To pat or not to pat, that is the question.
--
./testdata/grepinputx:42:This line contains pattern not on a line by itself.
./testdata/grepinputx-43-This is the last line of this file.
RC=0
---------------------------- Test 32 -----------------------------
./testdata/grepinputx
RC=0
---------------------------- Test 33 -----------------------------
pcre2grep: Failed to open ./testdata/grepnonexist: No such file or directory
RC=2
---------------------------- Test 34 -----------------------------
RC=2
---------------------------- Test 35 -----------------------------
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 36 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 37 -----------------------------
24KiB long so that it needs more than a single read() call to process it. New
aaaaa0
aaaaa2
010203040506
RC=0
======== STDERR ========
pcre2grep: pcre2_match() gave error -47 while matching this text:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
pcre2grep: pcre2_match() gave error -47 while matching this text:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
pcre2grep: Error -46, -47, -53 or -63 means that a resource limit was exceeded.
pcre2grep: Check your regex for nested unlimited loops.
---------------------------- Test 38 ------------------------------
This line contains a binary zero here >< for testing.
RC=0
---------------------------- Test 39 ------------------------------
This is a line before the binary zero.
This line contains a binary zero here >< for testing.
RC=0
---------------------------- Test 40 ------------------------------
This line contains a binary zero here >< for testing.
This is a line after the binary zero.
RC=0
---------------------------- Test 41 ------------------------------
before the binary zero
after the binary zero
RC=0
---------------------------- Test 42 ------------------------------
./testdata/grepinput:595:before the binary zero
./testdata/grepinput:597:after the binary zero
RC=0
---------------------------- Test 43 ------------------------------
595:before
595:zero
596:zero
597:after
597:zero
RC=0
---------------------------- Test 44 ------------------------------
595:before
595:zero
596:zero
597:after
597:zero
RC=0
---------------------------- Test 45 ------------------------------
10:pattern
595:binary
596:binary
597:binary
RC=0
---------------------------- Test 46 ------------------------------
pcre2grep: Error in 1st command-line regex at offset 8: unmatched closing parenthesis
RC=2
pcre2grep: Error in 2nd command-line regex at offset 9: missing closing parenthesis
RC=2
pcre2grep: Error in 3rd command-line regex at offset 9: missing terminating ] for character class
RC=2
pcre2grep: Error in 4th command-line regex at offset 9: missing terminating ] for character class
RC=2
---------------------------- Test 47 ------------------------------
AB.VE
RC=0
---------------------------- Test 48 ------------------------------
ABOVE the elephant
AB.VE
AB.VE the turtle
RC=0
---------------------------- Test 49 ------------------------------
ABOVE the elephant
AB.VE
AB.VE the turtle
PUT NEW DATA ABOVE THIS LINE.
RC=0
---------------------------- Test 50 ------------------------------
RC=1
---------------------------- Test 51 ------------------------------
over the lazy dog.
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
A buried feline in the syndicate
trailing spaces
RC=0
---------------------------- Test 52 ------------------------------
fox jumps
This time it jumps and jumps and jumps.
RC=0
---------------------------- Test 53 ------------------------------
36976,6
36994,4
37028,4
37070,5
37087,4
RC=0
---------------------------- Test 54 ------------------------------
595:15,6
595:33,4
596:28,4
597:15,5
597:32,4
RC=0
---------------------------- Test 55 -----------------------------
Here is the pattern again.
That time it was on a line by itself.
This line contains pattern not on a line by itself.
RC=0
---------------------------- Test 56 -----------------------------
./testdata/grepinput:456
./testdata/grepinput3:0
./testdata/grepinput8:0
./testdata/grepinputBad8:0
./testdata/grepinputBad8_Trail:0
./testdata/grepinputM:0
./testdata/grepinputUN:0
./testdata/grepinputv:1
./testdata/grepinputx:0
RC=0
---------------------------- Test 57 -----------------------------
./testdata/grepinput:456
./testdata/grepinputv:1
RC=0
---------------------------- Test 58 -----------------------------
PATTERN at the start of a line.
In the middle of a line, PATTERN appears.
Check up on PATTERN near the end.
RC=0
---------------------------- Test 59 -----------------------------
PATTERN at the start of a line.
In the middle of a line, PATTERN appears.
Check up on PATTERN near the end.
RC=0
---------------------------- Test 60 -----------------------------
PATTERN at the start of a line.
In the middle of a line, PATTERN appears.
Check up on PATTERN near the end.
RC=0
---------------------------- Test 61 -----------------------------
PATTERN at the start of a line.
In the middle of a line, PATTERN appears.
Check up on PATTERN near the end.
RC=0
---------------------------- Test 62 -----------------------------
pcre2grep: pcre2_match() gave error -47 while matching text that starts:
This is a file of miscellaneous text that is used as test data for checking
that the pcregrep command is working correctly. The file must be more than
24KiB long so that it needs more than a single re
pcre2grep: Error -46, -47, -53 or -63 means that a resource limit was exceeded.
pcre2grep: Check your regex for nested unlimited loops.
RC=1
---------------------------- Test 63 -----------------------------
pcre2grep: pcre2_match() gave error -53 while matching text that starts:
This is a file of miscellaneous text that is used as test data for checking
that the pcregrep command is working correctly. The file must be more than
24KiB long so that it needs more than a single re
pcre2grep: Error -46, -47, -53 or -63 means that a resource limit was exceeded.
pcre2grep: Check your regex for nested unlimited loops.
RC=1
---------------------------- Test 64 ------------------------------
appears
RC=0
---------------------------- Test 65 ------------------------------
pear
RC=0
---------------------------- Test 66 ------------------------------
RC=0
---------------------------- Test 67 ------------------------------
RC=0
---------------------------- Test 68 ------------------------------
pear
RC=0
---------------------------- Test 69 -----------------------------
1:This is a second file of input for the pcre2grep tests.
2:
4:
5:Pattern
6:That time it was on a line by itself.
7:
8:To pat or not to pat, that is the question.
9:
10:complete pair
11:of lines
12:
13:That was a complete pair
14:of lines all by themselves.
15:
16:complete pair
17:of lines
18:
19:And there they were again, to check line numbers.
20:
21:one
22:two
23:three
24:four
25:five
26:six
27:seven
28:eight
29:nine
30:ten
31:eleven
32:twelve
33:thirteen
34:fourteen
35:fifteen
36:sixteen
37:seventeen
38:eighteen
39:nineteen
40:twenty
41:
43:This is the last line of this file.
RC=0
---------------------------- Test 70 -----------------------------
triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
RC=0
1:triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
6:triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
8:triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
13:triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
RC=0
triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
RC=0
1:triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
6:triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
8:triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
13:triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
RC=0
---------------------------- Test 71 -----------------------------
01
RC=0
---------------------------- Test 72 -----------------------------
010203040506
RC=0
---------------------------- Test 73 -----------------------------
01
RC=0
---------------------------- Test 74 -----------------------------
01
02
RC=0
---------------------------- Test 75 -----------------------------
010203040506
RC=0
---------------------------- Test 76 -----------------------------
01
02
RC=0
---------------------------- Test 77 -----------------------------
01
03
RC=0
---------------------------- Test 78 -----------------------------
010203040506
RC=0
---------------------------- Test 79 -----------------------------
01
03
RC=0
---------------------------- Test 80 -----------------------------
01
RC=0
---------------------------- Test 81 -----------------------------
010203040506
RC=0
---------------------------- Test 82 -----------------------------
01
RC=0
---------------------------- Test 83 -----------------------------
pcre2grep: line 4 of file ./testdata/grepinput3 is too long for the internal buffer
pcre2grep: the maximum buffer size is 100
pcre2grep: use the --max-buffer-size option to change it
RC=2
---------------------------- Test 84 -----------------------------
testdata/grepinputv:fox jumps
testdata/grepinputx:complete pair
testdata/grepinputx:That was a complete pair
testdata/grepinputx:complete pair
testdata/grepinput3:triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
RC=0
---------------------------- Test 85 -----------------------------
./testdata/grepinput3:Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
RC=0
---------------------------- Test 86 -----------------------------
Binary file ./testdata/grepbinary matches
RC=0
---------------------------- Test 87 -----------------------------
RC=1
---------------------------- Test 88 -----------------------------
Binary file ./testdata/grepbinary matches
RC=0
---------------------------- Test 89 -----------------------------
RC=1
---------------------------- Test 90 -----------------------------
RC=1
---------------------------- Test 91 -----------------------------
The quick brown fx jumps over the lazy dog.
RC=0
---------------------------- Test 92 -----------------------------
The quick brown fx jumps over the lazy dog.
RC=0
---------------------------- Test 93 -----------------------------
The quick brown fx jumps over the lazy dog.
RC=0
---------------------------- Test 94 -----------------------------
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 95 -----------------------------
testdata/grepinputx:complete pair
testdata/grepinputx:That was a complete pair
testdata/grepinputx:complete pair
RC=0
---------------------------- Test 96 -----------------------------
./testdata/grepinput3
./testdata/grepinput8
./testdata/grepinputBad8
./testdata/grepinputBad8_Trail
./testdata/grepinputx
RC=0
---------------------------- Test 97 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 98 -----------------------------
./testdata/grepinputx
RC=0
---------------------------- Test 99 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 100 ------------------------------
./testdata/grepinput:zerothe.
./testdata/grepinput:zeroa
./testdata/grepinput:zerothe.
RC=0
---------------------------- Test 101 ------------------------------
./testdata/grepinput:.|zero|the|.
./testdata/grepinput:zero|a
./testdata/grepinput:.|zero|the|.
RC=0
---------------------------- Test 102 -----------------------------
2:
5:
7:
9:
12:
14:
RC=0
---------------------------- Test 103 -----------------------------
RC=0
---------------------------- Test 104 -----------------------------
2:
5:
7:
9:
12:
14:
RC=0
---------------------------- Test 105 -----------------------------
triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t2_txt s1_tag s_txt p_tag p_txt o_tag
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
triple: t5_txt s1_tag s_txt p_tag p_txt o_tag
o_txt
triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
RC=0
---------------------------- Test 106 -----------------------------
a
RC=0
---------------------------- Test 107 -----------------------------
1:0,1
2:0,1
2:1,1
2:2,1
2:3,1
2:4,1
RC=0
---------------------------- Test 108 ------------------------------
RC=0
---------------------------- Test 109 -----------------------------
RC=0
---------------------------- Test 110 -----------------------------
match 1:
a
/1/a
match 2:
b
/2/b
match 3:
c
/3/c
match 4:
d
/4/d
match 5:
e
/5/e
RC=0
---------------------------- Test 111 -----------------------------
607:0,12
609:0,12
611:0,12
613:0,12
615:0,12
RC=0
---------------------------- Test 112 -----------------------------
37172,12
37184,12
37196,12
37208,12
37220,12
RC=0
---------------------------- Test 113 -----------------------------
480
RC=0
---------------------------- Test 114 -----------------------------
testdata/grepinput:469
testdata/grepinput3:0
testdata/grepinput8:0
testdata/grepinputBad8:0
testdata/grepinputBad8_Trail:0
testdata/grepinputM:2
testdata/grepinputUN:0
testdata/grepinputv:3
testdata/grepinputx:6
TOTAL:480
RC=0
---------------------------- Test 115 -----------------------------
testdata/grepinput:469
testdata/grepinputM:2
testdata/grepinputv:3
testdata/grepinputx:6
TOTAL:480
RC=0
---------------------------- Test 116 -----------------------------
478
RC=0
---------------------------- Test 117 -----------------------------
469
0
0
0
0
2
0
3
6
480
RC=0
---------------------------- Test 118 -----------------------------
testdata/grepinput3
testdata/grepinput8
testdata/grepinputBad8
testdata/grepinputBad8_Trail
testdata/grepinputUN
RC=0
---------------------------- Test 119 -----------------------------
123
456
789
---
abc
def
xyz
---
RC=0
---------------------------- Test 120 ------------------------------
./testdata/grepinput:the binary zero.:zerothe.
./testdata/grepinput:a binary zero:zeroa
./testdata/grepinput:the binary zero.:zerothe.
RC=0
./testdata/grepinput:the binary zero.:zerothe.
./testdata/grepinput:a binary zero:zeroa
./testdata/grepinput:the binary zero.:zerothe.
RC=0
the binary zero.:
RC=0
pcre2grep: Error in output text at offset 2: decimal number expected
RC=2
pcre2grep: Error in output text at offset 3: no character after $
RC=2
pcre2grep: Error in output text at offset 8: too many hex digits
RC=2
pcre2grep: Error in output text at offset 5: missing closing brace
RC=2
pcre2grep: Error in output text at offset 7: code point greater than 0xff is invalid
RC=2
---------------------------- Test 121 -----------------------------
This line contains \E and (regex) *meta* [characters].
RC=0
---------------------------- Test 122 -----------------------------
over the lazy dog.
The word is cat in this line
RC=0
---------------------------- Test 123 -----------------------------
over the lazy dog.
The word is cat in this line
RC=0
---------------------------- Test 124 -----------------------------
3:start end in between start
end and following
7:start end in between start
end and following start
end other stuff
11:start end in between start
end
16:start end in between start
end
RC=0
3:start end in between start
end and following
5-Other stuff
6-
7:start end in between start
end and following start
end other stuff
10-
11:start end in between start
end
14-
15-** These two lines must be last.
16:start end in between start
end
RC=0
3:start end in between start
end and following
7:start end in between start
end and following start
end other stuff
11:start end in between start
end
16:start end in between start
end
RC=0
3:start end in between start
end and following
5-Other stuff
6-
7:start end in between start
end and following start
end other stuff
10-
11:start end in between start
end
14-
15-** These two lines must be last.
16:start end in between start
end
RC=0
---------------------------- Test 125 -----------------------------
abcd
RC=0
abcd
RC=0
abcd
RC=0
abcd
RC=0
abcd
RC=0
---------------------------- Test 126 -----------------------------
ABCXYZ
RC=0
pcre2grep: Error in regex in line 2 of testtemp1grep at offset 4: unmatched closing parenthesis
RC=2
---------------------------- Test 127 -----------------------------
pattern
RC=0
---------------------------- Test 128 -----------------------------
pcre2grep: Requested group 1 cannot be captured.
pcre2grep: Use --om-capture to increase the size of the capture vector.
RC=2
---------------------------- Test 129 -----------------------------
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox
RC=0
---------------------------- Test 130 -----------------------------
fox
fox
fox
fox
RC=0
---------------------------- Test 131 -----------------------------
2
RC=0
---------------------------- Test 132 -----------------------------
match 1:
a
match 2:
b
---
a
RC=0
---------------------------- Test 133 -----------------------------
match 1:
a
match 2:
b
---
match 2:
b
match 3:
c
RC=0
---------------------------- Test 134 -----------------------------
(standard input):2:=AB3CD5=
RC=0
---------------------------- Test 135 -----------------------------
./testdata/grepinputv@The word is cat in this line
RC=0
./testdata/grepinputv@./testdata/grepinputv@RC=0
./testdata/grepinputv@This line contains \E and (regex) *meta* [characters].
./testdata/grepinputv@The word is cat in this line
./testdata/grepinputv@The caterpillar sat on the mat
RC=0
testdata/grepinputM3:start end in between start
end and following
testdata/grepinputM7:start end in between start
end and following start
end other stuff
testdata/grepinputM11:start end in between start
end
testdata/grepinputM16:start end in between start
end
RC=0
---------------------------- Test 136 -----------------------------
pcre2grep: Malformed number "1MK" after -m
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
pcre2grep: Malformed number "1MK" after --max-count
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
---------------------------- Test 137 -----------------------------
Last line
has no newline
RC=0
---------------------------- Test 138 -----------------------------
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: pcre2_match() gave error -63 while matching this text:
AbC
pcre2grep: Too many errors - abandoned.
pcre2grep: Error -46, -47, -53 or -63 means that a resource limit was exceeded.
pcre2grep: Check your regex for nested unlimited loops.
RC=2
---------------------------- Test 139 -----------------------------
fox jumps
RC=0
---------------------------- Test 140 -----------------------------
The quick brown
fox jumps
RC=0
---------------------------- Test 141 -----------------------------
(standard input):This is a line from stdin.
RC=0
---------------------------- Test 142 -----------------------------
pcre2grep: Failed to open /does/not/exist: No such file or directory
RC=2
---------------------------- Test 143 -----------------------------
fox jumps
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
A buried feline in the syndicate
RC=0
---------------------------- Test 144 -----------------------------
pcre2grep: Failed to open /non/exist: No such file or directory
RC=2
---------------------------- Test 145 -----------------------------
The quick brown
fox jumps
over the lazy dog.
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
A buried feline in the syndicate
trailing spaces
RC=0
---------------------------- Test 146 -----------------------------
(standard input):A123B
RC=0
A123B
fox jumps
RC=0
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
---------------------------- Test 147 -----------------------------
pcre2grep: Failed to open -nonfile: No such file or directory
RC=2
---------------------------- Test 148 -----------------------------
pcre2grep: Unknown option --nonexist
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
pcre2grep: Unknown option letter '-' in "-n-n-bad"
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
pcre2grep: Data missing after --context
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
pcre2grep: Cannot mix --only-matching, --output, --file-offsets and/or --line-offsets
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
pcre2grep: Unknown colour setting "badvalue"
RC=2
pcre2grep: Invalid newline specifier "badvalue"
RC=2
pcre2grep: Invalid value "badvalue" for -d
RC=2
pcre2grep: Invalid value "badvalue" for -D
RC=2
pcre2grep: --buffer-size must be greater than zero
RC=2
pcre2grep: Error in --exclude regex at offset 7: missing closing parenthesis
RC=2
pcre2grep: Failed to open /non/exist: No such file or directory
RC=2
pcre2grep: Failed to open /non/exist: No such file or directory
RC=2
pcre2grep: Failed to open /non/exist: No such file or directory
RC=2
---------------------------- Test 149 -----------------------------
Binary file ./testdata/grepbinary matches
RC=0
pcre2grep: unknown value "wrong" for binary-files
Usage: pcre2grep [-AaBCcDdEeFfHhIilLMmNnOoPqrstuUVvwxZ] [long options] [pattern] [files]
Type "pcre2grep --help" for more information and the long options.
RC=2
---------------------------- Test 150 -----------------------------
pcre2grep: Failed to set locale locale.bad (obtained from LC_CTYPE)
RC=2
---------------------------- Test 151 -----------------------------
The quick brown
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
RC=0
---------------------------- Test 152 -----------------------------
24:four
25-five
26-six
27-seven
++
34:fourteen
35-fifteen
36-sixteen
37-seventeen
RC=0
---------------------------- Test 153 -----------------------------
24:four
25-five
26-six
27-seven
34:fourteen
35-fifteen
36-sixteen
37-seventeen
RC=0
---------------------------- Test 154 -----------------------------
RC=1
---------------------------- Test 155 -----------------------------
RC=1
---------------------------- Test 156 -----------------------------
The quick brown
fox jumps
over the lazy dog.
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
A buried feline in the syndicate
trailing spaces
RC=0
---------------------------- Test 157 -----------------------------
RC=0
---------------------------- Test 158 -----------------------------
trailing spaces
RC=0
---------------------------- Test 159 -----------------------------
trailing spaces
RC=0
---------------------------- Test 160 -----------------------------
622-bnm
623-asd
624-qwe
625:ert
626-tyu
627-uio
628-ggg
629-asd
630-dfg
631-ghj
632:jkl
633-abx
634-def
635-ghi
RC=0
621-cvb
622-bnm
623-asd
624-qwe
625:ert
626-tyu
627-uio
628-ggg
629-asd
630:dfg
631-ghj

47
3rd/pcre2/testdata/grepoutput8 vendored Normal file
View File

@@ -0,0 +1,47 @@
---------------------------- Test U1 ------------------------------
1:X one
2:X two 3:X three 4:X four
5:X five
6:X six
7:X seven…8:X eight9:X nine10:X ten
RC=0
---------------------------- Test U2 ------------------------------
12-Before 111
13-Before 22214-Before 333…15:Match
16-After 111
17-After 22218-After 333
RC=0
---------------------------- Test U3 ------------------------------
21:0,2
22:0,2
22:2,2
22:4,2
22:6,2
22:8,2
RC=0
---------------------------- Test U4 ------------------------------
pcre2grep: pcre2_match() gave error -22 while matching this text:
Aက<EFBFBD>CD Z
UTF-8 error: isolated byte with 0x80 bit set at offset 4
RC=1
---------------------------- Test U5 ------------------------------
CD Z
RC=0
---------------------------- Test U6 -----------------------------
=ǓǤ=
RC=0
---------------------------- Test U7 ------------------------------
ÁabcÁ Kk
RC=0
---------------------------- Test U8 ------------------------------
ÁabcÁ Kk
RC=0
---------------------------- Test U9 ------------------------------
A
A1
RC=0
---------------------------- Test U10 ------------------------------
A1

74
3rd/pcre2/testdata/grepoutputC vendored Normal file
View File

@@ -0,0 +1,74 @@
--- Test 1 ---
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The quick brown
Arg1: [T] [his] [s] Arg2: |T| () () (0)
This time it jumps and jumps and jumps.
Arg1: [T] [his] [s] Arg2: |T| () () (0)
This line contains \E and (regex) *meta* [characters].
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The word is cat in this line
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The caterpillar sat on the mat
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The snowcat is not an animal
RC=0
--- Test 2 ---
Arg1: [qu] [qu]
The quick brown
Arg1: [ t] [ t]
This time it jumps and jumps and jumps.
Arg1: [ l] [ l]
This line contains \E and (regex) *meta* [characters].
Arg1: [wo] [wo]
The word is cat in this line
Arg1: [ca] [ca]
The caterpillar sat on the mat
Arg1: [sn] [sn]
The snowcat is not an animal
RC=0
--- Test 3 ---
0:T
The quick brown
0:T
This time it jumps and jumps and jumps.
0:T
This line contains \E and (regex) *meta* [characters].
0:T
The word is cat in this line
0:T
The caterpillar sat on the mat
0:T
The snowcat is not an animal
RC=0
--- Test 4 ---
0:T
The quick brown
0:T
This time it jumps and jumps and jumps.
0:T
This line contains \E and (regex) *meta* [characters].
0:T
The word is cat in this line
0:T
The caterpillar sat on the mat
0:T
The snowcat is not an animal
RC=0
--- Test 5 ---
T
T
T
T
T
T
RC=1
--- Test 6 ---
0:T:AA
The quick brown
RC=0

50
3rd/pcre2/testdata/grepoutputCN vendored Normal file
View File

@@ -0,0 +1,50 @@
--- Test 1 ---
The quick brown
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
RC=0
--- Test 2 ---
The quick brown
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
RC=0
--- Test 3 ---
0:T
The quick brown
0:T
This time it jumps and jumps and jumps.
0:T
This line contains \E and (regex) *meta* [characters].
0:T
The word is cat in this line
0:T
The caterpillar sat on the mat
0:T
The snowcat is not an animal
RC=0
--- Test 4 ---
The quick brown
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
RC=0
--- Test 5 ---
T
T
T
T
T
T
RC=1
--- Test 6 ---
0:T:AA
The quick brown
RC=0

22
3rd/pcre2/testdata/grepoutputCNU vendored Normal file
View File

@@ -0,0 +1,22 @@
--- Test 1 ---
0:¦
The quick brown
0:¦
This time it jumps and jumps and jumps.
0:¦
This line contains \E and (regex) *meta* [characters].
0:¦
The word is cat in this line
0:¦
The caterpillar sat on the mat
0:¦
The snowcat is not an animal
RC=0
--- Test 2 ---
The quick brown
This time it jumps and jumps and jumps.
This line contains \E and (regex) *meta* [characters].
The word is cat in this line
The caterpillar sat on the mat
The snowcat is not an animal
RC=0

34
3rd/pcre2/testdata/grepoutputCU vendored Normal file
View File

@@ -0,0 +1,34 @@
--- Test 1 ---
0:¦
The quick brown
0:¦
This time it jumps and jumps and jumps.
0:¦
This line contains \E and (regex) *meta* [characters].
0:¦
The word is cat in this line
0:¦
The caterpillar sat on the mat
0:¦
The snowcat is not an animal
RC=0
--- Test 2 ---
0:¦
The quick brown
0:¦
This time it jumps and jumps and jumps.
0:¦
This line contains \E and (regex) *meta* [characters].
0:¦
The word is cat in this line
0:¦
The caterpillar sat on the mat
0:¦
The snowcat is not an animal
RC=0

6
3rd/pcre2/testdata/grepoutputCbz2 vendored Normal file
View File

@@ -0,0 +1,6 @@
one
two
RC=0
one
two
RC=0

3
3rd/pcre2/testdata/grepoutputCgz vendored Normal file
View File

@@ -0,0 +1,3 @@
one
two
RC=0

42
3rd/pcre2/testdata/grepoutputN vendored Normal file
View File

@@ -0,0 +1,42 @@
---------------------------- Test N1 ------------------------------
1:abc
2:def
RC=0
1-abc
2:def
RC=0
---------------------------- Test N2 ------------------------------
1:abc
def
2:ghi
jkl
RC=0
1-abc
def
2:ghi
jkl
RC=0
---------------------------- Test N3 ------------------------------
2:def
3:
ghi
jkl
RC=0
---------------------------- Test N4 ------------------------------
2:ghi
jkl
RC=0
---------------------------- Test N5 ------------------------------
1:abc
2:def
3:ghi
4:jkl
RC=0
1-abc
2:def
RC=0
---------------------------- Test N6 ------------------------------
1:abc
2:def
3:ghi
4:jkl

4
3rd/pcre2/testdata/grepoutputUN vendored Normal file
View File

@@ -0,0 +1,4 @@
---------------------------- Test UN2 ------------------------------
1:abc<62>
RC=0

2
3rd/pcre2/testdata/greppatN4 vendored Normal file
View File

@@ -0,0 +1,2 @@
xxx
jkl

BIN
3rd/pcre2/testdata/testbtables vendored Normal file

Binary file not shown.

7076
3rd/pcre2/testdata/testinput1 vendored Normal file

File diff suppressed because it is too large Load Diff

714
3rd/pcre2/testdata/testinput10 vendored Normal file
View File

@@ -0,0 +1,714 @@
# This set of tests is for UTF-8 support and Unicode property support, with
# relevance only for the 8-bit library.
#newline_default lf any anycrlf
# The next 5 patterns have UTF-8 errors
/[<5B>]/utf
/<2F>/utf
/<2F><><EFBFBD>xxx/utf
<><C382><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>/utf
<><C382><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>/match_invalid_utf
# Now test subjects
/badutf/utf
\= Expect UTF-8 errors
X\xdf
XX\xef
XXX\xef\x80
X\xf7
XX\xf7\x80
XXX\xf7\x80\x80
\xfb
\xfb\x80
\xfb\x80\x80
\xfb\x80\x80\x80
\xfd
\xfd\x80
\xfd\x80\x80
\xfd\x80\x80\x80
\xfd\x80\x80\x80\x80
\xdf\x7f
\xef\x7f\x80
\xef\x80\x7f
\xf7\x7f\x80\x80
\xf7\x80\x7f\x80
\xf7\x80\x80\x7f
\xfb\x7f\x80\x80\x80
\xfb\x80\x7f\x80\x80
\xfb\x80\x80\x7f\x80
\xfb\x80\x80\x80\x7f
\xfd\x7f\x80\x80\x80\x80
\xfd\x80\x7f\x80\x80\x80
\xfd\x80\x80\x7f\x80\x80
\xfd\x80\x80\x80\x7f\x80
\xfd\x80\x80\x80\x80\x7f
\xed\xa0\x80
\xc0\x8f
\xe0\x80\x8f
\xf0\x80\x80\x8f
\xf8\x80\x80\x80\x8f
\xfc\x80\x80\x80\x80\x8f
\x80
\xfe
\xff
/badutf/utf
\= Expect UTF-8 errors
XX\xfb\x80\x80\x80\x80
XX\xfd\x80\x80\x80\x80\x80
XX\xf7\xbf\xbf\xbf
/shortutf/utf
\= Expect UTF-8 errors
XX\xdf\=ph
XX\xef\=ph
XX\xef\x80\=ph
\xf7\=ph
\xf7\x80\=ph
\xf7\x80\x80\=ph
\xfb\=ph
\xfb\x80\=ph
\xfb\x80\x80\=ph
\xfb\x80\x80\x80\=ph
\xfd\=ph
\xfd\x80\=ph
\xfd\x80\x80\=ph
\xfd\x80\x80\x80\=ph
\xfd\x80\x80\x80\x80\=ph
/anything/utf
\= Expect UTF-8 errors
X\xc0\x80
XX\xc1\x8f
XXX\xe0\x9f\x80
\xf0\x8f\x80\x80
\xf8\x87\x80\x80\x80
\xfc\x83\x80\x80\x80\x80
\xfe\x80\x80\x80\x80\x80
\xff\x80\x80\x80\x80\x80
\xf8\x88\x80\x80\x80
\xf9\x87\x80\x80\x80
\xfc\x84\x80\x80\x80\x80
\xfd\x83\x80\x80\x80\x80
\= Expect no match
\xc3\x8f
\xe0\xaf\x80
\xe1\x80\x80
\xf0\x9f\x80\x80
\xf1\x8f\x80\x80
\xf8\x88\x80\x80\x80\=no_utf_check
\xf9\x87\x80\x80\x80\=no_utf_check
\xfc\x84\x80\x80\x80\x80\=no_utf_check
\xfd\x83\x80\x80\x80\x80\=no_utf_check
# Similar tests with offsets
/badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
X\xdfabcd\=offset=1
\= Expect no match
X\xdfabcd\=offset=2
/(?<=x)badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
X\xdfabcd\=offset=1
X\xdfabcd\=offset=2
X\xdfabcd\xdf\=offset=3
\= Expect no match
X\xdfabcd\=offset=3
/(?<=xx)badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
X\xdfabcd\=offset=1
X\xdfabcd\=offset=2
X\xdfabcd\=offset=3
/(?<=xxxx)badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
X\xdfabcd\=offset=1
X\xdfabcd\=offset=2
X\xdfabcd\=offset=3
X\xdfabc\xdf\=offset=6
X\xdfabc\xdf\=offset=7
\= Expect no match
X\xdfabcd\=offset=6
/\x{100}/IB,utf
/\x{1000}/IB,utf
/\x{10000}/IB,utf
/\x{100000}/IB,utf
/\x{10ffff}/IB,utf
/[\x{ff}]/IB,utf
/[\x{100}]/IB,utf
/\x80/IB,utf
/\xff/IB,utf
/\x{D55c}\x{ad6d}\x{C5B4}/IB,utf
\x{D55c}\x{ad6d}\x{C5B4}
/\x{65e5}\x{672c}\x{8a9e}/IB,utf
\x{65e5}\x{672c}\x{8a9e}
/\x{80}/IB,utf
/\x{084}/IB,utf
/\x{104}/IB,utf
/\x{861}/IB,utf
/\x{212ab}/IB,utf
/[^ab\xC0-\xF0]/IB,utf
\x{f1}
\x{bf}
\x{100}
\x{1000}
\= Expect no match
\x{c0}
\x{f0}
/(\x{100}+|x)/IB,utf
/(\x{100}*a|x)/IB,utf
/(\x{100}{0,2}a|x)/IB,utf
/(\x{100}{1,2}a|x)/IB,utf
/\x{100}/IB,utf
/a\x{100}\x{101}*/IB,utf
/a\x{100}\x{101}+/IB,utf
/[^\x{c4}]/IB
/[\x{100}]/IB,utf
\x{100}
Z\x{100}
\x{100}Z
/[\xff]/IB,utf
>\x{ff}<
/[^\xff]/IB,utf
/\x{100}abc(xyz(?1))/IB,utf
/\777/I,utf
\x{1ff}
\777
/\x{100}+\x{200}/IB,utf
/\x{100}+X/IB,utf
/^[\QĀ\E-\QŐ\E/B,utf
# This tests the stricter UTF-8 check according to RFC 3629.
/X/utf
\= Expect UTF-8 errors
\x{d800}
\x{da00}
\x{dfff}
\x{110000}
\x{2000000}
\x{7fffffff}
\= Expect no match
\x{d800}\=no_utf_check
\x{da00}\=no_utf_check
\x{dfff}\=no_utf_check
\x{110000}\=no_utf_check
\x{2000000}\=no_utf_check
\x{7fffffff}\=no_utf_check
/(*UTF8)\x{1234}/
abcd\x{1234}pqr
/(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
/\h/I,utf
ABC\x{09}
ABC\x{20}
ABC\x{a0}
ABC\x{1680}
ABC\x{180e}
ABC\x{2000}
ABC\x{202f}
ABC\x{205f}
ABC\x{3000}
/\v/I,utf
ABC\x{0a}
ABC\x{0b}
ABC\x{0c}
ABC\x{0d}
ABC\x{85}
ABC\x{2028}
/\h*A/I,utf
CDBABC
/\v+A/I,utf
/\s?xxx\s/I,utf
/\sxxx\s/I,utf,tables=2
AB\x{85}xxx\x{a0}XYZ
AB\x{a0}xxx\x{85}XYZ
/\S \S/I,utf,tables=2
\x{a2} \x{84}
A Z
/a+/utf
a\x{123}aa\=offset=1
a\x{123}aa\=offset=3
a\x{123}aa\=offset=4
\= Expect bad offset value
a\x{123}aa\=offset=6
\= Expect bad UTF-8 offset
a\x{123}aa\=offset=2
\= Expect no match
a\x{123}aa\=offset=5
/\x{1234}+/Ii,utf
/\x{1234}+?/Ii,utf
/\x{1234}++/Ii,utf
/\x{1234}{2}/Ii,utf
/[^\x{c4}]/IB,utf
/X+\x{200}/IB,utf
/\R/I,utf
/\777/IB,utf
/\w+\x{C4}/B,utf
a\x{C4}\x{C4}
/\w+\x{C4}/B,utf,tables=2
a\x{C4}\x{C4}
/\W+\x{C4}/B,utf
!\x{C4}
/\W+\x{C4}/B,utf,tables=2
!\x{C4}
/\W+\x{A1}/B,utf
!\x{A1}
/\W+\x{A1}/B,utf,tables=2
!\x{A1}
/X\s+\x{A0}/B,utf
X\x20\x{A0}\x{A0}
/X\s+\x{A0}/B,utf,tables=2
X\x20\x{A0}\x{A0}
/\S+\x{A0}/B,utf
X\x{A0}\x{A0}
/\S+\x{A0}/B,utf,tables=2
X\x{A0}\x{A0}
/\x{a0}+\s!/B,utf
\x{a0}\x20!
/\x{a0}+\s!/B,utf,tables=2
\x{a0}\x20!
/A/utf
\x{ff000041}
\x{7f000041}
/(*UTF8)abc/never_utf
/abc/utf,never_utf
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IBi,utf
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IB,utf
/AB\x{1fb0}/IB,utf
/AB\x{1fb0}/IBi,utf
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
/[ⱥ]/Bi,utf
/[^ⱥ]/Bi,utf
/\h/I
/\v/I
/\R/I
/[[:blank:]]/B,ucp
/\x{212a}+/Ii,utf
KKkk\x{212a}
/s+/Ii,utf
SSss\x{17f}
/\x{100}*A/IB,utf
A
/\x{100}*\d(?R)/IB,utf
/[Z\x{100}]/IB,utf
Z\x{100}
\x{100}
\x{100}Z
/[z-\x{100}]/IB,utf
/[z\Qa-d]Ā\E]/IB,utf
\x{100}
Ā
/[ab\x{100}]abc(xyz(?1))/IB,utf
/\x{100}*\s/IB,utf
/\x{100}*\d/IB,utf
/\x{100}*\w/IB,utf
/\x{100}*\D/IB,utf
/\x{100}*\S/IB,utf
/\x{100}*\W/IB,utf
/[\x{105}-\x{109}]/IBi,utf
\x{104}
\x{105}
\x{109}
\= Expect no match
\x{100}
\x{10a}
/[z-\x{100}]/IBi,utf
Z
z
\x{39c}
\x{178}
|
\x{80}
\x{ff}
\x{100}
\x{101}
\= Expect no match
\x{102}
Y
y
/[z-\x{100}]/IBi,utf
/\x{3a3}B/IBi,utf
/abc/utf,replace=<3D>
abc
/(?<=(a)(?-1))x/I,utf
a\x80zx\=offset=3
/[\W\p{Any}]/B
abc
123
/[\W\pL]/B
abc
\= Expect no match
123
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/utf
/[\s[:^ascii:]]/B,ucp
# A special extra option allows excaped surrogate code points in 8-bit mode,
# but subjects containing them must not be UTF-checked.
/\x{d800}/I,utf,allow_surrogate_escapes
\x{d800}\=no_utf_check
/\udfff\o{157401}/utf,alt_bsux,allow_surrogate_escapes
\x{dfff}\x{df01}\=no_utf_check
# This has different starting code units in 8-bit mode.
/^[^ab]/IB,utf
c
\x{ff}
\x{100}
\= Expect no match
aaa
# Offsets are different in 8-bit mode.
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
123abcáyzabcdef789abcሴqr
# Check name length with non-ASCII characters
/(?'ABáC678901234567890123456789012012345678901234567890123456789AB012345678901234567890123456789AB012345678901234567890123456789AB'...)/utf
/(?'ABáC6789012345678901234567890123012345678901234567890123456789AB012345678901234567890123456789AB012345678901234567890123456789AB'...)/utf
/(?'ABZC6789012345678901234567890123012345678901234567890123456789AB012345678901234567890123456789AB012345678901234567890123456789AB'...)/utf
/(?(n/utf
/(?(á/utf
# Invalid UTF-8 tests
/.../g,match_invalid_utf
abcd\x80wxzy\x80pqrs
abcd\x{80}wxzy\x80pqrs
/abc/match_invalid_utf
ab\x80ab\=ph
\= Expect no match
ab\x80cdef\=ph
/.a/match_invalid_utf
ab\=ph
ab\=ps
b\xf0\x91\x88b\=ph
b\xf0\x91\x88b\=ps
b\xf0\x91\x88\xb4a
\= Expect no match
b\x80\=ph
b\x80\=ps
b\xf0\x91\x88\=ph
b\xf0\x91\x88\=ps
/.a$/match_invalid_utf
ab\=ph
ab\=ps
\= Expect no match
b\xf0\x91\x98\=ph
b\xf0\x91\x98\=ps
/ab$/match_invalid_utf
ab\x80cdeab
\= Expect no match
ab\x80cde
/.../g,match_invalid_utf
abcd\x{80}wxzy\x80pqrs
/(?<=x)../g,match_invalid_utf
abcd\x{80}wxzy\x80pqrs
abcd\x{80}wxzy\x80xpqrs
/X$/match_invalid_utf
\= Expect no match
X\xc4
/(?<=..)X/match_invalid_utf,aftertext
AB\x80AQXYZ
AB\x80AQXYZ\=offset=5
AB\x80\x80AXYZXC\=offset=5
\= Expect no match
AB\x80XYZ
AB\x80XYZ\=offset=3
AB\xfeXYZ
AB\xffXYZ\=offset=3
AB\x80AXYZ
AB\x80AXYZ\=offset=4
AB\x80\x80AXYZ\=offset=5
/.../match_invalid_utf
AB\xc4CCC
\= Expect no match
A\x{d800}B
A\x{110000}B
A\xc4B
/\bX/match_invalid_utf
A\x80X
/\BX/match_invalid_utf
\= Expect no match
A\x80X
/(?<=...)X/match_invalid_utf
AAA\x80BBBXYZ
\= Expect no match
AAA\x80BXYZ
AAA\x80BBXYZ
# -------------------------------------
/(*UTF)(?=\x{123})/I
/[\x{c1}\x{e1}]X[\x{145}\x{146}]/I,utf
/[󿾟,]/BI,utf
/[\x{fff4}-\x{ffff8}]/I,utf
/[\x{fff4}-\x{afff8}\x{10ffff}]/I,utf
/[\xff\x{ffff}]/I,utf
/[\xff\x{ff}]/I,utf
abc\x{ff}def
/[\xff\x{ff}]/I
abc\x{ff}def
/[Ss]/I
/[Ss]/I,utf
/(?:\x{ff}|\x{3000})/I,utf
/x/utf
abxyz
\x80\=startchar
abc\x80\=startchar
abc\x80\=startchar,offset=3
/\x{c1}+\x{e1}/iIB,ucp
\x{c1}\x{c1}\x{c1}
\x{e1}\x{e1}\x{e1}
/a|\x{c1}/iI,ucp
\x{e1}xxx
/a|\x{c1}/iI,utf
\x{e1}xxx
/\x{c1}|\x{e1}/iI,ucp
/X(\x{e1})Y/ucp,replace=>\U$1<,substitute_extended
X\x{e1}Y
/X(\x{e1})Y/i,ucp,replace=>\L$1<,substitute_extended
X\x{c1}Y
# Without UTF or UCP characters > 127 have only one case in the default locale.
/X(\x{e1})Y/replace=>\U$1<,substitute_extended
X\x{e1}Y
/A/utf,match_invalid_utf,caseless
\xe5A
/\bch\b/utf,match_invalid_utf
qchq\=ph
qchq\=ps
/line1\nbreak/firstline,utf,match_invalid_utf
line1\nbreak
line0\nline1\nbreak
/A\z/utf,match_invalid_utf
A\x80\x42\n
/ab$/match_invalid_utf
\= Expect no match
ab\x80cde
/ab\z/match_invalid_utf
\= Expect no match
ab\x80cde
/ab\Z/match_invalid_utf
\= Expect no match
ab\x80cde
/(..)(*scs:(1)ab\z)/match_invalid_utf
ab\x80cde
/(..)(*scs:(1)ab\Z)/match_invalid_utf
ab\x80cde
/(..)(*scs:(1)ab$)/match_invalid_utf
ab\x80cde
/(.) \1/i,ucp
i I
/(.) \1/i,ucp,turkish_casing
/[\x60-\x7f]/i,ucp,turkish_casing
i
\= Expect no match
I
/[\x60-\xc0]/i,ucp,turkish_casing
i
\= Expect no match
I
/[\x80-\xc0]/i,ucp,turkish_casing
\= Expect no match
i
I
# python_octal
/\400/
/abc/substitute_extended
abc\=replace=\400
/\400/python_octal
/abc/substitute_extended,python_octal
abc\=replace=\400
/\400/utf
/abc/utf,substitute_extended
abc\=replace=\400
/\400/utf,python_octal
/abc/utf,substitute_extended,python_octal
abc\=replace=\400
/[\x00-\x2f\x11-\xff]+/B
abcd
/[\x00-\x2f\x11-\xff]{4,}/B,utf
abcd
# End of testinput10

504
3rd/pcre2/testdata/testinput11 vendored Normal file
View File

@@ -0,0 +1,504 @@
# This set of tests is for the 16-bit and 32-bit libraries' basic (non-UTF)
# features that are not compatible with the 8-bit library, or which give
# different output in 16-bit or 32-bit mode. The output for the two widths is
# different, so they have separate output files.
#forbid_utf
#newline_default LF ANY ANYCRLF
/[^\x{c4}]/IB
/\x{100}/I
/ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional leading comment
(?: (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address
| # or
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # one word, optionally followed by....
(?:
[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] | # atom and space parts, or...
\(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) | # comments, or...
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
# quoted strings
)*
< (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # leading <
(?: @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* , (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
)* # further okay, if led by comma
: # closing colon
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* )? # optional route
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address spec
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* > # trailing >
# name and address
) (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional trailing comment
/Ix
/[\h]/B
>\x09<
/[\h]+/B
>\x09\x20\xa0<
/[\v]/B
/[^\h]/B
/\h+/I
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
/[\h\x{dc00}]+/IB
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
/\H+/I
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
\xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
/[\H\x{d800}]+/
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
\xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
/\v+/I
\x{2027}\x{2030}\x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
/[\v\x{dc00}]+/IB
\x{2027}\x{2030}\x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
/\V+/I
\x{2028}\x{2029}\x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
/[\V\x{d800}]+/
\x{2028}\x{2029}\x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
/\R+/I,bsr=unicode
\x{2027}\x{2030}\x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
/[^\x{80}][^\x{ff}][^\x{100}][^\x{1000}][^\x{ffff}]/B
/[^\x{80}][^\x{ff}][^\x{100}][^\x{1000}][^\x{ffff}]/Bi
/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/B
/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/Bi
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
XX
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
XX
/\u0100/B,alt_bsux,allow_empty_class,match_unset_backref
/[\u0100-\u0200]/B,alt_bsux,allow_empty_class,match_unset_backref
/\ud800/B,alt_bsux,allow_empty_class,match_unset_backref
/^\x{ffff}+/i
\x{ffff}
/^\x{ffff}?/i
\x{ffff}
/^\x{ffff}*/i
\x{ffff}
/^\x{ffff}{3}/i
\x{ffff}\x{ffff}\x{ffff}
/^\x{ffff}{0,3}/i
\x{ffff}
/[^\x00-a]{12,}[^b-\xff]*/B
/[^\s]*\s* [^\W]+\W+ [^\d]*?\d0 [^\d\w]{4,6}?\w*A/B
/a*[b-\x{200}]?a#a*[b-\x{200}]?b#[a-f]*[g-\x{200}]*#[g-\x{200}]*[a-c]*#[g-\x{200}]*[a-h]*/B
/^[\x{1234}\x{4321}]{2,4}?/
\x{1234}\x{1234}\x{1234}
# Check maximum non-UTF character size for the 16-bit library.
/\x{ffff}/
A\x{ffff}B
/\x{10000}/
/\o{20000}/
# Check maximum character size for the 32-bit library. These will all give
# errors in the 16-bit library.
/\x{110000}/
/\x{7fffffff}/
/\x{80000000}/
/\x{ffffffff}/
/\x{100000000}/
/\o{17777777777}/
/\o{20000000000}/
/\o{37777777777}/
/\o{40000000000}/
/\x{7fffffff}\x{7fffffff}/I
/\x{80000000}\x{80000000}/I
/\x{ffffffff}\x{ffffffff}/I
# Non-UTF characters
/.{2,3}/
\x{400000}\x{400001}\x{400002}\x{400003}
/\x{400000}\x{800000}/IBi
# Check character ranges
/[\H]/IB
/[\V]/IB
/(*THEN:\[A]{65501})/expand
# We can use pcre2test's utf8_input modifier to create wide pattern characters,
# even though this test is run when UTF is not supported.
/a\x{d800}b/utf8_input
a<><61><EFBFBD>b
a\x{d800}b
a\o{154000}b
\= Expect warning unless 32bit
a\N{U+d800}b
/a\x{ffff}b/utf8_input
a￿b
a\x{ffff}b
a\o{177777}b
a\N{U+ffff}b
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf8_input
ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z
ab\x{7fffffff}z
ab\o{17777777777}z
ab\N{U+7fffffff}z
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf8_input
ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z
ab\x{ffffffff}z
/ab<61>Az/utf8_input
ab<61>Az
ab\x{80000041}z
\= Expect no match
abAz
aAz
ab\377Az
ab\xff\N{U+0041}z
ab\N{U+ff}\N{U+41}z
/ab\x{80000041}z/
ab\x{80000041}z
/(?i:A{1,}\6666666666)/
A\x{1b6}6666666
/abc/substitute_extended,replace=>\777<
abc
/abc/substitute_extended,replace=>\o{012345}<
abc
# Character range merging tests
/[\x{100}-\x{200}\H\x{8000}-\x{9000}]/B
/[\x{100}-\x{200}\V\x{8000}-\x{9000}]/B
/[\x00-\x{6000}\x{3000}-\x{ffff}]#[\x00-\x{6000}\x{3000}-\x{ffff}]{5,7}?/B
/[\x00-\x{6000}\x{3000}-\x{ffffffff}]#[\x00-\x{6000}\x{3000}-\x{ffffffff}]{5,7}?/B
/[\x00-\x2f\x11-\xff]*?!/B
abcd!e
/i/turkish_casing
# Character list tests
/([\x{100}-\x{7fff}\x{9000}\x{9002}\x{9004}\x{9006}\x{9008}\x{10000}-\x{7fffffff}]{3,8}?).#/B
\x{9001}\x{9007}\x{8000}\x{ffff}\x{9002}\x{7fff}\x{10000}\x{7fffffff}\x{500000}\x{9006}#
/([\x{3000}\x{3001}\x{3003}\x{3004}\x{3006}\x{3007}\x{8000}-\x{ffff}\x{100001}\x{100002}\x{100004}\x{100005}\x{100007}\x{100008}\x{10000a}\x{10000b}\x{80000000}-\x{ffffffff}]{5,}).#/B
\x{2fff}\x{3002}\x{7fff}\x{100000}\x{7fffffff}\x{3000}\x{3007}\x{8000}\x{ffff}\x{100001}\x{10000b}\x{80000000}\x{ffffffff}\x{3000}#
/([^\x{4000}\x{4002}\x{4004}\x{4005}\x{4007}\x{4009}\x{400a}\x{f000}\x{f002}\x{f004}\x{f005}\x{f007}\x{f009}\x{f00a}\x{100000}\x{100002}\x{100004}\x{100005}\x{100007}\x{100009}\x{10000a}\x{a0000000}\x{a0000002}\x{a0000004}\x{a0000005}\x{a0000007}\x{a0000009}\x{a000000a}]+).#/B
\x{4000}\x{4002}\x{4004}\x{4005}\x{4007}\x{4009}\x{400a}\x{3fff}\x{4001}\x{4003}\x{4006}\x{4008}\x{400b}\x{100}#
\x{f000}\x{f002}\x{f004}\x{f005}\x{f007}\x{f009}\x{f00a}\x{efff}\x{f001}\x{f003}\x{f006}\x{f008}\x{f00b}\x{100}#
\x{100000}\x{100002}\x{100004}\x{100005}\x{100007}\x{100009}\x{10000a}\x{fffff}\x{100001}\x{100003}\x{100006}\x{100008}\x{10000b}\x{100}#
\x{a0000000}\x{a0000002}\x{a0000004}\x{a0000005}\x{a0000007}\x{a0000009}\x{a000000a}\x{9fffffff}\x{a0000001}\x{a0000003}\x{a0000006}\x{a0000008}\x{a000000b}\x{100}#
# --------------
# EXTENDED CHARACTER CLASSES (UTS#18)
# META_BIGVALUE tests
/\x{80000000}/B
\x{80000000}
\= Expect no match
\x{7fffffff}
\x{80000001}
/[\x{80000000}-\x{8000000f}\x{8fffffff}]/B
\x{80000002}
\x{8fffffff}
\= Expect no match
\x{7fffffff}
\x{90000000}
/\x{80000000}/B,alt_extended_class
\x{80000000}
\= Expect no match
\x{7fffffff}
\x{80000001}
/[\x{80000000}-\x{8000000f}\x{8fffffff}]/B,alt_extended_class
\x{80000002}
\x{8fffffff}
\= Expect no match
\x{7fffffff}
\x{90000000}
/[\x{80000000}-\x{8000000f}--\x{80000002}]/B,alt_extended_class
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
/[[\x{80000000}-\x{8000000f}]--[\x{80000002}]]/B,alt_extended_class
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
# META_BIGVALUE tests
/(?[[\x{80000000}-\x{8000000f}]+\x{8fffffff}])/B
\x{80000002}
\x{8fffffff}
\= Expect no match
\x{7fffffff}
\x{90000000}
/(?[[\x{80000000}-\x{8000000f}]-\x{80000002}])/B
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
/(?[[\x{80000000}-\x{8000000f}]-\x{80000002}])/B
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
# --------------
# End of testinput11

715
3rd/pcre2/testdata/testinput12 vendored Normal file
View File

@@ -0,0 +1,715 @@
# This set of tests is for UTF-16 and UTF-32 support, including Unicode
# properties. It is relevant only to the 16-bit and 32-bit libraries. The
# output is different for each library, so there are separate output files.
/<2F><><EFBFBD>xxx/IB,utf,no_utf_check
/abc/utf
<20>]
# Check maximum character size
/\x{ffff}/IB,utf
/\x{10000}/IB,utf
/\x{100}/IB,utf
/\x{1000}/IB,utf
/\x{10000}/IB,utf
/\x{100000}/IB,utf
/\x{10ffff}/IB,utf
/[\x{ff}]/IB,utf
/[\x{100}]/IB,utf
/\x80/IB,utf
/\xff/IB,utf
/\x{D55c}\x{ad6d}\x{C5B4}/IB,utf
\x{D55c}\x{ad6d}\x{C5B4}
/\x{65e5}\x{672c}\x{8a9e}/IB,utf
\x{65e5}\x{672c}\x{8a9e}
/\x{80}/IB,utf
/\x{084}/IB,utf
/\x{104}/IB,utf
/\x{861}/IB,utf
/\x{212ab}/IB,utf
/[^ab\xC0-\xF0]/IB,utf
\x{f1}
\x{bf}
\x{100}
\x{1000}
\= Expect no match
\x{c0}
\x{f0}
/(\x{100}+|x)/IB,utf
/(\x{100}*a|x)/IB,utf
/(\x{100}{0,2}a|x)/IB,utf
/(\x{100}{1,2}a|x)/IB,utf
/\x{100}/IB,utf
/a\x{100}\x{101}*/IB,utf
/a\x{100}\x{101}+/IB,utf
/[^\x{c4}]/IB
/[\x{100}]/IB,utf
\x{100}
Z\x{100}
\x{100}Z
/[\xff]/IB,utf
>\x{ff}<
/[^\xff]/IB,utf
/\x{100}abc(xyz(?1))/IB,utf
/\777/I,utf
\x{1ff}
\777
/\x{100}+\x{200}/IB,utf
/\x{100}+X/IB,utf
/^[\QĀ\E-\QŐ\E/B,utf
/X/utf
XX\x{d800}\=no_utf_check
XX\x{da00}\=no_utf_check
XX\x{dc00}\=no_utf_check
XX\x{de00}\=no_utf_check
XX\x{dfff}\=no_utf_check
\= Expect UTF error
XX\x{d800}
XX\x{da00}
XX\x{dc00}
XX\x{de00}
XX\x{dfff}
XX\x{110000}
XX\x{d800}\x{1234}
\= Expect no match
XX\x{d800}\=offset=3
/(?<=.)X/utf
XX\x{d800}\=offset=3
/(*UTF16)\x{11234}/
abcd\x{11234}pqr
/(*UTF)\x{11234}/I
abcd\x{11234}pqr
/(*UTF-32)\x{11234}/
abcd\x{11234}pqr
/(*UTF-32)\x{112}/
abcd\x{11234}pqr
/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
/(*CRLF)(*UTF32)(*BSR_UNICODE)a\Rb/I
/\h/I,utf
ABC\x{09}
ABC\x{20}
ABC\x{a0}
ABC\x{1680}
ABC\x{180e}
ABC\x{2000}
ABC\x{202f}
ABC\x{205f}
ABC\x{3000}
/\v/I,utf
ABC\x{0a}
ABC\x{0b}
ABC\x{0c}
ABC\x{0d}
ABC\x{85}
ABC\x{2028}
/\h*A/I,utf
CDBABC
\x{2000}ABC
/\R*A/I,bsr=unicode,utf
CDBABC
\x{2028}A
/\v+A/I,utf
/\s?xxx\s/I,utf
/\sxxx\s/I,utf,tables=2
AB\x{85}xxx\x{a0}XYZ
AB\x{a0}xxx\x{85}XYZ
/\S \S/I,utf,tables=2
\x{a2} \x{84}
A Z
/a+/utf
a\x{123}aa\=offset=1
a\x{123}aa\=offset=2
a\x{123}aa\=offset=3
\= Expect no match
a\x{123}aa\=offset=4
\= Expect bad offset error
a\x{123}aa\=offset=5
a\x{123}aa\=offset=6
/\x{1234}+/Ii,utf
/\x{1234}+?/Ii,utf
/\x{1234}++/Ii,utf
/\x{1234}{2}/Ii,utf
/[^\x{c4}]/IB,utf
/X+\x{200}/IB,utf
/\R/I,utf
# Check bad offset
/a/utf
\= Expect bad UTF-16 offset, or no match in 32-bit
\x{10000}\=offset=1
\x{10000}ab\=offset=1
\= Expect 16-bit match, 32-bit no match
\x{10000}ab\=offset=2
\= Expect no match
\x{10000}ab\=offset=3
\= Expect no match in 16-bit, bad offset in 32-bit
\x{10000}ab\=offset=4
\= Expect bad offset
\x{10000}ab\=offset=5
/<2F><><EFBFBD>/utf
/\w+\x{C4}/B,utf
a\x{C4}\x{C4}
/\w+\x{C4}/B,utf,tables=2
a\x{C4}\x{C4}
/\W+\x{C4}/B,utf
!\x{C4}
/\W+\x{C4}/B,utf,tables=2
!\x{C4}
/\W+\x{A1}/B,utf
!\x{A1}
/\W+\x{A1}/B,utf,tables=2
!\x{A1}
/X\s+\x{A0}/B,utf
X\x20\x{A0}\x{A0}
/X\s+\x{A0}/B,utf,tables=2
X\x20\x{A0}\x{A0}
/\S+\x{A0}/B,utf
X\x{A0}\x{A0}
/\S+\x{A0}/B,utf,tables=2
X\x{A0}\x{A0}
/\x{a0}+\s!/B,utf
\x{a0}\x20!
/\x{a0}+\s!/B,utf,tables=2
\x{a0}\x20!
/(*UTF)abc/never_utf
/abc/utf,never_utf
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IBi,utf
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IB,utf
/AB\x{1fb0}/IB,utf
/AB\x{1fb0}/IBi,utf
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
/[ⱥ]/Bi,utf
/[^ⱥ]/Bi,utf
/[[:blank:]]/B,ucp
/\x{212a}+/Ii,utf
KKkk\x{212a}
/s+/Ii,utf
SSss\x{17f}
# Non-UTF characters should give errors in both 16-bit and 32-bit modes.
/\x{110000}/utf
/\o{4200000}/utf
/\x{100}*A/IB,utf
A
/\x{100}*\d(?R)/IB,utf
/[Z\x{100}]/IB,utf
Z\x{100}
\x{100}
\x{100}Z
/[z-\x{100}]/IB,utf
/[z\Qa-d]Ā\E]/IB,utf
\x{100}
Ā
/[ab\x{100}]abc(xyz(?1))/IB,utf
/\x{100}*\s/IB,utf
/\x{100}*\d/IB,utf
/\x{100}*\w/IB,utf
/\x{100}*\D/IB,utf
/\x{100}*\S/IB,utf
/\x{100}*\W/IB,utf
/[\x{105}-\x{109}]/IBi,utf
\x{104}
\x{105}
\x{109}
\= Expect no match
\x{100}
\x{10a}
/[z-\x{100}]/IBi,utf
Z
z
\x{39c}
\x{178}
|
\x{80}
\x{ff}
\x{100}
\x{101}
\= Expect no match
\x{102}
Y
y
/[z-\x{100}]/IBi,utf
/\x{3a3}B/IBi,utf
/./utf
\x{110000}
/(*UTF)ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/B
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf
/[\W\p{Any}]/B
abc
123
/[\W\pL]/B
abc
\x{100}
\x{308}
\= Expect no match
123
/[\s[:^ascii:]]/B,ucp
/\pP/ucp
\x{7fffffff}
# A special extra option allows excaped surrogate code points in 32-bit mode,
# but subjects containing them must not be UTF-checked. These patterns give
# errors in 16-bit mode.
/\x{d800}/I,utf,allow_surrogate_escapes
\x{d800}\=no_utf_check
/\udfff\o{157401}/utf,alt_bsux,allow_surrogate_escapes
\x{dfff}\x{df01}\=no_utf_check
# This has different starting code units in 8-bit mode.
/^[^ab]/IB,utf
c
\x{ff}
\x{100}
\= Expect no match
aaa
# Offsets are different in 8-bit mode.
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
123abcáyzabcdef789abcሴqr
# A few script run tests in non-UTF mode (but they need Unicode support)
/^(*script_run:.{4})/
\x{3041}\x{30a1}\x{3007}\x{3007} Hiragana Katakana Han Han
\x{30a1}\x{3041}\x{3007}\x{3007} Katakana Hiragana Han Han
\x{1100}\x{2e80}\x{2e80}\x{1101} Hangul Han Han Hangul
/^(*sr:.*)/utf,allow_surrogate_escapes
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
\x{d800}\x{dfff} Surrogates (Unknown) \=no_utf_check
/(?(n/utf
/(?(á/utf
# Invalid UTF-16/32 tests.
/.../g,match_invalid_utf
abcd\x{df00}wxzy\x{df00}pqrs
abcd\x{80}wxzy\x{df00}pqrs
/abc/match_invalid_utf
ab\x{df00}ab\=ph
\= Expect no match
ab\x{df00}cdef\=ph
/.a/match_invalid_utf
ab\=ph
ab\=ps
\= Expect no match
b\x{df00}\=ph
b\x{df00}\=ps
/.a$/match_invalid_utf
ab\=ph
ab\=ps
\= Expect no match
b\x{df00}\=ph
b\x{df00}\=ps
/ab$/match_invalid_utf
ab\x{df00}cdeab
\= Expect no match
ab\x{df00}cde
/.../g,match_invalid_utf
abcd\x{80}wxzy\x{df00}pqrs
/(?<=x)../g,match_invalid_utf
abcd\x{80}wxzy\x{df00}pqrs
abcd\x{80}wxzy\x{df00}xpqrs
/X$/match_invalid_utf
\= Expect no match
X\x{df00}
/(?<=..)X/match_invalid_utf,aftertext
AB\x{df00}AQXYZ
AB\x{df00}AQXYZ\=offset=5
AB\x{df00}\x{df00}AXYZXC\=offset=5
\= Expect no match
AB\x{df00}XYZ
AB\x{df00}XYZ\=offset=3
AB\x{df00}AXYZ
AB\x{df00}AXYZ\=offset=4
AB\x{df00}\x{df00}AXYZ\=offset=5
/.../match_invalid_utf
\= Expect no match
A\x{d800}B
A\x{110000}B
/aa/utf,ucp,match_invalid_utf,global
aa\x{d800}aa
/aa/utf,ucp,match_invalid_utf,global
\x{d800}aa
/A\z/utf,match_invalid_utf
A\x{df00}\n
/ab$/match_invalid_utf
\= Expect no match
ab\x{df00}cde
/ab\z/match_invalid_utf
\= Expect no match
ab\x{df00}cde
/ab\Z/match_invalid_utf
\= Expect no match
ab\x{df00}cde
/(..)(*scs:(1)ab\z)/match_invalid_utf
ab\x{df00}cde
/(..)(*scs:(1)ab\Z)/match_invalid_utf
ab\x{df00}cde
/(..)(*scs:(1)ab$)/match_invalid_utf
ab\x{df00}cde
# ----------------------------------------------------
/(*UTF)(?=\x{123})/I
/[\x{c1}\x{e1}]X[\x{145}\x{146}]/I,utf
/[\xff\x{ffff}]/I,utf
/[\xff\x{ff}]/I,utf
/[\xff\x{ff}]/I
/[Ss]/I
/[Ss]/I,utf
/(?:\x{ff}|\x{3000})/I,utf
# ----------------------------------------------------
# UCP and casing tests
/\x{120}/iI
/\x{c1}/iI,ucp
/[\x{120}\x{121}]/iB,ucp
/[ab\x{120}]+/iB,ucp
aABb\x{121}\x{120}
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
/\x{120}\x{c1}/i,ucp,no_start_optimize
\x{121}\x{e1}
/\x{120}\x{c1}/i,ucp
\x{121}\x{e1}
/[^\x{120}]/i,no_start_optimize
\x{121}
/[^\x{120}]/i,ucp,no_start_optimize
\= Expect no match
\x{121}
/[^\x{120}]/i
\x{121}
/[^\x{120}]/i,ucp
\= Expect no match
\x{121}
/\x{120}{2}/i,ucp
\x{121}\x{121}
/[^\x{120}]{2}/i,ucp
\= Expect no match
\x{121}\x{121}
/\x{c1}+\x{e1}/iB,ucp
\x{c1}\x{c1}\x{c1}
/\x{c1}+\x{e1}/iIB,ucp
\x{c1}\x{c1}\x{c1}
\x{e1}\x{e1}\x{e1}
/a|\x{c1}/iI,ucp
\x{e1}xxx
/\x{c1}|\x{e1}/iI,ucp
/X(\x{e1})Y/ucp,replace=>\U$1<,substitute_extended
X\x{e1}Y
/X(\x{121})Y/ucp,replace=>\U$1<,substitute_extended
X\x{121}Y
/s/i,ucp
\x{17f}
/s/i,utf
\x{17f}
/[^s]/i,ucp
\= Expect no match
\x{17f}
/[^s]/i,utf
\= Expect no match
\x{17f}
/(.) \1/i,ucp
i I
/(.) \1/i,ucp,turkish_casing
\= Expect no match
i I
/(.) \1/i,ucp
i I
\x{212a} k
\= Expect no match
i \x{0130}
\x{0131} I
/(.) \1/i,ucp,turkish_casing
\x{212a} k
i \x{0130}
\x{0131} I
\= Expect no match
i I
/(.) (?r:\1)/i,ucp,turkish_casing
i I
\= Expect no match
i \x{0130}
\x{0131} I
\x{212a} k
/[a-z][^i]I/ucp,turkish_casing
bII
b\x{0130}I
b\x{0131}I
\= Expect no match
biI
/[a-z][^i]I/i,ucp,turkish_casing
b\x{0131}I
bII
\= Expect no match
biI
b\x{0130}I
/[a-z](?r:[^i])I/i,ucp,turkish_casing
b\x{0131}I
b\x{0130}I
\= Expect no match
bII
biI
/b(?r:[\x{00FF}-\x{FFEE}])/i,ucp,turkish_casing
b\x{0130}
b\x{0131}
B\x{212a}
\= Expect no match
bi
bI
bk
/[\x60-\x7f]/i,ucp,turkish_casing
i
\= Expect no match
I
/[\x60-\xc0]/i,ucp,turkish_casing
i
\= Expect no match
I
/[\x80-\xc0]/i,ucp,turkish_casing
\= Expect no match
i
I
# ----------------------------------------------------
/b[\x{00FF}-\x{FFEE}]/ir
b\x{0130}
b\x{0131}
B\x{212a}
\= Expect no match
bi
bI
bk
# Quantifier after a literal that has the value of META_ACCEPT (not UTF). This
# fails in 16-bit mode, but is OK for 32-bit.
/\x{802a0000}*/
\x{802a0000}\x{802a0000}
# UTF matching without UTF, check invalid UTF characters
/\X++/
a\x{110000}\x{ffffffff}
# This used to loop in 32-bit mode; it will fail in 16-bit mode.
/[\x{ffffffff}]/caseless,ucp
\x{ffffffff}xyz
# These are 32-bit tests for handing 0xffffffff when in UCP caselsss mode. They
# will give errors in 16-bit mode.
/k*\x{ffffffff}/caseless,ucp
\x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
/[sk](?r:[sk])[sk]/Bi,ucp
SKS
sks
\x{212a}S\x{17f}
\x{17f}K\x{212a}
\= Expect no match
s\x{212a}s
K\x{17f}K
# ---------------------------------------------------------
# End of testinput12

22
3rd/pcre2/testdata/testinput13 vendored Normal file
View File

@@ -0,0 +1,22 @@
# These DFA tests are for the handling of characters greater than 255 in
# 16-bit or 32-bit, non-UTF mode.
#forbid_utf
#subject dfa
/^\x{ffff}+/i
\x{ffff}
/^\x{ffff}?/i
\x{ffff}
/^\x{ffff}*/i
\x{ffff}
/^\x{ffff}{3}/i
\x{ffff}\x{ffff}\x{ffff}
/^\x{ffff}{0,3}/i
\x{ffff}
# End of testinput13

108
3rd/pcre2/testdata/testinput14 vendored Normal file
View File

@@ -0,0 +1,108 @@
# These test special UTF and UCP features of DFA matching. The output is
# different for the different widths.
#subject dfa
# ----------------------------------------------------
# These are a selection of the more comprehensive tests that are run for
# non-DFA matching.
/X/utf
XX\x{d800}
XX\x{d800}\=offset=3
XX\x{d800}\=no_utf_check
XX\x{da00}
XX\x{da00}\=no_utf_check
XX\x{dc00}
XX\x{dc00}\=no_utf_check
XX\x{de00}
XX\x{de00}\=no_utf_check
XX\x{dfff}
XX\x{dfff}\=no_utf_check
XX\x{110000}
XX\x{d800}\x{1234}
/badutf/utf
X\xdf
XX\xef
XXX\xef\x80
X\xf7
XX\xf7\x80
XXX\xf7\x80\x80
/shortutf/utf
XX\xdf\=ph
XX\xef\=ph
XX\xef\x80\=ph
\xf7\=ph
\xf7\x80\=ph
# ----------------------------------------------------
# UCP and casing tests - except for the first two, these will all fail in 8-bit
# mode because they are testing UCP without UTF and use characters > 255.
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
/\x{c1}+\x{e1}/iB,ucp
\x{c1}\x{c1}\x{c1}
\x{e1}\x{e1}\x{e1}
/\x{120}\x{c1}/i,ucp,no_start_optimize
\x{121}\x{e1}
/\x{120}\x{c1}/i,ucp
\x{121}\x{e1}
/[^\x{120}]/i,no_start_optimize
\x{121}
/[^\x{120}]/i,ucp,no_start_optimize
\= Expect no match
\x{121}
/[^\x{120}]/i
\x{121}
/[^\x{120}]/i,ucp
\= Expect no match
\x{121}
/\x{120}{2}/i,ucp
\x{121}\x{121}
/[^\x{120}]{2}/i,ucp
\= Expect no match
\x{121}\x{121}
# ----------------------------------------------------
# ----------------------------------------------------
# Tests for handling 0xffffffff in caseless UCP mode. They only apply to 32-bit
# mode; for the other widths they will fail.
/k*\x{ffffffff}/caseless,ucp
\x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
# ----------------------------------------------------
# End of testinput14

253
3rd/pcre2/testdata/testinput15 vendored Normal file
View File

@@ -0,0 +1,253 @@
# These are:
#
# (1) Tests of the match-limiting features. The results are different for
# interpretive or JIT matching, so this test should not be run with JIT. The
# same tests are run using JIT in test 17.
# (2) Other tests that must not be run with JIT.
# These tests are first so that they don't inherit a large enough heap frame
# vector from a previous test.
/(*LIMIT_HEAP=21)\[(a)]{60}/expand
\[a]{60}
"(*LIMIT_HEAP=21)()((?))()()()()()()()()()()()()()()()()()()()()()()()(())()()()()()()()()()()()()()()()()()()()()()(())()()()()()()()()()()()()()"
xx
# -----------------------------------------------------------------------
/(a+)*zz/I
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits_noheap
aaaaaaaaaaaaaz\=find_limits_noheap
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
/* this is a C style comment */\=find_limits_noheap
/^(?>a)++/
aa\=find_limits_noheap
aaaaaaaaa\=find_limits_noheap
/(a)(?1)++/
aa\=find_limits_noheap
aaaaaaaaa\=find_limits_noheap
/a(?:.)*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
/a(?:.(*THEN))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
/a(?:.(*THEN:ABC))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
aabbccddee\=find_limits_noheap
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
aabbccddee\=find_limits_noheap
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
aabbccddee\=find_limits_noheap
/(*LIMIT_MATCH=12bc)abc/
/(*LIMIT_MATCH=4294967290)abc/
/(*LIMIT_DEPTH=4294967280)abc/I
/(a+)*zz/
\= Expect no match
aaaaaaaaaaaaaz
\= Expect limit exceeded
aaaaaaaaaaaaaz\=match_limit=3000
/(a+)*zz/
\= Expect limit exceeded
aaaaaaaaaaaaaz\=depth_limit=10
/(*LIMIT_MATCH=3000)(a+)*zz/I
\= Expect limit exceeded
aaaaaaaaaaaaaz
\= Expect limit exceeded
aaaaaaaaaaaaaz\=match_limit=60000
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
\= Expect limit exceeded
aaaaaaaaaaaaaz
/(*LIMIT_MATCH=60000)(a+)*zz/I
\= Expect no match
aaaaaaaaaaaaaz
\= Expect limit exceeded
aaaaaaaaaaaaaz\=match_limit=3000
/(*LIMIT_DEPTH=10)(a+)*zz/I
\= Expect limit exceeded
aaaaaaaaaaaaaz
\= Expect limit exceeded
aaaaaaaaaaaaaz\=depth_limit=1000
/(*LIMIT_DEPTH=10)(*LIMIT_DEPTH=1000)(a+)*zz/I
\= Expect no match
aaaaaaaaaaaaaz
/(*LIMIT_DEPTH=1000)(a+)*zz/I
\= Expect no match
aaaaaaaaaaaaaz
\= Expect limit exceeded
aaaaaaaaaaaaaz\=depth_limit=10
# These three have infinitely nested recursions.
/((?2))((?1))/
abc
/((?(R2)a+|(?1)b))()/
aaaabcde
/(?(R)a*(?1)|((?R))b)/
aaaabcde
# The allusedtext modifier does not work with JIT, which does not maintain
# the leftchar/rightchar data.
/abc(?=xyz)/allusedtext
abcxyzpqr
abcxyzpqr\=aftertext
/(?<=pqr)abc(?=xyz)/allusedtext
xyzpqrabcxyzpqr
xyzpqrabcxyzpqr\=aftertext
/a\b/
a.\=allusedtext
a\=allusedtext
/abc\Kxyz/
abcxyz\=allusedtext
/abc(?=xyz(*ACCEPT))/
abcxyz\=allusedtext
/abc(?=abcde)(?=ab)/allusedtext
abcabcdefg
#subject allusedtext
/(?<=abc)123/
xyzabc123pqr
xyzabc12\=ps
xyzabc12\=ph
/\babc\b/
+++abc+++
+++ab\=ps
+++ab\=ph
/(?<=abc)def/
abc\=ph
/(?<=123)(*MARK:xx)abc/mark
xxxx123a\=ph
xxxx123a\=ps
/(?<=(?<=a)b)c.*/I
abc\=ph
\= Expect no match
xbc\=ph
/(?<=ab)c.*/I
abc\=ph
\= Expect no match
xbc\=ph
/abc(?<=bc)def/
xxxabcd\=ph
/(?<=ab)cdef/
xxabcd\=ph
/(?<=(?<=(?<=a)b)c)./I
123abcXYZ
/(?<=ab(cd(?<=...)))./I
abcdX
/(?<=ab((?<=...)cd))./I
ZabcdX
/(?<=((?<=(?<=ab).))(?1)(?1))./I
abxZ
#subject
# -------------------------------------------------------------------
# These tests provoke recursion loops, which give a different error message
# when JIT is used.
/(?R)/I
abcd
/(a|(?R))/I
abcd
defg
/(ab|(bc|(de|(?R))))/I
abcd
fghi
/(ab|(bc|(de|(?1))))/I
abcd
fghi
/x(ab|(bc|(de|(?1)x)x)x)/I
xab123
xfghi
/(?!\w)(?R)/
abcd
=abc
/(?=\w)(?R)/
=abc
abcd
/(?<!\w)(?R)/
abcd
/(?<=\w)(?R)/
abcd
/(a+|(?R)b)/
aaa
bbb
/[^\xff]((?1))/BI
abcd
# These tests don't behave the same with JIT
/\w+(?C1)/BI,no_auto_possess
abc\=callout_fail=1
/(*NO_AUTO_POSSESS)\w+(?C1)/BI
abc\=callout_fail=1
# This test breaks the JIT stack limit
/(|]+){2,2452}/
(|]+){2,2452}
/b(?<!ax)(?!cx)/allusedtext
abc
abcz
# This test triggers the recursion limit in the interpreter, but completes in
# JIT. It's in testinput2 with disable_recurse_loop_check to get it to work
# in the interpreter.
/(a(?1)z||(?1)++)$/
abcd
# End of testinput15

9
3rd/pcre2/testdata/testinput16 vendored Normal file
View File

@@ -0,0 +1,9 @@
# This test is run only when JIT support is not available. It checks that an
# attempt to use it has the expected behaviour. It also tests things that
# are different without JIT.
/abc/I,jit,jitverify
/a*/I
# End of testinput16

316
3rd/pcre2/testdata/testinput17 vendored Normal file

File diff suppressed because one or more lines are too long

147
3rd/pcre2/testdata/testinput18 vendored Normal file
View File

@@ -0,0 +1,147 @@
# This set of tests is run only with the 8-bit library. It tests the POSIX
# interface, which is supported only with the 8-bit library. This test should
# not be run with JIT (which is not available for the POSIX interface).
#forbid_utf
#pattern posix
# Test some invalid options
/abc/auto_callout
/abc/
abc\=find_limits
/abc/
abc\=partial_hard
/a(())bc/parens_nest_limit=1
/abc/allow_surrogate_escapes,max_pattern_length=2
# Real tests
/abc/
abc
/^abc|def/
abcdef
abcdef\=notbol
/.*((abc)$|(def))/
defabc
defabc\=noteol
/the quick brown fox/
the quick brown fox
\= Expect no match
The Quick Brown Fox
/the quick brown fox/i
the quick brown fox
The Quick Brown Fox
/(*LF)abc.def/
\= Expect no match
abc\ndef
/(*LF)abc$/
abc
abc\n
/(abc)\2/
/(abc\1)/
\= Expect no match
abc
/a*(b+)(z)(z)/
aaaabbbbzzzz
aaaabbbbzzzz\=ovector=0
aaaabbbbzzzz\=ovector=1
aaaabbbbzzzz\=ovector=2
/(*ANY)ab.cd/
ab-cd
ab=cd
\= Expect no match
ab\ncd
/ab.cd/s
ab-cd
ab=cd
ab\ncd
/a(b)c/posix_nosub
abc
/a(?P<name>b)c/posix_nosub
abc
/(a)\1/posix_nosub
zaay
/a?|b?/
abc
\= Expect no match
ddd\=notempty
/\w+A/
CDAAAAB
/\w+A/ungreedy
CDAAAAB
/\Biss\B/I,aftertext
Mississippi
/abc/\
"(?(?C)"
"(?(?C))"
/abcd/substitute_extended
/\[A]{1000000}**/expand,regerror_buffsize=31
/\[A]{1000000}**/expand,regerror_buffsize=32
//posix_nosub
\=offset=70000
/^d(e)$/posix
acdef\=posix_startend=2:4
acde\=posix_startend=2
\= Expect no match
acdef
acdef\=posix_startend=2
/^a\x{00}b$/posix
a\x{00}b\=posix_startend=0:3
/"A" 00 "B"/hex
A\x{00}B\=posix_startend=0:3
/ABC/use_length
ABC
/a\b(c/literal,posix
a\\b(c
/a\b(c/literal,posix,dotall
/((a)(b)?(c))/posix
123ace
123ace\=posix_startend=2:6
//posix
\= Expect errors
\=null_subject
abc\=null_subject
/(*LIMIT_HEAP=0)xx/posix
\= Expect error
xxxx
# End of testdata/testinput18

25
3rd/pcre2/testdata/testinput19 vendored Normal file
View File

@@ -0,0 +1,25 @@
# This set of tests is run only with the 8-bit library. It tests the POSIX
# interface with UTF/UCP support, which is supported only with the 8-bit
# library. This test should not be run with JIT (which is not available for the
# POSIX interface).
#pattern posix
/a\x{1234}b/utf
a\x{1234}b
/\w/
\= Expect no match
+++\x{c2}
/\w/ucp
+++\x{c2}
/"^AB" 00 "\x{1234}$"/hex,utf
AB\x{00}\x{1234}\=posix_startend=0:6
/\w/utf
\= Expect UTF error
A\xabB
# End of testdata/testinput19

7771
3rd/pcre2/testdata/testinput2 vendored Normal file

File diff suppressed because it is too large Load Diff

108
3rd/pcre2/testdata/testinput20 vendored Normal file
View File

@@ -0,0 +1,108 @@
# This set of tests exercises the serialization/deserialization and code copy
# functions in the library. It does not use UTF or JIT.
#forbid_utf
# Compile several patterns, push them onto the stack, and then write them
# all to a file.
#pattern push
/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
(?(DEFINE)
(?<NAME_PAT>[a-z]+)
(?<ADDRESS_PAT>\d+)
)/x
/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
#save testsaved1
# Do it again for some more patterns.
/(*MARK:A)(*SKIP:B)(C|X)/mark
/(?:(?<n>foo)|(?<n>bar))\k<n>/dupnames
#save testsaved2
#pattern -push
# Reload the patterns, then pop them one by one and check them.
#load testsaved1
#load testsaved2
#pop info
foofoo
barbar
#pop mark
C
\= Expect no match
D
#pop
AmanaplanacanalPanama
#pop info
metcalfe 33
# Check for an error when different tables are used.
/abc/push,tables=1
/xyz/push,tables=2
#save testsaved1
#pop
xyz
#pop
abc
#pop should give an error
pqr
/abcd/pushcopy
abcd
#pop
abcd
#pop should give an error
/abcd/push
#popcopy
abcd
#pop
abcd
/abcd/push
#save testsaved1
#pop should give an error
#load testsaved1
#popcopy
abcd
#pop
abcd
#pop should give an error
/abcd/pushtablescopy
abcd
#popcopy
abcd
#pop
abcd
# Must only specify one of these
//push,pushcopy
//push,pushtablescopy
//pushcopy,pushtablescopy
# End of testinput20

18
3rd/pcre2/testdata/testinput21 vendored Normal file
View File

@@ -0,0 +1,18 @@
# These are tests of \C that do not involve UTF. They are not run when \C is
# disabled by compiling with --enable-never-backslash-C.
/\C+\D \C+\d \C+\S \C+\s \C+\W \C+\w \C+. \C+\R \C+\H \C+\h \C+\V \C+\v \C+\Z \C+\z \C+$/Bx
/\D+\C \d+\C \S+\C \s+\C \W+\C \w+\C .+\C \R+\C \H+\C \h+\C \V+\C \v+\C a+\C \n+\C \C+\C/Bx
/ab\Cde/never_backslash_c
/ab\Cde/info
abXde
/(?<=ab\Cde)X/
abZdeX
/[\C]/
# End of testinput21

107
3rd/pcre2/testdata/testinput22 vendored Normal file
View File

@@ -0,0 +1,107 @@
# Tests of \C when Unicode support is available. Note that \C is not supported
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
# in some widths and not in others.
/ab\Cde/utf,info
abXde
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
# 16-bit modes, but not in 32-bit mode.
/(?<=ab\Cde)X/utf
ab!deXYZ
# Autopossessification tests
/\C+\X \X+\C/Bx
/\C+\X \X+\C/Bx,utf
/\C\X*TӅ;
{0,6}\v+
F
/utf
\= Expect no match
Ӆ\x0a
/\C(\W?ſ)'?{{/utf
\= Expect no match
\\C(\\W?ſ)'?{{
/X(\C{3})/utf
X\x{1234}
X\x{11234}Y
X\x{11234}YZ
/X(\C{4})/utf
X\x{1234}YZ
X\x{11234}YZ
X\x{11234}YZW
/X\C*/utf
XYZabcdce
/X\C*?/utf
XYZabcde
/X\C{3,5}/utf
Xabcdefg
X\x{1234}
X\x{1234}YZ
X\x{1234}\x{512}
X\x{1234}\x{512}YZ
X\x{11234}Y
X\x{11234}YZ
X\x{11234}\x{512}
X\x{11234}\x{512}YZ
X\x{11234}\x{512}\x{11234}Z
/X\C{3,5}?/utf
Xabcdefg
X\x{1234}
X\x{1234}YZ
X\x{1234}\x{512}
X\x{11234}Y
X\x{11234}YZ
X\x{11234}\x{512}YZ
X\x{11234}
/a\Cb/utf
aXb
a\nb
a\x{100}b
/a\C\Cb/utf
a\x{100}b
a\x{12257}b
a\x{12257}\x{11234}b
/ab\Cde/utf
abXde
# This one is here not because it's different to Perl, but because the way
# the captured single code unit is displayed. (In Perl it becomes a character,
# and you can't tell the difference.)
/X(\C)(.*)/utf
X\x{1234}
X\nabc
# This one is here because Perl gives out a grumbly error message (quite
# correctly, but that messes up comparisons).
/a\Cb/utf
\= Expect no match in 8-bit mode
a\x{100}b
/^ab\C/utf,no_start_optimize
\= Expect no match - tests \C at end of subject
ab
/\C[^\v]+\x80/utf
[AΏBŀC]
/\C[^\d]+\x80/utf
[AΏBŀC]

9
3rd/pcre2/testdata/testinput23 vendored Normal file
View File

@@ -0,0 +1,9 @@
# This test is run when PCRE2 has been built with --enable-never-backslash-C,
# which disables the use of \C. All we can do is check that it gives the
# correct error message.
/a\Cb/
/a[\C]b/
# End of testinput23

396
3rd/pcre2/testdata/testinput24 vendored Normal file
View File

@@ -0,0 +1,396 @@
# This file tests the auxiliary pattern conversion features of the PCRE2
# library, in non-UTF mode.
#forbid_utf
#newline_default lf any anycrlf
# -------- Tests of glob conversion --------
# Set the glob separator explicitly so that different OS defaults are not a
# problem. Then test various errors.
#pattern convert=glob,convert_glob_escape=\,convert_glob_separator=/
/abc/posix
# Separator must be / \ or .
/a*b/convert_glob_separator=%
# Can't have separator in a class
"[ab/cd]"
"[,-/]"
/[ab/
# Length check
/abc/convert_length=11
/abc/convert_length=12
# Now some actual tests
/a?b[]xy]*c/
azb]1234c
# Tests from the gitwildmatch list, with some additions
/foo/
foo
/= Expect no match
bar
//
\
/???/
foo
\= Expect no match
foobar
/*/
foo
\
/f*/
foo
f
/*f/
oof
\= Expect no match
foo
/*foo*/
foo
food
aprilfool
/*ob*a*r*/
foobar
/*ab/
aaaaaaabababab
/foo\*/
foo*
/foo\*bar/
\= Expect no match
foobar
/f\\oo/
f\\oo
/*[al]?/
ball
/[ten]/
\= Expect no match
ten
/t[a-g]n/
ten
/a[]]b/
a]b
/a[]a-]b/
/a[]-]b/
a-b
a]b
\= Expect no match
aab
/a[]a-z]b/
aab
/]/
]
/t[!a-g]n/
ton
\= Expect no match
ten
'[[:alpha:]][[:digit:]][[:upper:]]'
a1B
'[[:digit:][:upper:][:space:]]'
A
1
\ \=
\= Expect no match
a
.
'[a-c[:digit:]x-z]'
5
b
y
\= Expect no match
q
# End of gitwildmatch tests
/*.j?g/
pic01.jpg
.jpg
pic02.jxg
\= Expect no match
pic03.j/g
/A[+-0]B/
A+B
A.B
A0B
\= Expect no match
A/B
/*x?z/
abc.xyz
\= Expect no match
.xyz
/?x?z/
axyz
\= Expect no match
.xyz
"[,-0]x?z"
,xyz
\= Expect no match
/xyz
.xyz
".x*"
.xabc
/a[--0]z/
a-z
a.z
a0z
\= Expect no match
a/z
a1z
/<[a-c-d]>/
<a>
<b>
<c>
<d>
<->
/a[[:digit:].]z/
a1z
a.z
\= Expect no match
a:z
/a[[:digit].]z/
a[.]z
a:.]z
ad.]z
/<[[:a[:digit:]b]>/
<[>
<:>
<a>
<9>
<b>
\= Expect no match
<d>
/a*b/convert_glob_separator=\
/a*b/convert_glob_separator=.
/a*b/convert_glob_separator=/
# Non control character checking
/A\B\\C\D/
/\\{}\?\*+\[\]()|.^$/
/*a*\/*b*/
/?a?\/?b?/
/[a\\b\c][]][-][\]\-]/
/[^a\\b\c][!]][!-][^\]\-]/
/[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:word:][:xdigit:]]/
"[/-/]"
/[-----]/
/[------]/
/[!------]/
/[[:alpha:]-a]/
/[[:alpha:]][[:punct:]][[:ascii:]]/
/[a-[:alpha:]]/
/[[:alpha:/
/[[:alpha:]/
/[[:alphaa:]]/
/[[:xdigi:]]/
/[[:xdigit::]]/
/****/
/**\/abc/
abc
x/abc
xabc
/abc\/**/
/abc\/**\/abc/
/**\/*a*b*g*n*t/
abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt
/**\/*a*\/**/
xx/xx/xx/xax/xx/xb
/**\/*a*/
xx/xx/xx/xax
xx/xx/xx/xax/xx
/**\/*a*\/**\/*b*/
xx/xx/xx/xax/xx/xb
xx/xx/xx/xax/xx/x
"**a"convert=glob
a
c/b/a
c/b/aaa
"a**/b"convert=glob
a/b
ab
"a/**b"convert=glob
a/b
ab
#pattern convert=glob:glob_no_starstar
/***/
/**a**/
#pattern convert=unset
#pattern convert=glob:glob_no_wild_separator
/*/
/*a*/
/**a**/
/a*b/
/*a*b*/
/??a??/
#pattern convert=unset
#pattern convert=glob,convert_glob_escape=0
/a\b\cd/
/**\/a/
/a`*b/convert_glob_escape=`
/a`*b/convert_glob_escape=0
/a`*b/convert_glob_escape=x
# -------- Tests of extended POSIX conversion --------
#pattern convert=unset:posix_extended
/<[[:a[:digit:]b]>/
<[>
<:>
<a>
<9>
<b>
\= Expect no match
<d>
/a+\1b\\c|d[ab\c]/
/<[]bc]>/
<]>
<b>
<c>
/<[^]bc]>/
<.>
\= Expect no match
<]>
<b>
/(a)\1b/
a1b
\= Expect no match
aab
/(ab)c)d]/
Xabc)d]Y
/a***b/
# -------- Tests of basic POSIX conversion --------
#pattern convert=unset:posix_basic
/a*b+c\+[def](ab)\(cd\)/
/\(a\)\1b/
aab
\= Expect no match
a1b
/how.to how\.to/
how\nto how.to
\= Expect no match
how\x{0}to how.to
/^how to \^how to/
/^*abc/
/*abc/
X*abcY
/**abc/
XabcY
X*abcY
X**abcY
/*ab\(*cd\)/
/^b\(c^d\)\(^e^f\)/
/a***b/
# End of testinput24

22
3rd/pcre2/testdata/testinput25 vendored Normal file
View File

@@ -0,0 +1,22 @@
# This file tests the auxiliary pattern conversion features of the PCRE2
# library, in UTF mode.
#newline_default lf any anycrlf
# -------- Tests of glob conversion --------
# Set the glob separator explicitly so that different OS defaults are not a
# problem. Then test various errors.
#pattern convert=glob,convert_glob_escape=\,convert_glob_separator=/
# The fact that this one works in 13 bytes in the 8-bit library shows that the
# output is in UTF-8, though pcre2test shows the character as an escape.
/'>' c4 a3 '<'/hex,utf,convert_length=13
# This expansion creates a string that is too long for the input buffer.
/\[()]{65535}()/expand
# End of testinput25

2754
3rd/pcre2/testdata/testinput26 vendored Normal file
View File

@@ -0,0 +1,2754 @@
# These tests were generated by maint/GenerateTest.py using PCRE2's UCP
# data, do not edit unless that data has changed and they are reflecting
# a previous version.
# Unicode Script Extension tests for version 15.0.0
#perltest
# Base script check
/^\p{sc=Latin}/utf
A
/^\p{Script=Latn}/utf
\x{1df2a}
# Script extension check
/^\p{Latin}/utf
\x{363}
/^\p{scx=Latn}/utf
\x{a92e}
# Script extension only character
/^\p{Latin}/utf
\x{363}
/^\p{sc=Latin}/utf
\x{363}
# Character not in script
/^\p{Latin}/utf
\x{1df2b}
# Base script check
/^\p{sc=Greek}/utf
\x{370}
/^\p{Script=Grek}/utf
\x{1d245}
# Script extension check
/^\p{Greek}/utf
\x{342}
/^\p{Script_Extensions=Grek}/utf
\x{1dc1}
# Script extension only character
/^\p{Greek}/utf
\x{342}
/^\p{sc=Greek}/utf
\x{342}
# Character not in script
/^\p{Greek}/utf
\x{1d246}
# Base script check
/^\p{sc=Cyrillic}/utf
\x{400}
/^\p{Script=Cyrl}/utf
\x{1e08f}
# Script extension check
/^\p{Cyrillic}/utf
\x{483}
/^\p{scx=Cyrl}/utf
\x{a66f}
# Script extension only character
/^\p{Cyrillic}/utf
\x{2e43}
/^\p{sc=Cyrillic}/utf
\x{2e43}
# Character not in script
/^\p{Cyrillic}/utf
\x{1e090}
# Base script check
/^\p{sc=Arabic}/utf
\x{600}
/^\p{Script=Arab}/utf
\x{1eef1}
# Script extension check
/^\p{Arabic}/utf
\x{60c}
/^\p{Script_Extensions=Arab}/utf
\x{102fb}
# Script extension only character
/^\p{Arabic}/utf
\x{102e0}
/^\p{sc=Arabic}/utf
\x{102e0}
# Character not in script
/^\p{Arabic}/utf
\x{1eef2}
# Base script check
/^\p{sc=Syriac}/utf
\x{700}
/^\p{Script=Syrc}/utf
\x{86a}
# Script extension check
/^\p{Syriac}/utf
\x{60c}
/^\p{scx=Syrc}/utf
\x{1dfa}
# Script extension only character
/^\p{Syriac}/utf
\x{1dfa}
/^\p{sc=Syriac}/utf
\x{1dfa}
# Character not in script
/^\p{Syriac}/utf
\x{1dfb}
# Base script check
/^\p{sc=Thaana}/utf
\x{780}
/^\p{Script=Thaa}/utf
\x{7b1}
# Script extension check
/^\p{Thaana}/utf
\x{60c}
/^\p{Script_Extensions=Thaa}/utf
\x{fdfd}
# Script extension only character
/^\p{Thaana}/utf
\x{fdf2}
/^\p{sc=Thaana}/utf
\x{fdf2}
# Character not in script
/^\p{Thaana}/utf
\x{fdfe}
# Base script check
/^\p{sc=Devanagari}/utf
\x{900}
/^\p{Script=Deva}/utf
\x{11b09}
# Script extension check
/^\p{Devanagari}/utf
\x{951}
/^\p{scx=Deva}/utf
\x{a8f3}
# Script extension only character
/^\p{Devanagari}/utf
\x{1cd1}
/^\p{sc=Devanagari}/utf
\x{1cd1}
# Character not in script
/^\p{Devanagari}/utf
\x{11b0a}
# Base script check
/^\p{sc=Bengali}/utf
\x{980}
/^\p{Script=Beng}/utf
\x{9fe}
# Script extension check
/^\p{Bengali}/utf
\x{951}
/^\p{Script_Extensions=Beng}/utf
\x{a8f1}
# Script extension only character
/^\p{Bengali}/utf
\x{1cf7}
/^\p{sc=Bengali}/utf
\x{1cf7}
# Character not in script
/^\p{Bengali}/utf
\x{a8f2}
# Base script check
/^\p{sc=Gurmukhi}/utf
\x{a01}
/^\p{Script=Guru}/utf
\x{a76}
# Script extension check
/^\p{Gurmukhi}/utf
\x{951}
/^\p{scx=Guru}/utf
\x{a839}
# Script extension only character
/^\p{Gurmukhi}/utf
\x{a836}
/^\p{sc=Gurmukhi}/utf
\x{a836}
# Character not in script
/^\p{Gurmukhi}/utf
\x{a83a}
# Base script check
/^\p{sc=Gujarati}/utf
\x{a81}
/^\p{Script=Gujr}/utf
\x{aff}
# Script extension check
/^\p{Gujarati}/utf
\x{951}
/^\p{Script_Extensions=Gujr}/utf
\x{a839}
# Script extension only character
/^\p{Gujarati}/utf
\x{a836}
/^\p{sc=Gujarati}/utf
\x{a836}
# Character not in script
/^\p{Gujarati}/utf
\x{a83a}
# Base script check
/^\p{sc=Oriya}/utf
\x{b01}
/^\p{Script=Orya}/utf
\x{b77}
# Script extension check
/^\p{Oriya}/utf
\x{951}
/^\p{scx=Orya}/utf
\x{1cf2}
# Script extension only character
/^\p{Oriya}/utf
\x{1cda}
/^\p{sc=Oriya}/utf
\x{1cda}
# Character not in script
/^\p{Oriya}/utf
\x{1cf3}
# Base script check
/^\p{sc=Tamil}/utf
\x{b82}
/^\p{Script=Taml}/utf
\x{11fff}
# Script extension check
/^\p{Tamil}/utf
\x{951}
/^\p{Script_Extensions=Taml}/utf
\x{11fd3}
# Script extension only character
/^\p{Tamil}/utf
\x{a8f3}
/^\p{sc=Tamil}/utf
\x{a8f3}
# Character not in script
/^\p{Tamil}/utf
\x{12000}
# Base script check
/^\p{sc=Telugu}/utf
\x{c00}
/^\p{Script=Telu}/utf
\x{c7f}
# Script extension check
/^\p{Telugu}/utf
\x{951}
/^\p{scx=Telu}/utf
\x{1cf2}
# Script extension only character
/^\p{Telugu}/utf
\x{1cda}
/^\p{sc=Telugu}/utf
\x{1cda}
# Character not in script
/^\p{Telugu}/utf
\x{1cf3}
# Base script check
/^\p{sc=Kannada}/utf
\x{c80}
/^\p{Script=Knda}/utf
\x{cf3}
# Script extension check
/^\p{Kannada}/utf
\x{951}
/^\p{Script_Extensions=Knda}/utf
\x{a835}
# Script extension only character
/^\p{Kannada}/utf
\x{1cf4}
/^\p{sc=Kannada}/utf
\x{1cf4}
# Character not in script
/^\p{Kannada}/utf
\x{a836}
# Base script check
/^\p{sc=Malayalam}/utf
\x{d00}
/^\p{Script=Mlym}/utf
\x{d7f}
# Script extension check
/^\p{Malayalam}/utf
\x{951}
/^\p{scx=Mlym}/utf
\x{a832}
# Script extension only character
/^\p{Malayalam}/utf
\x{1cda}
/^\p{sc=Malayalam}/utf
\x{1cda}
# Character not in script
/^\p{Malayalam}/utf
\x{a833}
# Base script check
/^\p{sc=Sinhala}/utf
\x{d81}
/^\p{Script=Sinh}/utf
\x{111f4}
# Script extension check
/^\p{Sinhala}/utf
\x{964}
/^\p{Script_Extensions=Sinh}/utf
\x{965}
# Script extension only character
/^\p{Sinhala}/utf
\x{964}
/^\p{sc=Sinhala}/utf
\x{964}
# Character not in script
/^\p{Sinhala}/utf
\x{111f5}
# Base script check
/^\p{sc=Myanmar}/utf
\x{1000}
/^\p{Script=Mymr}/utf
\x{aa7f}
# Script extension check
/^\p{Myanmar}/utf
\x{1040}
/^\p{scx=Mymr}/utf
\x{a92e}
# Script extension only character
/^\p{Myanmar}/utf
\x{a92e}
/^\p{sc=Myanmar}/utf
\x{a92e}
# Character not in script
/^\p{Myanmar}/utf
\x{aa80}
# Base script check
/^\p{sc=Georgian}/utf
\x{10a0}
/^\p{Script=Geor}/utf
\x{2d2d}
# Script extension check
/^\p{Georgian}/utf
\x{10fb}
/^\p{Script_Extensions=Geor}/utf
\x{10fb}
# Script extension only character
/^\p{Georgian}/utf
\x{10fb}
/^\p{sc=Georgian}/utf
\x{10fb}
# Character not in script
/^\p{Georgian}/utf
\x{2d2e}
# Base script check
/^\p{sc=Hangul}/utf
\x{1100}
/^\p{Script=Hang}/utf
\x{ffdc}
# Script extension check
/^\p{Hangul}/utf
\x{3001}
/^\p{scx=Hang}/utf
\x{ff65}
# Script extension only character
/^\p{Hangul}/utf
\x{3003}
/^\p{sc=Hangul}/utf
\x{3003}
# Character not in script
/^\p{Hangul}/utf
\x{ffdd}
# Base script check
/^\p{sc=Mongolian}/utf
\x{1800}
/^\p{Script=Mong}/utf
\x{1166c}
# Script extension check
/^\p{Mongolian}/utf
\x{1802}
/^\p{Script_Extensions=Mong}/utf
\x{202f}
# Script extension only character
/^\p{Mongolian}/utf
\x{202f}
/^\p{sc=Mongolian}/utf
\x{202f}
# Character not in script
/^\p{Mongolian}/utf
\x{1166d}
# Base script check
/^\p{sc=Hiragana}/utf
\x{3041}
/^\p{Script=Hira}/utf
\x{1f200}
# Script extension check
/^\p{Hiragana}/utf
\x{3001}
/^\p{scx=Hira}/utf
\x{ff9f}
# Script extension only character
/^\p{Hiragana}/utf
\x{3031}
/^\p{sc=Hiragana}/utf
\x{3031}
# Character not in script
/^\p{Hiragana}/utf
\x{1f201}
# Base script check
/^\p{sc=Katakana}/utf
\x{30a1}
/^\p{Script=Kana}/utf
\x{1b167}
# Script extension check
/^\p{Katakana}/utf
\x{3001}
/^\p{Script_Extensions=Kana}/utf
\x{ff9f}
# Script extension only character
/^\p{Katakana}/utf
\x{3031}
/^\p{sc=Katakana}/utf
\x{3031}
# Character not in script
/^\p{Katakana}/utf
\x{1b168}
# Base script check
/^\p{sc=Bopomofo}/utf
\x{2ea}
/^\p{Script=Bopo}/utf
\x{31bf}
# Script extension check
/^\p{Bopomofo}/utf
\x{3001}
/^\p{scx=Bopo}/utf
\x{ff65}
# Script extension only character
/^\p{Bopomofo}/utf
\x{302a}
/^\p{sc=Bopomofo}/utf
\x{302a}
# Character not in script
/^\p{Bopomofo}/utf
\x{ff66}
# Base script check
/^\p{sc=Han}/utf
\x{2e80}
/^\p{Script=Hani}/utf
\x{323af}
# Script extension check
/^\p{Han}/utf
\x{3001}
/^\p{Script_Extensions=Hani}/utf
\x{1f251}
# Script extension only character
/^\p{Han}/utf
\x{3006}
/^\p{sc=Han}/utf
\x{3006}
# Character not in script
/^\p{Han}/utf
\x{323b0}
# Base script check
/^\p{sc=Yi}/utf
\x{a000}
/^\p{Script=Yiii}/utf
\x{a4c6}
# Script extension check
/^\p{Yi}/utf
\x{3001}
/^\p{scx=Yiii}/utf
\x{ff65}
# Script extension only character
/^\p{Yi}/utf
\x{3001}
/^\p{sc=Yi}/utf
\x{3001}
# Character not in script
/^\p{Yi}/utf
\x{ff66}
# Base script check
/^\p{sc=Tagalog}/utf
\x{1700}
/^\p{Script=Tglg}/utf
\x{171f}
# Script extension check
/^\p{Tagalog}/utf
\x{1735}
/^\p{Script_Extensions=Tglg}/utf
\x{1736}
# Script extension only character
/^\p{Tagalog}/utf
\x{1735}
/^\p{sc=Tagalog}/utf
\x{1735}
# Character not in script
/^\p{Tagalog}/utf
\x{1737}
# Base script check
/^\p{sc=Hanunoo}/utf
\x{1720}
/^\p{Script=Hano}/utf
\x{1734}
# Script extension check
/^\p{Hanunoo}/utf
\x{1735}
/^\p{scx=Hano}/utf
\x{1736}
# Script extension only character
/^\p{Hanunoo}/utf
\x{1735}
/^\p{sc=Hanunoo}/utf
\x{1735}
# Character not in script
/^\p{Hanunoo}/utf
\x{1737}
# Base script check
/^\p{sc=Buhid}/utf
\x{1740}
/^\p{Script=Buhd}/utf
\x{1753}
# Script extension check
/^\p{Buhid}/utf
\x{1735}
/^\p{Script_Extensions=Buhd}/utf
\x{1736}
# Script extension only character
/^\p{Buhid}/utf
\x{1735}
/^\p{sc=Buhid}/utf
\x{1735}
# Character not in script
/^\p{Buhid}/utf
\x{1754}
# Base script check
/^\p{sc=Tagbanwa}/utf
\x{1760}
/^\p{Script=Tagb}/utf
\x{1773}
# Script extension check
/^\p{Tagbanwa}/utf
\x{1735}
/^\p{scx=Tagb}/utf
\x{1736}
# Script extension only character
/^\p{Tagbanwa}/utf
\x{1735}
/^\p{sc=Tagbanwa}/utf
\x{1735}
# Character not in script
/^\p{Tagbanwa}/utf
\x{1774}
# Base script check
/^\p{sc=Limbu}/utf
\x{1900}
/^\p{Script=Limb}/utf
\x{194f}
# Script extension check
/^\p{Limbu}/utf
\x{965}
/^\p{Script_Extensions=Limb}/utf
\x{965}
# Script extension only character
/^\p{Limbu}/utf
\x{965}
/^\p{sc=Limbu}/utf
\x{965}
# Character not in script
/^\p{Limbu}/utf
\x{1950}
# Base script check
/^\p{sc=Tai_Le}/utf
\x{1950}
/^\p{Script=Tale}/utf
\x{1974}
# Script extension check
/^\p{Tai_Le}/utf
\x{1040}
/^\p{scx=Tale}/utf
\x{1049}
# Script extension only character
/^\p{Tai_Le}/utf
\x{1040}
/^\p{sc=Tai_Le}/utf
\x{1040}
# Character not in script
/^\p{Tai_Le}/utf
\x{1975}
# Base script check
/^\p{sc=Linear_B}/utf
\x{10000}
/^\p{Script=Linb}/utf
\x{100fa}
# Script extension check
/^\p{Linear_B}/utf
\x{10100}
/^\p{Script_Extensions=Linb}/utf
\x{1013f}
# Script extension only character
/^\p{Linear_B}/utf
\x{10102}
/^\p{sc=Linear_B}/utf
\x{10102}
# Character not in script
/^\p{Linear_B}/utf
\x{10140}
# Base script check
/^\p{sc=Cypriot}/utf
\x{10800}
/^\p{Script=Cprt}/utf
\x{1083f}
# Script extension check
/^\p{Cypriot}/utf
\x{10100}
/^\p{scx=Cprt}/utf
\x{1013f}
# Script extension only character
/^\p{Cypriot}/utf
\x{10102}
/^\p{sc=Cypriot}/utf
\x{10102}
# Character not in script
/^\p{Cypriot}/utf
\x{10840}
# Base script check
/^\p{sc=Buginese}/utf
\x{1a00}
/^\p{Script=Bugi}/utf
\x{1a1f}
# Script extension check
/^\p{Buginese}/utf
\x{a9cf}
/^\p{Script_Extensions=Bugi}/utf
\x{a9cf}
# Script extension only character
/^\p{Buginese}/utf
\x{a9cf}
/^\p{sc=Buginese}/utf
\x{a9cf}
# Character not in script
/^\p{Buginese}/utf
\x{a9d0}
# Base script check
/^\p{sc=Coptic}/utf
\x{3e2}
/^\p{Script=Copt}/utf
\x{2cff}
# Script extension check
/^\p{Coptic}/utf
\x{102e0}
/^\p{scx=Copt}/utf
\x{102fb}
# Script extension only character
/^\p{Coptic}/utf
\x{102e0}
/^\p{sc=Coptic}/utf
\x{102e0}
# Character not in script
/^\p{Coptic}/utf
\x{102fc}
# Base script check
/^\p{sc=Glagolitic}/utf
\x{2c00}
/^\p{Script=Glag}/utf
\x{1e02a}
# Script extension check
/^\p{Glagolitic}/utf
\x{484}
/^\p{Script_Extensions=Glag}/utf
\x{a66f}
# Script extension only character
/^\p{Glagolitic}/utf
\x{484}
/^\p{sc=Glagolitic}/utf
\x{484}
# Character not in script
/^\p{Glagolitic}/utf
\x{1e02b}
# Base script check
/^\p{sc=Syloti_Nagri}/utf
\x{a800}
/^\p{Script=Sylo}/utf
\x{a82c}
# Script extension check
/^\p{Syloti_Nagri}/utf
\x{964}
/^\p{scx=Sylo}/utf
\x{9ef}
# Script extension only character
/^\p{Syloti_Nagri}/utf
\x{9e6}
/^\p{sc=Syloti_Nagri}/utf
\x{9e6}
# Character not in script
/^\p{Syloti_Nagri}/utf
\x{a82d}
# Base script check
/^\p{sc=Phags_Pa}/utf
\x{a840}
/^\p{Script=Phag}/utf
\x{a877}
# Script extension check
/^\p{Phags_Pa}/utf
\x{1802}
/^\p{Script_Extensions=Phag}/utf
\x{1805}
# Script extension only character
/^\p{Phags_Pa}/utf
\x{1802}
/^\p{sc=Phags_Pa}/utf
\x{1802}
# Character not in script
/^\p{Phags_Pa}/utf
\x{a878}
# Base script check
/^\p{sc=Nko}/utf
\x{7c0}
/^\p{Script=Nkoo}/utf
\x{7ff}
# Script extension check
/^\p{Nko}/utf
\x{60c}
/^\p{scx=Nkoo}/utf
\x{fd3f}
# Script extension only character
/^\p{Nko}/utf
\x{fd3e}
/^\p{sc=Nko}/utf
\x{fd3e}
# Character not in script
/^\p{Nko}/utf
\x{fd40}
# Base script check
/^\p{sc=Kayah_Li}/utf
\x{a900}
/^\p{Script=Kali}/utf
\x{a92f}
# Script extension check
/^\p{Kayah_Li}/utf
\x{a92e}
/^\p{Script_Extensions=Kali}/utf
\x{a92e}
# Script extension only character
/^\p{Kayah_Li}/utf
\x{a92e}
/^\p{sc=Kayah_Li}/utf
\x{a92e}
# Character not in script
/^\p{Kayah_Li}/utf
\x{a930}
# Base script check
/^\p{sc=Javanese}/utf
\x{a980}
/^\p{Script=Java}/utf
\x{a9df}
# Script extension check
/^\p{Javanese}/utf
\x{a9cf}
/^\p{scx=Java}/utf
\x{a9cf}
# Script extension only character
/^\p{Javanese}/utf
\x{a9cf}
/^\p{sc=Javanese}/utf
\x{a9cf}
# Character not in script
/^\p{Javanese}/utf
\x{a9e0}
# Base script check
/^\p{sc=Kaithi}/utf
\x{11080}
/^\p{Script=Kthi}/utf
\x{110cd}
# Script extension check
/^\p{Kaithi}/utf
\x{966}
/^\p{Script_Extensions=Kthi}/utf
\x{a839}
# Script extension only character
/^\p{Kaithi}/utf
\x{966}
/^\p{sc=Kaithi}/utf
\x{966}
# Character not in script
/^\p{Kaithi}/utf
\x{110ce}
# Base script check
/^\p{sc=Mandaic}/utf
\x{840}
/^\p{Script=Mand}/utf
\x{85e}
# Script extension check
/^\p{Mandaic}/utf
\x{640}
/^\p{scx=Mand}/utf
\x{640}
# Script extension only character
/^\p{Mandaic}/utf
\x{640}
/^\p{sc=Mandaic}/utf
\x{640}
# Character not in script
/^\p{Mandaic}/utf
\x{85f}
# Base script check
/^\p{sc=Chakma}/utf
\x{11100}
/^\p{Script=Cakm}/utf
\x{11147}
# Script extension check
/^\p{Chakma}/utf
\x{9e6}
/^\p{Script_Extensions=Cakm}/utf
\x{1049}
# Script extension only character
/^\p{Chakma}/utf
\x{9e6}
/^\p{sc=Chakma}/utf
\x{9e6}
# Character not in script
/^\p{Chakma}/utf
\x{11148}
# Base script check
/^\p{sc=Sharada}/utf
\x{11180}
/^\p{Script=Shrd}/utf
\x{111df}
# Script extension check
/^\p{Sharada}/utf
\x{951}
/^\p{scx=Shrd}/utf
\x{1ce0}
# Script extension only character
/^\p{Sharada}/utf
\x{1cd7}
/^\p{sc=Sharada}/utf
\x{1cd7}
# Character not in script
/^\p{Sharada}/utf
\x{111e0}
# Base script check
/^\p{sc=Takri}/utf
\x{11680}
/^\p{Script=Takr}/utf
\x{116c9}
# Script extension check
/^\p{Takri}/utf
\x{964}
/^\p{Script_Extensions=Takr}/utf
\x{a839}
# Script extension only character
/^\p{Takri}/utf
\x{a836}
/^\p{sc=Takri}/utf
\x{a836}
# Character not in script
/^\p{Takri}/utf
\x{116ca}
# Base script check
/^\p{sc=Duployan}/utf
\x{1bc00}
/^\p{Script=Dupl}/utf
\x{1bc9f}
# Script extension check
/^\p{Duployan}/utf
\x{1bca0}
/^\p{scx=Dupl}/utf
\x{1bca3}
# Script extension only character
/^\p{Duployan}/utf
\x{1bca0}
/^\p{sc=Duployan}/utf
\x{1bca0}
# Character not in script
/^\p{Duployan}/utf
\x{1bca4}
# Base script check
/^\p{sc=Grantha}/utf
\x{11300}
/^\p{Script=Gran}/utf
\x{11374}
# Script extension check
/^\p{Grantha}/utf
\x{951}
/^\p{Script_Extensions=Gran}/utf
\x{11fd3}
# Script extension only character
/^\p{Grantha}/utf
\x{1cd3}
/^\p{sc=Grantha}/utf
\x{1cd3}
# Character not in script
/^\p{Grantha}/utf
\x{11fd4}
# Base script check
/^\p{sc=Khojki}/utf
\x{11200}
/^\p{Script=Khoj}/utf
\x{11241}
# Script extension check
/^\p{Khojki}/utf
\x{ae6}
/^\p{scx=Khoj}/utf
\x{a839}
# Script extension only character
/^\p{Khojki}/utf
\x{ae6}
/^\p{sc=Khojki}/utf
\x{ae6}
# Character not in script
/^\p{Khojki}/utf
\x{11242}
# Base script check
/^\p{sc=Linear_A}/utf
\x{10600}
/^\p{Script=Lina}/utf
\x{10767}
# Script extension check
/^\p{Linear_A}/utf
\x{10107}
/^\p{Script_Extensions=Lina}/utf
\x{10133}
# Script extension only character
/^\p{Linear_A}/utf
\x{10107}
/^\p{sc=Linear_A}/utf
\x{10107}
# Character not in script
/^\p{Linear_A}/utf
\x{10768}
# Base script check
/^\p{sc=Mahajani}/utf
\x{11150}
/^\p{Script=Mahj}/utf
\x{11176}
# Script extension check
/^\p{Mahajani}/utf
\x{964}
/^\p{scx=Mahj}/utf
\x{a839}
# Script extension only character
/^\p{Mahajani}/utf
\x{966}
/^\p{sc=Mahajani}/utf
\x{966}
# Character not in script
/^\p{Mahajani}/utf
\x{11177}
# Base script check
/^\p{sc=Manichaean}/utf
\x{10ac0}
/^\p{Script=Mani}/utf
\x{10af6}
# Script extension check
/^\p{Manichaean}/utf
\x{640}
/^\p{Script_Extensions=Mani}/utf
\x{10af2}
# Script extension only character
/^\p{Manichaean}/utf
\x{640}
/^\p{sc=Manichaean}/utf
\x{640}
# Character not in script
/^\p{Manichaean}/utf
\x{10af7}
# Base script check
/^\p{sc=Modi}/utf
\x{11600}
/^\p{Script=Modi}/utf
\x{11659}
# Script extension check
/^\p{Modi}/utf
\x{a830}
/^\p{scx=Modi}/utf
\x{a839}
# Script extension only character
/^\p{Modi}/utf
\x{a836}
/^\p{sc=Modi}/utf
\x{a836}
# Character not in script
/^\p{Modi}/utf
\x{1165a}
# Base script check
/^\p{sc=Old_Permic}/utf
\x{10350}
/^\p{Script=Perm}/utf
\x{1037a}
# Script extension check
/^\p{Old_Permic}/utf
\x{483}
/^\p{Script_Extensions=Perm}/utf
\x{483}
# Script extension only character
/^\p{Old_Permic}/utf
\x{483}
/^\p{sc=Old_Permic}/utf
\x{483}
# Character not in script
/^\p{Old_Permic}/utf
\x{1037b}
# Base script check
/^\p{sc=Psalter_Pahlavi}/utf
\x{10b80}
/^\p{Script=Phlp}/utf
\x{10baf}
# Script extension check
/^\p{Psalter_Pahlavi}/utf
\x{640}
/^\p{scx=Phlp}/utf
\x{640}
# Script extension only character
/^\p{Psalter_Pahlavi}/utf
\x{640}
/^\p{sc=Psalter_Pahlavi}/utf
\x{640}
# Character not in script
/^\p{Psalter_Pahlavi}/utf
\x{10bb0}
# Base script check
/^\p{sc=Khudawadi}/utf
\x{112b0}
/^\p{Script=Sind}/utf
\x{112f9}
# Script extension check
/^\p{Khudawadi}/utf
\x{964}
/^\p{Script_Extensions=Sind}/utf
\x{a839}
# Script extension only character
/^\p{Khudawadi}/utf
\x{a836}
/^\p{sc=Khudawadi}/utf
\x{a836}
# Character not in script
/^\p{Khudawadi}/utf
\x{112fa}
# Base script check
/^\p{sc=Tirhuta}/utf
\x{11480}
/^\p{Script=Tirh}/utf
\x{114d9}
# Script extension check
/^\p{Tirhuta}/utf
\x{951}
/^\p{scx=Tirh}/utf
\x{a839}
# Script extension only character
/^\p{Tirhuta}/utf
\x{1cf2}
/^\p{sc=Tirhuta}/utf
\x{1cf2}
# Character not in script
/^\p{Tirhuta}/utf
\x{114da}
# Base script check
/^\p{sc=Multani}/utf
\x{11280}
/^\p{Script=Mult}/utf
\x{112a9}
# Script extension check
/^\p{Multani}/utf
\x{a66}
/^\p{Script_Extensions=Mult}/utf
\x{a6f}
# Script extension only character
/^\p{Multani}/utf
\x{a66}
/^\p{sc=Multani}/utf
\x{a66}
# Character not in script
/^\p{Multani}/utf
\x{112aa}
# Base script check
/^\p{sc=Adlam}/utf
\x{1e900}
/^\p{Script=Adlm}/utf
\x{1e95f}
# Script extension check
/^\p{Adlam}/utf
\x{61f}
/^\p{scx=Adlm}/utf
\x{640}
# Script extension only character
/^\p{Adlam}/utf
\x{61f}
/^\p{sc=Adlam}/utf
\x{61f}
# Character not in script
/^\p{Adlam}/utf
\x{1e960}
# Base script check
/^\p{sc=Masaram_Gondi}/utf
\x{11d00}
/^\p{Script=Gonm}/utf
\x{11d59}
# Script extension check
/^\p{Masaram_Gondi}/utf
\x{964}
/^\p{Script_Extensions=Gonm}/utf
\x{965}
# Script extension only character
/^\p{Masaram_Gondi}/utf
\x{964}
/^\p{sc=Masaram_Gondi}/utf
\x{964}
# Character not in script
/^\p{Masaram_Gondi}/utf
\x{11d5a}
# Base script check
/^\p{sc=Dogra}/utf
\x{11800}
/^\p{Script=Dogr}/utf
\x{1183b}
# Script extension check
/^\p{Dogra}/utf
\x{964}
/^\p{scx=Dogr}/utf
\x{a839}
# Script extension only character
/^\p{Dogra}/utf
\x{966}
/^\p{sc=Dogra}/utf
\x{966}
# Character not in script
/^\p{Dogra}/utf
\x{1183c}
# Base script check
/^\p{sc=Gunjala_Gondi}/utf
\x{11d60}
/^\p{Script=Gong}/utf
\x{11da9}
# Script extension check
/^\p{Gunjala_Gondi}/utf
\x{964}
/^\p{Script_Extensions=Gong}/utf
\x{965}
# Script extension only character
/^\p{Gunjala_Gondi}/utf
\x{964}
/^\p{sc=Gunjala_Gondi}/utf
\x{964}
# Character not in script
/^\p{Gunjala_Gondi}/utf
\x{11daa}
# Base script check
/^\p{sc=Hanifi_Rohingya}/utf
\x{10d00}
/^\p{Script=Rohg}/utf
\x{10d39}
# Script extension check
/^\p{Hanifi_Rohingya}/utf
\x{60c}
/^\p{scx=Rohg}/utf
\x{6d4}
# Script extension only character
/^\p{Hanifi_Rohingya}/utf
\x{6d4}
/^\p{sc=Hanifi_Rohingya}/utf
\x{6d4}
# Character not in script
/^\p{Hanifi_Rohingya}/utf
\x{10d3a}
# Base script check
/^\p{sc=Sogdian}/utf
\x{10f30}
/^\p{Script=Sogd}/utf
\x{10f59}
# Script extension check
/^\p{Sogdian}/utf
\x{640}
/^\p{Script_Extensions=Sogd}/utf
\x{640}
# Script extension only character
/^\p{Sogdian}/utf
\x{640}
/^\p{sc=Sogdian}/utf
\x{640}
# Character not in script
/^\p{Sogdian}/utf
\x{10f5a}
# Base script check
/^\p{sc=Nandinagari}/utf
\x{119a0}
/^\p{Script=Nand}/utf
\x{119e4}
# Script extension check
/^\p{Nandinagari}/utf
\x{964}
/^\p{scx=Nand}/utf
\x{a835}
# Script extension only character
/^\p{Nandinagari}/utf
\x{1cfa}
/^\p{sc=Nandinagari}/utf
\x{1cfa}
# Character not in script
/^\p{Nandinagari}/utf
\x{119e5}
# Base script check
/^\p{sc=Yezidi}/utf
\x{10e80}
/^\p{Script=Yezi}/utf
\x{10eb1}
# Script extension check
/^\p{Yezidi}/utf
\x{60c}
/^\p{Script_Extensions=Yezi}/utf
\x{669}
# Script extension only character
/^\p{Yezidi}/utf
\x{660}
/^\p{sc=Yezidi}/utf
\x{660}
# Character not in script
/^\p{Yezidi}/utf
\x{10eb2}
# Base script check
/^\p{sc=Cypro_Minoan}/utf
\x{12f90}
/^\p{Script=Cpmn}/utf
\x{12ff2}
# Script extension check
/^\p{Cypro_Minoan}/utf
\x{10100}
/^\p{scx=Cpmn}/utf
\x{10101}
# Script extension only character
/^\p{Cypro_Minoan}/utf
\x{10100}
/^\p{sc=Cypro_Minoan}/utf
\x{10100}
# Character not in script
/^\p{Cypro_Minoan}/utf
\x{12ff3}
# Base script check
/^\p{sc=Old_Uyghur}/utf
\x{10f70}
/^\p{Script=Ougr}/utf
\x{10f89}
# Script extension check
/^\p{Old_Uyghur}/utf
\x{640}
/^\p{Script_Extensions=Ougr}/utf
\x{10af2}
# Script extension only character
/^\p{Old_Uyghur}/utf
\x{10af2}
/^\p{sc=Old_Uyghur}/utf
\x{10af2}
# Character not in script
/^\p{Old_Uyghur}/utf
\x{10f8a}
# Base script check
/^\p{sc=Common}/utf
\x{00}
/^\p{Script=Zyyy}/utf
\x{e007f}
# Character not in script
/^\p{Common}/utf
\x{e0080}
# Base script check
/^\p{sc=Armenian}/utf
\x{531}
/^\p{Script=Armn}/utf
\x{fb17}
# Character not in script
/^\p{Armenian}/utf
\x{fb18}
# Base script check
/^\p{sc=Hebrew}/utf
\x{591}
/^\p{Script=Hebr}/utf
\x{fb4f}
# Character not in script
/^\p{Hebrew}/utf
\x{fb50}
# Base script check
/^\p{sc=Thai}/utf
\x{e01}
/^\p{Script=Thai}/utf
\x{e5b}
# Character not in script
/^\p{Thai}/utf
\x{e5c}
# Base script check
/^\p{sc=Lao}/utf
\x{e81}
/^\p{Script=Laoo}/utf
\x{edf}
# Character not in script
/^\p{Lao}/utf
\x{ee0}
# Base script check
/^\p{sc=Tibetan}/utf
\x{f00}
/^\p{Script=Tibt}/utf
\x{fda}
# Character not in script
/^\p{Tibetan}/utf
\x{fdb}
# Base script check
/^\p{sc=Ethiopic}/utf
\x{1200}
/^\p{Script=Ethi}/utf
\x{1e7fe}
# Character not in script
/^\p{Ethiopic}/utf
\x{1e7ff}
# Base script check
/^\p{sc=Cherokee}/utf
\x{13a0}
/^\p{Script=Cher}/utf
\x{abbf}
# Character not in script
/^\p{Cherokee}/utf
\x{abc0}
# Base script check
/^\p{sc=Canadian_Aboriginal}/utf
\x{1400}
/^\p{Script=Cans}/utf
\x{11abf}
# Character not in script
/^\p{Canadian_Aboriginal}/utf
\x{11ac0}
# Base script check
/^\p{sc=Ogham}/utf
\x{1680}
/^\p{Script=Ogam}/utf
\x{169c}
# Character not in script
/^\p{Ogham}/utf
\x{169d}
# Base script check
/^\p{sc=Runic}/utf
\x{16a0}
/^\p{Script=Runr}/utf
\x{16f8}
# Character not in script
/^\p{Runic}/utf
\x{16f9}
# Base script check
/^\p{sc=Khmer}/utf
\x{1780}
/^\p{Script=Khmr}/utf
\x{19ff}
# Character not in script
/^\p{Khmer}/utf
\x{1a00}
# Base script check
/^\p{sc=Old_Italic}/utf
\x{10300}
/^\p{Script=Ital}/utf
\x{1032f}
# Character not in script
/^\p{Old_Italic}/utf
\x{10330}
# Base script check
/^\p{sc=Gothic}/utf
\x{10330}
/^\p{Script=Goth}/utf
\x{1034a}
# Character not in script
/^\p{Gothic}/utf
\x{1034b}
# Base script check
/^\p{sc=Deseret}/utf
\x{10400}
/^\p{Script=Dsrt}/utf
\x{1044f}
# Character not in script
/^\p{Deseret}/utf
\x{10450}
# Base script check
/^\p{sc=Inherited}/utf
\x{300}
/^\p{Script=Zinh}/utf
\x{e01ef}
# Character not in script
/^\p{Inherited}/utf
\x{e01f0}
# Base script check
/^\p{sc=Ugaritic}/utf
\x{10380}
/^\p{Script=Ugar}/utf
\x{1039f}
# Character not in script
/^\p{Ugaritic}/utf
\x{103a0}
# Base script check
/^\p{sc=Shavian}/utf
\x{10450}
/^\p{Script=Shaw}/utf
\x{1047f}
# Character not in script
/^\p{Shavian}/utf
\x{10480}
# Base script check
/^\p{sc=Osmanya}/utf
\x{10480}
/^\p{Script=Osma}/utf
\x{104a9}
# Character not in script
/^\p{Osmanya}/utf
\x{104aa}
# Base script check
/^\p{sc=Braille}/utf
\x{2800}
/^\p{Script=Brai}/utf
\x{28ff}
# Character not in script
/^\p{Braille}/utf
\x{2900}
# Base script check
/^\p{sc=New_Tai_Lue}/utf
\x{1980}
/^\p{Script=Talu}/utf
\x{19df}
# Character not in script
/^\p{New_Tai_Lue}/utf
\x{19e0}
# Base script check
/^\p{sc=Tifinagh}/utf
\x{2d30}
/^\p{Script=Tfng}/utf
\x{2d7f}
# Character not in script
/^\p{Tifinagh}/utf
\x{2d80}
# Base script check
/^\p{sc=Old_Persian}/utf
\x{103a0}
/^\p{Script=Xpeo}/utf
\x{103d5}
# Character not in script
/^\p{Old_Persian}/utf
\x{103d6}
# Base script check
/^\p{sc=Kharoshthi}/utf
\x{10a00}
/^\p{Script=Khar}/utf
\x{10a58}
# Character not in script
/^\p{Kharoshthi}/utf
\x{10a59}
# Base script check
/^\p{sc=Balinese}/utf
\x{1b00}
/^\p{Script=Bali}/utf
\x{1b7e}
# Character not in script
/^\p{Balinese}/utf
\x{1b8f}
# Base script check
/^\p{sc=Cuneiform}/utf
\x{12000}
/^\p{Script=Xsux}/utf
\x{12543}
# Character not in script
/^\p{Cuneiform}/utf
\x{12544}
# Base script check
/^\p{sc=Phoenician}/utf
\x{10900}
/^\p{Script=Phnx}/utf
\x{1091f}
# Character not in script
/^\p{Phoenician}/utf
\x{10920}
# Base script check
/^\p{sc=Sundanese}/utf
\x{1b80}
/^\p{Script=Sund}/utf
\x{1cc7}
# Character not in script
/^\p{Sundanese}/utf
\x{1cc8}
# Base script check
/^\p{sc=Lepcha}/utf
\x{1c00}
/^\p{Script=Lepc}/utf
\x{1c4f}
# Character not in script
/^\p{Lepcha}/utf
\x{1c50}
# Base script check
/^\p{sc=Ol_Chiki}/utf
\x{1c50}
/^\p{Script=Olck}/utf
\x{1c7f}
# Character not in script
/^\p{Ol_Chiki}/utf
\x{1c80}
# Base script check
/^\p{sc=Vai}/utf
\x{a500}
/^\p{Script=Vaii}/utf
\x{a62b}
# Character not in script
/^\p{Vai}/utf
\x{a62c}
# Base script check
/^\p{sc=Saurashtra}/utf
\x{a880}
/^\p{Script=Saur}/utf
\x{a8d9}
# Character not in script
/^\p{Saurashtra}/utf
\x{a8da}
# Base script check
/^\p{sc=Rejang}/utf
\x{a930}
/^\p{Script=Rjng}/utf
\x{a95f}
# Character not in script
/^\p{Rejang}/utf
\x{a960}
# Base script check
/^\p{sc=Lycian}/utf
\x{10280}
/^\p{Script=Lyci}/utf
\x{1029c}
# Character not in script
/^\p{Lycian}/utf
\x{1029d}
# Base script check
/^\p{sc=Carian}/utf
\x{102a0}
/^\p{Script=Cari}/utf
\x{102d0}
# Character not in script
/^\p{Carian}/utf
\x{102d1}
# Base script check
/^\p{sc=Lydian}/utf
\x{10920}
/^\p{Script=Lydi}/utf
\x{1093f}
# Character not in script
/^\p{Lydian}/utf
\x{10940}
# Base script check
/^\p{sc=Cham}/utf
\x{aa00}
/^\p{Script=Cham}/utf
\x{aa5f}
# Character not in script
/^\p{Cham}/utf
\x{aa60}
# Base script check
/^\p{sc=Tai_Tham}/utf
\x{1a20}
/^\p{Script=Lana}/utf
\x{1aad}
# Character not in script
/^\p{Tai_Tham}/utf
\x{1aae}
# Base script check
/^\p{sc=Tai_Viet}/utf
\x{aa80}
/^\p{Script=Tavt}/utf
\x{aadf}
# Character not in script
/^\p{Tai_Viet}/utf
\x{aae0}
# Base script check
/^\p{sc=Avestan}/utf
\x{10b00}
/^\p{Script=Avst}/utf
\x{10b3f}
# Character not in script
/^\p{Avestan}/utf
\x{10b40}
# Base script check
/^\p{sc=Egyptian_Hieroglyphs}/utf
\x{13000}
/^\p{Script=Egyp}/utf
\x{13455}
# Character not in script
/^\p{Egyptian_Hieroglyphs}/utf
\x{13456}
# Base script check
/^\p{sc=Samaritan}/utf
\x{800}
/^\p{Script=Samr}/utf
\x{83e}
# Character not in script
/^\p{Samaritan}/utf
\x{83f}
# Base script check
/^\p{sc=Lisu}/utf
\x{a4d0}
/^\p{Script=Lisu}/utf
\x{11fb0}
# Character not in script
/^\p{Lisu}/utf
\x{11fb1}
# Base script check
/^\p{sc=Bamum}/utf
\x{a6a0}
/^\p{Script=Bamu}/utf
\x{16a38}
# Character not in script
/^\p{Bamum}/utf
\x{16a39}
# Base script check
/^\p{sc=Meetei_Mayek}/utf
\x{aae0}
/^\p{Script=Mtei}/utf
\x{abf9}
# Character not in script
/^\p{Meetei_Mayek}/utf
\x{abfa}
# Base script check
/^\p{sc=Imperial_Aramaic}/utf
\x{10840}
/^\p{Script=Armi}/utf
\x{1085f}
# Character not in script
/^\p{Imperial_Aramaic}/utf
\x{10860}
# Base script check
/^\p{sc=Old_South_Arabian}/utf
\x{10a60}
/^\p{Script=Sarb}/utf
\x{10a7f}
# Character not in script
/^\p{Old_South_Arabian}/utf
\x{10a80}
# Base script check
/^\p{sc=Inscriptional_Parthian}/utf
\x{10b40}
/^\p{Script=Prti}/utf
\x{10b5f}
# Character not in script
/^\p{Inscriptional_Parthian}/utf
\x{10b60}
# Base script check
/^\p{sc=Inscriptional_Pahlavi}/utf
\x{10b60}
/^\p{Script=Phli}/utf
\x{10b7f}
# Character not in script
/^\p{Inscriptional_Pahlavi}/utf
\x{10b80}
# Base script check
/^\p{sc=Old_Turkic}/utf
\x{10c00}
/^\p{Script=Orkh}/utf
\x{10c48}
# Character not in script
/^\p{Old_Turkic}/utf
\x{10c49}
# Base script check
/^\p{sc=Batak}/utf
\x{1bc0}
/^\p{Script=Batk}/utf
\x{1bff}
# Character not in script
/^\p{Batak}/utf
\x{1c00}
# Base script check
/^\p{sc=Brahmi}/utf
\x{11000}
/^\p{Script=Brah}/utf
\x{1107f}
# Character not in script
/^\p{Brahmi}/utf
\x{11080}
# Base script check
/^\p{sc=Meroitic_Cursive}/utf
\x{109a0}
/^\p{Script=Merc}/utf
\x{109ff}
# Character not in script
/^\p{Meroitic_Cursive}/utf
\x{10a00}
# Base script check
/^\p{sc=Meroitic_Hieroglyphs}/utf
\x{10980}
/^\p{Script=Mero}/utf
\x{1099f}
# Character not in script
/^\p{Meroitic_Hieroglyphs}/utf
\x{109a0}
# Base script check
/^\p{sc=Miao}/utf
\x{16f00}
/^\p{Script=Plrd}/utf
\x{16f9f}
# Character not in script
/^\p{Miao}/utf
\x{16fa0}
# Base script check
/^\p{sc=Sora_Sompeng}/utf
\x{110d0}
/^\p{Script=Sora}/utf
\x{110f9}
# Character not in script
/^\p{Sora_Sompeng}/utf
\x{110fa}
# Base script check
/^\p{sc=Caucasian_Albanian}/utf
\x{10530}
/^\p{Script=Aghb}/utf
\x{1056f}
# Character not in script
/^\p{Caucasian_Albanian}/utf
\x{10570}
# Base script check
/^\p{sc=Bassa_Vah}/utf
\x{16ad0}
/^\p{Script=Bass}/utf
\x{16af5}
# Character not in script
/^\p{Bassa_Vah}/utf
\x{16af6}
# Base script check
/^\p{sc=Elbasan}/utf
\x{10500}
/^\p{Script=Elba}/utf
\x{10527}
# Character not in script
/^\p{Elbasan}/utf
\x{10528}
# Base script check
/^\p{sc=Pahawh_Hmong}/utf
\x{16b00}
/^\p{Script=Hmng}/utf
\x{16b8f}
# Character not in script
/^\p{Pahawh_Hmong}/utf
\x{16b90}
# Base script check
/^\p{sc=Mende_Kikakui}/utf
\x{1e800}
/^\p{Script=Mend}/utf
\x{1e8d6}
# Character not in script
/^\p{Mende_Kikakui}/utf
\x{1e8d7}
# Base script check
/^\p{sc=Mro}/utf
\x{16a40}
/^\p{Script=Mroo}/utf
\x{16a6f}
# Character not in script
/^\p{Mro}/utf
\x{16a70}
# Base script check
/^\p{sc=Old_North_Arabian}/utf
\x{10a80}
/^\p{Script=Narb}/utf
\x{10a9f}
# Character not in script
/^\p{Old_North_Arabian}/utf
\x{10aa0}
# Base script check
/^\p{sc=Nabataean}/utf
\x{10880}
/^\p{Script=Nbat}/utf
\x{108af}
# Character not in script
/^\p{Nabataean}/utf
\x{108b0}
# Base script check
/^\p{sc=Palmyrene}/utf
\x{10860}
/^\p{Script=Palm}/utf
\x{1087f}
# Character not in script
/^\p{Palmyrene}/utf
\x{10880}
# Base script check
/^\p{sc=Pau_Cin_Hau}/utf
\x{11ac0}
/^\p{Script=Pauc}/utf
\x{11af8}
# Character not in script
/^\p{Pau_Cin_Hau}/utf
\x{11af9}
# Base script check
/^\p{sc=Siddham}/utf
\x{11580}
/^\p{Script=Sidd}/utf
\x{115dd}
# Character not in script
/^\p{Siddham}/utf
\x{115de}
# Base script check
/^\p{sc=Warang_Citi}/utf
\x{118a0}
/^\p{Script=Wara}/utf
\x{118ff}
# Character not in script
/^\p{Warang_Citi}/utf
\x{11900}
# Base script check
/^\p{sc=Ahom}/utf
\x{11700}
/^\p{Script=Ahom}/utf
\x{11746}
# Character not in script
/^\p{Ahom}/utf
\x{11747}
# Base script check
/^\p{sc=Anatolian_Hieroglyphs}/utf
\x{14400}
/^\p{Script=Hluw}/utf
\x{14646}
# Character not in script
/^\p{Anatolian_Hieroglyphs}/utf
\x{14647}
# Base script check
/^\p{sc=Hatran}/utf
\x{108e0}
/^\p{Script=Hatr}/utf
\x{108ff}
# Character not in script
/^\p{Hatran}/utf
\x{10900}
# Base script check
/^\p{sc=Old_Hungarian}/utf
\x{10c80}
/^\p{Script=Hung}/utf
\x{10cff}
# Character not in script
/^\p{Old_Hungarian}/utf
\x{10d00}
# Base script check
/^\p{sc=SignWriting}/utf
\x{1d800}
/^\p{Script=Sgnw}/utf
\x{1daaf}
# Character not in script
/^\p{SignWriting}/utf
\x{1dab0}
# Base script check
/^\p{sc=Bhaiksuki}/utf
\x{11c00}
/^\p{Script=Bhks}/utf
\x{11c6c}
# Character not in script
/^\p{Bhaiksuki}/utf
\x{11c6d}
# Base script check
/^\p{sc=Marchen}/utf
\x{11c70}
/^\p{Script=Marc}/utf
\x{11cb6}
# Character not in script
/^\p{Marchen}/utf
\x{11cb7}
# Base script check
/^\p{sc=Newa}/utf
\x{11400}
/^\p{Script=Newa}/utf
\x{11461}
# Character not in script
/^\p{Newa}/utf
\x{11462}
# Base script check
/^\p{sc=Osage}/utf
\x{104b0}
/^\p{Script=Osge}/utf
\x{104fb}
# Character not in script
/^\p{Osage}/utf
\x{104fc}
# Base script check
/^\p{sc=Tangut}/utf
\x{16fe0}
/^\p{Script=Tang}/utf
\x{18d08}
# Character not in script
/^\p{Tangut}/utf
\x{18d09}
# Base script check
/^\p{sc=Nushu}/utf
\x{16fe1}
/^\p{Script=Nshu}/utf
\x{1b2fb}
# Character not in script
/^\p{Nushu}/utf
\x{1b2fc}
# Base script check
/^\p{sc=Soyombo}/utf
\x{11a50}
/^\p{Script=Soyo}/utf
\x{11aa2}
# Character not in script
/^\p{Soyombo}/utf
\x{11aa3}
# Base script check
/^\p{sc=Zanabazar_Square}/utf
\x{11a00}
/^\p{Script=Zanb}/utf
\x{11a47}
# Character not in script
/^\p{Zanabazar_Square}/utf
\x{11a48}
# Base script check
/^\p{sc=Makasar}/utf
\x{11ee0}
/^\p{Script=Maka}/utf
\x{11ef8}
# Character not in script
/^\p{Makasar}/utf
\x{11ef9}
# Base script check
/^\p{sc=Medefaidrin}/utf
\x{16e40}
/^\p{Script=Medf}/utf
\x{16e9a}
# Character not in script
/^\p{Medefaidrin}/utf
\x{16e9b}
# Base script check
/^\p{sc=Old_Sogdian}/utf
\x{10f00}
/^\p{Script=Sogo}/utf
\x{10f27}
# Character not in script
/^\p{Old_Sogdian}/utf
\x{10f28}
# Base script check
/^\p{sc=Elymaic}/utf
\x{10fe0}
/^\p{Script=Elym}/utf
\x{10ff6}
# Character not in script
/^\p{Elymaic}/utf
\x{10ff7}
# Base script check
/^\p{sc=Nyiakeng_Puachue_Hmong}/utf
\x{1e100}
/^\p{Script=Hmnp}/utf
\x{1e14f}
# Character not in script
/^\p{Nyiakeng_Puachue_Hmong}/utf
\x{1e150}
# Base script check
/^\p{sc=Wancho}/utf
\x{1e2c0}
/^\p{Script=Wcho}/utf
\x{1e2ff}
# Character not in script
/^\p{Wancho}/utf
\x{1e300}
# Base script check
/^\p{sc=Chorasmian}/utf
\x{10fb0}
/^\p{Script=Chrs}/utf
\x{10fcb}
# Character not in script
/^\p{Chorasmian}/utf
\x{10fcc}
# Base script check
/^\p{sc=Dives_Akuru}/utf
\x{11900}
/^\p{Script=Diak}/utf
\x{11959}
# Character not in script
/^\p{Dives_Akuru}/utf
\x{1195a}
# Base script check
/^\p{sc=Khitan_Small_Script}/utf
\x{16fe4}
/^\p{Script=Kits}/utf
\x{18cd5}
# Character not in script
/^\p{Khitan_Small_Script}/utf
\x{18cd6}
# Base script check
/^\p{sc=Tangsa}/utf
\x{16a70}
/^\p{Script=Tnsa}/utf
\x{16ac9}
# Character not in script
/^\p{Tangsa}/utf
\x{16aca}
# Base script check
/^\p{sc=Toto}/utf
\x{1e290}
/^\p{Script=Toto}/utf
\x{1e2ae}
# Character not in script
/^\p{Toto}/utf
\x{1e2af}
# Base script check
/^\p{sc=Vithkuqi}/utf
\x{10570}
/^\p{Script=Vith}/utf
\x{105bc}
# Character not in script
/^\p{Vithkuqi}/utf
\x{105bd}
# Base script check
/^\p{sc=Kawi}/utf
\x{11f00}
/^\p{Script=Kawi}/utf
\x{11f59}
# Character not in script
/^\p{Kawi}/utf
\x{11f6a}
# Base script check
/^\p{sc=Nag_Mundari}/utf
\x{1e4d0}
/^\p{Script=Nagm}/utf
\x{1e4f9}
# Character not in script
/^\p{Nag_Mundari}/utf
\x{1e4fa}
# End of testinput26

3251
3rd/pcre2/testdata/testinput27 vendored Normal file
View File

@@ -0,0 +1,3251 @@
# These tests were generated by maint/GenerateTest.py using PCRE2's UCP
# data, do not edit unless that data has changed and they are reflecting
# a previous version.
# Unicode Script Extension tests for version 16.0.0
#perltest
# Base script check
/^\p{sc=Latin}/utf
A
/^\p{Script=Latn}/utf
\x{1df2a}
# Script extension check
/^\p{Latin}/utf
\x{b7}
/^\p{scx=Latn}/utf
\x{a92e}
# Script extension only character
/^\p{Latin}/utf
\x{b7}
/^\p{sc=Latin}/utf
\x{b7}
# Character not in script
/^\p{Latin}/utf
\x{1df2b}
# Base script check
/^\p{sc=Greek}/utf
\x{370}
/^\p{Script=Grek}/utf
\x{1d245}
# Script extension check
/^\p{Greek}/utf
\x{b7}
/^\p{Script_Extensions=Grek}/utf
\x{205d}
# Script extension only character
/^\p{Greek}/utf
\x{b7}
/^\p{sc=Greek}/utf
\x{b7}
# Character not in script
/^\p{Greek}/utf
\x{1d246}
# Base script check
/^\p{sc=Cyrillic}/utf
\x{400}
/^\p{Script=Cyrl}/utf
\x{1e08f}
# Script extension check
/^\p{Cyrillic}/utf
\x{2bc}
/^\p{scx=Cyrl}/utf
\x{a66f}
# Script extension only character
/^\p{Cyrillic}/utf
\x{2bc}
/^\p{sc=Cyrillic}/utf
\x{2bc}
# Character not in script
/^\p{Cyrillic}/utf
\x{1e090}
# Base script check
/^\p{sc=Armenian}/utf
\x{531}
/^\p{Script=Armn}/utf
\x{fb17}
# Script extension check
/^\p{Armenian}/utf
\x{308}
/^\p{Script_Extensions=Armn}/utf
\x{589}
# Script extension only character
/^\p{Armenian}/utf
\x{308}
/^\p{sc=Armenian}/utf
\x{308}
# Character not in script
/^\p{Armenian}/utf
\x{fb18}
# Base script check
/^\p{sc=Hebrew}/utf
\x{591}
/^\p{Script=Hebr}/utf
\x{fb4f}
# Script extension check
/^\p{Hebrew}/utf
\x{307}
/^\p{scx=Hebr}/utf
\x{308}
# Script extension only character
/^\p{Hebrew}/utf
\x{307}
/^\p{sc=Hebrew}/utf
\x{307}
# Character not in script
/^\p{Hebrew}/utf
\x{fb50}
# Base script check
/^\p{sc=Arabic}/utf
\x{600}
/^\p{Script=Arab}/utf
\x{1eef1}
# Script extension check
/^\p{Arabic}/utf
\x{60c}
/^\p{Script_Extensions=Arab}/utf
\x{102fb}
# Script extension only character
/^\p{Arabic}/utf
\x{60c}
/^\p{sc=Arabic}/utf
\x{60c}
# Character not in script
/^\p{Arabic}/utf
\x{1eef2}
# Base script check
/^\p{sc=Syriac}/utf
\x{700}
/^\p{Script=Syrc}/utf
\x{86a}
# Script extension check
/^\p{Syriac}/utf
\x{303}
/^\p{scx=Syrc}/utf
\x{1dfa}
# Script extension only character
/^\p{Syriac}/utf
\x{303}
/^\p{sc=Syriac}/utf
\x{303}
# Character not in script
/^\p{Syriac}/utf
\x{1dfb}
# Base script check
/^\p{sc=Thaana}/utf
\x{780}
/^\p{Script=Thaa}/utf
\x{7b1}
# Script extension check
/^\p{Thaana}/utf
\x{60c}
/^\p{Script_Extensions=Thaa}/utf
\x{fdfd}
# Script extension only character
/^\p{Thaana}/utf
\x{60c}
/^\p{sc=Thaana}/utf
\x{60c}
# Character not in script
/^\p{Thaana}/utf
\x{fdfe}
# Base script check
/^\p{sc=Devanagari}/utf
\x{900}
/^\p{Script=Deva}/utf
\x{11b09}
# Script extension check
/^\p{Devanagari}/utf
\x{2bc}
/^\p{scx=Deva}/utf
\x{a8f3}
# Script extension only character
/^\p{Devanagari}/utf
\x{2bc}
/^\p{sc=Devanagari}/utf
\x{2bc}
# Character not in script
/^\p{Devanagari}/utf
\x{11b0a}
# Base script check
/^\p{sc=Bengali}/utf
\x{980}
/^\p{Script=Beng}/utf
\x{9fe}
# Script extension check
/^\p{Bengali}/utf
\x{2bc}
/^\p{Script_Extensions=Beng}/utf
\x{a8f1}
# Script extension only character
/^\p{Bengali}/utf
\x{2bc}
/^\p{sc=Bengali}/utf
\x{2bc}
# Character not in script
/^\p{Bengali}/utf
\x{a8f2}
# Base script check
/^\p{sc=Gurmukhi}/utf
\x{a01}
/^\p{Script=Guru}/utf
\x{a76}
# Script extension check
/^\p{Gurmukhi}/utf
\x{951}
/^\p{scx=Guru}/utf
\x{a839}
# Script extension only character
/^\p{Gurmukhi}/utf
\x{951}
/^\p{sc=Gurmukhi}/utf
\x{951}
# Character not in script
/^\p{Gurmukhi}/utf
\x{a83a}
# Base script check
/^\p{sc=Gujarati}/utf
\x{a81}
/^\p{Script=Gujr}/utf
\x{aff}
# Script extension check
/^\p{Gujarati}/utf
\x{951}
/^\p{Script_Extensions=Gujr}/utf
\x{a839}
# Script extension only character
/^\p{Gujarati}/utf
\x{951}
/^\p{sc=Gujarati}/utf
\x{951}
# Character not in script
/^\p{Gujarati}/utf
\x{a83a}
# Base script check
/^\p{sc=Oriya}/utf
\x{b01}
/^\p{Script=Orya}/utf
\x{b77}
# Script extension check
/^\p{Oriya}/utf
\x{951}
/^\p{scx=Orya}/utf
\x{1cf2}
# Script extension only character
/^\p{Oriya}/utf
\x{951}
/^\p{sc=Oriya}/utf
\x{951}
# Character not in script
/^\p{Oriya}/utf
\x{1cf3}
# Base script check
/^\p{sc=Tamil}/utf
\x{b82}
/^\p{Script=Taml}/utf
\x{11fff}
# Script extension check
/^\p{Tamil}/utf
\x{951}
/^\p{Script_Extensions=Taml}/utf
\x{11fd3}
# Script extension only character
/^\p{Tamil}/utf
\x{951}
/^\p{sc=Tamil}/utf
\x{951}
# Character not in script
/^\p{Tamil}/utf
\x{12000}
# Base script check
/^\p{sc=Telugu}/utf
\x{c00}
/^\p{Script=Telu}/utf
\x{c7f}
# Script extension check
/^\p{Telugu}/utf
\x{951}
/^\p{scx=Telu}/utf
\x{1cf2}
# Script extension only character
/^\p{Telugu}/utf
\x{951}
/^\p{sc=Telugu}/utf
\x{951}
# Character not in script
/^\p{Telugu}/utf
\x{1cf3}
# Base script check
/^\p{sc=Kannada}/utf
\x{c80}
/^\p{Script=Knda}/utf
\x{cf3}
# Script extension check
/^\p{Kannada}/utf
\x{951}
/^\p{Script_Extensions=Knda}/utf
\x{a835}
# Script extension only character
/^\p{Kannada}/utf
\x{951}
/^\p{sc=Kannada}/utf
\x{951}
# Character not in script
/^\p{Kannada}/utf
\x{a836}
# Base script check
/^\p{sc=Malayalam}/utf
\x{d00}
/^\p{Script=Mlym}/utf
\x{d7f}
# Script extension check
/^\p{Malayalam}/utf
\x{951}
/^\p{scx=Mlym}/utf
\x{a832}
# Script extension only character
/^\p{Malayalam}/utf
\x{951}
/^\p{sc=Malayalam}/utf
\x{951}
# Character not in script
/^\p{Malayalam}/utf
\x{a833}
# Base script check
/^\p{sc=Sinhala}/utf
\x{d81}
/^\p{Script=Sinh}/utf
\x{111f4}
# Script extension check
/^\p{Sinhala}/utf
\x{964}
/^\p{Script_Extensions=Sinh}/utf
\x{1cf2}
# Script extension only character
/^\p{Sinhala}/utf
\x{964}
/^\p{sc=Sinhala}/utf
\x{964}
# Character not in script
/^\p{Sinhala}/utf
\x{111f5}
# Base script check
/^\p{sc=Thai}/utf
\x{e01}
/^\p{Script=Thai}/utf
\x{e5b}
# Script extension check
/^\p{Thai}/utf
\x{2bc}
/^\p{scx=Thai}/utf
\x{331}
# Script extension only character
/^\p{Thai}/utf
\x{2bc}
/^\p{sc=Thai}/utf
\x{2bc}
# Character not in script
/^\p{Thai}/utf
\x{e5c}
# Base script check
/^\p{sc=Tibetan}/utf
\x{f00}
/^\p{Script=Tibt}/utf
\x{fda}
# Script extension check
/^\p{Tibetan}/utf
\x{3008}
/^\p{Script_Extensions=Tibt}/utf
\x{300b}
# Script extension only character
/^\p{Tibetan}/utf
\x{3008}
/^\p{sc=Tibetan}/utf
\x{3008}
# Character not in script
/^\p{Tibetan}/utf
\x{300c}
# Base script check
/^\p{sc=Myanmar}/utf
\x{1000}
/^\p{Script=Mymr}/utf
\x{116e3}
# Script extension check
/^\p{Myanmar}/utf
\x{1040}
/^\p{scx=Mymr}/utf
\x{a92e}
# Script extension only character
/^\p{Myanmar}/utf
\x{a92e}
/^\p{sc=Myanmar}/utf
\x{a92e}
# Character not in script
/^\p{Myanmar}/utf
\x{116e4}
# Base script check
/^\p{sc=Georgian}/utf
\x{10a0}
/^\p{Script=Geor}/utf
\x{2d2d}
# Script extension check
/^\p{Georgian}/utf
\x{b7}
/^\p{Script_Extensions=Geor}/utf
\x{2e31}
# Script extension only character
/^\p{Georgian}/utf
\x{b7}
/^\p{sc=Georgian}/utf
\x{b7}
# Character not in script
/^\p{Georgian}/utf
\x{2e32}
# Base script check
/^\p{sc=Hangul}/utf
\x{1100}
/^\p{Script=Hang}/utf
\x{ffdc}
# Script extension check
/^\p{Hangul}/utf
\x{3001}
/^\p{scx=Hang}/utf
\x{ff65}
# Script extension only character
/^\p{Hangul}/utf
\x{3001}
/^\p{sc=Hangul}/utf
\x{3001}
# Character not in script
/^\p{Hangul}/utf
\x{ffdd}
# Base script check
/^\p{sc=Ethiopic}/utf
\x{1200}
/^\p{Script=Ethi}/utf
\x{1e7fe}
# Script extension check
/^\p{Ethiopic}/utf
\x{30e}
/^\p{Script_Extensions=Ethi}/utf
\x{30e}
# Script extension only character
/^\p{Ethiopic}/utf
\x{30e}
/^\p{sc=Ethiopic}/utf
\x{30e}
# Character not in script
/^\p{Ethiopic}/utf
\x{1e7ff}
# Base script check
/^\p{sc=Cherokee}/utf
\x{13a0}
/^\p{Script=Cher}/utf
\x{abbf}
# Script extension check
/^\p{Cherokee}/utf
\x{300}
/^\p{scx=Cher}/utf
\x{331}
# Script extension only character
/^\p{Cherokee}/utf
\x{300}
/^\p{sc=Cherokee}/utf
\x{300}
# Character not in script
/^\p{Cherokee}/utf
\x{abc0}
# Base script check
/^\p{sc=Runic}/utf
\x{16a0}
/^\p{Script=Runr}/utf
\x{16f8}
# Script extension check
/^\p{Runic}/utf
\x{16eb}
/^\p{Script_Extensions=Runr}/utf
\x{16ed}
# Script extension only character
/^\p{Runic}/utf
\x{16eb}
/^\p{sc=Runic}/utf
\x{16eb}
# Character not in script
/^\p{Runic}/utf
\x{16f9}
# Base script check
/^\p{sc=Mongolian}/utf
\x{1800}
/^\p{Script=Mong}/utf
\x{1166c}
# Script extension check
/^\p{Mongolian}/utf
\x{1802}
/^\p{scx=Mong}/utf
\x{300b}
# Script extension only character
/^\p{Mongolian}/utf
\x{1802}
/^\p{sc=Mongolian}/utf
\x{1802}
# Character not in script
/^\p{Mongolian}/utf
\x{1166d}
# Base script check
/^\p{sc=Hiragana}/utf
\x{3041}
/^\p{Script=Hira}/utf
\x{1f200}
# Script extension check
/^\p{Hiragana}/utf
\x{3001}
/^\p{Script_Extensions=Hira}/utf
\x{ff9f}
# Script extension only character
/^\p{Hiragana}/utf
\x{3001}
/^\p{sc=Hiragana}/utf
\x{3001}
# Character not in script
/^\p{Hiragana}/utf
\x{1f201}
# Base script check
/^\p{sc=Katakana}/utf
\x{30a1}
/^\p{Script=Kana}/utf
\x{1b167}
# Script extension check
/^\p{Katakana}/utf
\x{305}
/^\p{scx=Kana}/utf
\x{ff9f}
# Script extension only character
/^\p{Katakana}/utf
\x{305}
/^\p{sc=Katakana}/utf
\x{305}
# Character not in script
/^\p{Katakana}/utf
\x{1b168}
# Base script check
/^\p{sc=Bopomofo}/utf
\x{2ea}
/^\p{Script=Bopo}/utf
\x{31bf}
# Script extension check
/^\p{Bopomofo}/utf
\x{2c7}
/^\p{Script_Extensions=Bopo}/utf
\x{ff65}
# Script extension only character
/^\p{Bopomofo}/utf
\x{2c7}
/^\p{sc=Bopomofo}/utf
\x{2c7}
# Character not in script
/^\p{Bopomofo}/utf
\x{ff66}
# Base script check
/^\p{sc=Han}/utf
\x{2e80}
/^\p{Script=Hani}/utf
\x{323af}
# Script extension check
/^\p{Han}/utf
\x{b7}
/^\p{scx=Hani}/utf
\x{1f251}
# Script extension only character
/^\p{Han}/utf
\x{b7}
/^\p{sc=Han}/utf
\x{b7}
# Character not in script
/^\p{Han}/utf
\x{323b0}
# Base script check
/^\p{sc=Yi}/utf
\x{a000}
/^\p{Script=Yiii}/utf
\x{a4c6}
# Script extension check
/^\p{Yi}/utf
\x{3001}
/^\p{Script_Extensions=Yiii}/utf
\x{ff65}
# Script extension only character
/^\p{Yi}/utf
\x{3001}
/^\p{sc=Yi}/utf
\x{3001}
# Character not in script
/^\p{Yi}/utf
\x{ff66}
# Base script check
/^\p{sc=Gothic}/utf
\x{10330}
/^\p{Script=Goth}/utf
\x{1034a}
# Script extension check
/^\p{Gothic}/utf
\x{b7}
/^\p{scx=Goth}/utf
\x{331}
# Script extension only character
/^\p{Gothic}/utf
\x{b7}
/^\p{sc=Gothic}/utf
\x{b7}
# Character not in script
/^\p{Gothic}/utf
\x{1034b}
# Base script check
/^\p{sc=Tagalog}/utf
\x{1700}
/^\p{Script=Tglg}/utf
\x{171f}
# Script extension check
/^\p{Tagalog}/utf
\x{1735}
/^\p{Script_Extensions=Tglg}/utf
\x{1736}
# Script extension only character
/^\p{Tagalog}/utf
\x{1735}
/^\p{sc=Tagalog}/utf
\x{1735}
# Character not in script
/^\p{Tagalog}/utf
\x{1737}
# Base script check
/^\p{sc=Hanunoo}/utf
\x{1720}
/^\p{Script=Hano}/utf
\x{1734}
# Script extension check
/^\p{Hanunoo}/utf
\x{1735}
/^\p{scx=Hano}/utf
\x{1736}
# Script extension only character
/^\p{Hanunoo}/utf
\x{1735}
/^\p{sc=Hanunoo}/utf
\x{1735}
# Character not in script
/^\p{Hanunoo}/utf
\x{1737}
# Base script check
/^\p{sc=Buhid}/utf
\x{1740}
/^\p{Script=Buhd}/utf
\x{1753}
# Script extension check
/^\p{Buhid}/utf
\x{1735}
/^\p{Script_Extensions=Buhd}/utf
\x{1736}
# Script extension only character
/^\p{Buhid}/utf
\x{1735}
/^\p{sc=Buhid}/utf
\x{1735}
# Character not in script
/^\p{Buhid}/utf
\x{1754}
# Base script check
/^\p{sc=Tagbanwa}/utf
\x{1760}
/^\p{Script=Tagb}/utf
\x{1773}
# Script extension check
/^\p{Tagbanwa}/utf
\x{1735}
/^\p{scx=Tagb}/utf
\x{1736}
# Script extension only character
/^\p{Tagbanwa}/utf
\x{1735}
/^\p{sc=Tagbanwa}/utf
\x{1735}
# Character not in script
/^\p{Tagbanwa}/utf
\x{1774}
# Base script check
/^\p{sc=Limbu}/utf
\x{1900}
/^\p{Script=Limb}/utf
\x{194f}
# Script extension check
/^\p{Limbu}/utf
\x{965}
/^\p{Script_Extensions=Limb}/utf
\x{965}
# Script extension only character
/^\p{Limbu}/utf
\x{965}
/^\p{sc=Limbu}/utf
\x{965}
# Character not in script
/^\p{Limbu}/utf
\x{1950}
# Base script check
/^\p{sc=Tai_Le}/utf
\x{1950}
/^\p{Script=Tale}/utf
\x{1974}
# Script extension check
/^\p{Tai_Le}/utf
\x{300}
/^\p{scx=Tale}/utf
\x{1049}
# Script extension only character
/^\p{Tai_Le}/utf
\x{300}
/^\p{sc=Tai_Le}/utf
\x{300}
# Character not in script
/^\p{Tai_Le}/utf
\x{1975}
# Base script check
/^\p{sc=Linear_B}/utf
\x{10000}
/^\p{Script=Linb}/utf
\x{100fa}
# Script extension check
/^\p{Linear_B}/utf
\x{10100}
/^\p{Script_Extensions=Linb}/utf
\x{1013f}
# Script extension only character
/^\p{Linear_B}/utf
\x{10100}
/^\p{sc=Linear_B}/utf
\x{10100}
# Character not in script
/^\p{Linear_B}/utf
\x{10140}
# Base script check
/^\p{sc=Shavian}/utf
\x{10450}
/^\p{Script=Shaw}/utf
\x{1047f}
# Script extension check
/^\p{Shavian}/utf
\x{b7}
/^\p{scx=Shaw}/utf
\x{b7}
# Script extension only character
/^\p{Shavian}/utf
\x{b7}
/^\p{sc=Shavian}/utf
\x{b7}
# Character not in script
/^\p{Shavian}/utf
\x{10480}
# Base script check
/^\p{sc=Cypriot}/utf
\x{10800}
/^\p{Script=Cprt}/utf
\x{1083f}
# Script extension check
/^\p{Cypriot}/utf
\x{10100}
/^\p{Script_Extensions=Cprt}/utf
\x{1013f}
# Script extension only character
/^\p{Cypriot}/utf
\x{10100}
/^\p{sc=Cypriot}/utf
\x{10100}
# Character not in script
/^\p{Cypriot}/utf
\x{10840}
# Base script check
/^\p{sc=Buginese}/utf
\x{1a00}
/^\p{Script=Bugi}/utf
\x{1a1f}
# Script extension check
/^\p{Buginese}/utf
\x{a9cf}
/^\p{scx=Bugi}/utf
\x{a9cf}
# Script extension only character
/^\p{Buginese}/utf
\x{a9cf}
/^\p{sc=Buginese}/utf
\x{a9cf}
# Character not in script
/^\p{Buginese}/utf
\x{a9d0}
# Base script check
/^\p{sc=Coptic}/utf
\x{3e2}
/^\p{Script=Copt}/utf
\x{2cff}
# Script extension check
/^\p{Coptic}/utf
\x{b7}
/^\p{Script_Extensions=Copt}/utf
\x{102fb}
# Script extension only character
/^\p{Coptic}/utf
\x{b7}
/^\p{sc=Coptic}/utf
\x{b7}
# Character not in script
/^\p{Coptic}/utf
\x{102fc}
# Base script check
/^\p{sc=Glagolitic}/utf
\x{2c00}
/^\p{Script=Glag}/utf
\x{1e02a}
# Script extension check
/^\p{Glagolitic}/utf
\x{b7}
/^\p{scx=Glag}/utf
\x{a66f}
# Script extension only character
/^\p{Glagolitic}/utf
\x{b7}
/^\p{sc=Glagolitic}/utf
\x{b7}
# Character not in script
/^\p{Glagolitic}/utf
\x{1e02b}
# Base script check
/^\p{sc=Tifinagh}/utf
\x{2d30}
/^\p{Script=Tfng}/utf
\x{2d7f}
# Script extension check
/^\p{Tifinagh}/utf
\x{302}
/^\p{Script_Extensions=Tfng}/utf
\x{309}
# Script extension only character
/^\p{Tifinagh}/utf
\x{302}
/^\p{sc=Tifinagh}/utf
\x{302}
# Character not in script
/^\p{Tifinagh}/utf
\x{2d80}
# Base script check
/^\p{sc=Syloti_Nagri}/utf
\x{a800}
/^\p{Script=Sylo}/utf
\x{a82c}
# Script extension check
/^\p{Syloti_Nagri}/utf
\x{964}
/^\p{scx=Sylo}/utf
\x{9ef}
# Script extension only character
/^\p{Syloti_Nagri}/utf
\x{964}
/^\p{sc=Syloti_Nagri}/utf
\x{964}
# Character not in script
/^\p{Syloti_Nagri}/utf
\x{a82d}
# Base script check
/^\p{sc=Phags_Pa}/utf
\x{a840}
/^\p{Script=Phag}/utf
\x{a877}
# Script extension check
/^\p{Phags_Pa}/utf
\x{1802}
/^\p{Script_Extensions=Phag}/utf
\x{3002}
# Script extension only character
/^\p{Phags_Pa}/utf
\x{1802}
/^\p{sc=Phags_Pa}/utf
\x{1802}
# Character not in script
/^\p{Phags_Pa}/utf
\x{a878}
# Base script check
/^\p{sc=Nko}/utf
\x{7c0}
/^\p{Script=Nkoo}/utf
\x{7ff}
# Script extension check
/^\p{Nko}/utf
\x{60c}
/^\p{scx=Nkoo}/utf
\x{fd3f}
# Script extension only character
/^\p{Nko}/utf
\x{60c}
/^\p{sc=Nko}/utf
\x{60c}
# Character not in script
/^\p{Nko}/utf
\x{fd40}
# Base script check
/^\p{sc=Kayah_Li}/utf
\x{a900}
/^\p{Script=Kali}/utf
\x{a92f}
# Script extension check
/^\p{Kayah_Li}/utf
\x{a92e}
/^\p{Script_Extensions=Kali}/utf
\x{a92e}
# Script extension only character
/^\p{Kayah_Li}/utf
\x{a92e}
/^\p{sc=Kayah_Li}/utf
\x{a92e}
# Character not in script
/^\p{Kayah_Li}/utf
\x{a930}
# Base script check
/^\p{sc=Lycian}/utf
\x{10280}
/^\p{Script=Lyci}/utf
\x{1029c}
# Script extension check
/^\p{Lycian}/utf
\x{205a}
/^\p{scx=Lyci}/utf
\x{205a}
# Script extension only character
/^\p{Lycian}/utf
\x{205a}
/^\p{sc=Lycian}/utf
\x{205a}
# Character not in script
/^\p{Lycian}/utf
\x{1029d}
# Base script check
/^\p{sc=Carian}/utf
\x{102a0}
/^\p{Script=Cari}/utf
\x{102d0}
# Script extension check
/^\p{Carian}/utf
\x{b7}
/^\p{Script_Extensions=Cari}/utf
\x{2e31}
# Script extension only character
/^\p{Carian}/utf
\x{b7}
/^\p{sc=Carian}/utf
\x{b7}
# Character not in script
/^\p{Carian}/utf
\x{102d1}
# Base script check
/^\p{sc=Lydian}/utf
\x{10920}
/^\p{Script=Lydi}/utf
\x{1093f}
# Script extension check
/^\p{Lydian}/utf
\x{b7}
/^\p{scx=Lydi}/utf
\x{2e31}
# Script extension only character
/^\p{Lydian}/utf
\x{b7}
/^\p{sc=Lydian}/utf
\x{b7}
# Character not in script
/^\p{Lydian}/utf
\x{10940}
# Base script check
/^\p{sc=Avestan}/utf
\x{10b00}
/^\p{Script=Avst}/utf
\x{10b3f}
# Script extension check
/^\p{Avestan}/utf
\x{b7}
/^\p{Script_Extensions=Avst}/utf
\x{2e31}
# Script extension only character
/^\p{Avestan}/utf
\x{b7}
/^\p{sc=Avestan}/utf
\x{b7}
# Character not in script
/^\p{Avestan}/utf
\x{10b40}
# Base script check
/^\p{sc=Samaritan}/utf
\x{800}
/^\p{Script=Samr}/utf
\x{83e}
# Script extension check
/^\p{Samaritan}/utf
\x{2e31}
/^\p{scx=Samr}/utf
\x{2e31}
# Script extension only character
/^\p{Samaritan}/utf
\x{2e31}
/^\p{sc=Samaritan}/utf
\x{2e31}
# Character not in script
/^\p{Samaritan}/utf
\x{2e32}
# Base script check
/^\p{sc=Lisu}/utf
\x{a4d0}
/^\p{Script=Lisu}/utf
\x{11fb0}
# Script extension check
/^\p{Lisu}/utf
\x{2bc}
/^\p{Script_Extensions=Lisu}/utf
\x{300b}
# Script extension only character
/^\p{Lisu}/utf
\x{2bc}
/^\p{sc=Lisu}/utf
\x{2bc}
# Character not in script
/^\p{Lisu}/utf
\x{11fb1}
# Base script check
/^\p{sc=Javanese}/utf
\x{a980}
/^\p{Script=Java}/utf
\x{a9df}
# Script extension check
/^\p{Javanese}/utf
\x{a9cf}
/^\p{scx=Java}/utf
\x{a9cf}
# Script extension only character
/^\p{Javanese}/utf
\x{a9cf}
/^\p{sc=Javanese}/utf
\x{a9cf}
# Character not in script
/^\p{Javanese}/utf
\x{a9e0}
# Base script check
/^\p{sc=Old_Turkic}/utf
\x{10c00}
/^\p{Script=Orkh}/utf
\x{10c48}
# Script extension check
/^\p{Old_Turkic}/utf
\x{205a}
/^\p{Script_Extensions=Orkh}/utf
\x{2e30}
# Script extension only character
/^\p{Old_Turkic}/utf
\x{205a}
/^\p{sc=Old_Turkic}/utf
\x{205a}
# Character not in script
/^\p{Old_Turkic}/utf
\x{10c49}
# Base script check
/^\p{sc=Kaithi}/utf
\x{11080}
/^\p{Script=Kthi}/utf
\x{110cd}
# Script extension check
/^\p{Kaithi}/utf
\x{966}
/^\p{scx=Kthi}/utf
\x{a839}
# Script extension only character
/^\p{Kaithi}/utf
\x{966}
/^\p{sc=Kaithi}/utf
\x{966}
# Character not in script
/^\p{Kaithi}/utf
\x{110ce}
# Base script check
/^\p{sc=Mandaic}/utf
\x{840}
/^\p{Script=Mand}/utf
\x{85e}
# Script extension check
/^\p{Mandaic}/utf
\x{640}
/^\p{Script_Extensions=Mand}/utf
\x{640}
# Script extension only character
/^\p{Mandaic}/utf
\x{640}
/^\p{sc=Mandaic}/utf
\x{640}
# Character not in script
/^\p{Mandaic}/utf
\x{85f}
# Base script check
/^\p{sc=Chakma}/utf
\x{11100}
/^\p{Script=Cakm}/utf
\x{11147}
# Script extension check
/^\p{Chakma}/utf
\x{9e6}
/^\p{scx=Cakm}/utf
\x{1049}
# Script extension only character
/^\p{Chakma}/utf
\x{9e6}
/^\p{sc=Chakma}/utf
\x{9e6}
# Character not in script
/^\p{Chakma}/utf
\x{11148}
# Base script check
/^\p{sc=Meroitic_Hieroglyphs}/utf
\x{10980}
/^\p{Script=Mero}/utf
\x{1099f}
# Script extension check
/^\p{Meroitic_Hieroglyphs}/utf
\x{205d}
/^\p{Script_Extensions=Mero}/utf
\x{205d}
# Script extension only character
/^\p{Meroitic_Hieroglyphs}/utf
\x{205d}
/^\p{sc=Meroitic_Hieroglyphs}/utf
\x{205d}
# Character not in script
/^\p{Meroitic_Hieroglyphs}/utf
\x{109a0}
# Base script check
/^\p{sc=Sharada}/utf
\x{11180}
/^\p{Script=Shrd}/utf
\x{111df}
# Script extension check
/^\p{Sharada}/utf
\x{951}
/^\p{scx=Shrd}/utf
\x{a838}
# Script extension only character
/^\p{Sharada}/utf
\x{951}
/^\p{sc=Sharada}/utf
\x{951}
# Character not in script
/^\p{Sharada}/utf
\x{111e0}
# Base script check
/^\p{sc=Takri}/utf
\x{11680}
/^\p{Script=Takr}/utf
\x{116c9}
# Script extension check
/^\p{Takri}/utf
\x{964}
/^\p{Script_Extensions=Takr}/utf
\x{a839}
# Script extension only character
/^\p{Takri}/utf
\x{964}
/^\p{sc=Takri}/utf
\x{964}
# Character not in script
/^\p{Takri}/utf
\x{116ca}
# Base script check
/^\p{sc=Caucasian_Albanian}/utf
\x{10530}
/^\p{Script=Aghb}/utf
\x{1056f}
# Script extension check
/^\p{Caucasian_Albanian}/utf
\x{304}
/^\p{scx=Aghb}/utf
\x{35e}
# Script extension only character
/^\p{Caucasian_Albanian}/utf
\x{304}
/^\p{sc=Caucasian_Albanian}/utf
\x{304}
# Character not in script
/^\p{Caucasian_Albanian}/utf
\x{10570}
# Base script check
/^\p{sc=Duployan}/utf
\x{1bc00}
/^\p{Script=Dupl}/utf
\x{1bc9f}
# Script extension check
/^\p{Duployan}/utf
\x{b7}
/^\p{Script_Extensions=Dupl}/utf
\x{1bca3}
# Script extension only character
/^\p{Duployan}/utf
\x{b7}
/^\p{sc=Duployan}/utf
\x{b7}
# Character not in script
/^\p{Duployan}/utf
\x{1bca4}
# Base script check
/^\p{sc=Elbasan}/utf
\x{10500}
/^\p{Script=Elba}/utf
\x{10527}
# Script extension check
/^\p{Elbasan}/utf
\x{b7}
/^\p{scx=Elba}/utf
\x{305}
# Script extension only character
/^\p{Elbasan}/utf
\x{b7}
/^\p{sc=Elbasan}/utf
\x{b7}
# Character not in script
/^\p{Elbasan}/utf
\x{10528}
# Base script check
/^\p{sc=Grantha}/utf
\x{11300}
/^\p{Script=Gran}/utf
\x{11374}
# Script extension check
/^\p{Grantha}/utf
\x{951}
/^\p{Script_Extensions=Gran}/utf
\x{11fd3}
# Script extension only character
/^\p{Grantha}/utf
\x{951}
/^\p{sc=Grantha}/utf
\x{951}
# Character not in script
/^\p{Grantha}/utf
\x{11fd4}
# Base script check
/^\p{sc=Khojki}/utf
\x{11200}
/^\p{Script=Khoj}/utf
\x{11241}
# Script extension check
/^\p{Khojki}/utf
\x{ae6}
/^\p{scx=Khoj}/utf
\x{a839}
# Script extension only character
/^\p{Khojki}/utf
\x{ae6}
/^\p{sc=Khojki}/utf
\x{ae6}
# Character not in script
/^\p{Khojki}/utf
\x{11242}
# Base script check
/^\p{sc=Linear_A}/utf
\x{10600}
/^\p{Script=Lina}/utf
\x{10767}
# Script extension check
/^\p{Linear_A}/utf
\x{10107}
/^\p{Script_Extensions=Lina}/utf
\x{10133}
# Script extension only character
/^\p{Linear_A}/utf
\x{10107}
/^\p{sc=Linear_A}/utf
\x{10107}
# Character not in script
/^\p{Linear_A}/utf
\x{10768}
# Base script check
/^\p{sc=Mahajani}/utf
\x{11150}
/^\p{Script=Mahj}/utf
\x{11176}
# Script extension check
/^\p{Mahajani}/utf
\x{b7}
/^\p{scx=Mahj}/utf
\x{a839}
# Script extension only character
/^\p{Mahajani}/utf
\x{b7}
/^\p{sc=Mahajani}/utf
\x{b7}
# Character not in script
/^\p{Mahajani}/utf
\x{11177}
# Base script check
/^\p{sc=Manichaean}/utf
\x{10ac0}
/^\p{Script=Mani}/utf
\x{10af6}
# Script extension check
/^\p{Manichaean}/utf
\x{640}
/^\p{Script_Extensions=Mani}/utf
\x{10af2}
# Script extension only character
/^\p{Manichaean}/utf
\x{640}
/^\p{sc=Manichaean}/utf
\x{640}
# Character not in script
/^\p{Manichaean}/utf
\x{10af7}
# Base script check
/^\p{sc=Modi}/utf
\x{11600}
/^\p{Script=Modi}/utf
\x{11659}
# Script extension check
/^\p{Modi}/utf
\x{a830}
/^\p{scx=Modi}/utf
\x{a839}
# Script extension only character
/^\p{Modi}/utf
\x{a830}
/^\p{sc=Modi}/utf
\x{a830}
# Character not in script
/^\p{Modi}/utf
\x{1165a}
# Base script check
/^\p{sc=Old_Permic}/utf
\x{10350}
/^\p{Script=Perm}/utf
\x{1037a}
# Script extension check
/^\p{Old_Permic}/utf
\x{b7}
/^\p{Script_Extensions=Perm}/utf
\x{483}
# Script extension only character
/^\p{Old_Permic}/utf
\x{b7}
/^\p{sc=Old_Permic}/utf
\x{b7}
# Character not in script
/^\p{Old_Permic}/utf
\x{1037b}
# Base script check
/^\p{sc=Psalter_Pahlavi}/utf
\x{10b80}
/^\p{Script=Phlp}/utf
\x{10baf}
# Script extension check
/^\p{Psalter_Pahlavi}/utf
\x{640}
/^\p{scx=Phlp}/utf
\x{640}
# Script extension only character
/^\p{Psalter_Pahlavi}/utf
\x{640}
/^\p{sc=Psalter_Pahlavi}/utf
\x{640}
# Character not in script
/^\p{Psalter_Pahlavi}/utf
\x{10bb0}
# Base script check
/^\p{sc=Khudawadi}/utf
\x{112b0}
/^\p{Script=Sind}/utf
\x{112f9}
# Script extension check
/^\p{Khudawadi}/utf
\x{964}
/^\p{Script_Extensions=Sind}/utf
\x{a839}
# Script extension only character
/^\p{Khudawadi}/utf
\x{964}
/^\p{sc=Khudawadi}/utf
\x{964}
# Character not in script
/^\p{Khudawadi}/utf
\x{112fa}
# Base script check
/^\p{sc=Tirhuta}/utf
\x{11480}
/^\p{Script=Tirh}/utf
\x{114d9}
# Script extension check
/^\p{Tirhuta}/utf
\x{951}
/^\p{scx=Tirh}/utf
\x{a839}
# Script extension only character
/^\p{Tirhuta}/utf
\x{951}
/^\p{sc=Tirhuta}/utf
\x{951}
# Character not in script
/^\p{Tirhuta}/utf
\x{114da}
# Base script check
/^\p{sc=Multani}/utf
\x{11280}
/^\p{Script=Mult}/utf
\x{112a9}
# Script extension check
/^\p{Multani}/utf
\x{a66}
/^\p{Script_Extensions=Mult}/utf
\x{a6f}
# Script extension only character
/^\p{Multani}/utf
\x{a66}
/^\p{sc=Multani}/utf
\x{a66}
# Character not in script
/^\p{Multani}/utf
\x{112aa}
# Base script check
/^\p{sc=Old_Hungarian}/utf
\x{10c80}
/^\p{Script=Hung}/utf
\x{10cff}
# Script extension check
/^\p{Old_Hungarian}/utf
\x{205a}
/^\p{scx=Hung}/utf
\x{2e41}
# Script extension only character
/^\p{Old_Hungarian}/utf
\x{205a}
/^\p{sc=Old_Hungarian}/utf
\x{205a}
# Character not in script
/^\p{Old_Hungarian}/utf
\x{10d00}
# Base script check
/^\p{sc=Adlam}/utf
\x{1e900}
/^\p{Script=Adlm}/utf
\x{1e95f}
# Script extension check
/^\p{Adlam}/utf
\x{61f}
/^\p{Script_Extensions=Adlm}/utf
\x{2e41}
# Script extension only character
/^\p{Adlam}/utf
\x{61f}
/^\p{sc=Adlam}/utf
\x{61f}
# Character not in script
/^\p{Adlam}/utf
\x{1e960}
# Base script check
/^\p{sc=Osage}/utf
\x{104b0}
/^\p{Script=Osge}/utf
\x{104fb}
# Script extension check
/^\p{Osage}/utf
\x{301}
/^\p{scx=Osge}/utf
\x{358}
# Script extension only character
/^\p{Osage}/utf
\x{301}
/^\p{sc=Osage}/utf
\x{301}
# Character not in script
/^\p{Osage}/utf
\x{104fc}
# Base script check
/^\p{sc=Tangut}/utf
\x{16fe0}
/^\p{Script=Tang}/utf
\x{18d08}
# Script extension check
/^\p{Tangut}/utf
\x{2ff0}
/^\p{Script_Extensions=Tang}/utf
\x{31ef}
# Script extension only character
/^\p{Tangut}/utf
\x{2ff0}
/^\p{sc=Tangut}/utf
\x{2ff0}
# Character not in script
/^\p{Tangut}/utf
\x{18d09}
# Base script check
/^\p{sc=Masaram_Gondi}/utf
\x{11d00}
/^\p{Script=Gonm}/utf
\x{11d59}
# Script extension check
/^\p{Masaram_Gondi}/utf
\x{964}
/^\p{scx=Gonm}/utf
\x{965}
# Script extension only character
/^\p{Masaram_Gondi}/utf
\x{964}
/^\p{sc=Masaram_Gondi}/utf
\x{964}
# Character not in script
/^\p{Masaram_Gondi}/utf
\x{11d5a}
# Base script check
/^\p{sc=Dogra}/utf
\x{11800}
/^\p{Script=Dogr}/utf
\x{1183b}
# Script extension check
/^\p{Dogra}/utf
\x{964}
/^\p{Script_Extensions=Dogr}/utf
\x{a839}
# Script extension only character
/^\p{Dogra}/utf
\x{964}
/^\p{sc=Dogra}/utf
\x{964}
# Character not in script
/^\p{Dogra}/utf
\x{1183c}
# Base script check
/^\p{sc=Gunjala_Gondi}/utf
\x{11d60}
/^\p{Script=Gong}/utf
\x{11da9}
# Script extension check
/^\p{Gunjala_Gondi}/utf
\x{b7}
/^\p{scx=Gong}/utf
\x{965}
# Script extension only character
/^\p{Gunjala_Gondi}/utf
\x{b7}
/^\p{sc=Gunjala_Gondi}/utf
\x{b7}
# Character not in script
/^\p{Gunjala_Gondi}/utf
\x{11daa}
# Base script check
/^\p{sc=Hanifi_Rohingya}/utf
\x{10d00}
/^\p{Script=Rohg}/utf
\x{10d39}
# Script extension check
/^\p{Hanifi_Rohingya}/utf
\x{60c}
/^\p{Script_Extensions=Rohg}/utf
\x{6d4}
# Script extension only character
/^\p{Hanifi_Rohingya}/utf
\x{60c}
/^\p{sc=Hanifi_Rohingya}/utf
\x{60c}
# Character not in script
/^\p{Hanifi_Rohingya}/utf
\x{10d3a}
# Base script check
/^\p{sc=Sogdian}/utf
\x{10f30}
/^\p{Script=Sogd}/utf
\x{10f59}
# Script extension check
/^\p{Sogdian}/utf
\x{640}
/^\p{scx=Sogd}/utf
\x{640}
# Script extension only character
/^\p{Sogdian}/utf
\x{640}
/^\p{sc=Sogdian}/utf
\x{640}
# Character not in script
/^\p{Sogdian}/utf
\x{10f5a}
# Base script check
/^\p{sc=Nandinagari}/utf
\x{119a0}
/^\p{Script=Nand}/utf
\x{119e4}
# Script extension check
/^\p{Nandinagari}/utf
\x{964}
/^\p{Script_Extensions=Nand}/utf
\x{a835}
# Script extension only character
/^\p{Nandinagari}/utf
\x{964}
/^\p{sc=Nandinagari}/utf
\x{964}
# Character not in script
/^\p{Nandinagari}/utf
\x{119e5}
# Base script check
/^\p{sc=Yezidi}/utf
\x{10e80}
/^\p{Script=Yezi}/utf
\x{10eb1}
# Script extension check
/^\p{Yezidi}/utf
\x{60c}
/^\p{scx=Yezi}/utf
\x{669}
# Script extension only character
/^\p{Yezidi}/utf
\x{60c}
/^\p{sc=Yezidi}/utf
\x{60c}
# Character not in script
/^\p{Yezidi}/utf
\x{10eb2}
# Base script check
/^\p{sc=Cypro_Minoan}/utf
\x{12f90}
/^\p{Script=Cpmn}/utf
\x{12ff2}
# Script extension check
/^\p{Cypro_Minoan}/utf
\x{10100}
/^\p{Script_Extensions=Cpmn}/utf
\x{10101}
# Script extension only character
/^\p{Cypro_Minoan}/utf
\x{10100}
/^\p{sc=Cypro_Minoan}/utf
\x{10100}
# Character not in script
/^\p{Cypro_Minoan}/utf
\x{12ff3}
# Base script check
/^\p{sc=Old_Uyghur}/utf
\x{10f70}
/^\p{Script=Ougr}/utf
\x{10f89}
# Script extension check
/^\p{Old_Uyghur}/utf
\x{640}
/^\p{scx=Ougr}/utf
\x{10af2}
# Script extension only character
/^\p{Old_Uyghur}/utf
\x{640}
/^\p{sc=Old_Uyghur}/utf
\x{640}
# Character not in script
/^\p{Old_Uyghur}/utf
\x{10f8a}
# Base script check
/^\p{sc=Toto}/utf
\x{1e290}
/^\p{Script=Toto}/utf
\x{1e2ae}
# Script extension check
/^\p{Toto}/utf
\x{2bc}
/^\p{Script_Extensions=Toto}/utf
\x{2bc}
# Script extension only character
/^\p{Toto}/utf
\x{2bc}
/^\p{sc=Toto}/utf
\x{2bc}
# Character not in script
/^\p{Toto}/utf
\x{1e2af}
# Base script check
/^\p{sc=Garay}/utf
\x{10d40}
/^\p{Script=Gara}/utf
\x{10d8f}
# Script extension check
/^\p{Garay}/utf
\x{60c}
/^\p{scx=Gara}/utf
\x{61f}
# Script extension only character
/^\p{Garay}/utf
\x{60c}
/^\p{sc=Garay}/utf
\x{60c}
# Character not in script
/^\p{Garay}/utf
\x{10d90}
# Base script check
/^\p{sc=Gurung_Khema}/utf
\x{16100}
/^\p{Script=Gukh}/utf
\x{16139}
# Script extension check
/^\p{Gurung_Khema}/utf
\x{965}
/^\p{Script_Extensions=Gukh}/utf
\x{965}
# Script extension only character
/^\p{Gurung_Khema}/utf
\x{965}
/^\p{sc=Gurung_Khema}/utf
\x{965}
# Character not in script
/^\p{Gurung_Khema}/utf
\x{1613a}
# Base script check
/^\p{sc=Ol_Onal}/utf
\x{1e5d0}
/^\p{Script=Onao}/utf
\x{1e5ff}
# Script extension check
/^\p{Ol_Onal}/utf
\x{964}
/^\p{scx=Onao}/utf
\x{965}
# Script extension only character
/^\p{Ol_Onal}/utf
\x{964}
/^\p{sc=Ol_Onal}/utf
\x{964}
# Character not in script
/^\p{Ol_Onal}/utf
\x{1e600}
# Base script check
/^\p{sc=Sunuwar}/utf
\x{11bc0}
/^\p{Script=Sunu}/utf
\x{11bf9}
# Script extension check
/^\p{Sunuwar}/utf
\x{300}
/^\p{Script_Extensions=Sunu}/utf
\x{331}
# Script extension only character
/^\p{Sunuwar}/utf
\x{300}
/^\p{sc=Sunuwar}/utf
\x{300}
# Character not in script
/^\p{Sunuwar}/utf
\x{11bfa}
# Base script check
/^\p{sc=Todhri}/utf
\x{105c0}
/^\p{Script=Todr}/utf
\x{105f3}
# Script extension check
/^\p{Todhri}/utf
\x{301}
/^\p{scx=Todr}/utf
\x{35e}
# Script extension only character
/^\p{Todhri}/utf
\x{301}
/^\p{sc=Todhri}/utf
\x{301}
# Character not in script
/^\p{Todhri}/utf
\x{105f4}
# Base script check
/^\p{sc=Tulu_Tigalari}/utf
\x{11380}
/^\p{Script=Tutg}/utf
\x{113e2}
# Script extension check
/^\p{Tulu_Tigalari}/utf
\x{ce6}
/^\p{Script_Extensions=Tutg}/utf
\x{a8f1}
# Script extension only character
/^\p{Tulu_Tigalari}/utf
\x{ce6}
/^\p{sc=Tulu_Tigalari}/utf
\x{ce6}
# Character not in script
/^\p{Tulu_Tigalari}/utf
\x{113e3}
# Base script check
/^\p{sc=Common}/utf
\x{00}
/^\p{Script=Zyyy}/utf
\x{e007f}
# Character not in script
/^\p{Common}/utf
\x{e0080}
# Base script check
/^\p{sc=Lao}/utf
\x{e81}
/^\p{Script=Laoo}/utf
\x{edf}
# Character not in script
/^\p{Lao}/utf
\x{ee0}
# Base script check
/^\p{sc=Canadian_Aboriginal}/utf
\x{1400}
/^\p{Script=Cans}/utf
\x{11abf}
# Character not in script
/^\p{Canadian_Aboriginal}/utf
\x{11ac0}
# Base script check
/^\p{sc=Ogham}/utf
\x{1680}
/^\p{Script=Ogam}/utf
\x{169c}
# Character not in script
/^\p{Ogham}/utf
\x{169d}
# Base script check
/^\p{sc=Khmer}/utf
\x{1780}
/^\p{Script=Khmr}/utf
\x{19ff}
# Character not in script
/^\p{Khmer}/utf
\x{1a00}
# Base script check
/^\p{sc=Old_Italic}/utf
\x{10300}
/^\p{Script=Ital}/utf
\x{1032f}
# Character not in script
/^\p{Old_Italic}/utf
\x{10330}
# Base script check
/^\p{sc=Deseret}/utf
\x{10400}
/^\p{Script=Dsrt}/utf
\x{1044f}
# Character not in script
/^\p{Deseret}/utf
\x{10450}
# Base script check
/^\p{sc=Inherited}/utf
\x{300}
/^\p{Script=Zinh}/utf
\x{e01ef}
# Character not in script
/^\p{Inherited}/utf
\x{e01f0}
# Base script check
/^\p{sc=Ugaritic}/utf
\x{10380}
/^\p{Script=Ugar}/utf
\x{1039f}
# Character not in script
/^\p{Ugaritic}/utf
\x{103a0}
# Base script check
/^\p{sc=Osmanya}/utf
\x{10480}
/^\p{Script=Osma}/utf
\x{104a9}
# Character not in script
/^\p{Osmanya}/utf
\x{104aa}
# Base script check
/^\p{sc=Braille}/utf
\x{2800}
/^\p{Script=Brai}/utf
\x{28ff}
# Character not in script
/^\p{Braille}/utf
\x{2900}
# Base script check
/^\p{sc=New_Tai_Lue}/utf
\x{1980}
/^\p{Script=Talu}/utf
\x{19df}
# Character not in script
/^\p{New_Tai_Lue}/utf
\x{19e0}
# Base script check
/^\p{sc=Old_Persian}/utf
\x{103a0}
/^\p{Script=Xpeo}/utf
\x{103d5}
# Character not in script
/^\p{Old_Persian}/utf
\x{103d6}
# Base script check
/^\p{sc=Kharoshthi}/utf
\x{10a00}
/^\p{Script=Khar}/utf
\x{10a58}
# Character not in script
/^\p{Kharoshthi}/utf
\x{10a59}
# Base script check
/^\p{sc=Balinese}/utf
\x{1b00}
/^\p{Script=Bali}/utf
\x{1b7f}
# Character not in script
/^\p{Balinese}/utf
\x{1b80}
# Base script check
/^\p{sc=Cuneiform}/utf
\x{12000}
/^\p{Script=Xsux}/utf
\x{12543}
# Character not in script
/^\p{Cuneiform}/utf
\x{12544}
# Base script check
/^\p{sc=Phoenician}/utf
\x{10900}
/^\p{Script=Phnx}/utf
\x{1091f}
# Character not in script
/^\p{Phoenician}/utf
\x{10920}
# Base script check
/^\p{sc=Sundanese}/utf
\x{1b80}
/^\p{Script=Sund}/utf
\x{1cc7}
# Character not in script
/^\p{Sundanese}/utf
\x{1cc8}
# Base script check
/^\p{sc=Lepcha}/utf
\x{1c00}
/^\p{Script=Lepc}/utf
\x{1c4f}
# Character not in script
/^\p{Lepcha}/utf
\x{1c50}
# Base script check
/^\p{sc=Ol_Chiki}/utf
\x{1c50}
/^\p{Script=Olck}/utf
\x{1c7f}
# Character not in script
/^\p{Ol_Chiki}/utf
\x{1c80}
# Base script check
/^\p{sc=Vai}/utf
\x{a500}
/^\p{Script=Vaii}/utf
\x{a62b}
# Character not in script
/^\p{Vai}/utf
\x{a62c}
# Base script check
/^\p{sc=Saurashtra}/utf
\x{a880}
/^\p{Script=Saur}/utf
\x{a8d9}
# Character not in script
/^\p{Saurashtra}/utf
\x{a8da}
# Base script check
/^\p{sc=Rejang}/utf
\x{a930}
/^\p{Script=Rjng}/utf
\x{a95f}
# Character not in script
/^\p{Rejang}/utf
\x{a960}
# Base script check
/^\p{sc=Cham}/utf
\x{aa00}
/^\p{Script=Cham}/utf
\x{aa5f}
# Character not in script
/^\p{Cham}/utf
\x{aa60}
# Base script check
/^\p{sc=Tai_Tham}/utf
\x{1a20}
/^\p{Script=Lana}/utf
\x{1aad}
# Character not in script
/^\p{Tai_Tham}/utf
\x{1aae}
# Base script check
/^\p{sc=Tai_Viet}/utf
\x{aa80}
/^\p{Script=Tavt}/utf
\x{aadf}
# Character not in script
/^\p{Tai_Viet}/utf
\x{aae0}
# Base script check
/^\p{sc=Egyptian_Hieroglyphs}/utf
\x{13000}
/^\p{Script=Egyp}/utf
\x{143fa}
# Character not in script
/^\p{Egyptian_Hieroglyphs}/utf
\x{143fb}
# Base script check
/^\p{sc=Bamum}/utf
\x{a6a0}
/^\p{Script=Bamu}/utf
\x{16a38}
# Character not in script
/^\p{Bamum}/utf
\x{16a39}
# Base script check
/^\p{sc=Meetei_Mayek}/utf
\x{aae0}
/^\p{Script=Mtei}/utf
\x{abf9}
# Character not in script
/^\p{Meetei_Mayek}/utf
\x{abfa}
# Base script check
/^\p{sc=Imperial_Aramaic}/utf
\x{10840}
/^\p{Script=Armi}/utf
\x{1085f}
# Character not in script
/^\p{Imperial_Aramaic}/utf
\x{10860}
# Base script check
/^\p{sc=Old_South_Arabian}/utf
\x{10a60}
/^\p{Script=Sarb}/utf
\x{10a7f}
# Character not in script
/^\p{Old_South_Arabian}/utf
\x{10a80}
# Base script check
/^\p{sc=Inscriptional_Parthian}/utf
\x{10b40}
/^\p{Script=Prti}/utf
\x{10b5f}
# Character not in script
/^\p{Inscriptional_Parthian}/utf
\x{10b60}
# Base script check
/^\p{sc=Inscriptional_Pahlavi}/utf
\x{10b60}
/^\p{Script=Phli}/utf
\x{10b7f}
# Character not in script
/^\p{Inscriptional_Pahlavi}/utf
\x{10b80}
# Base script check
/^\p{sc=Batak}/utf
\x{1bc0}
/^\p{Script=Batk}/utf
\x{1bff}
# Character not in script
/^\p{Batak}/utf
\x{1c00}
# Base script check
/^\p{sc=Brahmi}/utf
\x{11000}
/^\p{Script=Brah}/utf
\x{1107f}
# Character not in script
/^\p{Brahmi}/utf
\x{11080}
# Base script check
/^\p{sc=Meroitic_Cursive}/utf
\x{109a0}
/^\p{Script=Merc}/utf
\x{109ff}
# Character not in script
/^\p{Meroitic_Cursive}/utf
\x{10a00}
# Base script check
/^\p{sc=Miao}/utf
\x{16f00}
/^\p{Script=Plrd}/utf
\x{16f9f}
# Character not in script
/^\p{Miao}/utf
\x{16fa0}
# Base script check
/^\p{sc=Sora_Sompeng}/utf
\x{110d0}
/^\p{Script=Sora}/utf
\x{110f9}
# Character not in script
/^\p{Sora_Sompeng}/utf
\x{110fa}
# Base script check
/^\p{sc=Bassa_Vah}/utf
\x{16ad0}
/^\p{Script=Bass}/utf
\x{16af5}
# Character not in script
/^\p{Bassa_Vah}/utf
\x{16af6}
# Base script check
/^\p{sc=Pahawh_Hmong}/utf
\x{16b00}
/^\p{Script=Hmng}/utf
\x{16b8f}
# Character not in script
/^\p{Pahawh_Hmong}/utf
\x{16b90}
# Base script check
/^\p{sc=Mende_Kikakui}/utf
\x{1e800}
/^\p{Script=Mend}/utf
\x{1e8d6}
# Character not in script
/^\p{Mende_Kikakui}/utf
\x{1e8d7}
# Base script check
/^\p{sc=Mro}/utf
\x{16a40}
/^\p{Script=Mroo}/utf
\x{16a6f}
# Character not in script
/^\p{Mro}/utf
\x{16a70}
# Base script check
/^\p{sc=Old_North_Arabian}/utf
\x{10a80}
/^\p{Script=Narb}/utf
\x{10a9f}
# Character not in script
/^\p{Old_North_Arabian}/utf
\x{10aa0}
# Base script check
/^\p{sc=Nabataean}/utf
\x{10880}
/^\p{Script=Nbat}/utf
\x{108af}
# Character not in script
/^\p{Nabataean}/utf
\x{108b0}
# Base script check
/^\p{sc=Palmyrene}/utf
\x{10860}
/^\p{Script=Palm}/utf
\x{1087f}
# Character not in script
/^\p{Palmyrene}/utf
\x{10880}
# Base script check
/^\p{sc=Pau_Cin_Hau}/utf
\x{11ac0}
/^\p{Script=Pauc}/utf
\x{11af8}
# Character not in script
/^\p{Pau_Cin_Hau}/utf
\x{11af9}
# Base script check
/^\p{sc=Siddham}/utf
\x{11580}
/^\p{Script=Sidd}/utf
\x{115dd}
# Character not in script
/^\p{Siddham}/utf
\x{115de}
# Base script check
/^\p{sc=Warang_Citi}/utf
\x{118a0}
/^\p{Script=Wara}/utf
\x{118ff}
# Character not in script
/^\p{Warang_Citi}/utf
\x{11900}
# Base script check
/^\p{sc=Ahom}/utf
\x{11700}
/^\p{Script=Ahom}/utf
\x{11746}
# Character not in script
/^\p{Ahom}/utf
\x{11747}
# Base script check
/^\p{sc=Anatolian_Hieroglyphs}/utf
\x{14400}
/^\p{Script=Hluw}/utf
\x{14646}
# Character not in script
/^\p{Anatolian_Hieroglyphs}/utf
\x{14647}
# Base script check
/^\p{sc=Hatran}/utf
\x{108e0}
/^\p{Script=Hatr}/utf
\x{108ff}
# Character not in script
/^\p{Hatran}/utf
\x{10900}
# Base script check
/^\p{sc=SignWriting}/utf
\x{1d800}
/^\p{Script=Sgnw}/utf
\x{1daaf}
# Character not in script
/^\p{SignWriting}/utf
\x{1dab0}
# Base script check
/^\p{sc=Bhaiksuki}/utf
\x{11c00}
/^\p{Script=Bhks}/utf
\x{11c6c}
# Character not in script
/^\p{Bhaiksuki}/utf
\x{11c6d}
# Base script check
/^\p{sc=Marchen}/utf
\x{11c70}
/^\p{Script=Marc}/utf
\x{11cb6}
# Character not in script
/^\p{Marchen}/utf
\x{11cb7}
# Base script check
/^\p{sc=Newa}/utf
\x{11400}
/^\p{Script=Newa}/utf
\x{11461}
# Character not in script
/^\p{Newa}/utf
\x{11462}
# Base script check
/^\p{sc=Nushu}/utf
\x{16fe1}
/^\p{Script=Nshu}/utf
\x{1b2fb}
# Character not in script
/^\p{Nushu}/utf
\x{1b2fc}
# Base script check
/^\p{sc=Soyombo}/utf
\x{11a50}
/^\p{Script=Soyo}/utf
\x{11aa2}
# Character not in script
/^\p{Soyombo}/utf
\x{11aa3}
# Base script check
/^\p{sc=Zanabazar_Square}/utf
\x{11a00}
/^\p{Script=Zanb}/utf
\x{11a47}
# Character not in script
/^\p{Zanabazar_Square}/utf
\x{11a48}
# Base script check
/^\p{sc=Makasar}/utf
\x{11ee0}
/^\p{Script=Maka}/utf
\x{11ef8}
# Character not in script
/^\p{Makasar}/utf
\x{11ef9}
# Base script check
/^\p{sc=Medefaidrin}/utf
\x{16e40}
/^\p{Script=Medf}/utf
\x{16e9a}
# Character not in script
/^\p{Medefaidrin}/utf
\x{16e9b}
# Base script check
/^\p{sc=Old_Sogdian}/utf
\x{10f00}
/^\p{Script=Sogo}/utf
\x{10f27}
# Character not in script
/^\p{Old_Sogdian}/utf
\x{10f28}
# Base script check
/^\p{sc=Elymaic}/utf
\x{10fe0}
/^\p{Script=Elym}/utf
\x{10ff6}
# Character not in script
/^\p{Elymaic}/utf
\x{10ff7}
# Base script check
/^\p{sc=Nyiakeng_Puachue_Hmong}/utf
\x{1e100}
/^\p{Script=Hmnp}/utf
\x{1e14f}
# Character not in script
/^\p{Nyiakeng_Puachue_Hmong}/utf
\x{1e150}
# Base script check
/^\p{sc=Wancho}/utf
\x{1e2c0}
/^\p{Script=Wcho}/utf
\x{1e2ff}
# Character not in script
/^\p{Wancho}/utf
\x{1e300}
# Base script check
/^\p{sc=Chorasmian}/utf
\x{10fb0}
/^\p{Script=Chrs}/utf
\x{10fcb}
# Character not in script
/^\p{Chorasmian}/utf
\x{10fcc}
# Base script check
/^\p{sc=Dives_Akuru}/utf
\x{11900}
/^\p{Script=Diak}/utf
\x{11959}
# Character not in script
/^\p{Dives_Akuru}/utf
\x{1195a}
# Base script check
/^\p{sc=Khitan_Small_Script}/utf
\x{16fe4}
/^\p{Script=Kits}/utf
\x{18cff}
# Character not in script
/^\p{Khitan_Small_Script}/utf
\x{18d00}
# Base script check
/^\p{sc=Tangsa}/utf
\x{16a70}
/^\p{Script=Tnsa}/utf
\x{16ac9}
# Character not in script
/^\p{Tangsa}/utf
\x{16aca}
# Base script check
/^\p{sc=Vithkuqi}/utf
\x{10570}
/^\p{Script=Vith}/utf
\x{105bc}
# Character not in script
/^\p{Vithkuqi}/utf
\x{105bd}
# Base script check
/^\p{sc=Kawi}/utf
\x{11f00}
/^\p{Script=Kawi}/utf
\x{11f5a}
# Character not in script
/^\p{Kawi}/utf
\x{11f5b}
# Base script check
/^\p{sc=Nag_Mundari}/utf
\x{1e4d0}
/^\p{Script=Nagm}/utf
\x{1e4f9}
# Character not in script
/^\p{Nag_Mundari}/utf
\x{1e4fa}
# Base script check
/^\p{sc=Kirat_Rai}/utf
\x{16d40}
/^\p{Script=Krai}/utf
\x{16d79}
# Character not in script
/^\p{Kirat_Rai}/utf
\x{16d7a}
# End of test

113
3rd/pcre2/testdata/testinput3 vendored Normal file
View File

@@ -0,0 +1,113 @@
# This set of tests checks local-specific features, using the "fr_FR" locale.
# It is almost Perl-compatible. When run via RunTest, the locale is edited to
# be whichever of "fr_FR", "french", or "fr" is found to exist. There is
# different version of this file called wintestinput3 for use on Windows,
# where the locale is called "french" and the tests are run using
# RunTest.bat.
#forbid_utf
/^[\w]+/
\= Expect no match
<20>cole
/^[\w]+/locale=fr_FR
<20>cole
/^[\W]+/
<20>cole
/^[\W]+/locale=fr_FR
\= Expect no match
<20>cole
/[\b]/
\b
\= Expect no match
a
/[\b]/locale=fr_FR
\b
\= Expect no match
a
/^\w+/
\= Expect no match
<20>cole
/^\w+/locale=fr_FR
<20>cole
/(.+)\b(.+)/
<20>cole
/(.+)\b(.+)/locale=fr_FR
\= Expect no match
<20>cole
/<2F>cole/i
<20>cole
\= Expect no match
<20>cole
/<2F>cole/i,locale=fr_FR
<20>cole
<20>cole
/\w/I
/\w/I,locale=fr_FR
# All remaining tests are in the fr_FR locale, so set the default.
#pattern locale=fr_FR
/^[\xc8-\xc9]/i
<20>cole
<20>cole
/^[\xc8-\xc9]/
<20>cole
\= Expect no match
<20>cole
/\xb5/i
<20>
\= Expect no match
\x9c
/<2F>/i
\xff
\= Expect no match
y
/(.)\1/i
\xfe\xde
/\W+/
>>>\xaa<<<
>>>\xba<<<
/[\W]+/
>>>\xaa<<<
>>>\xba<<<
/[^[:alpha:]]+/
>>>\xaa<<<
>>>\xba<<<
/\w+/
>>>\xaa<<<
>>>\xba<<<
/[\w]+/
>>>\xaa<<<
>>>\xba<<<
/[[:alpha:]]+/
>>>\xaa<<<
>>>\xba<<<
/[[:alpha:]][[:lower:]][[:upper:]]/IB
# End of testinput3

3119
3rd/pcre2/testdata/testinput4 vendored Normal file
View File

@@ -0,0 +1,3119 @@
# This set of tests is for UTF support, including Unicode properties. The
# Unicode tests are all compatible with all versions of Perl >= 5.10, but
# some of the property tests may differ because of different versions of
# Unicode in use by PCRE2 and Perl.
# WARNING: Use only / as the pattern delimiter. Although pcre2test supports
# a number of delimiters, all those other than / give problems with the
# perltest.sh script.
#newline_default lf anycrlf any
#perltest
/a.b/utf
acb
a\x7fb
a\x{100}b
\= Expect no match
a\nb
/a(.{3})b/utf
a\x{4000}xyb
a\x{4000}\x7fyb
a\x{4000}\x{100}yb
\= Expect no match
a\x{4000}b
ac\ncb
/a(.*?)(.)/
a\xc0\x88b
/a(.*?)(.)/utf
a\x{100}b
/a(.*)(.)/
a\xc0\x88b
/a(.*)(.)/utf
a\x{100}b
/a(.)(.)/
a\xc0\x92bcd
/a(.)(.)/utf
a\x{240}bcd
/a(.?)(.)/
a\xc0\x92bcd
/a(.?)(.)/utf
a\x{240}bcd
/a(.??)(.)/
a\xc0\x92bcd
/a(.??)(.)/utf
a\x{240}bcd
/a(.{3})b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
\= Expect no match
a\x{1234}b
ac\ncb
/a(.{3,})b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
\= Expect no match
a\x{1234}b
/a(.{3,}?)b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
\= Expect no match
a\x{1234}b
/a(.{3,5})b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
axbxxbcdefghijb
axxxxxbcdefghijb
\= Expect no match
a\x{1234}b
axxxxxxbcdefghijb
/a(.{3,5}?)b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
axbxxbcdefghijb
axxxxxbcdefghijb
\= Expect no match
a\x{1234}b
axxxxxxbcdefghijb
/^[a\x{c0}]/utf
\= Expect no match
\x{100}
/(?<=aXb)cd/utf
aXbcd
/(?<=a\x{100}b)cd/utf
a\x{100}bcd
/(?<=a\x{100000}b)cd/utf
a\x{100000}bcd
/(?:\x{100}){3}b/utf
\x{100}\x{100}\x{100}b
\= Expect no match
\x{100}\x{100}b
/\x{ab}/utf
\x{ab}
\xc2\xab
\= Expect no match
\x00{ab}
/(?<=(.))X/utf
WXYZ
\x{256}XYZ
\= Expect no match
XYZ
/[^a]+/g,utf
bcd
\x{100}aY\x{256}Z
/^[^a]{2}/utf
\x{100}bc
/^[^a]{2,}/utf
\x{100}bcAa
/^[^a]{2,}?/utf
\x{100}bca
/[^a]+/gi,utf
bcd
\x{100}aY\x{256}Z
/^[^a]{2}/i,utf
\x{100}bc
/^[^a]{2,}/i,utf
\x{100}bcAa
/^[^a]{2,}?/i,utf
\x{100}bca
/\x{100}{0,0}/utf
abcd
/\x{100}?/utf
abcd
\x{100}\x{100}
/\x{100}{0,3}/utf
\x{100}\x{100}
\x{100}\x{100}\x{100}\x{100}
/\x{100}*/utf
abce
\x{100}\x{100}\x{100}\x{100}
/\x{100}{1,1}/utf
abcd\x{100}\x{100}\x{100}\x{100}
/\x{100}{1,3}/utf
abcd\x{100}\x{100}\x{100}\x{100}
/\x{100}+/utf
abcd\x{100}\x{100}\x{100}\x{100}
/\x{100}{3}/utf
abcd\x{100}\x{100}\x{100}XX
/\x{100}{3,5}/utf
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
/\x{100}{3,}/utf
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
/(?<=a\x{100}{2}b)X/utf,aftertext
Xyyya\x{100}\x{100}bXzzz
/\D*/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
/\D*/utf
\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
/\D/utf
1X2
1\x{100}2
/>\S/utf
> >X Y
> >\x{100} Y
/\d/utf
\x{100}3
/\s/utf
\x{100} X
/\D+/utf
12abcd34
\= Expect no match
1234
/\D{2,3}/utf
12abcd34
12ab34
\= Expect no match
1234
12a34
/\D{2,3}?/utf
12abcd34
12ab34
\= Expect no match
1234
12a34
/\d+/utf
12abcd34
/\d{2,3}/utf
12abcd34
1234abcd
\= Expect no match
1.4
/\d{2,3}?/utf
12abcd34
1234abcd
\= Expect no match
1.4
/\S+/utf
12abcd34
\= Expect no match
\ \
/\S{2,3}/utf
12abcd34
1234abcd
\= Expect no match
\ \
/\S{2,3}?/utf
12abcd34
1234abcd
\= Expect no match
\ \
/>\s+</utf,aftertext
12> <34
/>\s{2,3}</utf,aftertext
ab> <cd
ab> <ce
\= Expect no match
ab> <cd
/>\s{2,3}?</utf,aftertext
ab> <cd
ab> <ce
\= Expect no match
ab> <cd
/\w+/utf
12 34
\= Expect no match
+++=*!
/\w{2,3}/utf
ab cd
abcd ce
\= Expect no match
a.b.c
/\w{2,3}?/utf
ab cd
abcd ce
\= Expect no match
a.b.c
/\W+/utf
12====34
\= Expect no match
abcd
/\W{2,3}/utf
ab====cd
ab==cd
\= Expect no match
a.b.c
/\W{2,3}?/utf
ab====cd
ab==cd
\= Expect no match
a.b.c
/[\x{100}]/utf
\x{100}
Z\x{100}
\x{100}Z
/[Z\x{100}]/utf
Z\x{100}
\x{100}
\x{100}Z
/[\x{100}\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
/[\x{100}-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
/[z-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
abzcd
ab|cd
/[Q\x{100}\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
Q?
/[Q\x{100}-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
Q?
/[Qz-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
abzcd
ab|cd
Q?
/[\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/[\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/[Q\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/[Q\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/(?<=[\x{100}\x{200}])X/utf
abc\x{200}X
abc\x{100}X
\= Expect no match
X
/(?<=[Q\x{100}\x{200}])X/utf
abc\x{200}X
abc\x{100}X
abQX
\= Expect no match
X
/(?<=[\x{100}\x{200}]{3})X/utf
abc\x{100}\x{200}\x{100}X
\= Expect no match
abc\x{200}X
X
/[^\x{100}\x{200}]X/utf
AX
\x{150}X
\x{500}X
\= Expect no match
\x{100}X
\x{200}X
/[^Q\x{100}\x{200}]X/utf
AX
\x{150}X
\x{500}X
\= Expect no match
\x{100}X
\x{200}X
QX
/[^\x{100}-\x{200}]X/utf
AX
\x{500}X
\= Expect no match
\x{100}X
\x{150}X
\x{200}X
/[z-\x{100}]/i,utf
z
Z
\x{100}
\= Expect no match
\x{102}
y
/[\xFF]/
>\xff<
/[\xff]/utf
>\x{ff}<
/[^\xFF]/
XYZ
/[^\xff]/utf
XYZ
\x{123}
/^[ac]*b/utf
\= Expect no match
xb
/^[ac\x{100}]*b/utf
\= Expect no match
xb
/^[^x]*b/i,utf
\= Expect no match
xb
/^[^x]*b/utf
\= Expect no match
xb
/^\d*b/utf
\= Expect no match
xb
/(|a)/g,utf
catac
a\x{256}a
/^\x{85}$/i,utf
\x{85}
/^ሴ/utf
/^\ሴ/utf
/(?s)(.{1,5})/utf
abcdefg
ab
/a*\x{100}*\w/utf
a
/\S\S/g,utf
A\x{a3}BC
/\S{2}/g,utf
A\x{a3}BC
/\W\W/g,utf
+\x{a3}==
/\W{2}/g,utf
+\x{a3}==
/\S/g,utf
\x{442}\x{435}\x{441}\x{442}
/[\S]/g,utf
\x{442}\x{435}\x{441}\x{442}
/\D/g,utf
\x{442}\x{435}\x{441}\x{442}
/[\D]/g,utf
\x{442}\x{435}\x{441}\x{442}
/\W/g,utf
\x{2442}\x{2435}\x{2441}\x{2442}
/[\W]/g,utf
\x{2442}\x{2435}\x{2441}\x{2442}
/[\S\s]*/utf
abc\n\r\x{442}\x{435}\x{441}\x{442}xyz
/[\x{41f}\S]/g,utf
\x{442}\x{435}\x{441}\x{442}
/.[^\S]./g,utf
abc def\x{442}\x{443}xyz\npqr
/.[^\S\n]./g,utf
abc def\x{442}\x{443}xyz\npqr
/[[:^alnum:]]/g,utf
+\x{2442}
/[[:^alpha:]]/g,utf
+\x{2442}
/[[:^ascii:]]/g,utf
A\x{442}
/[[:^blank:]]/g,utf
A\x{442}
/[[:^cntrl:]]/g,utf
A\x{442}
/[[:^digit:]]/g,utf
A\x{442}
/[[:^graph:]]/g,utf
\x19\x{e01ff}
/[[:^lower:]]/g,utf
A\x{422}
/[[:^print:]]/g,utf
\x{19}\x{e01ff}
/[[:^punct:]]/g,utf
A\x{442}
/[[:^space:]]/g,utf
A\x{442}
/[[:^upper:]]/g,utf
a\x{442}
/[[:^word:]]/g,utf
+\x{2442}
/[[:^xdigit:]]/g,utf
M\x{442}
/[^ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞĀĂĄĆĈĊČĎĐĒĔĖĘĚĜĞĠĢĤĦĨĪĬĮİIJĴĶĹĻĽĿŁŃŅŇŊŌŎŐŒŔŖŘŚŜŞŠŢŤŦŨŪŬŮŰŲŴŶŸŹŻŽƁƂƄƆƇƉƊƋƎƏƐƑƓƔƖƗƘƜƝƟƠƢƤƦƧƩƬƮƯƱƲƳƵƷƸƼDŽLJNJǍǏǑǓǕǗǙǛǞǠǢǤǦǨǪǬǮDZǴǶǷǸǺǼǾȀȂȄȆȈȊȌȎȐȒȔȖȘȚȜȞȠȢȤȦȨȪȬȮȰȲȺȻȽȾɁΆΈΉΊΌΎΏΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩΪΫϒϓϔϘϚϜϞϠϢϤϦϨϪϬϮϴϷϹϺϽϾϿЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯѠѢѤѦѨѪѬѮѰѲѴѶѸѺѼѾҀҊҌҎҐҒҔҖҘҚҜҞҠҢҤҦҨҪҬҮҰҲҴҶҸҺҼҾӀӁӃӅӇӉӋӍӐӒӔӖӘӚӜӞӠӢӤӦӨӪӬӮӰӲӴӶӸԀԂԄԆԈԊԌԎԱԲԳԴԵԶԷԸԹԺԻԼԽԾԿՀՁՂՃՄՅՆՇՈՉՊՋՌՍՎՏՐՑՒՓՔՕՖႠႡႢႣႤႥႦႧႨႩႪႫႬႭႮႯႰႱႲႳႴႵႶႷႸႹႺႻႼႽႾႿჀჁჂჃჄჅḀḂḄḆḈḊḌḎḐḒḔḖḘḚḜḞḠḢḤḦḨḪḬḮḰḲḴḶḸḺḼḾṀṂṄṆṈṊṌṎṐṒṔṖṘṚṜṞṠṢṤṦṨṪṬṮṰṲṴṶṸṺṼṾẀẂẄẆẈẊẌẎẐẒẔẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼẾỀỂỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỦỨỪỬỮỰỲỴỶỸἈἉἊἋἌἍἎἏἘἙἚἛἜἝἨἩἪἫἬἭἮἯἸἹἺἻἼἽἾἿὈὉὊὋὌὍὙὛὝὟὨὩὪὫὬὭὮὯᾸᾹᾺΆῈΈῊΉῘῙῚΊῨῩῪΎῬῸΌῺΏabcdefghijklmnopqrstuvwxyzªµºßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿāăąćĉċčďđēĕėęěĝğġģĥħĩīĭįıijĵķĸĺļľŀłńņňʼnŋōŏőœŕŗřśŝşšţťŧũūŭůűųŵŷźżžſƀƃƅƈƌƍƒƕƙƚƛƞơƣƥƨƪƫƭưƴƶƹƺƽƾƿdžljnjǎǐǒǔǖǘǚǜǝǟǡǣǥǧǩǫǭǯǰdzǵǹǻǽǿȁȃȅȇȉȋȍȏȑȓȕȗșțȝȟȡȣȥȧȩȫȭȯȱȳȴȵȶȷȸȹȼȿɀɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏʐʑʒʓʔʕʖʗʘʙʚʛʜʝʞʟʠʡʢʣʤʥʦʧʨʩʪʫʬʭʮʯΐάέήίΰαβγδεζηθικλμνξοπρςστυφχψωϊϋόύώϐϑϕϖϗϙϛϝϟϡϣϥϧϩϫϭϯϰϱϲϳϵϸϻϼабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџѡѣѥѧѩѫѭѯѱѳѵѷѹѻѽѿҁҋҍҏґғҕҗҙқҝҟҡңҥҧҩҫҭүұҳҵҷҹһҽҿӂӄӆӈӊӌӎӑӓӕӗәӛӝӟӡӣӥӧөӫӭӯӱӳӵӷӹԁԃԅԇԉԋԍԏաբգդեզէըթժիլխծկհձղճմյնշոչպջռսվտրցւփքօֆևᴀᴁᴂᴃᴅᴆᴇᴈᴉᴊᴋᴌᴍᴎᴒᴓᴔᴕᴖᴗᴘᴙᴚᴛᴝᴞᴟᴣᴤᴥᴧᴨᴩᴪᴫᵢᵣᵤᵥᵦᵧᵨᵩᵪᵫᵬᵭᵮᵯᵰᵱᵲᵳᵴᵵᵶᵷᵹᵺᵻᵼᵽᵾᵿᶀᶁᶂᶄᶅᶆᶇᶈᶉᶊᶋᶍᶎᶏᶐᶑᶒᶓᶔᶕᶖᶗᶘᶙᶚḁḃḅḇḉḋḍḏḑḓḕḗḙḛḝḟḡḣḥḧḩḫḭḯḱḳḵḷḹḻḽḿṁṃṅṇṉṋṍṏṑṓṕṗṙṛṝṟṡṣṥṧṩṫṭṯṱṳṵṷṹṻṽṿẁẃẅẇẉẋẍẏẑẓẕẖẗẘẙẚẛạảấầẩẫậắằẳẵặẹẻẽếềểễệỉịọỏốồổỗộớờởỡợụủứừửữựỳỵỷỹἀἁἂἃἄἅἆἇἐἑἒἓἔἕἠἡἢἣἤἥἦἧἰἱἲἳἴἵἶἷὀὁὂὃὄὅὐὑὒὓὔὕὖὗὠὡὢὣὤὥὦὧὰάὲέὴήὶίὸόὺύὼώᾀᾁᾂᾃᾄᾅᾆᾇᾐᾑᾒᾓᾔᾕᾖᾗᾠᾡᾢᾣᾤᾥᾦᾧᾰᾱᾲᾳᾴᾶᾷῂῃῄῆῇῐῑῒΐῖῗῠῡῢΰῤῥῦῧῲῳῴῶῷⲁⲃⲇⲉⲋⲍⲏⲑⲓⲕⲗⲙⲛⲝⲧⲩⲫⲭⲯⲱⲳⲵⲷⲹⲻⲽⲿⳁⳃⳅⳇⳉⳋⳍⳏⳑⳓⳕⳗⳙⳛⳝⳟⳡⳣⳤⴀⴁⴂⴃⴄⴅⴆⴇⴈⴉⴊⴋⴌⴍⴎⴏⴐⴑⴒⴓⴔⴕⴖⴗⴘⴙⴚⴛⴜⴝⴞⴟⴠⴡⴢⴣⴤⴥfffiflffifflſtstﬓﬔﬕﬖﬗ\d_^]/utf
/^[^d]*?$/
abc
/^[^d]*?$/utf
abc
/^[^d]*?$/i
abc
/^[^d]*?$/i,utf
abc
/(?i)[\xc3\xa9\xc3\xbd]|[\xc3\xa9\xc3\xbdA]/utf
/^[a\x{c0}]b/utf
\x{c0}b
/^([a\x{c0}]*?)aa/utf
a\x{c0}aaaa/
/^([a\x{c0}]*?)aa/utf
a\x{c0}aaaa/
a\x{c0}a\x{c0}aaa/
/^([a\x{c0}]*)aa/utf
a\x{c0}aaaa/
a\x{c0}a\x{c0}aaa/
/^([a\x{c0}]*)a\x{c0}/utf
a\x{c0}aaaa/
a\x{c0}a\x{c0}aaa/
/A*/g,utf
AAB\x{123}BAA
/(abc)\1/i,utf
\= Expect no match
abc
/(abc)\1/utf
\= Expect no match
abc
/a(*:a\x{1234}b)/utf,mark
abc
/a(*:a£b)/utf,mark
abc
# Noncharacters
/./utf
\x{fffe}
\x{ffff}
\x{1fffe}
\x{1ffff}
\x{2fffe}
\x{2ffff}
\x{3fffe}
\x{3ffff}
\x{4fffe}
\x{4ffff}
\x{5fffe}
\x{5ffff}
\x{6fffe}
\x{6ffff}
\x{7fffe}
\x{7ffff}
\x{8fffe}
\x{8ffff}
\x{9fffe}
\x{9ffff}
\x{afffe}
\x{affff}
\x{bfffe}
\x{bffff}
\x{cfffe}
\x{cffff}
\x{dfffe}
\x{dffff}
\x{efffe}
\x{effff}
\x{ffffe}
\x{fffff}
\x{10fffe}
\x{10ffff}
\x{fdd0}
\x{fdd1}
\x{fdd2}
\x{fdd3}
\x{fdd4}
\x{fdd5}
\x{fdd6}
\x{fdd7}
\x{fdd8}
\x{fdd9}
\x{fdda}
\x{fddb}
\x{fddc}
\x{fddd}
\x{fdde}
\x{fddf}
\x{fde0}
\x{fde1}
\x{fde2}
\x{fde3}
\x{fde4}
\x{fde5}
\x{fde6}
\x{fde7}
\x{fde8}
\x{fde9}
\x{fdea}
\x{fdeb}
\x{fdec}
\x{fded}
\x{fdee}
\x{fdef}
/^\d*\w{4}/utf
1234
\= Expect no match
123
/^[^b]*\w{4}/utf
aaaa
\= Expect no match
aaa
/^[^b]*\w{4}/i,utf
aaaa
\= Expect no match
aaa
/^\x{100}*.{4}/utf
\x{100}\x{100}\x{100}\x{100}
\= Expect no match
\x{100}\x{100}\x{100}
/^\x{100}*.{4}/i,utf
\x{100}\x{100}\x{100}\x{100}
\= Expect no match
\x{100}\x{100}\x{100}
/^a+[a\x{200}]/utf
aa
/^.\B.\B./utf
\x{10123}\x{10124}\x{10125}
/^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/utf
#\x{10000}#\x{100}#\x{10ffff}#
# Unicode property support tests
/^\pC\pL\pM\pN\pP\pS\pZ</utf
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
\np\x{300}9!\$ <
\= Expect no match
ap\x{300}9!\$ <
/^\PC/utf
X
\= Expect no match
\x7f
/^\PL/utf
9
\= Expect no match
\x{c0}
/^\PM/utf
X
\= Expect no match
\x{30f}
/^\PN/utf
X
\= Expect no match
\x{660}
/^\PP/utf
X
\= Expect no match
\x{66c}
/^\PS/utf
X
\= Expect no match
\x{f01}
/^\PZ/utf
X
\= Expect no match
\x{1680}
/^\p{Cc}/utf
\x{017}
\x{09f}
\= Expect no match
\x{0600}
/^\p{Cf}/utf
\x{601}
\= Expect no match
\x{09f}
/^\p{Cn}/utf
\x{e0000}
\= Expect no match
\x{09f}
/^\p{Co}/utf
\x{f8ff}
\= Expect no match
\x{09f}
/^\p{Ll}/utf
a
\= Expect no match
Z
\x{e000}
/^\p{Lm}/utf
\x{2b0}
\= Expect no match
a
/^\p{Lo}/utf
\x{1bb}
\x{3400}
\x{3401}
\x{4d00}
\x{4db4}
\x{4db5}
\x{4db6}
\= Expect no match
a
\x{2b0}
/^\p{Lt}/utf
\x{1c5}
\= Expect no match
a
\x{2b0}
/^\p{Lu}/utf
A
\= Expect no match
\x{2b0}
/^\p{Mc}/utf
\x{903}
\= Expect no match
X
\x{300}
/^\p{Me}/utf
\x{488}
\= Expect no match
X
\x{903}
\x{300}
/^\p{Mn}/utf
\x{300}
\= Expect no match
X
\x{903}
/^\p{Nd}+/utf
0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
\x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
\x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
\= Expect no match
X
/^\p{Nl}/utf
\x{16ee}
\= Expect no match
X
\x{966}
/^\p{No}/utf
\x{b2}
\x{b3}
\= Expect no match
X
\x{16ee}
/^\p{Pc}/utf
\x5f
\x{203f}
\= Expect no match
X
-
\x{58a}
/^\p{Pd}/utf
-
\x{58a}
\= Expect no match
X
\x{203f}
/^\p{Pe}/utf
)
]
}
\x{f3b}
\= Expect no match
X
\x{203f}
(
[
{
\x{f3c}
/^\p{Pf}/utf
\x{bb}
\x{2019}
\= Expect no match
X
\x{203f}
/^\p{Pi}/utf
\x{ab}
\x{2018}
\= Expect no match
X
\x{203f}
/^\p{Po}/utf
!
\x{37e}
\= Expect no match
X
\x{203f}
/^\p{Ps}/utf
(
[
{
\x{f3c}
\= Expect no match
X
)
]
}
\x{f3b}
/^\p{Sk}/utf
\x{2c2}
\= Expect no match
X
\x{9f2}
/^\p{Sm}+/utf
+<|~\x{ac}\x{2044}
\= Expect no match
X
\x{9f2}
/^\p{So}/utf
\x{a6}
\x{482}
\= Expect no match
X
\x{9f2}
/^\p{Zl}/utf
\x{2028}
\= Expect no match
X
\x{2029}
/^\p{Zp}/utf
\x{2029}
\= Expect no match
X
\x{2028}
/\p{Nd}+(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}+?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,}(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,}?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2}(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,3}(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,3}?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}??(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*+(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*+(...)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*+(....)/utf
\= Expect no match
\x{660}\x{661}\x{662}ABC
/^\pN{3,}+(.)/utf
\x{7c0}8\x{662}\x{966}\x{95c}
\x{7c0}8\x{662}\x{95c}
\= Expect no match
\x{7c0}8\x{662}\x{966}
\x{7c0}8\x{95c}
/(?<=A\p{Nd})XYZ/utf
A2XYZ
123A5XYZPQR
ABA\x{660}XYZpqr
\= Expect no match
AXYZ
XYZ
/(?<!\pL)XYZ/utf
1XYZ
AB=XYZ..
XYZ
\= Expect no match
WXYZ
/[\P{Nd}]+/utf
abcd
\= Expect no match
1234
/\D+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/\P{Nd}+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/[\D]+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/[\P{Nd}]+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/[\D\P{Nd}]+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/\pL/utf
a
A
/\pL/i,utf
a
A
/\p{Lu}/utf
A
aZ
\= Expect no match
abc
/\p{Ll}/utf
a
Az
\= Expect no match
ABC
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
\= Expect no match
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
/\x{391}+/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
/\x{391}{3,5}(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
/\x{391}{3,5}?(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
/[\x{391}\x{ff3a}]/i,utf
\x{391}
\x{ff3a}
\x{3b1}
\x{ff5a}
/^(\X*)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^(\X*?)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^(\X*)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^(\X*?)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^\X(.)/utf
\= Expect no match
A\x{300}\x{301}\x{302}
/^\X{2,3}(.)/utf
A\x{300}\x{301}B\x{300}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
/^\X{2,3}?(.)/utf
A\x{300}\x{301}B\x{300}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
/^\X{3,}+/utf
A\x{300}B\x{301}U\x{303}\x{0301}
A\x{300}B\x{301}U\x{303}\x{0301}X
\= Expect no match
A\x{300}
A\x{300}B\x{301}
A\x{300}U\x{303}\x{0301}
/^\X/utf
A
A\x{300}BC
A\x{300}\x{301}\x{302}BC
\x{300}
/^\p{Han}+/utf
\x{2e81}\x{3007}\x{2f804}\x{31a0}
\= Expect no match
\x{2e7f}
/^[\p{Arabic}]/utf
\x{06e9}
\x{060b}
\= Expect no match
X\x{06e9}
/^\P{Katakana}+/utf
\x{3105}
\= Expect no match
\x{30ff}
/^[\P{Yi}]/utf
\x{2f800}
\= Expect no match
\x{a014}
\x{a4c6}
/^\p{Any}X/utf
AXYZ
\x{1234}XYZ
\= Expect no match
X
/^\P{Any}X/utf
\= Expect no match
AX
/^\p{Any}?X/utf
XYZ
AXYZ
\x{1234}XYZ
\= Expect no match
ABXYZ
/^\P{Any}?X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
ABXYZ
/^\p{Any}+X/utf
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
\= Expect no match
XYZ
/^\P{Any}+X/utf
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
XYZ
/^\p{Any}*X/utf
XYZ
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^\P{Any}*X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^[\p{Any}]X/utf
AXYZ
\x{1234}XYZ
\= Expect no match
X
/^[\P{Any}]X/utf
\= Expect no match
AX
/^[\p{Any}]?X/utf
XYZ
AXYZ
\x{1234}XYZ
\= Expect no match
ABXYZ
/^[\P{Any}]?X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
ABXYZ
/^[\p{Any}]+X/utf
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
\= Expect no match
XYZ
/^[\P{Any}]+X/utf
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
XYZ
/^[\p{Any}]*X/utf
XYZ
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^[\P{Any}]*X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^\p{Any}{3,5}?/utf
abcdefgh
\x{1234}\n\r\x{3456}xyz
/^\p{Any}{3,5}/utf
abcdefgh
\x{1234}\n\r\x{3456}xyz
/^\P{Any}{3,5}?/utf
\= Expect no match
abcdefgh
\x{1234}\n\r\x{3456}xyz
/^\p{L&}X/utf
AXY
aXY
\x{1c5}XY
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^[\p{L&}]X/utf
AXY
aXY
\x{1c5}XY
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^\p{L&}+X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^[\p{L&}]+X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^\p{L&}+?X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^[\p{L&}]+?X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^\P{L&}X/utf
!XY
\x{1bb}XY
\x{2b0}XY
\= Expect no match
\x{1c5}XY
AXY
/^[\P{L&}]X/utf
!XY
\x{1bb}XY
\x{2b0}XY
\= Expect no match
\x{1c5}XY
AXY
/^(\p{Z}[^\p{C}\p{Z}]+)*$/
\xa0!
/^[\pL](abc)(?1)/
AabcabcYZ
/([\pL]=(abc))*X/
L=abcX
/^\p{Balinese}\p{Cuneiform}\p{Nko}\p{Phags_Pa}\p{Phoenician}/utf
\x{1b00}\x{12000}\x{7c0}\x{a840}\x{10900}
# Check property support in non-UTF mode
/\p{L}{4}/
123abcdefg
123abc\xc4\xc5zz
/\X{1,3}\d/
\= Expect no match
\x8aBCD
/\X?\d/
\= Expect no match
\x8aBCD
/\P{L}?\d/
\= Expect no match
\x8aBCD
/[\PPP\x8a]{1,}\x80/
A\x80
/^[\p{Arabic}]/utf
\x{604}
\x{60e}
\x{656}
\x{657}
\x{658}
\x{659}
\x{65a}
\x{65b}
\x{65c}
\x{65d}
\x{65e}
\x{65f}
\x{66a}
\x{6e9}
\x{6ef}
\x{6fa}
/^\p{Cyrillic}/utf
\x{1d2b}
/^\p{Common}/utf
\x{2116}
\x{1D183}
/^\p{Inherited}/utf
\x{200c}
\= Expect no match
\x{64a}
\x{656}
/^\p{Shavian}/utf
\x{10450}
\x{1047f}
/^\p{Deseret}/utf
\x{10400}
\x{1044f}
/^\p{Osmanya}/utf
\x{10480}
\x{1049d}
\x{104a0}
\x{104a9}
\= Expect no match
\x{1049e}
\x{1049f}
\x{104aa}
/\p{katakana}/utf
\x{30a1}
\x{3001}
/\p{scx:katakana}/utf
\x{30a1}
\x{3001}
/\p{script extensions:katakana}/utf
\x{30a1}
\x{3001}
/\p{sc:katakana}/utf
\x{30a1}
\= Expect no match
\x{3001}
/\p{script:katakana}/utf
\x{30a1}
\= Expect no match
\x{3001}
/\p{sc:katakana}{3,}/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
/\p{sc:katakana}{3,}?/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/utf
\x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
/\x{a77d}\x{1d79}/i,utf
\x{a77d}\x{1d79}
\x{1d79}\x{a77d}
/\x{a77d}\x{1d79}/utf
\x{a77d}\x{1d79}
\= Expect no match
\x{1d79}\x{a77d}
/(A)\1/i,utf
AA
Aa
aa
aA
/(\x{10a})\1/i,utf
\x{10a}\x{10a}
\x{10a}\x{10b}
\x{10b}\x{10b}
\x{10b}\x{10a}
# The next two tests are for property support in non-UTF mode
/(?:\p{Lu}|\x20)+/
\x41\x20\x50\xC2\x54\xC9\x20\x54\x4F\x44\x41\x59
/[\p{Lu}\x20]+/
\x41\x20\x50\xC2\x54\xC9\x20\x54\x4F\x44\x41\x59
/\p{Avestan}\p{Bamum}\p{Egyptian_Hieroglyphs}\p{Imperial_Aramaic}\p{Inscriptional_Pahlavi}\p{Inscriptional_Parthian}\p{Javanese}\p{Kaithi}\p{Lisu}\p{Meetei_Mayek}\p{Old_South_Arabian}\p{Old_Turkic}\p{Samaritan}\p{Tai_Tham}\p{Tai_Viet}/utf
\x{10b00}\x{a6ef}\x{13007}\x{10857}\x{10b78}\x{10b58}\x{a980}\x{110c1}\x{a4ff}\x{abc0}\x{10a7d}\x{10c48}\x{0800}\x{1aad}\x{aac0}
/^\w+/utf,ucp
Az_\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}1\x{660}\x{bef}\x{16ee}
/^[[:xdigit:]]*/utf,ucp
1a\x{660}\x{bef}\x{16ee}
/^\d+/utf,ucp
1\x{660}\x{bef}\x{16ee}
/^[[:digit:]]+/utf,ucp
1\x{660}\x{bef}\x{16ee}
/^>\s+/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{9}\x{b}
/^>\pZ+/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{9}\x{b}
/^>[[:space:]]*/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{9}\x{b}
/^>[[:blank:]]*/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2000}\x{202f}\x{9}\x{b}\x{2028}
/^[[:alpha:]]*/utf,ucp
Az\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}
/^[[:alnum:]]*/utf,ucp
Az\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}1\x{660}\x{bef}\x{16ee}
/^[[:cntrl:]]*/utf,ucp
\x{0}\x{09}\x{1f}\x{7f}\x{9f}
/^[[:graph:]]*/utf,ucp
A\x{a1}\x{a0}
/^[[:print:]]*/utf,ucp
A z\x{a0}\x{a1}
/^[[:punct:]]*/utf,ucp
.+\x{a1}\x{a0}
/\p{Zs}*?\R/
\= Expect no match
a\xFCb
/\p{Zs}*\R/
\= Expect no match
a\xFCb
/ⱥ/i,utf
Ⱥx
Ⱥ
/[ⱥ]/i,utf
Ⱥx
Ⱥ
/Ⱥ/i,utf
Ⱥ
# These are tests for extended grapheme clusters
/^\X/utf,aftertext
G\x{34e}\x{34e}X
\x{34e}\x{34e}X
\x04X
\x{1100}X
\x{1100}\x{34e}X
\x{1b04}\x{1b04}X
*These match up to the roman letters
\x{1111}\x{1111}L,L
\x{1111}\x{1111}\x{1169}L,L,V
\x{1111}\x{ae4c}L, LV
\x{1111}\x{ad89}L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
*These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
\x{ae4c}\x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
\x{1169}\x{1111}V, L
\x{1169}\x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
\x{11fe}\x{1169}T, V
\x{11fe}\x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
*Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
\x0d\x{0711}CR, extend
\x0d\x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
\x0a\x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
\x09\x{1b04}Control, spacingmark
*Test Extended Pictographic after bug fix
\x{261d}\x{261d}B Extended_Pictographic Extended_Pictographic
\x{261D}\x{1F3FB}\x{261d}B Extended_Pictographic Extend E-P
\x{261D}\x{1F3FB}\x{200d}\x{261d}B Extended_Pictographic Extend ZWJ E-P
\x{1f3f3}\x{fe0f}\x{200d}\x{1f308}\x{1f3f4}\x{200d}\x{2620}\x{fe0f}\x{1f3f3}\x{fe0f}\x{200d}\x{1f308}\x{1f3f4}\x{200d}\x{2620}\x{fe0f}
A\x{200d}\x{1f308}B
A\x{200d}B A ZWJ
\x{261D}\x{1F3FB}B Extended_Pictographic Extend
\x{1F1E6}\x{1F1E7}B RegionalIndicator RegionalIndicator
*There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}?X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/\X*Z/utf,no_start_optimize
\= Expect no match
A\x{300}
/\X*(.)/utf,no_start_optimize
A\x{1111}\x{ae4c}\x{1169}
# --------------------------------------------
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
/[z\x{1e9e}]+/i,utf
\x{1e9e}\x{00df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
/[z\x{00df}]+/i,utf
\x{1e9e}\x{00df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
/[z\x{1f88}]+/i,utf
\x{1f88}\x{1f80}
# Check a reference with more than one other case
/^(\x{00b5})\1{2}$/i,utf
\x{00b5}\x{039c}\x{03bc}
# Characters with more than one other case; test in classes
/[z\x{00b5}]+/i,utf
\x{00b5}\x{039c}\x{03bc}
/[z\x{039c}]+/i,utf
\x{00b5}\x{039c}\x{03bc}
/[z\x{03bc}]+/i,utf
\x{00b5}\x{039c}\x{03bc}
/[z\x{00c5}]+/i,utf
\x{00c5}\x{00e5}\x{212b}
/[z\x{00e5}]+/i,utf
\x{00c5}\x{00e5}\x{212b}
/[z\x{212b}]+/i,utf
\x{00c5}\x{00e5}\x{212b}
/[z\x{01c4}]+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c5}]+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c6}]+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c7}]+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01c8}]+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01c9}]+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01ca}]+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01cb}]+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01cc}]+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01f1}]+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/[z\x{01f2}]+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/[z\x{01f3}]+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/[z\x{0345}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{0399}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{03b9}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{1fbe}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{0392}]+/i,utf
\x{0392}\x{03b2}\x{03d0}
/[z\x{03b2}]+/i,utf
\x{0392}\x{03b2}\x{03d0}
/[z\x{03d0}]+/i,utf
\x{0392}\x{03b2}\x{03d0}
/[z\x{0395}]+/i,utf
\x{0395}\x{03b5}\x{03f5}
/[z\x{03b5}]+/i,utf
\x{0395}\x{03b5}\x{03f5}
/[z\x{03f5}]+/i,utf
\x{0395}\x{03b5}\x{03f5}
/[z\x{0398}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03b8}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03d1}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03f4}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{039a}]+/i,utf
\x{039a}\x{03ba}\x{03f0}
/[z\x{03ba}]+/i,utf
\x{039a}\x{03ba}\x{03f0}
/[z\x{03f0}]+/i,utf
\x{039a}\x{03ba}\x{03f0}
/[z\x{03a0}]+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03c0}]+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03d6}]+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03a1}]+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03c1}]+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03f1}]+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03a3}]+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03c2}]+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03c3}]+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03a6}]+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03c6}]+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03d5}]+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03c9}]+/i,utf
\x{03c9}\x{03a9}\x{2126}
/[z\x{03a9}]+/i,utf
\x{03c9}\x{03a9}\x{2126}
/[z\x{2126}]+/i,utf
\x{03c9}\x{03a9}\x{2126}
/[z\x{1e60}]+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e61}]+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e9b}]+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
# Perl 5.12.4 gets these wrong, but 5.15.3 is OK
/[z\x{004b}]+/i,utf
\x{004b}\x{006b}\x{212a}
/[z\x{006b}]+/i,utf
\x{004b}\x{006b}\x{212a}
/[z\x{212a}]+/i,utf
\x{004b}\x{006b}\x{212a}
/[z\x{0053}]+/i,utf
\x{0053}\x{0073}\x{017f}
/[z\x{0073}]+/i,utf
\x{0053}\x{0073}\x{017f}
/[z\x{017f}]+/i,utf
\x{0053}\x{0073}\x{017f}
/^[a-z\x{500}-\x{1000}]{3,}[a-h]|x/utf
ab\x{600}ijklmh
ab\x{600}hijklm
\= Expect no match
ab\x{600}ijklm
/^[a-z\x{500}-\x{1000}]{4,7}[a-h]|x/utf
ab\x{600}\x{700}ijkh
ab\x{600}\x{700}hijkl
\= Expect no match
ab\x{600}\x{700}ijklh
ab\x{600}h\x{700}ijklmh
/([a-z\x{1000}\x{2000}]{1,2}?u)+$/utf
\x{1000}uu\x{2000}u
\x{1001}uuuu
\x{2001}uuuuu
uuuu\x{1fff}#u#\x{2000}\x{1000}u\x{2000}u
\= Expect no match
abuabuabuabu!
uuuuuuuuuuuu#
# --------------------------------------
/(ΣΆΜΟΣ) \1/i,utf
ΣΆΜΟΣ ΣΆΜΟΣ
ΣΆΜΟΣ σάμος
σάμος σάμος
σάμος σάμοσ
σάμος ΣΆΜΟΣ
/(σάμος) \1/i,utf
ΣΆΜΟΣ ΣΆΜΟΣ
ΣΆΜΟΣ σάμος
σάμος σάμος
σάμος σάμοσ
σάμος ΣΆΜΟΣ
/(ΣΆΜΟΣ) \1*/i,utf
ΣΆΜΟΣ\x20
ΣΆΜΟΣ ΣΆΜΟΣσάμοςσάμος
# Perl matches these
/\x{00b5}+/i,utf
\x{00b5}\x{039c}\x{03bc}
/\x{039c}+/i,utf
\x{00b5}\x{039c}\x{03bc}
/\x{03bc}+/i,utf
\x{00b5}\x{039c}\x{03bc}
/\x{00c5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
/\x{00e5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
/\x{212b}+/i,utf
\x{00c5}\x{00e5}\x{212b}
/\x{01c4}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/\x{01c5}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/\x{01c6}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/\x{01c7}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/\x{01c8}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/\x{01c9}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/\x{01ca}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/\x{01cb}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/\x{01cc}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/\x{01f1}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/\x{01f2}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/\x{01f3}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/\x{0345}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0399}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{03b9}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{1fbe}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0392}+/i,utf
\x{0392}\x{03b2}\x{03d0}
/\x{03b2}+/i,utf
\x{0392}\x{03b2}\x{03d0}
/\x{03d0}+/i,utf
\x{0392}\x{03b2}\x{03d0}
/\x{0395}+/i,utf
\x{0395}\x{03b5}\x{03f5}
/\x{03b5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
/\x{03f5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
/\x{0398}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03b8}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03d1}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03f4}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{039a}+/i,utf
\x{039a}\x{03ba}\x{03f0}
/\x{03ba}+/i,utf
\x{039a}\x{03ba}\x{03f0}
/\x{03f0}+/i,utf
\x{039a}\x{03ba}\x{03f0}
/\x{03a0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/\x{03c0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/\x{03d6}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/\x{03a1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/\x{03c1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/\x{03f1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/\x{03a3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/\x{03c2}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/\x{03c3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/\x{03a6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/\x{03c6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/\x{03d5}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/\x{03c9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
/\x{03a9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
/\x{2126}+/i,utf
\x{03c9}\x{03a9}\x{2126}
/\x{1e60}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
/\x{1f80}+/i,utf
\x{1f88}\x{1f80}
# Perl 5.12.4 gets these wrong, but 5.15.3 is OK
/\x{004b}+/i,utf
\x{004b}\x{006b}\x{212a}
/\x{006b}+/i,utf
\x{004b}\x{006b}\x{212a}
/\x{212a}+/i,utf
\x{004b}\x{006b}\x{212a}
/\x{0053}+/i,utf
\x{0053}\x{0073}\x{017f}
/\x{0073}+/i,utf
\x{0053}\x{0073}\x{017f}
/\x{017f}+/i,utf
\x{0053}\x{0073}\x{017f}
/^\p{Any}*\d{4}/utf
1234
\= Expect no match
123
/^\X*\w{4}/utf
1234
\= Expect no match
123
/^A\s+Z/utf,ucp
A\x{2005}Z
A\x{85}\x{2005}Z
/^A[\s]+Z/utf,ucp
A\x{2005}Z
A\x{85}\x{2005}Z
/^[[:graph:]]+$/utf,ucp
Letter:ABC
Mark:\x{300}\x{1d172}\x{1d17b}
Number:9\x{660}
Punctuation:\x{66a},;
Symbol:\x{6de}<>\x{fffc}
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
\x{feff}
\x{fff9}\x{fffa}\x{fffb}
\x{110bd}
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
\x{e0001}
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
\= Expect no match
\x{09}
\x{0a}
\x{1D}
\x{20}
\x{85}
\x{a0}
\x{1680}
\x{2028}
\x{2029}
\x{202f}
\x{2065}
\x{3000}
\x{e0002}
\x{e001f}
\x{e0080}
/^[[:print:]]+$/utf,ucp
Space: \x{a0}
\x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
\x{2006}\x{2007}\x{2008}\x{2009}\x{200a}
\x{202f}\x{205f}
\x{3000}
Letter:ABC
Mark:\x{300}\x{1d172}\x{1d17b}
Number:9\x{660}
Punctuation:\x{66a},;
Symbol:\x{6de}<>\x{fffc}
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
\x{202f}
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
\x{feff}
\x{fff9}\x{fffa}\x{fffb}
\x{110bd}
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
\x{e0001}
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
\= Expect no match
\x{09}
\x{1D}
\x{85}
\x{2028}
\x{2029}
\x{2065}
\x{e0002}
\x{e001f}
\x{e0080}
/^[[:punct:]]+$/utf,ucp
\$+<=>^`|~
!\"#%&'()*,-./:;?@[\\]_{}
\x{a1}\x{a7}
\x{37e}
\= Expect no match
abcde
/^[[:^graph:]]+$/utf,ucp
\x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{1680}
\x{2028}\x{2029}\x{202f}\x{2065}
\x{3000}\x{e0002}\x{e001f}\x{e0080}
\= Expect no match
Letter:ABC
Mark:\x{300}\x{1d172}\x{1d17b}
Number:9\x{660}
Punctuation:\x{66a},;
Symbol:\x{6de}<>\x{fffc}
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
\x{feff}
\x{fff9}\x{fffa}\x{fffb}
\x{110bd}
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
\x{e0001}
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
/^[[:^print:]]+$/utf,ucp
\x{09}\x{1D}\x{85}\x{2028}\x{2029}\x{2065}
\x{e0002}\x{e001f}\x{e0080}
\= Expect no match
Space: \x{a0}
\x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
\x{2006}\x{2007}\x{2008}\x{2009}\x{200a}
\x{202f}\x{205f}
\x{3000}
Letter:ABC
Mark:\x{300}\x{1d172}\x{1d17b}
Number:9\x{660}
Punctuation:\x{66a},;
Symbol:\x{6de}<>\x{fffc}
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
\x{202f}
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
\x{feff}
\x{fff9}\x{fffa}\x{fffb}
\x{110bd}
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
\x{e0001}
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
/^[[:^punct:]]+$/utf,ucp
abcde
\= Expect no match
\$+<=>^`|~
!\"#%&'()*,-./:;?@[\\]_{}
\x{a1}\x{a7}
\x{37e}
/[RST]+/i,utf,ucp
Ss\x{17f}
/[R-T]+/i,utf,ucp
Ss\x{17f}
/[q-u]+/i,utf,ucp
Ss\x{17f}
/^s?c/im,utf
scat
# The next four tests are for repeated caseless back references when the
# code unit length of the matched text is different to that of the original
# group in the UTF-8 case.
/^(\x{23a})\1*(.)/i,utf
\x{23a}\x{23a}\x{23a}\x{23a}
\x{23a}\x{2c65}\x{2c65}\x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
/^(\x{23a})\1*(..)/i,utf
\x{23a}\x{2c65}\x{2c65}\x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
/^(\x{23a})\1*(...)/i,utf
\x{23a}\x{2c65}\x{2c65}\x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
/^(\x{23a})\1*(....)/i,utf
\= Expect no match
\x{23a}\x{2c65}\x{2c65}\x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
/[A-`]/i,utf
abcdefghijklmno
/[\S\V\H]/utf
/[^\p{Any}]*+x/utf
x
/[[:punct:]]/utf,ucp
\x{b4}
/[[:^ascii:]]/utf,ucp
\x{100}
\x{200}
\x{300}
\x{37e}
\= Expect no match
aa
99
/[[:^ascii:]\w]/utf,ucp
aa
99
gg
\x{100}
\x{200}
\x{300}
\x{37e}
/[\w[:^ascii:]]/utf,ucp
aa
99
gg
\x{100}
\x{200}
\x{300}
\x{37e}
/[^[:ascii:]\W]/utf,ucp
\x{100}
\x{200}
\= Expect no match
aa
99
gg
\x{37e}
/[^[:^ascii:]\d]/utf,ucp
a
~
\a
\x{7f}
\= Expect no match
0
\x{389}
\x{20ac}
/(?=.*b)\pL/
11bb
/(?(?=.*b)(?=.*b)\pL|.*c)/
11bb
/^\x{123}+?$/utf,no_auto_possess
\x{123}\x{123}\x{123}
/^\x{123}+?$/i,utf,no_auto_possess
\x{123}\x{122}\x{123}
\= Expect no match
\x{123}\x{124}\x{123}
/\N{U+1234}/utf
\x{1234}
/[\N{U+1234}]/utf
\x{1234}
/(\x{1234}) \1/utf
\N{U+1234} \o{11064}
# Test the full list of Unicode "Pattern White Space" characters that are to
# be ignored by /x. The pattern lines below may show up oddly in text editors
# or when listed to the screen. Note that characters such as U+2002, which are
# matched as space by \h and \v are *not* "Pattern White Space".
/A…B/x,utf
AB
/AB/x,utf
A\x{2002}B
\= Expect no match
AB
# -------
/[^\x{100}-\x{ffff}]*[\x80-\xff]/utf
\x{99}\x{99}\x{99}
/[^\x{100}-\x{ffff}ABC]*[\x80-\xff]/utf
\x{99}\x{99}\x{99}
/[^\x{100}-\x{ffff}]*[\x80-\xff]/i,utf
\x{99}\x{99}\x{99}
# Script run tests
/^(*script_run:.{4})/utf
abcd Latin x4
\x{2e80}\x{2fa1d}\x{3041}\x{30a1} Han Han Hiragana Katakana
\x{3041}\x{30a1}\x{3007}\x{3007} Hiragana Katakana Han Han
\x{30a1}\x{3041}\x{3007}\x{3007} Katakana Hiragana Han Han
\x{1100}\x{2e80}\x{2e80}\x{1101} Hangul Han Han Hangul
\x{2e80}\x{3105}\x{2e80}\x{3105} Han Bopomofo Han Bopomofo
\x{02ea}\x{2e80}\x{2e80}\x{3105} Bopomofo-Sk Han Han Bopomofo
\x{3105}\x{2e80}\x{2e80}\x{3105} Bopomofo Han Han Bopomofo
\x{0300}cd! Inherited Latin Latin Common
\x{0391}12\x{03a9} Greek Common-digits Greek
\x{0400}12\x{fe2f} Cyrillic Common-digits Cyrillic
\x{0531}12\x{fb17} Armenian Common-digits Armenian
\x{0591}12\x{fb4f} Hebrew Common-digits Hebrew
\x{0600}12\x{1eef1} Arabic Common-digits Arabic
\x{0600}\x{0660}\x{0669}\x{1eef1} Arabic Arabic-digits Arabic
\x{0700}12\x{086a} Syriac Common-digits Syriac
\x{1200}12\x{ab2e} Ethiopic Common-digits Ethiopic
\x{1680}12\x{169c} Ogham Common-digits Ogham
\x{3041}12\x{3041} Hiragana Common-digits Hiragana
\x{0980}\x{09e6}\x{09e7}\x{0993} Bengali Bengali-digits Bengali
!cde Common Latin Latin Latin
A..B Latin Common Common Latin
0abc Ascii-digit Latin Latin Latin
1\x{0700}\x{0700}\x{0700} Ascii-digit Syriac x 3
\x{1A80}\x{1A80}\x{1a40}\x{1a41} Tai Tham Hora digits, letters
\= Expect no match
a\x{370}bcd Latin Greek Latin Latin
\x{1100}\x{02ea}\x{02ea}\x{02ea} Hangul Bopomofo x3
\x{02ea}\x{02ea}\x{02ea}\x{1100} Bopomofo x3 Hangul
\x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
\x{0391}\x{09e6}\x{09e7}\x{03a9} Greek Bengali digits Greek
\x{0600}7\x{0669}\x{1eef1} Arabic ascii-digit Arabic-digit Arabic
\x{0600}\x{0669}7\x{1eef1} Arabic Arabic-digit ascii-digit Arabic
A5\x{ff19}B Latin Common-ascii/notascii-digits Latin
\x{0300}cd\x{0391} Inherited Latin Latin Greek
!cd\x{0391} Common Latin Latin Greek
\x{1A80}\x{1A90}\x{1a40}\x{1a41} Tai Tham Hora digit, Tham digit, letters
A\x{1d7ce}\x{1d7ff}B Common fancy-common-2-sets-digits Common
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
/^(*sr:.{4}|..)/utf
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
/^(*atomic_script_run:.{4}|..)/utf
\= Expect no match
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
/^(*asr:.*)/utf
\= Expect no match
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
/^(?>(*sr:.*))/utf
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
/^(*sr:.*)/utf
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
\x{10fffd}\x{10fffd}\x{10fffd} Private use (Unknown)
/^(*sr:\x{2e80}*)/utf
\x{2e80}\x{2e80}\x{3105} Han Han Bopomofo
/^(*sr:\x{2e80}*)\x{2e80}/utf
\x{2e80}\x{2e80}\x{3105} Han Han Bopomofo
/^(*sr:.*)Test/utf
Test script run on an empty string
/^(*sr:(.{2})){2}/utf
\x{0600}7\x{0669}\x{1eef1} Arabic ascii-digit Arabic-digit Arabic
\x{1A80}\x{1A80}\x{1a40}\x{1a41} Tai Tham Hora digits, letters
\x{1A80}\x{1a40}\x{1A90}\x{1a41} Tai Tham Hora digit, letter, Tham digit, letter
\= Expect no match
\x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
/^(*sr:\S*)/utf
\x{1cf4}\x{20f0}\x{900}\x{11305} [Dev,Gran,Kan] [Dev,Gran,Lat] Dev Gran
\x{1cf4}\x{20f0}\x{11305}\x{900} [Dev,Gran,Kan] [Dev,Gran,Lat] Gran Dev
\x{1cf4}\x{20f0}\x{900}ABC [Dev,Gran,Kan] [Dev,Gran,Lat] Dev Lat
\x{1cf4}\x{20f0}ABC [Dev,Gran,Kan] [Dev,Gran,Lat] Lat
\x{20f0}ABC [Dev,Gran,Lat] Lat
XYZ\x{20f0}ABC Lat [Dev,Gran,Lat] Lat
\x{a36}\x{a33}\x{900} [Dev,...] [Dev,...] Dev
\x{3001}\x{2e80}\x{3041}\x{30a1} [Bopo, Han, etc] Han Hira Kata
\x{3001}\x{30a1}\x{2e80}\x{3041} [Bopo, Han, etc] Kata Han Hira
\x{3001}\x{3105}\x{2e80}\x{1101} [Bopo, Han, etc] Bopomofo Han Hangul
\x{3105}\x{3001}\x{2e80}\x{1101} Bopomofo [Bopo, Han, etc] Han Hangul
\x{3031}\x{3041}\x{30a1}\x{2e80} [Hira Kata] Hira Kata Han
\x{060c}\x{06d4}\x{0600}\x{10d00}\x{0700} [Arab Rohg Syrc Thaa] [Arab Rohg] Arab Rohg Syrc
\x{060c}\x{06d4}\x{0700}\x{0600}\x{10d00} [Arab Rohg Syrc Thaa] [Arab Rohg] Syrc Arab Rohg
\x{2e80}\x{3041}\x{3001}\x{3031}\x{2e80} Han Hira [Bopo, Han, etc] [Hira Kata] Han
/(?<!)(*sr:)/
/(?<!X(*sr:B)C)/
/(?<=abc(?=X(*sr:BCY)Z)XBCYZ)./
abcXBCYZ!
/(?<=abc(?=X(*sr:BXY)CCC)XBXYCCC)./
abcXBXYCCC!
/^(*sr:\S*)/utf
\x{10d00}\x{10d00}\x{06d4} Rohingya Rohingya Arabic-full-stop
\x{06d4}\x{10d00}\x{10d00} Arabic-full-stop Rohingya Rohingya
\x{10d00}\x{10d00}\x{0363} Rohingya Rohingya Inherited-extend-Latin
\x{0363}\x{10d00}\x{10d00} Inherited-extend-Latin Rohingya Rohingya
AB\x{0363} Latin Latin Inherited-extend-Latin
\x{0363}AB Inherited-extend-Latin Latin Latin
AB\x{1cf7} Latin Latin Common-extended-Beng
\x{1cf7}AB Common-extend-Beng Latin Latin
\x{1cf7}\x{0993} Common-extend-Beng Bengali
A\x{1abe}BC Test enclosing mark
\x{0370}\x{1abe}\x{0371} Which can occur with any script (Greek here)
\x{3001}\x{adf9}\x{3001} [.. Hangul ..] Hangul [.. Hangul ..]
\x{3400}\x{3001}XXX Han [Han etc.]
\x{3400}\x{1cd5} Han [Bengali Devanagari]
\x{ac01}\x{3400} Hangul [.. Hangul ..]
\x{ac01}\x{1cd5} Hangul [Bengali Devanagari]
\x{102e0}\x{06d4}\x{1ee4d} [Arabic Coptic] [Arab Rohingya] Arabic
\x{102e0}\x{06d4}\x{2cc9} [Arabic Coptic] [Arab Rohingya] Coptic
\x{102e0}\x{06d4}\x{10d30} [Arabic Coptic] [Arab Rohingya] Rohingya
# Test loop breaking for empty string match
/^(*sr:A|)*BCD/utf
AABCD
ABCD
BCD
# The use of (*ACCEPT) breaks script run checking
/^(*sr:.*(*ACCEPT)ZZ)/utf
\x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
# -------
# Test group names containing non-ASCII letters and digits
/(?'ABáC'...)\g{ABáC}/utf
abcabcdefg
/(?'XʰABC'...)/utf
xyzpq
/(?'XאABC'...)/utf
12345
/(?'XᾈABC'...)/utf
%^&*(...
/(?'𐨐ABC'...)/utf
abcde
/^(?'אABC'...)(?&אABC)(?P=אABC)/utf
123123123456
/^(?'אABC'...)(?&אABC)/utf
123123123456
/\X*/
\xF3aaa\xE4\xEA\xEB\xFEa
/Я/i,utf
\x{42f}
\x{44f}
/(?=Я)/i,utf
\x{42f}
\x{44f}
# -----------------------------------------------------------------------------
# Tests for bidi control and bidi class properties.
/\p{ bidi_control }/utf
-->\x{202c}<--
/\p{bidicontrol}+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{bidic}+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{bidi_control}++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidi_c}]/utf
-->\x{202c}<--
/[\p{bidicontrol}]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidicontrol}]+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidicontrol}]++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidicontrol}<>]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\P{bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{^bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{bidi class = al}/utf
-->\x{061D}<--
/\p{bc = al}+/utf
-->\x{061D}\x{061e}\x{061f}<--
/\p{bidi_class : AL}+?/utf
-->\x{061D}\x{061e}\x{061f}<--
/\p{Bidi_Class : AL}++/utf
-->\x{061D}\x{061e}\x{061f}<--
/\p{b_c = aN}+/utf
-->\x{061D}\x{0602}\x{0604}\x{061f}<--
/\p{bidi class = B}+/utf
-->\x{0a}\x{0d}\x{01c}\x{01e}\x{085}\x{2029}<--
/\p{bidi class:BN}+/utf
-->\x{0}\x{08}\x{200c}\x{fffe}\x{dfffe}\x{10ffff}<--
/\p{bidiclass:cs}+/utf
-->,.\x{060c}\x{ff1a}<--
/\p{bidiclass:En}+/utf
-->09\x{b2}\x{2074}\x{1fbf9}<--
/\p{bidiclass:es}+/utf
==>+-\x{207a}\x{ff0d}<==
/\p{bidiclass:et}+/utf
-->#\{24}%\x{a2}\x{A838}\x{1e2ff}<--
/\p{bidiclass:FSI}+/utf
-->\x{2068}<--
/\p{bidi class:L}+/utf
-->ABC<--
/\P{bidi class:L}+/utf
-->ABC<--
/\p{bidi class:LRE}+\p{bidiclass=lri}*\p{bidiclass:lro}/utf
-->\x{202a}\x{2066}\x{202d}<--
/\p{bidi class:NSM}+/utf
-->\x{9bc}\x{a71}\x{e31}<--
/\p{bidi class:ON}+/utf
-->\x{21}'()*;@\x{384}\x{2039}<=-
/\p{bidiclass:pdf}\p{bidiclass:pdi}/utf
-->\x{202c}\x{2069}<--
/\p{bidi class:R}+/utf
-->\x{590}\x{5c6}\x{200f}\x{10805}<--
/\p{bidi class:RLE}+\p{bidi class:RLI}*\p{bidi class:RLO}+/utf
-->\x{202b}\x{2067}\x{202e}<--
/\p{bidi class:S}+\p{bidiclass:WS}+/utf
-->\x{9}\x{b}\x{1f} \x{c} \x{2000} \x{3000}<--
# -----------------------------------------------------------------------------
/[\p{taml}\p{sc:ugar}]+/utf
\x{0b82}\x{10380}
/^[\p{sc:Arabic}]/utf
\= Expect no match
\x{650}
\x{651}
\x{652}
\x{653}
\x{654}
\x{655}
# -----------------------------------------------------------------------------
# Tests for newly-added Boolean Properties
/\p{ahex}\p{asciihexdigit}/utf
>4F<
/\p{alpha}\p{alphabetic}/g,utf
>AB<>\x{148}\x{1234}
/\p{ascii}\p{ascii}/g,utf
>AB<>\x{148}\x{1234}
/\p{Bidi_C}\p{bidicontrol}/g,utf
>\x{202d}\x{2069}<
/\p{Bidi_M}\p{bidimirrored}/g,utf
>\x{202d}\x{2069}<>\x{298b}\x{bb}<
/\p{cased}\p{cased}/g,utf
>AN<>\x{149}\x{120}<
/\p{caseignorable}\p{ci}/g,utf
>AN<>\x{60}\x{859}<
/\p{changeswhencasefolded}\p{cwcf}/g,utf
>AN<>\x{149}\x{120}<
/\p{changeswhencasemapped}\p{cwcm}/g,utf
>AN<>\x{149}\x{120}<
/\p{changeswhenlowercased}\p{cwl}/g,utf
>AN<>\x{149}\x{120}<>yz<
/\p{changeswhenuppercased}\p{cwu}/g,utf
>AN<>\x{149}\x{120}<>yz<
/\p{changeswhentitlecased}\p{cwt}/g,utf
>AN<>\x{149}\x{120}<>yz<
/\p{dash}\p{dash}/g,utf
>\x{2d}\x{1400}<>yz<
/\p{defaultignorablecodepoint}\p{di}/g,utf
>AN<>\x{ad}\x{e0fff}<>yz<
/\p{deprecated}\p{dep}/g,utf
>AN<>\x{149}\x{e0001}<>yz<
/\p{diacritic}\p{dia}/g,utf
>AN<>\x{f84}\x{5e}<>yz<
/\p{emojicomponent}\p{ecomp}/g,utf
>AN<>\x{200d}\x{e007f}<>yz<
/\p{emojimodifier}\p{emod}/g,utf
>AN<>\x{1f3fb}\x{1f3ff}<>yz<
/\p{emojipresentation}\p{epres}/g,utf
>AN<>\x{2653}\x{1f6d2}<>yz<
/\p{extender}\p{ext}/g,utf
>AN<>\x{1e944}\x{b7}<>yz<
/\p{extendedpictographic}\p{extpict}/g,utf
>AN<>\x{26cf}\x{ae}<>yz<
/\p{graphemebase}\p{grbase}/g,utf
>AN<>\x{10f}\x{60}<>yz<
/\p{graphemeextend}\p{grext}/g,utf
>AN<>\x{300}\x{b44}<>yz<
/\p{hexdigit}\p{hex}/g,utf
>AF23<>\x{ff46}\x{ff10}<>yz<
/\p{idcontinue}\p{idc}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
/\p{ideographic}\p{ideo}/g,utf
>AF23<>\x{30000}\x{3006}<>yz<
/\p{idstart}\p{ids}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
/\p{idsbinaryoperator}\p{idsb}/g,utf
>AF23<>\x{2ff0}\x{2ffb}<>yz<\x{2ff2}\x{2ff1}
/\p{idstrinaryoperator}\p{idst}/g,utf
>AF23<>\x{2ff2}\x{2ff3}<>yz<
/\p{Join Control}\p{joinc}/g,utf
>AF23<>\x{200c}\x{200d}<>yz<
/\p{logical_order_exception}\p{loe}/g,utf
>AF23<>\x{e40}\x{aabc}<>yz<
/\p{Lowercase}\p{lower}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
/\p{math}\p{math}/g,utf
>AF23<>\x{2215}\x{2b}<>yz<
/\p{Non Character Code Point}\p{nchar}/g,utf
>AF23<>\x{10ffff}\x{fdd0}<>yz<
/\p{patternsyntax}\p{patsyn}/g,utf
>AF23<>\x{21cd}\x{21}<>yz<
/\p{patternwhitespace}\p{patws}/g,utf
>AF23<>\x{2029}\x{85}<>yz<
/\p{prependedconcatenationmark}\p{pcm}/g,utf
>AF23<>\x{600}\x{110cd}<>yz<
/\p{quotationmark}\p{qmark}/g,utf
>AF23<>\x{ff63}\x{22}<>yz<
/\p{radical}\p{radical}/g,utf
>AF23<>\x{2fd5}\x{2e80}<>yz<
/\p{regionalindicator}\p{ri}/g,utf
>AF23<>\x{1f1e6}\x{1f1ff}<>yz<
/=\p{whitespace}\p{space}\p{wspace}=/g,utf
>AF23<=\x{d}\x{1680}\x{3000}=>yz<
/\p{sentenceterminal}\p{sterm}/g,utf
>AF23<>\x{1da88}\x{2e}<>yz<
/\p{terminalpunctuation}\p{term}/g,utf
>AF23<>\x{1da88}\x{2e}<>yz<
/\p{unified ideograph}\p{uideo}/g,utf
>AF23<>\x{30000}\x{3400}<>yz<
/\p{UPPERcase}\p{upper}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
/\p{variationselector}\p{vs}/g,utf
>AF23<>\x{180b}\x{e01ef}<>yz<
/\p{xidcontinue}\p{xidc}/g,utf
>AF23<>\x{146}\x{30}<>yz<
# -----------------------------------------------------------------------------
# Variable-length lookbehinds.
/(?<=áb?c).../g,utf
ábcdèfgácxyz
/(?<=PQR|áb?c).../g,utf
ábcdèfgácxyzPQR123
/(?<=áb?c|PQR).../g,utf
ábcdèfgácxyzPQR123
/(?<=PQ|áb?c).../g,utf
ábcdèfgácxyzPQR123
/(?<=áb?c|PQ).../g,utf
ábcdèfgácxyzPQR123
/(?<=á(b?c|d?è?è)f)X./g,utf
ácfX1zzzáèfX2zzzádèèfX3zzzX4zzz
/(?<!á(b?c|d?è?è)f)X./g,utf
ácfX1zzzáèfX2zzzádèèfX3zzzX4zzz
/(?(?<=áb?c)d|è)/utf
ábcdèfg
ácdèfg
áxdèfg
/(?<=\d{2,3}|áBC)./utf
áBCD
/(?<=á(b?c){3}d)X/utf
ZXácbccdXYZ
/(?<=á(b?c){0}d)X/utf
ZXádXYZ
/(?<=á?(b?c){0}d)X./utf
ZXádXYZ
# --------------------------------------------------------------------------
/\N{ U+1234 }/utf
\x{1234}
/\o{ 1234 }/utf
x\o{1234}y
/\x{ 1234 }/utf
x\x{1234}y
/\p{ L }/
23AB56
/\w+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/[\w]+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/[[:word:]]+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/[[:xdigit:]]+/utf,ucp
--123ef\x{ff10}\x{ff19}\x{ff21}\x{ff26}\x{ff1a}
/\b.+?\b/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/caf\B.+?\B/utf,ucp
--cafe\x{300}_au\x{203f}lait!
# --------------------------------------------------------------------------
# Case-independent matching property tests added after changing PCRE2 to be
# compatible with Perl. All three cases (upper, lower, title) conflate.
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/utf
>AbbD<
>Abb\x{01c5}<
\= Expect no match
>aBBd<
>aB!!<
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/i,utf
>aB!!<
>\x{01c5}B!!<
\= Expect no match
>AbbD<
>aBBd<
>Abb\x{01c5}<
/[.\p{Lu}][.\p{Ll}][.\P{Lu}][.\P{Ll}]/i,utf
>aB!!<
\= Expect no match
>AbbD<
>aBBd<
>Abb\x{01c5}<
/[\p{Lt}\x{36b}][\P{Lt}\x{10a0}]/i,utf
>A!<
>\x{3c9}\x{58d}<
>\x{413}\x{940}<
\= Expect no match
\x{3c9}\x{3c9}
\x{58d}\x{58d}
\x{413}\x{413}
\x{940}\x{940}
/^\p{Lt}+/i,utf
\x{1c5}AB
# --------------------------------------------------------------------------
/\p{ ^ L u }/
AbCd
# hex
/c3 b1/hex,utf
\N{U+00F1}
/[^\P{Lu}1]/i,utf,ucp
a
A
\x{3a3}
\x{3c3}
\= Expect no match
1
2
/[^\P{Lu}1]/utf,ucp
A
\x{3a3}
\= Expect no match
1
2
a
\x{3c3}
/[\P{Lu}1]/i,utf,ucp
1
2
\= Expect no match
a
A
\x{3a3}
\x{3c3}
/[\P{Lu}1]/utf,ucp
1
2
a
\x{3c3}
\= Expect no match
A
\x{3a3}
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
/(?[\p{L} - \p{Lu}])/
a
\= Expect no match
A
1
/(?[\p{L} & \p{Lu}])/
A
\= Expect no match
a
1
/(?[[\p{Lu}z] ^ [\p{Ll}G]])/
A
p
\= Expect no match
G
z
1
/(?[\p{Ll} | \p{Nd}])/
a
1
\= Expect no match
A
/(?[\p{Ll} + [\p{Nd}]])/
a
1
\= Expect no match
A
/(?[ ![\p{Nd}z] ])/
_
Z
\= Expect no match
1
z
/(?[ \P{Nd} + [2] ])/
_
Z
2
\= Expect no match
1
3
/(?[ ![\P{Nd}] ])/
1
2
\= Expect no match
_
z
# caseless tests
/(?[ \p{Lu} ^ \p{Ll} ])/
a
A
\= Expect no match
_
1
/(?[ [\p{Lu}1] ^ \p{Ll} ])/i
1
\= Expect no match
a
A
_
/(?[ [\p{Lu}1] & [\p{Ll}1] ])/
1
\= Expect no match
a
A
_
2
/(?[ [\p{Lu}1] & [\p{Ll}1] ])/i
a
A
1
\= Expect no match
_
2
/(?[ \p{Lu} + \p{Ll} & [a-z] ])/utf
\x{0411}
a
A
\= Expect no match
\x{0431}
/(?[ (\p{Lu} + \p{Ll}) & [a-z] ])/utf
a
\= Expect no match
\x{0411}
\x{0431}
A
/(?[ [a-z] & \p{Lu} + \p{Ll} ])/utf
a
\x{0431}
\= Expect no match
\x{0411}
A
/(?[ [a-z] & (\p{Lu} + \p{Ll}) ])/utf
a
\= Expect no match
\x{0431}
\x{0411}
A
# --------------
# End of testinput4

3603
3rd/pcre2/testdata/testinput5 vendored Normal file
View File

@@ -0,0 +1,3603 @@
# This set of tests checks the API, internals, and non-Perl stuff for UTF
# support, including Unicode properties. However, tests that give different
# results in 8-bit, 16-bit, and 32-bit modes are excluded (see tests 10 and
# 12).
#newline_default lf any anycrlf
# PCRE2 and Perl disagree about the characteristics of certain Unicode
# characters. For example, 061C was considered by Perl to be Arabic, though
# it was not listed as such in the Unicode Scripts.txt file for Unicode 8.
# However, it *is* in that file for Unicode 10, but when I came to re-check,
# Perl had changed in the meantime, with 5.026 not recognizing it as Arabic.
# 2066-2069 are graphic and printable according to Perl, though they are
# actually "isolate" control characters. That is why the following tests are
# here rather than in test 4.
/^[\p{Arabic}]/utf
\x{061c}
/^[[:graph:]]+$/utf,ucp
\= Expect no match
\x{61c}
\x{2066}
\x{2067}
\x{2068}
\x{2069}
/^[[:print:]]+$/utf,ucp
\= Expect no match
\x{61c}
\x{2066}
\x{2067}
\x{2068}
\x{2069}
/^[[:^graph:]]+$/utf,ucp
\x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{61c}\x{1680}
\x{2028}\x{2029}\x{202f}\x{2065}\x{2066}\x{2067}\x{2068}\x{2069}
/^[[:^print:]]+$/utf,ucp
\x{09}\x{1D}\x{85}\x{61c}\x{2028}\x{2029}\x{2065}\x{2066}\x{2067}
\x{2068}\x{2069}
# Perl does not consider U+180e to be a space character. It is true that it
# does not appear in the Unicode PropList.txt file as such, but in many other
# sources it is listed as a space, and has been treated as such in PCRE for
# a long time.
/^>[[:blank:]]*/utf,ucp
>\x{20}\x{a0}\x{1680}\x{180e}\x{2000}\x{202f}\x{9}\x{b}\x{2028}
/^A\s+Z/utf,ucp
A\x{85}\x{180e}\x{2005}Z
/^A[\s]+Z/utf,ucp
A\x{2005}Z
A\x{85}\x{2005}Z
/^[[:graph:]]+$/utf,ucp
\= Expect no match
\x{180e}
/^[[:print:]]+$/utf,ucp
\x{180e}
/^[[:^graph:]]+$/utf,ucp
\x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{61c}\x{1680}\x{180e}
/^[[:^print:]]+$/utf,ucp
\= Expect no match
\x{180e}
# End of U+180E tests.
# ---------------------------------------------------------------------
# Use no_start_optimize because the first code unit is different in 8-bit from
# the wider modes.
/\65535/IB,utf,no_start_optimize
/\65536/IB,utf,no_start_optimize
/\x{110000}/IB,utf
/\o{4200000}/IB,utf
/\x{ffffffff}/utf
/\o{37777777777}/utf
/\x{100000000}/utf
/\o{77777777777}/utf
/\x{d800}/utf
/\o{154000}/utf
/\x{dfff}/utf
/\o{157777}/utf
/\x{d7ff}/utf
/\o{153777}/utf
/\x{e000}/utf
/\o{170000}/utf
/^\x{100}a\x{1234}/utf
\x{100}a\x{1234}bcd
/\x{0041}\x{2262}\x{0391}\x{002e}/IB,utf
\x{0041}\x{2262}\x{0391}\x{002e}
/.{3,5}X/IB,utf
\x{212ab}\x{212ab}\x{212ab}\x{861}X
/.{3,5}?/IB,utf
\x{212ab}\x{212ab}\x{212ab}\x{861}
/^[ab]/IB,utf
bar
\= Expect no match
c
\x{ff}
\x{100}
/\x{100}*(\d+|"(?1)")/utf
1234
"1234"
\x{100}1234
"\x{100}1234"
\x{100}\x{100}12ab
\x{100}\x{100}"12"
\= Expect no match
\x{100}\x{100}abcd
/\x{100}*/IB,utf
/a\x{100}*/IB,utf
/ab\x{100}*/IB,utf
/[\x{200}-\x{100}]/utf
/[Ā-Ą]/utf
\x{100}
\x{104}
\= Expect no match
\x{105}
\x{ff}
/[\xFF]/IB
>\xff<
/[^\xFF]/IB
/[Ä-Ü]/utf
Ö # Matches without Study
\x{d6}
/[Ä-Ü]/utf
Ö <-- Same with Study
\x{d6}
/[\x{c4}-\x{dc}]/utf
Ö # Matches without Study
\x{d6}
/[\x{c4}-\x{dc}]/utf
Ö <-- Same with Study
\x{d6}
/[^\x{100}]abc(xyz(?1))/IB,utf
/(\x{100}(b(?2)c))?/IB,utf
/(\x{100}(b(?2)c)){0,2}/IB,utf
/(\x{100}(b(?1)c))?/IB,utf
/(\x{100}(b(?1)c)){0,2}/IB,utf
/\W/utf
A.B
A\x{100}B
/\w/utf
\x{100}X
# Use no_start_optimize because the first code unit is different in 8-bit from
# the wider modes.
/^\ሴ/IB,utf,no_start_optimize
/()()()()()()()()()()
()()()()()()()()()()
()()()()()()()()()()
()()()()()()()()()()
A (x) (?41) B/x,utf
AxxB
/^[\x{100}\E-\Q\E\x{150}]/B,utf
/^[\QĀ\E-\QŐ\E]/B,utf
/^abc./gmx,newline=any,utf
abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
/abc.$/gmx,newline=any,utf
abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
/^a\Rb/bsr=unicode,utf
a\nb
a\rb
a\r\nb
a\x0bb
a\x0cb
a\x{85}b
a\x{2028}b
a\x{2029}b
\= Expect no match
a\n\rb
/^a\R*b/bsr=unicode,utf
ab
a\nb
a\rb
a\r\nb
a\x0bb
a\x0c\x{2028}\x{2029}b
a\x{85}b
a\n\rb
a\n\r\x{85}\x0cb
/^a\R+b/bsr=unicode,utf
a\nb
a\rb
a\r\nb
a\x0bb
a\x0c\x{2028}\x{2029}b
a\x{85}b
a\n\rb
a\n\r\x{85}\x0cb
\= Expect no match
ab
/^a\R{1,3}b/bsr=unicode,utf
a\nb
a\n\rb
a\n\r\x{85}b
a\r\n\r\nb
a\r\n\r\n\r\nb
a\n\r\n\rb
a\n\n\r\nb
\= Expect no match
a\n\n\n\rb
a\r
/\H\h\V\v/utf
X X\x0a
X\x09X\x0b
\= Expect no match
\x{a0} X\x0a
/\H*\h+\V?\v{3,4}/utf
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
\x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
\x09\x20\x{a0}\x0a\x0b\x0c
\= Expect no match
\x09\x20\x{a0}\x0a\x0b
/\H\h\V\v/utf
\x{3001}\x{3000}\x{2030}\x{2028}
X\x{180e}X\x{85}
\= Expect no match
\x{2009} X\x0a
/\H*\h+\V?\v{3,4}/utf
\x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
\x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
\x09\x20\x{202f}\x0a\x0b\x0c
\= Expect no match
\x09\x{200a}\x{a0}\x{2028}\x0b
/[\h]/B,utf
>\x{1680}
/[\h]{3,}/B,utf
>\x{1680}\x{180e}\x{2000}\x{2003}\x{200a}\x{202f}\x{205f}\x{3000}<
/[\v]/B,utf
/[\H]/B,utf
/[\V]/B,utf
/.*$/newline=any,utf
\x{1ec5}
/a\Rb/I,bsr=anycrlf,utf
a\rb
a\nb
a\r\nb
\= Expect no match
a\x{85}b
a\x0bb
/a\Rb/I,bsr=unicode,utf
a\rb
a\nb
a\r\nb
a\x{85}b
a\x0bb
/a\R?b/I,bsr=anycrlf,utf
a\rb
a\nb
a\r\nb
\= Expect no match
a\x{85}b
a\x0bb
/a\R?b/I,bsr=unicode,utf
a\rb
a\nb
a\r\nb
a\x{85}b
a\x0bb
/.*a.*=.b.*/utf,newline=any
QQQ\x{2029}ABCaXYZ=!bPQR
\= Expect no match
a\x{2029}b
\x61\xe2\x80\xa9\x62
/[[:a\x{100}b:]]/utf
/[\p{InvalidOrBadProperty}]/
/a[^]b/utf,allow_empty_class,match_unset_backref
a\x{1234}b
a\nb
\= Expect no match
ab
/a[^]+b/utf,allow_empty_class,match_unset_backref
aXb
a\nX\nX\x{1234}b
\= Expect no match
ab
/(\x{de})\1/
\x{de}\x{de}
/X/newline=any,utf,firstline
A\x{1ec5}ABCXYZ
/Xa{2,4}b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/Xa{2,4}?b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/Xa{2,4}+b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X\x{123}{2,4}b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X\x{123}{2,4}?b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X\x{123}{2,4}+b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X\x{123}{2,4}b/utf
\= Expect no match
Xx\=ps
X\x{123}x\=ps
X\x{123}\x{123}x\=ps
X\x{123}\x{123}\x{123}x\=ps
X\x{123}\x{123}\x{123}\x{123}x\=ps
/X\x{123}{2,4}?b/utf
\= Expect no match
Xx\=ps
X\x{123}x\=ps
X\x{123}\x{123}x\=ps
X\x{123}\x{123}\x{123}x\=ps
X\x{123}\x{123}\x{123}\x{123}x\=ps
/X\x{123}{2,4}+b/utf
\= Expect no match
Xx\=ps
X\x{123}x\=ps
X\x{123}\x{123}x\=ps
X\x{123}\x{123}\x{123}x\=ps
X\x{123}\x{123}\x{123}\x{123}x\=ps
/X\d{2,4}b/utf
X\=ps
X3\=ps
X33\=ps
X333\=ps
X3333\=ps
/X\d{2,4}?b/utf
X\=ps
X3\=ps
X33\=ps
X333\=ps
X3333\=ps
/X\d{2,4}+b/utf
X\=ps
X3\=ps
X33\=ps
X333\=ps
X3333\=ps
/X\D{2,4}b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X\D{2,4}?b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X\D{2,4}+b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X\D{2,4}b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X\D{2,4}?b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X\D{2,4}+b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X[abc]{2,4}b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X[abc]{2,4}?b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X[abc]{2,4}+b/utf
X\=ps
Xa\=ps
Xaa\=ps
Xaaa\=ps
Xaaaa\=ps
/X[abc\x{123}]{2,4}b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X[abc\x{123}]{2,4}?b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X[abc\x{123}]{2,4}+b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X[^a]{2,4}b/utf
X\=ps
Xz\=ps
Xzz\=ps
Xzzz\=ps
Xzzzz\=ps
/X[^a]{2,4}?b/utf
X\=ps
Xz\=ps
Xzz\=ps
Xzzz\=ps
Xzzzz\=ps
/X[^a]{2,4}+b/utf
X\=ps
Xz\=ps
Xzz\=ps
Xzzz\=ps
Xzzzz\=ps
/X[^a]{2,4}b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X[^a]{2,4}?b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/X[^a]{2,4}+b/utf
X\=ps
X\x{123}\=ps
X\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\=ps
X\x{123}\x{123}\x{123}\x{123}\=ps
/(Y)X\1{2,4}b/utf
YX\=ps
YXY\=ps
YXYY\=ps
YXYYY\=ps
YXYYYY\=ps
/(Y)X\1{2,4}?b/utf
YX\=ps
YXY\=ps
YXYY\=ps
YXYYY\=ps
YXYYYY\=ps
/(Y)X\1{2,4}+b/utf
YX\=ps
YXY\=ps
YXYY\=ps
YXYYY\=ps
YXYYYY\=ps
/(\x{123})X\1{2,4}b/utf
\x{123}X\=ps
\x{123}X\x{123}\=ps
\x{123}X\x{123}\x{123}\=ps
\x{123}X\x{123}\x{123}\x{123}\=ps
\x{123}X\x{123}\x{123}\x{123}\x{123}\=ps
/(\x{123})X\1{2,4}?b/utf
\x{123}X\=ps
\x{123}X\x{123}\=ps
\x{123}X\x{123}\x{123}\=ps
\x{123}X\x{123}\x{123}\x{123}\=ps
\x{123}X\x{123}\x{123}\x{123}\x{123}\=ps
/(\x{123})X\1{2,4}+b/utf
\x{123}X\=ps
\x{123}X\x{123}\=ps
\x{123}X\x{123}\x{123}\=ps
\x{123}X\x{123}\x{123}\x{123}\=ps
\x{123}X\x{123}\x{123}\x{123}\x{123}\=ps
/\bthe cat\b/utf
the cat\=ps
the cat\=ph
/abcd*/utf
xxxxabcd\=ps
xxxxabcd\=ph
/abcd*/i,utf
xxxxabcd\=ps
xxxxabcd\=ph
XXXXABCD\=ps
XXXXABCD\=ph
/abc\d*/utf
xxxxabc1\=ps
xxxxabc1\=ph
/(a)bc\1*/utf
xxxxabca\=ps
xxxxabca\=ph
/abc[de]*/utf
xxxxabcde\=ps
xxxxabcde\=ph
/X\W{3}X/utf
X\=ps
/\sxxx\s/utf,tables=2
AB\x{85}xxx\x{a0}XYZ
AB\x{a0}xxx\x{85}XYZ
/\S \S/utf,tables=2
\x{a2} \x{84}
'A#хц'Bx,newline=any,utf
'A#хц
PQ'Bx,newline=any,utf
/a+#хaa
z#XX?/Bx,newline=any,utf
/a+#хaa
z#х?/Bx,newline=any,utf
/\g{A}xxx#bXX(?'A'123)
(?'A'456)/Bx,newline=any,utf
/\g{A}xxx#bх(?'A'123)
(?'A'456)/Bx,newline=any,utf
/^\cģ/utf
/(\R*)(.)/s,utf
\r\n
\r\r\n\n\r
\r\r\n\n\r\n
/(\R)*(.)/s,utf
\r\n
\r\r\n\n\r
\r\r\n\n\r\n
/[^\x{1234}]+/Ii,utf
/[^\x{1234}]+?/Ii,utf
/[^\x{1234}]++/Ii,utf
/[^\x{1234}]{2}/Ii,utf
/f.*/
for\=ph
/f.*/s
for\=ph
/f.*/utf
for\=ph
/f.*/s,utf
for\=ph
/\x{d7ff}\x{e000}/utf
/\x{d800}/utf
/\x{dfff}/utf
/\h+/utf
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\x{a0}\x{2000}
/[\h\x{e000}]+/B,utf
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\x{a0}\x{2000}
/\H+/utf
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
\x{a0}\x{3000}\x{9f}\x{a1}\x{2fff}\x{3001}
/[\H\x{d7ff}]+/B,utf
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
\x{a0}\x{3000}\x{9f}\x{a1}\x{2fff}\x{3001}
/\v+/utf
\x{2027}\x{2030}\x{2028}\x{2029}
\x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
/[\v\x{e000}]+/B,utf
\x{2027}\x{2030}\x{2028}\x{2029}
\x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
/\V+/utf
\x{2028}\x{2029}\x{2027}\x{2030}
\x{85}\x0a\x0b\x0c\x0d\x09\x0e\x{84}\x{86}
/[\V\x{d7ff}]+/B,utf
\x{2028}\x{2029}\x{2027}\x{2030}
\x{85}\x0a\x0b\x0c\x0d\x09\x0e\x{84}\x{86}
/\R+/bsr=unicode,utf
\x{2027}\x{2030}\x{2028}\x{2029}
\x09\x0e\x{84}\x{86}\x{85}\x0a\x0b\x0c\x0d
/(..)\1/utf
ab\=ps
aba\=ps
abab\=ps
/(..)\1/i,utf
ab\=ps
abA\=ps
aBAb\=ps
/(..)\1{2,}/utf
ab\=ps
aba\=ps
abab\=ps
ababa\=ps
ababab\=ps
ababab\=ph
abababa\=ps
abababa\=ph
/(..)\1{2,}/i,utf
ab\=ps
aBa\=ps
aBAb\=ps
AbaBA\=ps
abABAb\=ps
aBAbaB\=ph
abABabA\=ps
abaBABa\=ph
/(..)\1{2,}?x/i,utf
ab\=ps
abA\=ps
aBAb\=ps
abaBA\=ps
abAbaB\=ps
abaBabA\=ps
abAbABaBx\=ps
/./utf,newline=crlf
\r\=ps
\r\=ph
/.{2,3}/utf,newline=crlf
\r\=ps
\r\=ph
\r\r\=ps
\r\r\=ph
\r\r\r\=ps
\r\r\r\=ph
/.{2,3}?/utf,newline=crlf
\r\=ps
\r\=ph
\r\r\=ps
\r\r\=ph
\r\r\r\=ps
\r\r\r\=ph
/[^\x{100}][^\x{1234}][^\x{ffff}][^\x{10000}][^\x{10ffff}]/B,utf
/[^\x{100}][^\x{1234}][^\x{ffff}][^\x{10000}][^\x{10ffff}]/Bi,utf
/[^\x{100}]*[^\x{10000}]+[^\x{10ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{fffff}]{5,6}+/B,utf
/[^\x{100}]*[^\x{10000}]+[^\x{10ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{fffff}]{5,6}+/Bi,utf
/(?<=\x{1234}\x{1234})\bxy/I,utf
/(?<!^)ETA/utf
\= Expect no match
ETA
/\u0100/B,utf,alt_bsux,allow_empty_class,match_unset_backref
/[\u0100-\u0200]/B,utf,alt_bsux,allow_empty_class,match_unset_backref
/\ud800/utf,alt_bsux,allow_empty_class,match_unset_backref
/^\u{0000000000010ffff}/utf,extra_alt_bsux
\x{10ffff}
/\u{ 1bb1}/utf,extra_alt_bsux
u{ 1bb1}
\= Expect no match
\x{1bb1}
/\u/utf,alt_bsux
\\u
/^a+[a\x{200}]/B,utf
aa
/[b-d\x{200}-\x{250}]*[ae-h]?#[\x{200}-\x{250}]{0,8}[\x00-\xff]*#[\x{200}-\x{250}]+[a-z]/B,utf
/[\p{L}]/IB
/[\p{^L}]/IB
/[\P{L}]/IB
/[\P{^L}]/IB
/[abc\p{L}\x{0660}]/IB,utf
/[\p{Nd}]/IB,utf
1234
/[\p{Nd}+-]+/IB,utf
1234
12-34
12+\x{661}-34
\= Expect no match
abcd
/(?:[\PPa*]*){8,}/
/[\P{Any}]/B
/[^\P{Any}\P{Any}]/B
/[\P{Any}\E]/B
/\p{Any}#\P{Any}![\p{Any}]:[\P{Any}]@[\p{Any}a-z]%[\P{Any}c]/B,utf
/[\P{Any}\P{Any}\P{Any}]![\p{Any}\p{Any}\p{Any}]:[^\P{Any}\P{Any}]@[^\p{Any}\p{Any}]%[^\p{Any}\P{Any}]/B,utf
/(\P{Yi}+\277)/
/(\P{Yi}+\277)?/
/(?<=\P{Yi}{3}A)X/
/\p{Yi}+(\P{Yi}+)(?1)/
/(\P{Yi}{2}\277)?/
/[\P{Yi}A]/
/[\P{Yi}\P{Yi}\P{Yi}A]/
/[^\P{Yi}A]/
/[^\P{Yi}\P{Yi}\P{Yi}A]/
/(\P{Yi}*\277)*/
/(\P{Yi}*?\277)*/
/(\p{Yi}*+\277)*/
/(\P{Yi}?\277)*/
/(\P{Yi}??\277)*/
/(\p{Yi}?+\277)*/
/(\P{Yi}{0,3}\277)*/
/(\P{Yi}{0,3}?\277)*/
/(\p{Yi}{0,3}+\277)*/
/\p{Zl}{2,3}+/B,utf
\x{2028}\x{2028}\x{2028}
/\p{Zl}/B,utf
/\p{Lu}{3}+/B,utf
/\pL{2}+/B,utf
/\p{Cc}{2}+/B,utf
/^\p{Cf}/utf
\x{180e}
\x{061c}
\x{2066}
\x{2067}
\x{2068}
\x{2069}
/^\p{Cs}/utf
\x{dfff}\=no_utf_check
\= Expect no match
\x{09f}
/^\p{Mn}/utf
\x{1a1b}
/^\p{Pe}/utf
\x{2309}
\x{230b}
/^\p{Ps}/utf
\x{2308}
\x{230a}
/^\p{Sc}+/utf
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
\x{9f2}
\= Expect no match
X
\x{2c2}
/^\p{Zs}/utf
\ \
\x{a0}
\x{1680}
\x{2000}
\x{2001}
\= Expect no match
\x{2028}
\x{200d}
/[\x{c0}\x{391}]/i,utf
\x{c0}
\x{e0}
# The next two are special cases where the lengths of the different cases of
# the same character differ. The first went wrong with heap frame storage; the
# second was broken in all cases.
/^\x{023a}+?(\x{0130}+)/i,utf
\x{023a}\x{2c65}\x{0130}
/^\x{023a}+([^X])/i,utf
\x{023a}\x{2c65}X
/\x{c0}+\x{116}+/i,utf
\x{c0}\x{e0}\x{116}\x{117}
/[\x{c0}\x{116}]+/i,utf
\x{c0}\x{e0}\x{116}\x{117}
/(\x{de})\1/i,utf
\x{de}\x{de}
\x{de}\x{fe}
\x{fe}\x{fe}
\x{fe}\x{de}
/^\x{c0}$/i,utf
\x{c0}
\x{e0}
/^\x{e0}$/i,utf
\x{c0}
\x{e0}
# The next two should be Perl-compatible, but it fails to match \x{e0}. PCRE
# will match it only with UCP support, because without that it has no notion
# of case for anything other than the ASCII letters.
/((?i)[\x{c0}])/utf
\x{c0}
\x{e0}
/(?i:[\x{c0}])/utf
\x{c0}
\x{e0}
# These are PCRE's extra properties to help with Unicodizing \d etc.
/^\p{Xan}/utf
ABCD
1234
\x{6ca}
\x{a6c}
\x{10a7}
\= Expect no match
_ABC
/^\p{Xan}+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
\= Expect no match
_ABC
/^\p{Xan}+?/utf
\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xan}*/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xan}{2,9}/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xan}{2,9}?/utf
\x{6ca}\x{a6c}\x{10a7}_
/^[\p{Xan}]/utf
ABCD1234_
1234abcd_
\x{6ca}
\x{a6c}
\x{10a7}
\= Expect no match
_ABC
/^[\p{Xan}]+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
\= Expect no match
_ABC
/^>\p{Xsp}/utf
>\x{1680}\x{2028}\x{0b}
>\x{a0}
\= Expect no match
\x{0b}
/^>\p{Xsp}+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xsp}+?/utf
>\x{1680}\x{2028}\x{0b}
/^>\p{Xsp}*/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xsp}{2,9}/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xsp}{2,9}?/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>[\p{Xsp}]/utf
>\x{2028}\x{0b}
/^>[\p{Xsp}]+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}/utf
>\x{1680}\x{2028}\x{0b}
>\x{a0}
\= Expect no match
\x{0b}
/^>\p{Xps}+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}+?/utf
>\x{1680}\x{2028}\x{0b}
/^>\p{Xps}*/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}{2,9}/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}{2,9}?/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>[\p{Xps}]/utf
>\x{2028}\x{0b}
/^>[\p{Xps}]+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^\p{Xwd}/utf
ABCD
1234
\x{6ca}
\x{a6c}
\x{10a7}
_ABC
\= Expect no match
[]
/^\p{Xwd}+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}+?/utf
\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}*/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}{2,9}/utf
A_B12\x{6ca}\x{a6c}\x{10a7}
/^\p{Xwd}{2,9}?/utf
\x{6ca}\x{a6c}\x{10a7}_
/^[\p{Xwd}]/utf
ABCD1234_
1234abcd_
\x{6ca}
\x{a6c}
\x{10a7}
_ABC
\= Expect no match
[]
/^[\p{Xwd}]+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
# A check not in UTF-8 mode
/^[\p{Xwd}]+/
ABCD1234_
# Some negative checks
/^[\P{Xwd}]+/utf
!.+\x{019}\x{482}AB
/^[\p{^Xwd}]+/utf
!.+\x{019}\x{589}AB
/[\D]/B,utf,ucp
1\x{3c8}2
/[\d]/B,utf,ucp
>\x{6f4}<
/[\S]/B,utf,ucp
\x{1680}\x{6f4}\x{1680}
/[\s]/B,utf,ucp
>\x{1680}<
/[\W]/B,utf,ucp
A\x{1735}B
/[\w]/B,utf,ucp
>\x{1723}<
/\D/B,utf,ucp
1\x{3c8}2
/\d/B,utf,ucp
>\x{6f4}<
/\S/B,utf,ucp
\x{1680}\x{6f4}\x{1680}
/\s/B,utf,ucp
>\x{1680}>
/\W/B,utf,ucp
A\x{1735}B
/\w/B,utf,ucp
>\x{1723}<
/[[:alpha:]]/B,ucp
/[[:lower:]]/B,ucp
/[[:upper:]]/B,ucp
/[[:alnum:]]/B,ucp
/[[:ascii:]]/B,ucp
/[[:cntrl:]]/B,ucp
/[[:digit:]]/B,ucp
/[[:digit:]]/B,ucp,ascii_digit
/[[:graph:]]/B,ucp
/[[:print:]]/B,ucp
/[[:punct:]]/B,ucp
/[[:space:]]/B,ucp
/[[:word:]]/B,ucp
/[[:xdigit:]]/B,ucp
/[[:xdigit:]]/B,ucp,ascii_digit
# Unicode properties for \b and \B
/\b...\B/utf,ucp
abc_
\x{37e}abc\x{376}
\x{37e}\x{376}\x{371}\x{393}\x{394}
!\x{c0}++\x{c1}\x{c2}
!\x{c0}+++++
# Without PCRE_UCP, non-ASCII always fail, even if < 256
/\b...\B/utf
abc_
\= Expect no match
\x{37e}abc\x{376}
\x{37e}\x{376}\x{371}\x{393}\x{394}
!\x{c0}++\x{c1}\x{c2}
!\x{c0}+++++
# With PCRE_UCP, non-UTF8 chars that are < 256 still check properties
/\b...\B/ucp
abc_
!\x{c0}++\x{c1}\x{c2}
!\x{c0}+++++
# Some of these are silly, but they check various combinations
/[[:^alpha:][:^cntrl:]]+/B,utf,ucp
123
abc
/[[:^cntrl:][:^alpha:]]+/B,utf,ucp
123
abc
/[[:alpha:]]+/B,utf,ucp
abc
/[[:^alpha:]\S]+/B,utf,ucp
123
abc
/[^\d]+/B,utf,ucp
abc123
abc\x{123}
\x{660}abc
/\p{Lu}+9\p{Lu}+B\p{Lu}+b/B
/\p{^Lu}+9\p{^Lu}+B\p{^Lu}+b/B
/\P{Lu}+9\P{Lu}+B\P{Lu}+b/B
/\p{Han}+X\p{Greek}+\x{370}/B,utf
/\p{Xan}+!\p{Xan}+A/B
/\p{Xsp}+!\p{Xsp}\t/B
/\p{Xps}+!\p{Xps}\t/B
/\p{Xwd}+!\p{Xwd}_/B
/A+\p{N}A+\dB+\p{N}*B+\d*/B,ucp
# These behaved oddly in Perl, so they are kept in this test
/(\x{23a}\x{23a}\x{23a})?\1/i,utf
\= Expect no match
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
/(ȺȺȺ)?\1/i,utf
\= Expect no match
ȺȺȺⱥⱥ
/(\x{23a}\x{23a}\x{23a})?\1/i,utf
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
/(ȺȺȺ)?\1/i,utf
ȺȺȺⱥⱥⱥ
/(\x{23a}\x{23a}\x{23a})\1/i,utf
\= Expect no match
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
/(ȺȺȺ)\1/i,utf
\= Expect no match
ȺȺȺⱥⱥ
/(\x{23a}\x{23a}\x{23a})\1/i,utf
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
/(ȺȺȺ)\1/i,utf
ȺȺȺⱥⱥⱥ
/(\x{2c65}\x{2c65})\1/i,utf
\x{2c65}\x{2c65}\x{23a}\x{23a}
/(ⱥⱥ)\1/i,utf
ⱥⱥȺȺ
/(\x{23a}\x{23a}\x{23a})\1Y/i,utf
X\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}YZ
/(\x{2c65}\x{2c65})\1Y/i,utf
X\x{2c65}\x{2c65}\x{23a}\x{23a}YZ
# These scripts weren't yet in Perl when I added Unicode 6.0.0 to PCRE
/^[\p{Batak}]/utf
\x{1bc0}
\x{1bff}
\= Expect no match
\x{1bf4}
/^[\p{Brahmi}]/utf
\x{11000}
\x{1106f}
\= Expect no match
\x{1104e}
/^[\p{Mandaic}]/utf
\x{840}
\x{85e}
\= Expect no match
\x{85c}
\x{85d}
/(\X*)(.)/s,utf
A\x{300}
/^S(\X*)e(\X*)$/utf
Stéréo
/^\X/utf
́réo
/^a\X41z/alt_bsux,allow_empty_class,match_unset_backref,dupnames
aX41z
\= Expect no match
aAz
/\X/
a\=ps
a\=ph
/\Xa/
aa\=ps
aa\=ph
/\X{2}/
aa\=ps
aa\=ph
/\X+a/
a\=ps
aa\=ps
aa\=ph
/\X+?a/
a\=ps
ab\=ps
aa\=ps
aa\=ph
aba\=ps
# These Unicode 6.1.0 scripts are not known to Perl.
/\p{Chakma}\d/utf,ucp
\x{11100}\x{1113c}
/\p{Takri}\d/utf,ucp
\x{11680}\x{116c0}
/^\X/utf
A\=ps
A\=ph
A\x{300}\x{301}\=ps
A\x{300}\x{301}\=ph
A\x{301}\=ps
A\x{301}\=ph
/^\X{2,3}/utf
A\=ps
A\=ph
AA\=ps
AA\=ph
A\x{300}\x{301}\=ps
A\x{300}\x{301}\=ph
A\x{300}\x{301}A\x{300}\x{301}\=ps
A\x{300}\x{301}A\x{300}\x{301}\=ph
/^\X{2}/utf
AA\=ps
AA\=ph
A\x{300}\x{301}A\x{300}\x{301}\=ps
A\x{300}\x{301}A\x{300}\x{301}\=ph
/^\X+/utf
AA\=ps
AA\=ph
/^\X+?Z/utf
AA\=ps
AA\=ph
/A\x{3a3}B/IBi,utf
/[\x{3a3}]/Bi,utf
/[^\x{3a3}]/Bi,utf
/[\x{3a3}]+/Bi,utf
/[^\x{3a3}]+/Bi,utf
/a*\x{3a3}/Bi,utf
/\x{3a3}+a/Bi,utf
/\x{3a3}*\x{3c2}/Bi,utf
/\x{3a3}{3}/i,utf,aftertext
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}/i,utf,aftertext
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}?/i,utf,aftertext
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}+./i,utf,aftertext
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}++./i,utf,aftertext
\= Expect no match
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}*\x{3c2}/Bi,utf
/[^\x{3a3}]*\x{3c2}/Bi,utf
/[^a]*\x{3c2}/Bi,utf
/ist/Bi,utf
\= Expect no match
ikt
/is+t/i,utf
iSs\x{17f}t
\= Expect no match
ikt
/is+?t/i,utf
\= Expect no match
ikt
/is?t/i,utf
\= Expect no match
ikt
/is{2}t/i,utf
\= Expect no match
iskt
# This property is a PCRE special
/^\p{Xuc}/utf
$abc
@abc
`abc
\x{1234}abc
\= Expect no match
abc
/^\p{Xuc}+/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}+?/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}+?\*/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}++/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}{3,5}/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}{3,5}?/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^[\p{Xuc}]/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^[\p{Xuc}]+/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\P{Xuc}/utf
abc
\= Expect no match
$abc
@abc
`abc
\x{1234}abc
/^[\P{Xuc}]/utf
abc
\= Expect no match
$abc
@abc
`abc
\x{1234}abc
# Some auto-possessification tests
/\pN+\z/B
/\PN+\z/B
/\pN+/B
/\PN+/B
/\p{Any}+\p{Any} \p{Any}+\P{Any} \p{Any}+\p{L&} \p{Any}+\p{L} \p{Any}+\p{Lu} \p{Any}+\p{Han} \p{Any}+\p{Xan} \p{Any}+\p{Xsp} \p{Any}+\p{Xps} \p{Xwd}+\p{Any} \p{Any}+\p{Xuc}/Bx,ucp
/\p{L&}+\p{Any} \p{L&}+\p{L&} \P{L&}+\p{L&} \p{L&}+\p{L} \p{L&}+\p{Lu} \p{L&}+\p{Han} \p{L&}+\p{Xan} \p{L&}+\P{Xan} \p{L&}+\p{Xsp} \p{L&}+\p{Xps} \p{Xwd}+\p{L&} \p{L&}+\p{Xuc}/Bx,ucp
/\p{N}+\p{Any} \p{N}+\p{L&} \p{N}+\p{L} \p{N}+\P{L} \p{N}+\P{N} \p{N}+\p{Lu} \p{N}+\p{Han} \p{N}+\p{Xan} \p{N}+\p{Xsp} \p{N}+\p{Xps} \p{Xwd}+\p{N} \p{N}+\p{Xuc}/Bx,ucp
/\p{Lu}+\p{Any} \p{Lu}+\p{L&} \p{Lu}+\p{L} \p{Lu}+\p{Lu} \P{Lu}+\p{Lu} \p{Lu}+\p{Nd} \p{Lu}+\P{Nd} \p{Lu}+\p{Han} \p{Lu}+\p{Xan} \p{Lu}+\p{Xsp} \p{Lu}+\p{Xps} \p{Xwd}+\p{Lu} \p{Lu}+\p{Xuc}/Bx,ucp
/\p{Han}+\p{Lu} \p{Han}+\p{L&} \p{Han}+\p{L} \p{Han}+\p{Lu} \p{Han}+\p{Arabic} \p{Arabic}+\p{Arabic} \p{Han}+\p{Xan} \p{Han}+\p{Xsp} \p{Han}+\p{Xps} \p{Xwd}+\p{Han} \p{Han}+\p{Xuc}/Bx,ucp
/\p{Xan}+\p{Any} \p{Xan}+\p{L&} \P{Xan}+\p{L&} \p{Xan}+\p{L} \p{Xan}+\p{Lu} \p{Xan}+\p{Han} \p{Xan}+\p{Xan} \p{Xan}+\P{Xan} \p{Xan}+\p{Xsp} \p{Xan}+\p{Xps} \p{Xwd}+\p{Xan} \p{Xan}+\p{Xuc}/Bx,ucp
/\p{Xsp}+\p{Any} \p{Xsp}+\p{L&} \p{Xsp}+\p{L} \p{Xsp}+\p{Lu} \p{Xsp}+\p{Han} \p{Xsp}+\p{Xan} \p{Xsp}+\p{Xsp} \P{Xsp}+\p{Xsp} \p{Xsp}+\p{Xps} \p{Xwd}+\p{Xsp} \p{Xsp}+\p{Xuc}/Bx,ucp
/\p{Xwd}+\p{Any} \p{Xwd}+\p{L&} \p{Xwd}+\p{L} \p{Xwd}+\p{Lu} \p{Xwd}+\p{Han} \p{Xwd}+\p{Xan} \p{Xwd}+\p{Xsp} \p{Xwd}+\p{Xps} \p{Xwd}+\p{Xwd} \p{Xwd}+\P{Xwd} \p{Xwd}+\p{Xuc}/Bx,ucp
/\p{Xuc}+\p{Any} \p{Xuc}+\p{L&} \p{Xuc}+\p{L} \p{Xuc}+\p{Lu} \p{Xuc}+\p{Han} \p{Xuc}+\p{Xan} \p{Xuc}+\p{Xsp} \p{Xuc}+\p{Xps} \p{Xwd}+\p{Xuc} \p{Xuc}+\p{Xuc} \p{Xuc}+\P{Xuc}/Bx,ucp
/\p{N}+\p{Ll} \p{N}+\p{Nd} \p{N}+\P{Nd}/Bx,ucp
/\p{Xan}+\p{L} \p{Xan}+\p{N} \p{Xan}+\p{C} \p{Xan}+\P{L} \P{Xan}+\p{N} \p{Xan}+\P{C}/Bx,ucp
/\p{L}+\p{Xan} \p{N}+\p{Xan} \p{C}+\p{Xan} \P{L}+\p{Xan} \p{N}+\p{Xan} \P{C}+\p{Xan} \p{L}+\P{Xan}/Bx,ucp
/\p{Xan}+\p{Lu} \p{Xan}+\p{Nd} \p{Xan}+\p{Cc} \p{Xan}+\P{Ll} \P{Xan}+\p{No} \p{Xan}+\P{Cf}/Bx,ucp
/\p{Lu}+\p{Xan} \p{Nd}+\p{Xan} \p{Cs}+\p{Xan} \P{Lt}+\p{Xan} \p{Nl}+\p{Xan} \P{Cc}+\p{Xan} \p{Lt}+\P{Xan}/Bx,ucp
/\w+\p{P} \w+\p{Po} \w+\s \p{Xan}+\s \s+\p{Xan} \s+\w/Bx,ucp
/\w+\P{P} \W+\p{Po} \w+\S \P{Xan}+\s \s+\P{Xan} \s+\W/Bx,ucp
/\w+\p{Po} \w+\p{Pc} \W+\p{Po} \W+\p{Pc} \w+\P{Po} \w+\P{Pc}/Bx,ucp
/\p{Nl}+\p{Xan} \P{Nl}+\p{Xan} \p{Nl}+\P{Xan} \P{Nl}+\P{Xan}/Bx,ucp
/\p{Xan}+\p{Nl} \P{Xan}+\p{Nl} \p{Xan}+\P{Nl} \P{Xan}+\P{Nl}/Bx,ucp
/\p{Xan}+\p{Nd} \P{Xan}+\p{Nd} \p{Xan}+\P{Nd} \P{Xan}+\P{Nd}/Bx,ucp
# End auto-possessification tests
/\w+/B,utf,ucp,auto_callout
abcd
/[\p{N}]?+/B,no_auto_possess
/[\p{L}ab]{2,3}+/B,no_auto_possess
/\D+\X \d+\X \S+\X \s+\X \W+\X \w+\X \R+\X \H+\X \h+\X \V+\X \v+\X a+\X \n+\X .+\X/Bx
/.+\X/Bsx
/\X+$/Bmx
/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/Bx
/\d+\s{0,5}=\s*\S?=\w{0,4}\W*/B,utf,ucp
/[RST]+/Bi,utf,ucp
/[R-T]+/Bi,utf,ucp
/[Q-U]+/Bi,utf,ucp
/^s?c/Iim,utf
scat
/\X?abc/utf,no_start_optimize
\xff\x7f\x00\x00\x03\x00\x41\xcc\x80\x41\x{300}\x61\x62\x63\x00\=no_utf_check,offset=06
/\x{100}\x{200}\K\x{300}/utf,startchar
\x{100}\x{200}\x{300}
# Test UTF characters in a substitution
/ábc/utf,replace=XሴZ
123ábc123
/(?<=abc)(|def)/g,utf,replace=<$0>
123abcáyzabcdef789abcሴqr
/[A-`]/iB,utf
abcdefghijklmno
/(?<=\K\x{17f})/g,utf,aftertext,allow_lookaround_bsk
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
/(?<=\K\x{17f})/altglobal,utf,aftertext,allow_lookaround_bsk
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
"\xa\xf<(.\pZ*\P{Xwd}+^\xa8\3'3yq.::?(?J:()\xd1+!~:3'(8?:)':(?'d'(?'d'^u]!.+.+\\A\Ah(n+?9){7}+\K;(?'X'u'(?'c'(?'z'(?<y>\xb::\xf0'|\xd3(\xae?'w(z\x8?P>l)\x8?P>a)'\H\R\xd1+!!~:3'(?:h$N{26875}\W+?\\=D{2}\x89(?i:Uy0\N({2\xa(\v\x85*){y*\A(()\p{L}+?\P{^Xan}'+?\xff\+pS\?|).{;y*\A(()\p{L}+?\8}\d?1(|)(/1){7}.+[Lp{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(\xbf(R))\x8a\X*?\x8a\xb\xd1^9\3*+(\xc1,\k'R'\xb4)\xcc(z\z(?J)(?'X'\x1b(\xb\xd1^9\?'3*+P{^Xan}+?\xff\+(\xc1.]k+\xb'Pm'\xb4)\xcc4f\xa7'\xd1V(?i:U,{2,2})'(?'X'))?-%--\x95$9*\4'|\xd1(\x9c''%\x94$9)#(?'R')3\x7?('P\xed7'\xa8\xb1^u\xeaw\1\0\0\(|(?1){7}.+[\p{Me}].\s\xdcC*^\x14?(?(<y>))(?<!^)$C((;*?(R*?))+(?(R)\x8a\X*?\x8a\xb\xd1^9\3*+|(\xc1,\k'R'\xb4)\xcc! z)\z(?JJ)(?'X';(\xb\xd1^9\?'3*+(\xc1.]k+\xb'Pm'\xb4))':(?'d')(?'RD'(d')|)|$)'|(?<x>\g{d});\g{x}\x11\g{d}\x81\|$((?'X'\'X'(?'W''\x92()'9'\x83*))\xba*\!?^ <){)':;\xcc4'\xd1'(?'X'28))?-%--\x95$9*\4'|\xd1((''e\x94*$9:)*#(?'R')3)\x7?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+0!~:(?)'d'E:yD!\s(?'R'\x1e;\x10:U))|'\x9g!\xb0*){)\\x16:;()\x1e\x10\x87*:(?<y>)\xd1+!~:(?)'}'\d'E:yD!\s(?'R'\x1e;\x10:U))|'))|)g!\xb0*R+9{29+)#(?'P'})*?pS\{3,}\x85,{0,}l{*UTF)(\xe{7}){3722,{9,}d{2,?|))|{)\(A?&d}}{\xa,}2}){3,}7,l{)22}(,}l:7{2,4}}29\x19+)#?'P'})*v?))\x5"
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?'X'8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?'X'))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
"(*UTF)(*UCP)(.UTF).+X(\V+;\^(\D|)!999}(?(?C{7(?C')\H*\S*/^\x5\xa\\xd3\x85n?(;\D*(?m).[^mH+((*UCP)(*U:F)})(?!^)(?'"
/[\pS#moq]/
=
/(*:a\x{12345}b\t(d\)c)xxx/utf,alt_verbnames,mark
cxxxz
/abcd/utf,replace=x\x{824}y\o{3333}z(\Q12\$34$$\x34\E5$$),substitute_extended
abcd
/a(\x{e0}\x{101})(\x{c0}\x{102})/utf,replace=a\u$1\U$1\E$1\l$2\L$2\Eab\U\x{e0}\x{101}\L\x{d0}\x{160}\EDone,substitute_extended
a\x{e0}\x{101}\x{c0}\x{102}
/((?<digit>\d)|(?<letter>\p{L}))/g,substitute_extended,replace=<${digit:+digit; :not digit; }${letter:+letter:not a letter}>
ab12cde
/(*UCP)(*UTF)[[:>:]]X/B
/abc/utf,replace=xyz
abc\=zero_terminate
/a[[:punct:]b]/ucp,bincode
/a[[:punct:]b]/utf,ucp,bincode
/a[b[:punct:]]/utf,ucp,bincode
/[[:^ascii:]]/utf,ucp,bincode
/[[:^ascii:]\w]/utf,ucp,bincode
/[\w[:^ascii:]]/utf,ucp,bincode
/[^[:ascii:]\W]/utf,ucp,bincode
\x{de}
\x{200}
\= Expect no match
\x{589}
\x{37e}
/[[:^ascii:]a]/utf,ucp,bincode
/L(?#(|++<!(2)?/B,utf,no_auto_possess,auto_callout
/L(?#(|++<!(2)?/B,utf,ucp,auto_callout
/(*UTF)C\x09((?<!'(?x)!*H? #\xcc\x9a[^$]/
/[\D]/utf
\x{1d7cf}
/[\D\P{Nd}]/utf
\x{1d7cf}
/[^\D]/utf
a9b
\= Expect no match
\x{1d7cf}
/[^\D\P{Nd}]/utf
a9b
\= Expect no match
\x{1d7cf}
\x{10000}
# Hex uses pattern length, not zero-terminated. This tests for overrunning
# the given length of a pattern.
/'(*UTF)'/hex
/'#('/hex,extended,utf
/a(?<=A\XB)/utf
/../utf,auto_callout
\n\x{123}\x{123}\x{123}\x{123}
# This tests processing wide characters in extended mode.
/XȀ/x,utf
# These three test a bug fix that was not clearing up after a locale setting
# when the test or a subsequent one matched a wide character.
//locale=C
/[\P{Yi}]/utf
\x{2f000}
/[\P{Yi}]/utf,locale=C
\x{2f000}
/^(?<!(?=􃡜))/B,utf
# Horizontal and vertical space lists ignore caseless
/[\HH]/Bi,utf
/[^\HH]/Bi,utf
//g,utf
\=zero_terminate
/^(?1)\p{Nd}{3}(a)/
a123a
/\p{Nd}{0,3}[\pL](*:abc)(?C1)xxx/callout_info
# ---------------------------------------------------------------------------
# A bunch of tests that hit lines of code that others do not (at least when
# these were created).
/^[^a]{3,}?x/i,utf,no_start_optimize,no_auto_possess
\= Expect no match
bbb
cc
/^[ac]{3,}?x/i,utf,no_start_optimize,no_auto_possess
\= Expect no match
aaa\x{100}
/^X\X/no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{L&}+?/no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{L}+?/no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{Lu}+?/no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{Arabic}+?/no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{Xan}+?/ucp,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\s+?/ucp,no_start_optimize,no_auto_possess
\= Expect no match
X
XX
/^X\S+?/ucp,no_start_optimize,no_auto_possess
XX
\= Expect no match
X
/^X\w+?/ucp,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X[^\x{b5}]+?/i,utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X[\x{b5}]+?/i,utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{Xuc}+?/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X.+?Z/s,utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\R+?/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\H+?/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\V+?/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\s+?/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
XX
/^X\S+?/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
/^X\p{Any}{1,3}?Z/s,no_start_optimize,no_auto_possess
XYYYZ
\= Expect no match
XY
XYY
XYYY
XYYYYZ
/^X\p{L&}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
/^X\p{L}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
/^X\p{Lu}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
/^X\P{Han}{1,3}?Z/s,utf,no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
XY\x{2f00}!
/^X\p{Xan}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
/^X\p{Xsp}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
X\n
X\n!
X\n\n!
/^X\P{Xsp}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
XYY\n
/^X\p{Xwd}{1,3}?Z/s,no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
XYY!
/^X\x{b5}+?Z/i,utf,no_start_optimize,no_auto_possess
\= Expect no match
X
X\x{b5}
X\x{b5}\x{b5}Y
/^X\p{Xuc}+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X
X$
X@@Y
/(*CRLF)^X.+?Z/utf,no_start_optimize,no_auto_possess
\= Expect partial match
XYY\r\=ph
\= Expect no match
X
/^X.+?Z/s,utf,no_start_optimize,no_auto_possess
\= Expect no match
X
XYY
/^X\R+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X\nX
X\n\rX
X\n\r\nX
X\n\n
X\n\x{0c}
/(*BSR_ANYCRLF)^X\R+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X\nX
X\n\rX
X\n\r\nX
X\n\n
X\n\x{0c}
/^X\H+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
XY\t
XYY
/^X\h+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X\t\t
X\tY
/^X\V+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
XY\n
XYY
/^X\v+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X\n\n
X\nY
/^X\D+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
XY9
XYY
/^X\d+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X99
X9Y
/^X\S+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
XY\n
XYY
/^X\s+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X\n\n
X\nY
/^X\W+?Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X.A
X++
/^X\p{L&}{1,3}Z/no_start_optimize,no_auto_possess
\= Expect no match
XY
XY!
/^X\p{L}{1,3}Z/no_start_optimize,no_auto_possess
\= Expect no match
XY
/^X\p{Xan}{1,3}Z/no_start_optimize,no_auto_possess
\= Expect no match
XY
/^X\P{Xsp}{1,3}Z/no_start_optimize,no_auto_possess
\= Expect no match
XYY
/^X\p{Xuc}+Z/utf,no_start_optimize,no_auto_possess
\= Expect no match
X$
# ----------------------------------------------------------------------
# These test the dangerous PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL option.
/\x{d800}/B,utf,bad_escape_is_literal
/\ud800/B,utf,alt_bsux,bad_escape_is_literal
# ----------------------------------------------------------------------
/Aሴ+B/literal,utf,no_utf_check
Aሴ+B
# These are here because I upgraded to Unicode 10.0.0 before Perl did, so it
# doesn't recognize all these scripts. In time these three tests can be moved
# to test 4.
/^(\p{Adlam}+)(\p{Bhaiksuki}+)(\p{Marchen}+)(\p{Newa}+)(\p{Osage}+)
(\p{Tangut}+)(\p{Masaram_Gondi}+)(\p{Nushu}+)(\p{Soyombo}+)
(\p{Zanabazar_Square}+)/x,utf
\x{1E900}\x{1E924}\x{1E953}\x{11C00}\x{11C2D}\x{11C3E}\x{11C70}\x{11C77}\x{11CAB}\x{11400}\x{1142F}\x{11455}\x{104B0}\x{104D8}\x{104FB}\x{16FE0}\x{18800}\x{18AF2}\x{11D00}\x{11D3A}\x{11D59}\x{16FE1}\x{1B170}\x{1B2FB}\x{11A50}\x{11A58}\x{11AA2}\x{11A00}\x{11A07}\x{11A47}
/^\x{1E900}\x{104B0}/i,utf
\x{1E900}\x{104B0}
\x{1E922}\x{104D8}
/^(?:(\X)(?C))+$/utf
\x{1E900}\x{1E924}\x{1E953}\x{11C00}\x{11C2D}\x{11C3E}\x{11C70}\x{11C77}\x{11CAB}\x{11400}\x{1142F}\x{11455}\x{104B0}\x{104D8}\x{104FB}\x{16FE0}\x{18800}\x{18AF2}\x{11D00}\x{11D3A}\x{11D59}\x{16FE1}\x{1B170}\x{1B2FB}\x{11A50}\x{11A58}\x{11AA2}\x{11A00}\x{11A07}\x{11A47}\=callout_capture,callout_no_where
# Similarly for Unicode 11.0.0
/^(\p{Dogra}+)(\p{Gunjala_Gondi}+)(\p{Hanifi_Rohingya}+)(\p{Makasar}+)
(\p{Medefaidrin}+)(\p{Old_Sogdian}+)(\p{Sogdian}+)/x,utf
\x{11800}\x{11da9}\x{10d27}\x{11ee0}\x{16e48}\x{10f27}\x{10f30}
# Regional indicators
/^(\X)(\X)/utf,aftertext
\x{1F1E6}\x{1F1E7}\x{1F1E7}B
\x{1F1E6}\x{1F1E7}\x{1F1E7}\x{1F1E6}B
# More differences from Perl
/^\p{Common}/utf
\x{60c}
\x{61f}
\x{964}
\x{965}
/^\p{Inherited}/utf
\x{64b}
\x{654}
\x{655}
\x{1D1AA}
/\N{U+}/
/\N{U+}/utf
/\N{U}/
# This tests the non-UTF Unicode NEL pattern whitespace character, only
# recognized by PCRE2 with /x when there is Unicode support.
/A
<EFBFBD>B/x
AB
# This tests Unicode Pattern White Space characters in verb names when they
# are being processed with PCRE2_EXTENDED. Note: there are UTF-8 characters
# with code points greater than 255 between A, B, and C in the pattern.
/(*: ABC)abc/x,utf,mark,alt_verbnames
abc
# Script run tests: auto-possessification
/^(*sr:.*)/B,utf
paypаl.com A classic example of why script run checks are a good thing
/^(*sr:.*(*ACCEPT))/utf
paypаl.com But *ACCEPT breaks things
/^(*sr:\x{2e80}*)/B,utf
/^(*sr:\x{2e80}*)\x{2e80}/B,utf
/(?<!)(*sr:)/B
/(?<=abc(?=X(*sr:BXY)CCC)XBXYCCC)./B
abcXBXYCCC!
# Some script run patterns are broken in Perl 5.28.0. These can be moved into
# test 4 when a mended version of Perl is released.
/^(*sr:.{4})/utf
\x{0980}12\x{0993} Bengali Common-digits Bengali
\x{0780}12\x{07b1} Thaana Common-digits Thaana
\x{0e01}12\x{0e5b} Thai Common-digits Thai
\x{1780}12\x{19ff} Khmer Common-digits Khmer
\x{0904}12\x{0939} Devanagari Common-digits Devanagari
A\x{ff10}\x{ff19}B Latin Common-notascii-digits Latin
A\x{1d7ce}\x{1d7cf}B Latin fancy-common-digits Latin
# These ones involve non-ASCII but nevertheless Common digits. As of October
# 2018 even blead Perl wasn't handling all of these - but is going to.
/^(*sr:.{4})/utf
A\x{ff10}\x{ff19}B Latin Common-notascii-digits Latin
\x{ff10}\x{ff19}.. Common-notascii-digits Common Common
A\x{ff10}BC Latin Common-notascii-digit Latin Latin
A\x{1d7ce}\x{1d7cf}B Latin fancy-common-digits Latin
\x{1d7ce}\x{1d7cf},, fancy-common-digits Common Common
A\x{1d7ce}BC Latin fancy-common-digit Latin Latin
# Some Unicode 12.1.0 new script characters
/\p{Elymaic}\p{Nandinagari}\p{Nyiakeng_Puachue_Hmong}\p{Wancho}/utf
\x{10fe5}\x{119AC}\x{1E10E}\x{1E2D1}
# Some Unicode 13.0.0 new script characters
/\p{Chorasmian}\p{Dives_Akuru}\p{Khitan_Small_Script}\p{Yezidi}/utf
\x{10FB0}\x{11900}\x{18B00}\x{10E80}
# -------
# Test reference and errors in non-ASCII characters in group names
/(?'𑠅ABC'...)/I,utf
abcde\=copy=𑠅ABC
# Bad ones
/(?'AB၌C'...)\g{AB၌C}/utf
/(?'٠ABC'...)/utf
/(?'²ABC'...)/utf
/(?'X²ABC'...)/utf
# -------
/\p{Any}*xyz/I
/(|<7C>)7/caseless,ucp
/(\xc1)\1/i,ucp
\xc1\xe1\=no_jit
/\p{L&}+\p{bidi_control}/B
/\p{bidi_control}+\p{L&}/B
/\p{han}/B
/\p{script:han}/B
/\p{sc:han}/B
/\p{script extensions:han}/B
/\p{scx:han}/B
# Test error - invalid script name
/\p{sc:L}/
# Some Boolean property tests that differ from Perl
/\p{emojimodifierbase}\p{ebase}/g,utf
>AN<>\x{261d}\x{1faf6}<>yz<
/\p{graphemelink}\p{grlink}/g,utf
>AN<>\x{11d97}\x{94d}<>yz<
/\p{soft dotted}\p{sd}/g,utf
>AF23<>\x{1df1a}\x{69}<>yz<
# ------------------------------------------------
/\p{\2b[:x<>igi:t:_/
# Tests for PCRE2_EXTRA_CASELESS_RESTRICT. Compare each test with and without
# the restriction.
/AskZ/i,utf,caseless_restrict
AskZ
aSKz
\= Expect no match
A\x{17f}kZ
As\x{212a}Z
/AskZ/i,utf
AskZ
aSKz
A\x{17f}kZ
As\x{212a}Z
/A\x{17f}\x{212a}Z/ir,utf
\= Expect no match
AskZ
/A\x{17f}\x{212a}Z/i,utf
AskZ
/[AskZ]+/i,utf,caseless_restrict
AskZ
aSKz
A\x{17f}kZ
As\x{212a}Z
/[AskZ]+/i,utf
AskZ
aSKz
A\x{17f}kZ
As\x{212a}Z
/[\x{17f}\x{212a}]+/ir,utf
\= Expect no match
AskZ
/[\x{17f}\x{212a}]+/i,utf
AskZ
/[^s]+/ir,utf
A\x{17f}Z
/[^s]+/i,utf
A\x{17f}Z
/[^k]+/ir,utf
A\x{212a}Z
/[^k]+/i,utf
A\x{212a}Z
/[^sk]+/ir,utf
A\x{17f}\x{212a}Z
/[^sk]+/i,utf
A\x{17f}\x{212a}Z
/[^\x{17f}]+/ir,utf
AsSZ
/[^\x{17f}]+/i,utf
AsSZ
/[Ss]+/irB,utf
Sss\x{17f}ss
/[Ss]+/iB,utf
Sss\x{17f}ss
/[S\x{17f}]/irB,utf
/[S\x{17f}]/iB,utf
/[\x{17f}s]/irB,utf
/[\x{17f}s]/iB,utf
/[\x{4b}\x{6b}]/irB,utf
/[\x{4b}\x{6b}]/iB,utf
/s(?r)s(?-r)s(?r:s)s/i,utf
\x{17f}S\x{17f}S\x{17f}
\= Expect no match
\x{17f}\x{17f}\x{17f}S\x{17f}
\x{17f}S\x{17f}\x{17f}\x{17f}
/k(?^i)k/ir,utf
K\x{212a}
\= Expect no match
\x{212a}\x{212a}
/[sk](?r:[sk])[sk]/Bi,utf
SKS
sks
\x{212a}S\x{17f}
\x{17f}K\x{212a}
\= Expect no match
s\x{212a}s
K\x{17f}K
/(.) \1/i,utf,caseless_restrict
s S
k K
\= Expect no match
s \x{17f}
k \x{212a}
/(.) (?r:\1)/i,utf
s S
k K
\= Expect no match
s \x{17f}
k \x{212a}
/(.) \1/i,utf
s S
k K
s \x{17f}
k \x{212a}
/(?:(?<A>ss)|(?<A>kk)) \k<A>/i,utf,dupnames,caseless_restrict
sS Ss
kK Kk
\= Expect no match
sS \x{17f}s
kK \x{212a}k
/(?:(?<A>ss)|(?<A>kk)) \k<A>/i,utf,dupnames
sS Ss
kK Kk
sS \x{17f}s
kK \x{212a}k
/(?:(?<A>s)|(?<A>k)) \k<A>{3,}!/i,utf,dupnames,caseless_restrict
s SsSs!
k KkKk!
\= Expect no match
s \x{17f}sSs\x{17f}!
k \x{212a}kKk\x{212a}!
/(?:(?<A>s)|(?<A>k)) \k<A>{3,}!/i,utf,dupnames
s SsSs!
k KkKk!
s \x{17f}sSs\x{17f}!
k \x{212a}kKk\x{212a}!
# End caseless restrict tests
# TESTS for PCRE2_EXTRA_TURKISH_CASING - again, tests with and without.
/i/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/i/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/I/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/I/i,utf,turkish_casing
I
\x{0131}
\= Expect no match
i
\x{0130}
/\x{0130}/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/\x{0130}/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/\x{0131}/i,utf
\x{0131}
\= Expect no match
i
I
\x{0130}
/\x{0131}/i,utf,turkish_casing
I
\x{0131}
\= Expect no match
i
\x{0130}
/[i]/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/[i]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[^i]/i,utf
\x{0130}
\x{0131}
\= Expect no match
i
I
/[^i]/i,utf,turkish_casing
I
\x{0131}
\= Expect no match
i
\x{0130}
/[\x{0130}]/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/[\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[\x{0120}-\x{0130}]/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/[\x{0120}-\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[zi]/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/[zi]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[z\x{0130}]/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/[z\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[iI]/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/[iI]/i,utf,turkish_casing
i
I
\x{0130}
\x{0131}
/[i\x{0130}]/i,utf
i
I
\x{0130}
\= Expect no match
\x{0131}
/[i\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/(.) \1/i,utf
i I
\= Expect no match
i \x{0130}
\x{0131} I
/(*TURKISH_CASING)(.) \1/i,utf
i \x{0130}
\x{0131} I
\= Expect no match
i I
/(.) \1/i,utf,turkish_casing
i \x{0130}
\x{0131} I
\= Expect no match
i I
/i/i,utf,caseless_restrict,turkish_casing
/i/i,turkish_casing
/i/i,utf,caseless_restrict
i
/i/i,ucp,caseless_restrict
i
/b(?r:[\x{00FF}-\x{FFEE}])/i,utf,turkish_casing
b\x{0130}
b\x{0131}
\= Expect no match
bi
bI
bk
/[\x60-\x7f]/i,ucp
i
I
/[\x60-\xc0]/i,ucp
i
I
/[\x80-\xc0]/i,ucp
\= Expect no match
i
I
# End Turkish casing tests
# TESTS for PCRE2_EXTRA_ASCII_xxx - again, tests with and without.
# DIGITS
/\d+/i,utf
123\x{660}456
/\d+/i,utf,ucp
123\x{660}456
/\d+/i,utf,ucp,ascii_bsd
123\x{660}456
/[\d]+/i,utf
123\x{660}456
/[\d]+/i,utf,ucp
123\x{660}456
/[\d]+/i,utf,ucp,ascii_bsd
123\x{660}456
/\d(?aD)\d(?-aD)\d/utf,ucp
\x{660}9\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
/\d(?-aD)\d(?aD)\d/utf,ucp,ascii_bsd
999
9\x{660}9
/\d(?a)\d(?-a)\d/utf,ucp
\x{660}9\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
/\d(?-aD)\d(?aD)\d/utf,ucp,ascii_bsd
999
9\x{660}9
# SPACES
/>\s+</i,utf
> <
\= Expect no match
>\x{a0} <
/>\s+</i,utf,ucp
> <
>\x{a0} <
/>\s+</i,utf,ucp,ascii_bss
> <
\= Expect no match
>\x{a0} <
/>[\s]+</i,utf
> <
\= Expect no match
>\x{a0} <
/>[\s]+</i,utf,ucp
> <
>\x{a0} <
/>[\s]+</i,utf,ucp,ascii_bss
> <
\= Expect no match
>\x{a0} <
/>\s(?aS)\s(?-aS)\s</utf,ucp
>\x{a0} \x{a0}<
\= Expect no match
>\x{a0}\x{a0}\x{a0}<
/>\s(?a)\s(?-a)\s</utf,ucp
>\x{a0} \x{a0}<
\= Expect no match
>\x{a0}\x{a0}\x{a0}<
# WORDS
/\w+/i,utf
123\x{660}abc
/\w+/i,utf,ucp
123\x{660}abc
/\w+/i,utf,ucp,ascii_bsw
123\x{660}abc
/[\w]+/i,utf
123\x{660}abc
/[\w]+/i,utf,ucp
123\x{660}abc
/[\w]+/i,utf,ucp,ascii_bsw
123\x{660}abc
/\w(?aW)\w(?-aW)\w/utf,ucp
\x{660}A\x{c0}
\= Expect no match
\x{660}\x{c0}\x{c0}
/\w(?a)\w(?-a)\w/utf,ucp
\x{660}A\x{c0}
\= Expect no match
\x{660}\x{c0}\x{c0}
# WORD BOUNDARY
/\bABC\b/utf
\x{c0}ABC\x{d0}
/\bABC\b/utf,ucp
\= Expect no match
\x{c0}ABC\x{d0}
/\bABC\b/utf,ucp,ascii_bsw
\x{c0}ABC\x{d0}
/\bABC\b/utf,ucp,ascii_all
\x{c0}ABC\x{d0}
# POSIX
/^[[:digit:]]+$/utf,ucp
123456
123\x{660}456
/^[[:digit:]]+$/utf,ucp,ascii_digit
123456
\= Expect no match
123\x{660}456
/[[:digit:]]+/g,utf,ucp,ascii_digit
123\x{660}456
/(?-aT)[[:digit:]](?aT)[[:digit:]]/utf,ucp,ascii_digit
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-aT:[[:digit:]])[[:digit:]]/utf,ucp,ascii_digit
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-aT:[[:digit:]])[[:digit:]]/utf,never_ucp,ascii_digit
11
\= Expect no match
\x{ff11}1
1\x{ff11}
/[[:digit:]]+/utf,ucp,ascii_posix
123\x{660}456
/(?-aP)[[:digit:]](?aP)[[:digit:]]/utf,ucp,ascii_posix
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-aP:[[:digit:]])[[:digit:]]/utf,ucp,ascii_posix
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-a:[[:digit:]])[[:digit:]]/a,utf,ucp
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/^[[:xdigit:]]+$/utf,ucp
f0
1A
d\x{ff10}
\x{ff26}8
\= Expect no match
8g\=no_jit
/^[[:xdigit:]]+$/utf,ucp,ascii_digit
f0
1A
\= Expect no match
d\x{ff10}
\x{ff26}8
8g
/>[[:space:]]+</utf,ucp
>\x{a0} \x{a0}<
>\x{a0}\x{a0}\x{a0}<
/>[[:space:]]+</utf,ucp,ascii_posix
\= Expect no match
>\x{a0} \x{a0}<
/(?aP)[[:alnum:]]+/i,ucp,utf
abcáxyz
abc\x{660}xyz
/(?aP)[[:alnum:]\d]+/i,ucp,utf
abc\x{660}xyz
/(*UCP)(*UTF)[[:alnum:]](?aP:[[:alnum:]])[[:alnum:]]/
\x{660}A\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
# VARIOUS
/[\d\s\w]+/a,ucp,utf
9 A\x{660}À
9 AÀ\x{660}
# End PCRE2_EXTRA_ASCII_xxx tests
/(?<!(|l ))/utf
(?<!(|l ))
/\p{ Aሴ}/utf
/\p{BC: Aሴ}/utf
/\p{BC: A=}/utf
/abc/utf,substitute_extended,replace=>\777<
abc
/a(?<namED_1>b)c/utf,substitute_extended
abc\=replace=>${namED_1}<
/a(?<namedverylongbutperfectlylegalsoyoushouldnthaveaproblem_1>b)c/utf,substitute_extended
abc\=replace=>${namedverylongbutperfectlylegalsoyoushouldnthaveaproblem_1}<
/a(?<nämed>b)c/utf,substitute_extended
abc\=replace=>${nämed}<
/a(?<nämedverylongbutperfectlylegalsoyoushouldnthaveaproblem_٢>b)c/utf,substitute_extended
abc\=replace=>${nämedverylongbutperfectlylegalsoyoushouldnthaveaproblem_٢}<
# python_octal
/\400/utf
\o{400}
/\400/utf,python_octal
/abc/utf,substitute_extended
abc\=replace=\400
/abc/utf,substitute_extended,python_octal
abc\=replace=\400
# Character range merging tests
/[\x{1200}\s\x{1202}\d\x{1201}]+/B,utf,ucp
\x{11ff}\x{1200}\x{1201}\x{1202}\x{1203}
/[\x{2000}-\x{2500}\x{2100}-\x{2600}\d\x{1800}-\x{1fff}]+/B,utf,ucp
\x{17ff}\x{1800}\x{2600}\x{2601}
/[\x{10008}\x{10003}\x{10006}\x{10004}\x{10007}]+/B,utf
\x{10002}\x{10005}\x{10003}\x{10004}\x{10006}\x{10007}\x{10008}\x{10009}
/[\x{100}-\x{400}]+/Bi,utf
qS\x{ff}\x{100}\x{a7c5}\x{401}
\x{2c63}\x{2c64}\x{2c65}\x{2c66}\x{2c67}
\x{a7af}\x{a7b0}\x{a7b1}\x{a7b2}\x{a7b3}
/[\x{100}-\x{400}\p{Ll}\x{500}-\x{700}\p{OldHungarian}\x{701}\p{bidiLRI}]/B,utf
/[\pC\x{100}-\x{200}\h\pN]/B,utf
/[\pC\x{100}-\x{200}\v\pN]/B,utf
/[\pC\x{100}-\x{200}\H\pN]/B,utf
/[\pC\x{100}-\x{200}\V\pN]/B,utf
/[\x{16e49}-\x{16e4f}\x{20000}\x{16e40}-\x{16e48}\pN]/Bi,utf
/[\x80-\x{4000}\x90\x{400}-\x{f000}\xa0\x{4000}-\x{10ffff}]++/B,utf
\x{7f}\x{80}\x{100}\x{10fffe}\x{10ffff}\x00
/[\x80-\x{4000}\x90\x{400}-\x{f000}\xa0\pN\x{4000}-\x{10ffff}]++/B,utf
\x{7f}\x{80}\x{100}090\x{10fffe}\x{10ffff}\x00
/[\x00-\x{4000}\x{2000}-\x{10ffff}]++/B,utf
abcd
/[abc\p{Any}]{5,7}/B,utf
xyz
/[^\p{Any}\x34\p{Any}]*cat/B,utf
cat
/[\pN\xf0-\x{10ffff}]{5,8}/B,utf
ab0123456cd
/[\x00-\x{398}\x{39a}-\x{10ffff}]*#(?i)[\x00-\x{398}\x{39a}-\x{10ffff}]*?#/B,utf
abcd#efg#
# Freeing memory on error test
/[\x{100}-\x{400}][\x{100}-\x{300}][\x{100}-\x{200}]\8/i,utf
# Character list tests
/[\x{100}-\x{7fff}\x{d7b0}\x{d7b1}\x{d7b3}\x{d7b4}\x{d7b6}\x{d7b7}\x{d7b9}\x{d7ba}]{12}/B,utf
\x{8000}\x{d7af}\x{d7b2}\x{d7b5}\x{d7b8}\x{d7bb}\x{100}\x{800}\x{7000}\x{7fff}\x{d7b0}\x{d7b1}\x{d7b3}\x{d7b4}\x{d7b6}\x{d7b7}\x{d7b9}\x{d7ba}\x{100}
/([\x{6535}\x{6536}\x{6538}\x{6539}\x{653b}\x{653c}\x{653e}\x{653f}\x{6541}\x{6542}\x{8000}-\x{ffff}]#)+/B,utf
\x{6534}#\x{6537}#\x{653a}#\x{653d}#\x{6540}#\x{6543}#\x{7fff}#\x{6535}#\x{6536}#\x{6538}#\x{6539}#\x{653b}#\x{653c}#\x{653e}#\x{653f}#\x{6541}#\x{6542}#\x{8000}#\x{c246}#\x{ffff}
/[[:xdigit:]\x{400}-\x{600}]+/utf,ucp
!a0\x{400}\x{600}9\x{3ff}
/[^[:xdigit:]\x{400}-\x{600}]+/utf,ucp
\x{400}(\x{3ff}\x{601})\x{600}
/[[:xdigit:]\x{400}-\x{600}\x{700}]+/utf,ucp
!A0\x{700}9\x{601}
/[^[:xdigit:]\x{400}-\x{600}\x{700}]+/utf,ucp
\x{600}(\x{6ff}\x{701}\x{3ff}\x{601})\x{700}
/[[:xdigit:]\x{400}-\x{600}\x{700}-\x{800}\x{900}]+/utf,ucp
!f0\x{800}\x{600}9\x{601}
/[^[:xdigit:]\x{400}-\x{600}\x{700}-\x{800}\x{900}]+/utf,ucp
\x{700}[\x{3ff}\x{601}\x{6ff}\x{801}\x{8ff}\x{901}]\x{900}
/[[:xdigit:]\x{400}-\x{410}\x{500}\x{600}-\x{610}\x{700}\x{800}-\x{810}]+/utf,ucp
!F0\x{400}\x{410}\x{500}\x{600}\x{610}\x{700}\x{800}\x{810}9\x{7ff}
/[^[:xdigit:]\x{400}-\x{410}\x{500}\x{600}-\x{610}\x{700}\x{800}-\x{810}]+/utf,ucp
\x{800}<\x{3ff}\x{411}\x{4ff}\x{501}\x{5ff}\x{611}\x{6ff}\x{701}\x{7ff}\x{811}>\x{810}
# --------------
# EXTENDED CHARACTER CLASSES (UTS#18)
/[\p{Lu}[\p{Nd}]]/B,alt_extended_class
0
C
\= Expect no match
[
a
/[[\pL][\p{Nd}]]/B,alt_extended_class
0
a
\= Expect no match
[
]
/[[\p{Lu}]||[\p{Nd}]]/B,alt_extended_class
A
1
\= Expect no match
a
/[[^\pL][\p{Nd}]]/B,alt_extended_class
0
.
\= Expect no match
A
/[^[\pL][\p{Nd}]]/B,alt_extended_class
.
\= Expect no match
A
0
/[^[\pL]&&[\p{Nd}]]/B,alt_extended_class
A
0
/[[\p{Lu}\p{Ll}]||[\p{Nd}\p{Ll}]]/B,alt_extended_class
A
1
c
\= Expect no match
_
/[[\p{Lu}\p{Ll}]&&[\p{Nd}\p{Ll}]]/B,alt_extended_class
c
\= Expect no match
A
1
_
/[[\p{Lu}\p{Ll}]--[\p{Nd}\p{Ll}]]/B,alt_extended_class
A
\= Expect no match
1
c
_
/[[\p{Lu}\p{Ll}]~~[\p{Nd}\p{Ll}]]/B,alt_extended_class
A
1
\= Expect no match
c
_
/[\pL[]]]/B,alt_extended_class
A
]
\= Expect no match
[
/[\pL[^]]]/B,alt_extended_class
A
[
0
\= Expect no match
]
/[\pL[]]/B,alt_extended_class,allow_empty_class
A
\= Expect no match
]
[
/[\pL[^]]/B,alt_extended_class,allow_empty_class
A
0
[
]
/[\dAC-E[:space:]\p{Lu}&&[^z]]/B,alt_extended_class
0
A
C
D
E
\t
\= Expect no match
a
;
/[z||[^\dAC-E[:space:]\p{Lu}]]/B,alt_extended_class
z
;
\= Expect no match
0
A
C
D
E
B
F
\t
/[\p{Lu}\p{Nd}||cd]/B,alt_extended_class
A
0
c
\= Expect no match
e
/[[\p{Lu}]\p{Nd}||[c]d]/B,alt_extended_class
A
0
c
\= Expect no match
e
/[\p{Lu}[\p{Nd}]||c[d]]/B,alt_extended_class
A
0
c
\= Expect no match
e
/[\p{Lu}-]/B,alt_extended_class
A
-
\= Expect no match
a
/[-\p{Lu}]/B,alt_extended_class
A
-
\= Expect no match
a
/[\pL-]/B,alt_extended_class
A
-
\= Expect no match
0
/[-\pL]/B,alt_extended_class
A
-
\= Expect no match
0
/[\p{Lu}-]/B
A
-
\= Expect no match
a
/[-\p{Lu}]/B
A
-
\= Expect no match
a
/[\pL-]/B
A
-
\= Expect no match
0
/[-\pL]/B
A
-
\= Expect no match
0
/[\p{Lu}-z]/B,alt_extended_class
/[z-\p{Lu}]/B,alt_extended_class
/[\pL-z]/B,alt_extended_class
/[z-\pL]/B,alt_extended_class
/[\p{Lu}-&&-\pL]/B,alt_extended_class
-
A
\= Expect no match
a
/[-\p{Lu}&&\pL-]/B,alt_extended_class
-
A
\= Expect no match
a
/[[\p{Lu}]-&&-[\pL]]/B,alt_extended_class
-
A
\= Expect no match
a
/[-[\p{Lu}]&&[\pL]-]/B,alt_extended_class
-
A
\= Expect no match
a
/(?xx:[ ^ 5[ ^ \p{Nd}] ])/B,alt_extended_class
4
\= Expect no match
a
;
5
/(?xx:[ ^ \p{Nd}[ ^ 5] ])/B,alt_extended_class
\= Expect no match
a
;
4
5
/(?xx:[ ^ \p{Nd}[ ^ \p{Nd}] ])/B,alt_extended_class
\= Expect no match
a
;
4
5
/[ ^ \p{Ll}[ ^ \p{Nd}] ]/B,alt_extended_class
\x20
^
a
0
\= Expect no match
A
;
/[a-c--\p{Nd}]+/B,alt_extended_class
ac
a
\= Expect no match
0
/[a-c--\p{Nd}]{2,3}/B,alt_extended_class
ac
cac
\= Expect no match
a
00
/x[a-c--\p{Nd}]+y/B,alt_extended_class
xacy
xaay
xay
\= Expect no match
zacy
xacz
xy
x0y
/[\pL--\pL--\pL]/B,alt_extended_class
\= Expect no match
A
1
/[[\pL--\pL]--\pL]/B,alt_extended_class
\= Expect no match
A
1
/[\pL--[\pL--\pL]]/B,alt_extended_class
A
\= Expect no match
1
/[\pL--^\p{Nd}]/B,alt_extended_class
A
\= Expect no match
1
^
/([a-z--[\pL&&n]])\1/B,alt_extended_class
aa
zz
\= Expect no match
az
nn
/(x[a-z--[\pL&&n]]y)\1/B,alt_extended_class
xayxay
xzyxzy
\= Expect no match
xnyxny
/(?:_\1|([a-z--[\pL&&n]])){2}/B,alt_extended_class
a_a
z_z
\= Expect no match
a_z
n_n
/(?:_\1|([a-z--[\pL&&n]]))+/B,alt_extended_class
a_a
z_z
a_partial
\= Expect no match
n_n
/[\p{Nd}||[\pL--\p{Lu}]]/B,alt_extended_class
a
0
\= Expect no match
C
/[\P{Nd}||2]/B,alt_extended_class
_
Z
2
\= Expect no match
1
3
/[^[\P{Nd}]]/B,alt_extended_class
1
2
\= Expect no match
_
z
# caseless tests
/[\p{Lu}~~\p{Ll}]/B,alt_extended_class
a
A
\= Expect no match
_
1
/[[\p{Lu}1]~~\p{Ll}]/iB,alt_extended_class
1
\= Expect no match
a
A
_
/[[\p{Lu}1]&&[\p{Ll}1]]/B,alt_extended_class
1
\= Expect no match
a
A
_
2
/[[\p{Lu}1]&&[\p{Ll}1]]/iB,alt_extended_class
a
A
1
\= Expect no match
_
2
\
/[\p{Thai}&&\p{Nd}]/B,utf,alt_extended_class
\x{0e51}
\= Expect no match
0
a
\x{0e01}
/[\p{Thai}||\p{Nd}]/B,utf,alt_extended_class
\x{0e51}
\x{0e01}
0
\= Expect no match
a
/[\p{Thai}~~\p{Nd}]/B,utf,alt_extended_class
\x{0e01}
0
\= Expect no match
\x{0e51}
a
/[[\p{Thai}&&\p{Nd}]~~[^a]]/B,utf,alt_extended_class
\x{0e01}
b
0
\= Expect no match
a
\x{0e51}
/^[\p{Thai}&&\p{Nd}]?$/B,utf,alt_extended_class
\x{0e51}
\
\= Expect no match
a
/^[\p{Thai}&&\p{Nd}]??$/B,utf,alt_extended_class
\x{0e51}
\
\= Expect no match
a
/^[\p{Thai}&&\p{Nd}]?+$/B,utf,alt_extended_class
\x{0e51}
\
\= Expect no match
a
/^[\p{Thai}&&\p{Nd}]{3}$/B,utf,alt_extended_class
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]{3,}$/B,utf,alt_extended_class
\x{0e51}\x{0e51}\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]{3,}?$/B,utf,alt_extended_class
\x{0e51}\x{0e51}\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]{3,}+$/B,utf,alt_extended_class
\x{0e51}\x{0e51}\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]{,3}$/B,utf,alt_extended_class
\
\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}\x{0e51}\x{0e51}\x{0e51}
a
/^[\p{Thai}&&\p{Nd}]{,3}?$/B,utf,alt_extended_class
\
\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}\x{0e51}\x{0e51}\x{0e51}
a
/^[\p{Thai}&&\p{Nd}]{,3}+$/B,utf,alt_extended_class
\
\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}\x{0e51}\x{0e51}\x{0e51}
a
/^[\p{Thai}&&\p{Nd}]+\x{0e51}$/B,utf,alt_extended_class
\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]+?\x{0e51}$/B,utf,alt_extended_class
\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]++\x{0e51}$/B,utf,alt_extended_class
\= Expect no match
\x{0e51}
\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\
a
/^[\p{Thai}&&\p{Nd}]*\x{0e51}$/B,utf,alt_extended_class
\x{0e51}
\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\
a
/^[\p{Thai}&&\p{Nd}]*?\x{0e51}$/B,utf,alt_extended_class
\x{0e51}
\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\= Expect no match
\
a
/^[\p{Thai}&&\p{Nd}]*+\x{0e51}$/B,utf,alt_extended_class
\= Expect no match
\x{0e51}
\x{0e51}\x{0e51}
\x{0e51}\x{0e51}\x{0e51}
\
a
/[^[^\p{Thai}]]/B,utf,alt_extended_class
\x{0e51}
\= Expect no match
a
/[^[^\p{L}]]/B,utf,alt_extended_class
\x{0e01}
a
\= Expect no match
0
/[\pL&&[^\x00-\xFF]]/B,utf,alt_extended_class
\x{21e}
\= Expect no match
a
/[\pL&&\x{100}-\x{1000}]{3,6}+/utf,alt_extended_class
\x{145}\x{18b}A\x{145}\x{18b}\x{1C2}\x{21a}\x{257}\x{2ae}\x{0145}\x{18b}
\x{145}A\x{145}\x{18b}\x{1C2}B
/[\pL&&\x{100}-\x{1000}]{3,6}\x{2A3}/utf,alt_extended_class
\x{145}\x{18b}\x{2a3}A\x{145}\x{18b}\x{1c2}\x{21a}\x{257}\x{2ae}\x{2a3}
\x{145}\x{2a3}A\x{145}\x{18b}\x{1c2}\x{2a3}
\x{2a3}A\x{145}\x{18b}\x{1c2}\x{2a3}\x{2a3}
\x{0145}\x{18b}\x{2a3}A\x{145}\x{18b}\x{1c2}\x{21a}\x{257}\x{2ae}\x{145}\x{2a3}
/[\pL&&\x{100}-\x{1000}]{3,6}?\x{2A3}/utf,alt_extended_class
\x{145}\x{18b}\x{2a3}A\x{145}\x{18b}\x{1c2}\x{21a}\x{257}\x{2ae}\x{2a3}
\x{145}\x{2a3}A\x{145}\x{18b}\x{1c2}\x{2a3}
\x{2a3}A\x{145}\x{18b}\x{1c2}\x{2a3}\x{2a3}
\x{0145}\x{18b}\x{2a3}A\x{145}\x{18b}\x{1c2}\x{21a}\x{257}\x{2ae}\x{145}\x{2a3}
/[\P{scx=Beng}\P{scx=Deva}\pM--[\x{2000}-\x{3000}]]+/utf,alt_extended_class
\x{964}\x{2000}\x{3000}A\x{951}\x{1fff}\x{3001}\x{965}
/[\p{Thai}~~[^]]/B,utf,alt_extended_class,allow_empty_class
\x{0d01}
a
\= Expect no match
\x{0e01}
/[[]~~[^]]/B,utf,alt_extended_class,allow_empty_class
\x{0d01}
a
/[[^]~~[]]/B,utf,alt_extended_class,allow_empty_class
\x{0d01}
a
/[[^]~~[^]]/B,utf,alt_extended_class,allow_empty_class
\= Expect no match
\x{0d01}
a
/[[^]||\pL]/B,utf,alt_extended_class,allow_empty_class
0
a
/[\pL||[^]]/B,utf,alt_extended_class,allow_empty_class
0
a
/[\pL~~[^]]/B,utf,alt_extended_class,allow_empty_class
0
\= Expect no match
a
/[[^]~~\pL]/B,utf,alt_extended_class,allow_empty_class
0
\= Expect no match
a
/([\p{Lu}&&\p{sc=Hung}]+?\x{10c81})+#/utf,alt_extended_class
\x{10c80}\x{10cb2}\x{10c81}\x{10c85}\x{10cb0}\x{10cf2}\x{10c81}#\x{10c80}\x{10cb2}\x{10c81}\x{10c85}\x{10cb0}\x{10c81}##
/[[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]
&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]&&[\pN--[\pC||\x{9F5}]]]/utf,alt_extended_class
# --------------
/^([\h\x{9000}\x{9002}\x{9004}][\v\x{9000}\x{9002}\x{9004}\x{9006}\x{9008}][\h\v\x{9000}],){4}$/B,utf
\x09\x0a\x0d,\x{1680}\x{2028}\x{1680},\x{180e}\x{2029}\x{180e},\x{9000}\x{9000}\x{9000},
/[z-\p{Lu}]/
/[z-\pL]/
/[\p{Lu}-z]/
/[\pL-z]/
/[a\x{e1}]/iB
a
A
\x{e1}
/[a\x{e1}]/iB,utf
a
A
\x{e1}
\x{c1}
/[a\x{e1}]/iB,ucp
a
A
\x{e1}
\x{c1}
/[a\x{e1}]/iB,ucp,utf
a
A
\x{e1}

5203
3rd/pcre2/testdata/testinput6 vendored Normal file

File diff suppressed because it is too large Load Diff

2745
3rd/pcre2/testdata/testinput7 vendored Normal file
View File

@@ -0,0 +1,2745 @@
# This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests.
#subject dfa
#newline_default LF any anyCRLF
/\x{100}ab/utf
\x{100}ab
/a\x{100}*b/utf
ab
a\x{100}b
a\x{100}\x{100}b
/a\x{100}+b/utf
a\x{100}b
a\x{100}\x{100}b
\= Expect no match
ab
/\bX/utf
Xoanon
+Xoanon
\x{300}Xoanon
\= Expect no match
YXoanon
/\BX/utf
YXoanon
\= Expect no match
Xoanon
+Xoanon
\x{300}Xoanon
/X\b/utf
X+oanon
ZX\x{300}oanon
FAX
\= Expect no match
Xoanon
/X\B/utf
Xoanon
\= Expect no match
X+oanon
ZX\x{300}oanon
FAX
/[^a]/utf
abcd
a\x{100}
/^[abc\x{123}\x{400}-\x{402}]{2,3}\d/utf
ab99
\x{123}\x{123}45
\x{400}\x{401}\x{402}6
\= Expect no match
d99
\x{123}\x{122}4
\x{400}\x{403}6
\x{400}\x{401}\x{402}\x{402}6
/a.b/utf
acb
a\x7fb
a\x{100}b
\= Expect no match
a\nb
/a(.{3})b/utf
a\x{4000}xyb
a\x{4000}\x7fyb
a\x{4000}\x{100}yb
\= Expect no match
a\x{4000}b
ac\ncb
/a(.*?)(.)/
a\xc0\x88b
/a(.*?)(.)/utf
a\x{100}b
/a(.*)(.)/
a\xc0\x88b
/a(.*)(.)/utf
a\x{100}b
/a(.)(.)/
a\xc0\x92bcd
/a(.)(.)/utf
a\x{240}bcd
/a(.?)(.)/
a\xc0\x92bcd
/a(.?)(.)/utf
a\x{240}bcd
/a(.??)(.)/
a\xc0\x92bcd
/a(.??)(.)/utf
a\x{240}bcd
/a(.{3})b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
\= Expect no match
a\x{1234}b
ac\ncb
/a(.{3,})b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
\= Expect no match
a\x{1234}b
/a(.{3,}?)b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
\= Expect no match
a\x{1234}b
/a(.{3,5})b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
axbxxbcdefghijb
axxxxxbcdefghijb
\= Expect no match
a\x{1234}b
axxxxxxbcdefghijb
/a(.{3,5}?)b/utf
a\x{1234}xyb
a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
a\x{1234}\x{4321}\x{3412}\x{3421}b
axbxxbcdefghijb
axxxxxbcdefghijb
\= Expect no match
a\x{1234}b
axxxxxxbcdefghijb
/^[a\x{c0}]/utf
\= Expect no match
\x{100}
/(?<=aXb)cd/utf
aXbcd
/(?<=a\x{100}b)cd/utf
a\x{100}bcd
/(?<=a\x{100000}b)cd/utf
a\x{100000}bcd
/(?:\x{100}){3}b/utf
\x{100}\x{100}\x{100}b
\= Expect no match
\x{100}\x{100}b
/\x{ab}/utf
\x{ab}
\xc2\xab
\= Expect no match
\x00{ab}
/(?<=(.))X/utf
WXYZ
\x{256}XYZ
\= Expect no match
XYZ
/[^a]+/g,utf
bcd
\x{100}aY\x{256}Z
/^[^a]{2}/utf
\x{100}bc
/^[^a]{2,}/utf
\x{100}bcAa
/^[^a]{2,}?/utf
\x{100}bca
/[^a]+/gi,utf
bcd
\x{100}aY\x{256}Z
/^[^a]{2}/i,utf
\x{100}bc
/^[^a]{2,}/i,utf
\x{100}bcAa
/^[^a]{2,}?/i,utf
\x{100}bca
/\x{100}{0,0}/utf
abcd
/\x{100}?/utf
abcd
\x{100}\x{100}
/\x{100}{0,3}/utf
\x{100}\x{100}
\x{100}\x{100}\x{100}\x{100}
/\x{100}*/utf
abce
\x{100}\x{100}\x{100}\x{100}
/\x{100}{1,1}/utf
abcd\x{100}\x{100}\x{100}\x{100}
/\x{100}{1,3}/utf
abcd\x{100}\x{100}\x{100}\x{100}
/\x{100}+/utf
abcd\x{100}\x{100}\x{100}\x{100}
/\x{100}{3}/utf
abcd\x{100}\x{100}\x{100}XX
/\x{100}{3,5}/utf
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
/\x{100}{3,}/utf,no_auto_possess
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
/(?<=a\x{100}{2}b)X/utf
Xyyya\x{100}\x{100}bXzzz
/\D*/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
/\D*/utf,no_auto_possess
\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
/\D/utf
1X2
1\x{100}2
/>\S/utf
> >X Y
> >\x{100} Y
/\d/utf
\x{100}3
/\s/utf
\x{100} X
/\D+/utf
12abcd34
\= Expect no match
1234
/\D{2,3}/utf
12abcd34
12ab34
\= Expect no match
1234
12a34
/\D{2,3}?/utf
12abcd34
12ab34
\= Expect no match
1234
12a34
/\d+/utf
12abcd34
/\d{2,3}/utf
12abcd34
1234abcd
\= Expect no match
1.4
/\d{2,3}?/utf
12abcd34
1234abcd
\= Expect no match
1.4
/\S+/utf
12abcd34
\= Expect no match
\ \
/\S{2,3}/utf
12abcd34
1234abcd
\= Expect no match
\ \
/\S{2,3}?/utf
12abcd34
1234abcd
\= Expect no match
\ \
/>\s+</utf
12> <34
/>\s{2,3}</utf
ab> <cd
ab> <ce
\= Expect no match
ab> <cd
/>\s{2,3}?</utf
ab> <cd
ab> <ce
\= Expect no match
ab> <cd
/\w+/utf
12 34
\= Expect no match
+++=*!
/\w{2,3}/utf
ab cd
abcd ce
\= Expect no match
a.b.c
/\w{2,3}?/utf
ab cd
abcd ce
\= Expect no match
a.b.c
/\W+/utf
12====34
\= Expect no match
abcd
/\W{2,3}/utf
ab====cd
ab==cd
\= Expect no match
a.b.c
/\W{2,3}?/utf
ab====cd
ab==cd
\= Expect no match
a.b.c
/[\x{100}]/utf
\x{100}
Z\x{100}
\x{100}Z
/[Z\x{100}]/utf
Z\x{100}
\x{100}
\x{100}Z
/[\x{100}\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
/[\x{100}-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
/[z-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
abzcd
ab|cd
/[Q\x{100}\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
Q?
/[Q\x{100}-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
Q?
/[Qz-\x{200}]/utf
ab\x{100}cd
ab\x{200}cd
ab\x{111}cd
abzcd
ab|cd
Q?
/[\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/[\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/[Q\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/[Q\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
ab\x{200}cd
ab\x{200}\x{100}\x{200}\x{100}cd
/(?<=[\x{100}\x{200}])X/utf
abc\x{200}X
abc\x{100}X
\= Expect no match
X
/(?<=[Q\x{100}\x{200}])X/utf
abc\x{200}X
abc\x{100}X
abQX
\= Expect no match
X
/(?<=[\x{100}\x{200}]{3})X/utf
abc\x{100}\x{200}\x{100}X
\= Expect no match
abc\x{200}X
X
/[^\x{100}\x{200}]X/utf
AX
\x{150}X
\x{500}X
\= Expect no match
\x{100}X
\x{200}X
/[^Q\x{100}\x{200}]X/utf
AX
\x{150}X
\x{500}X
\= Expect no match
\x{100}X
\x{200}X
QX
/[^\x{100}-\x{200}]X/utf
AX
\x{500}X
\= Expect no match
\x{100}X
\x{150}X
\x{200}X
/[z-\x{100}]/i,utf
z
Z
\x{100}
\= Expect no match
\x{102}
y
/[\xFF]/
>\xff<
/[\xff]/utf
>\x{ff}<
/[^\xFF]/
XYZ
/[^\xff]/utf
XYZ
\x{123}
/^[ac]*b/utf
\= Expect no match
xb
/^[ac\x{100}]*b/utf
\= Expect no match
xb
/^[^x]*b/i,utf
\= Expect no match
xb
/^[^x]*b/utf
\= Expect no match
xb
/^\d*b/utf
\= Expect no match
xb
/(|a)/g,utf
catac
a\x{256}a
/^\x{85}$/i,utf
\x{85}
/^abc./gmx,newline=any,utf
abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
/abc.$/gmx,newline=any,utf
abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
/^a\Rb/bsr=unicode,utf
a\nb
a\rb
a\r\nb
a\x0bb
a\x0cb
a\x{85}b
a\x{2028}b
a\x{2029}b
\= Expect no match
a\n\rb
/^a\R*b/bsr=unicode,utf
ab
a\nb
a\rb
a\r\nb
a\x0bb
a\x0c\x{2028}\x{2029}b
a\x{85}b
a\n\rb
a\n\r\x{85}\x0cb
/^a\R+b/bsr=unicode,utf
a\nb
a\rb
a\r\nb
a\x0bb
a\x0c\x{2028}\x{2029}b
a\x{85}b
a\n\rb
a\n\r\x{85}\x0cb
\= Expect no match
ab
/^a\R{1,3}b/bsr=unicode,utf
a\nb
a\n\rb
a\n\r\x{85}b
a\r\n\r\nb
a\r\n\r\n\r\nb
a\n\r\n\rb
a\n\n\r\nb
\= Expect no match
a\n\n\n\rb
a\r
/\h+\V?\v{3,4}/utf,no_auto_possess
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
/\V?\v{3,4}/utf,no_auto_possess
\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
/\h+\V?\v{3,4}/utf,no_auto_possess
>\x09\x20\x{a0}X\x0a\x0a\x0a<
/\V?\v{3,4}/utf,no_auto_possess
>\x09\x20\x{a0}X\x0a\x0a\x0a<
/\H\h\V\v/utf
X X\x0a
X\x09X\x0b
\= Expect no match
\x{a0} X\x0a
/\H*\h+\V?\v{3,4}/utf,no_auto_possess
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
\x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
\x09\x20\x{a0}\x0a\x0b\x0c
\= Expect no match
\x09\x20\x{a0}\x0a\x0b
/\H\h\V\v/utf
\x{3001}\x{3000}\x{2030}\x{2028}
X\x{180e}X\x{85}
\= Expect no match
\x{2009} X\x0a
/\H*\h+\V?\v{3,4}/utf,no_auto_possess
\x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
\x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
\x09\x20\x{202f}\x0a\x0b\x0c
\= Expect no match
\x09\x{200a}\x{a0}\x{2028}\x0b
/a\Rb/I,bsr=anycrlf,utf
a\rb
a\nb
a\r\nb
\= Expect no match
a\x{85}b
a\x0bb
/a\Rb/I,bsr=unicode,utf
a\rb
a\nb
a\r\nb
a\x{85}b
a\x0bb
/a\R?b/I,bsr=anycrlf,utf
a\rb
a\nb
a\r\nb
\= Expect no match
a\x{85}b
a\x0bb
/a\R?b/I,bsr=unicode,utf
a\rb
a\nb
a\r\nb
a\x{85}b
a\x0bb
/X/newline=any,utf,firstline
A\x{1ec5}ABCXYZ
/abcd*/utf
xxxxabcd\=ps
xxxxabcd\=ph
/abcd*/i,utf
xxxxabcd\=ps
xxxxabcd\=ph
XXXXABCD\=ps
XXXXABCD\=ph
/abc\d*/utf
xxxxabc1\=ps
xxxxabc1\=ph
/abc[de]*/utf
xxxxabcde\=ps
xxxxabcde\=ph
/\bthe cat\b/utf
the cat\=ps
the cat\=ph
/./newline=crlf,utf
\r\=ps
\r\=ph
/.{2,3}/newline=crlf,utf
\r\=ps
\r\=ph
\r\r\=ps
\r\r\=ph
\r\r\r\=ps
\r\r\r\=ph
/.{2,3}?/newline=crlf,utf
\r\=ps
\r\=ph
\r\r\=ps
\r\r\=ph
\r\r\r\=ps
\r\r\r\=ph
/[^\x{100}]/utf
\x{100}\x{101}X
/[^\x{100}]+/utf
\x{100}\x{101}X
/\pL\P{Nd}/utf
AB
\= Expect no match
A0
00
/\X./utf
AB
A\x{300}BC
A\x{300}\x{301}\x{302}BC
\= Expect no match
\x{300}
/\X\X/utf
ABC
A\x{300}B\x{300}\x{301}C
A\x{300}\x{301}\x{302}BC
\= Expect no match
\x{300}
/^\pL+/utf
abcd
a
/^\PL+/utf
1234
=
\= Expect no match
abcd
/^\X+/utf
abcdA\x{300}\x{301}\x{302}
A\x{300}\x{301}\x{302}
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
a
\x{300}\x{301}\x{302}
/\X?abc/utf
abc
A\x{300}abc
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
\x{300}abc
/^\X?abc/utf
abc
A\x{300}abc
\x{300}abc
\= Expect no match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
/\X*abc/utf
abc
A\x{300}abc
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
\x{300}abc
/^\X*abc/utf
abc
A\x{300}abc
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
\x{300}abc
/^\pL?=./utf
A=b
=c
\= Expect no match
1=2
AAAA=b
/^\pL*=./utf
AAAA=b
=c
\= Expect no match
1=2
/^\X{2,3}X/utf
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
\= Expect no match
X
A\x{300}\x{301}\x{302}X
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
/^\pC\pL\pM\pN\pP\pS\pZ</utf
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
\np\x{300}9!\$ <
\= Expect no match
ap\x{300}9!\$ <
/^\PC/utf
X
\= Expect no match
\x7f
/^\PL/utf
9
\= Expect no match
\x{c0}
/^\PM/utf
X
\= Expect no match
\x{30f}
/^\PN/utf
X
\= Expect no match
\x{660}
/^\PP/utf
X
\= Expect no match
\x{66c}
/^\PS/utf
X
\= Expect no match
\x{f01}
/^\PZ/utf
X
\= Expect no match
\x{1680}
/^\p{Cc}/utf
\x{017}
\x{09f}
\= Expect no match
\x{0600}
/^\p{Cf}/utf
\x{601}
\x{180e}
\x{061c}
\x{2066}
\x{2067}
\x{2068}
\x{2069}
\= Expect no match
\x{09f}
/^\p{Cn}/utf
\= Expect no match
\x{09f}
/^\p{Co}/utf
\x{f8ff}
\= Expect no match
\x{09f}
/^\p{Cs}/utf
\x{dfff}\=no_utf_check
\= Expect no match
\x{09f}
/^\p{Ll}/utf
a
\= Expect no match
Z
\x{e000}
/^\p{Lm}/utf
\x{2b0}
\= Expect no match
a
/^\p{Lo}/utf
\x{1bb}
\= Expect no match
a
\x{2b0}
/^\p{Lt}/utf
\x{1c5}
\= Expect no match
a
\x{2b0}
/^\p{Lu}/utf
A
\= Expect no match
\x{2b0}
/^\p{Mc}/utf
\x{903}
\= Expect no match
X
\x{300}
/^\p{Me}/utf
\x{488}
\= Expect no match
X
\x{903}
\x{300}
/^\p{Mn}/utf
\x{300}
\x{1a1b}
\= Expect no match
X
\x{903}
/^\p{Nd}+/utf,no_auto_possess
0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
\x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
\x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
\= Expect no match
X
/^\p{Nl}/utf
\x{16ee}
\= Expect no match
X
\x{966}
/^\p{No}/utf
\x{b2}
\x{b3}
\= Expect no match
X
\x{16ee}
/^\p{Pc}/utf
\x5f
\x{203f}
\= Expect no match
X
-
\x{58a}
/^\p{Pd}/utf
-
\x{58a}
\= Expect no match
X
\x{203f}
/^\p{Pe}/utf
)
]
}
\x{f3b}
\x{2309}
\x{230b}
\= Expect no match
X
\x{203f}
(
[
{
\x{f3c}
/^\p{Pf}/utf
\x{bb}
\x{2019}
\= Expect no match
X
\x{203f}
/^\p{Pi}/utf
\x{ab}
\x{2018}
\= Expect no match
X
\x{203f}
/^\p{Po}/utf
!
\x{37e}
\= Expect no match
X
\x{203f}
/^\p{Ps}/utf
(
[
{
\x{f3c}
\x{2308}
\x{230a}
\= Expect no match
X
)
]
}
\x{f3b}
/^\p{Sc}+/utf
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
\x{9f2}
\= Expect no match
X
\x{2c2}
/^\p{Sk}/utf
\x{2c2}
\= Expect no match
X
\x{9f2}
/^\p{Sm}+/utf
+<|~\x{ac}\x{2044}
\= Expect no match
X
\x{9f2}
/^\p{So}/utf
\x{a6}
\x{482}
\= Expect no match
X
\x{9f2}
/^\p{Zl}/utf
\x{2028}
\= Expect no match
X
\x{2029}
/^\p{Zp}/utf
\x{2029}
\= Expect no match
X
\x{2028}
/^\p{Zs}/utf
\ \
\x{a0}
\x{1680}
\x{2000}
\x{2001}
\= Expect no match
\x{2028}
\x{200d}
/\p{Nd}+(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}+?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,}(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,}?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2}(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,3}(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}{2,3}?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}?(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}??(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*+(..)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*+(...)/utf
\x{660}\x{661}\x{662}ABC
/\p{Nd}*+(....)/utf
\= Expect no match
\x{660}\x{661}\x{662}ABC
/\p{^Lu}/i,utf
1234
\= Expect no match
ABC
/\P{Lu}/i,utf
1234
\= Expect no match
ABC
/(?<=A\p{Nd})XYZ/utf
A2XYZ
123A5XYZPQR
ABA\x{660}XYZpqr
\= Expect no match
AXYZ
XYZ
/(?<!\pL)XYZ/utf
1XYZ
AB=XYZ..
XYZ
\= Expect no match
WXYZ
/[\p{Nd}]/utf
1234
/[\p{Nd}+-]+/utf
1234
12-34
12+\x{661}-34
\= Expect no match
abcd
/[\P{Nd}]+/utf
abcd
\= Expect no match
1234
/\D+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/\P{Nd}+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/[\D]+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/[\P{Nd}]+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/[\D\P{Nd}]+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
/\pL/utf
a
A
/\pL/i,utf
a
A
/^\x{c0}$/i,utf
\x{c0}
\x{e0}
/^\x{e0}$/i,utf
\x{c0}
\x{e0}
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
\= Expect no match
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
/\x{391}+/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
/\x{391}{3,5}(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
/\x{391}{3,5}?(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
/[\x{391}\x{ff3a}]/i,utf
\x{391}
\x{ff3a}
\x{3b1}
\x{ff5a}
/[\x{c0}\x{391}]/i,utf
\x{c0}
\x{e0}
/[\x{105}-\x{109}]/i,utf
\x{104}
\x{105}
\x{109}
\= Expect no match
\x{100}
\x{10a}
/[z-\x{100}]/i,utf
Z
z
\x{39c}
\x{178}
|
\x{80}
\x{ff}
\x{100}
\x{101}
\= Expect no match
\x{102}
Y
y
/[z-\x{100}]/i,utf
/^\X/utf
A
A\x{300}BC
A\x{300}\x{301}\x{302}BC
\x{300}
/^(\X*)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^(\X*?)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^(\X*)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^(\X*?)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
/^\X(.)/utf
\= Expect no match
A\x{300}\x{301}\x{302}
/^\X{2,3}(.)/utf
A\x{300}\x{301}B\x{300}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
/^\X{2,3}?(.)/utf
A\x{300}\x{301}B\x{300}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
/^\pN{2,3}X/
12X
123X
\= Expect no match
X
1X
1234X
/\x{100}/i,utf
\x{100}
\x{101}
/^\p{Han}+/utf
\x{2e81}\x{3007}\x{2f804}\x{31a0}
\= Expect no match
\x{2e7f}
/^\P{Katakana}+/utf
\x{3105}
\= Expect no match
\x{30ff}
/^[\p{Arabic}]/utf
\x{06e9}
\x{060b}
\= Expect no match
X\x{06e9}
/^[\P{Yi}]/utf
\x{2f800}
\= Expect no match
\x{a014}
\x{a4c6}
/^\p{Any}X/utf
AXYZ
\x{1234}XYZ
\= Expect no match
X
/^\P{Any}X/utf
\= Expect no match
AX
/^\p{Any}?X/utf
XYZ
AXYZ
\x{1234}XYZ
\= Expect no match
ABXYZ
/^\P{Any}?X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
ABXYZ
/^\p{Any}+X/utf
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
\= Expect no match
XYZ
/^\P{Any}+X/utf
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
XYZ
/^\p{Any}*X/utf
XYZ
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^\P{Any}*X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^[\p{Any}]X/utf
AXYZ
\x{1234}XYZ
\= Expect no match
X
/^[\P{Any}]X/utf
\= Expect no match
AX
/^[\p{Any}]?X/utf
XYZ
AXYZ
\x{1234}XYZ
\= Expect no match
ABXYZ
/^[\P{Any}]?X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
ABXYZ
/^[\p{Any}]+X/utf
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
\= Expect no match
XYZ
/^[\P{Any}]+X/utf
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
XYZ
/^[\p{Any}]*X/utf
XYZ
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^[\P{Any}]*X/utf
XYZ
\= Expect no match
AXYZ
\x{1234}XYZ
A\x{1234}XYZ
/^\p{Any}{3,5}?/utf
abcdefgh
\x{1234}\n\r\x{3456}xyz
/^\p{Any}{3,5}/utf
abcdefgh
\x{1234}\n\r\x{3456}xyz
/^\P{Any}{3,5}?/utf
\= Expect no match
abcdefgh
\x{1234}\n\r\x{3456}xyz
/^\p{L&}X/utf
AXY
aXY
\x{1c5}XY
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^[\p{L&}]X/utf
AXY
aXY
\x{1c5}XY
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^\p{L&}+X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^[\p{L&}]+X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^\p{L&}+?X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^[\p{L&}]+?X/utf
AXY
aXY
AbcdeXyz
\x{1c5}AbXY
abcDEXypqreXlmn
\= Expect no match
\x{1bb}XY
\x{2b0}XY
!XY
/^\P{L&}X/utf
!XY
\x{1bb}XY
\x{2b0}XY
\= Expect no match
\x{1c5}XY
AXY
/^[\P{L&}]X/utf
!XY
\x{1bb}XY
\x{2b0}XY
\= Expect no match
\x{1c5}XY
AXY
/^\x{023a}+?(\x{0130}+)/i,utf
\x{023a}\x{2c65}\x{0130}
/^\x{023a}+([^X])/i,utf
\x{023a}\x{2c65}X
/\x{c0}+\x{116}+/i,utf
\x{c0}\x{e0}\x{116}\x{117}
/[\x{c0}\x{116}]+/i,utf
\x{c0}\x{e0}\x{116}\x{117}
# Check property support in non-UTF-8 mode
/\p{L}{4}/
123abcdefg
123abc\xc4\xc5zz
/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/utf
\x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
/\x{a77d}\x{1d79}/i,utf
\x{a77d}\x{1d79}
\x{1d79}\x{a77d}
/\x{a77d}\x{1d79}/utf
\x{a77d}\x{1d79}
\= Expect no match
\x{1d79}\x{a77d}
/^\p{Xan}/utf
ABCD
1234
\x{6ca}
\x{a6c}
\x{10a7}
\= Expect no match
_ABC
/^\p{Xan}+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
\= Expect no match
_ABC
/^\p{Xan}*/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xan}{2,9}/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^[\p{Xan}]/utf
ABCD1234_
1234abcd_
\x{6ca}
\x{a6c}
\x{10a7}
\= Expect no match
_ABC
/^[\p{Xan}]+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
\= Expect no match
_ABC
/^>\p{Xsp}/utf
>\x{1680}\x{2028}\x{0b}
\= Expect no match
\x{0b}
/^>\p{Xsp}+/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xsp}*/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xsp}{2,9}/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>[\p{Xsp}]/utf,no_auto_possess
>\x{2028}\x{0b}
/^>[\p{Xsp}]+/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}/utf
>\x{1680}\x{2028}\x{0b}
>\x{a0}
\= Expect no match
\x{0b}
/^>\p{Xps}+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}+?/utf
>\x{1680}\x{2028}\x{0b}
/^>\p{Xps}*/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}{2,9}/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}{2,9}?/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>[\p{Xps}]/utf
>\x{2028}\x{0b}
/^>[\p{Xps}]+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^\p{Xwd}/utf
ABCD
1234
\x{6ca}
\x{a6c}
\x{10a7}
_ABC
\= Expect no match
[]
/^\p{Xwd}+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}*/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}{2,9}/utf
A_12\x{6ca}\x{a6c}\x{10a7}
/^[\p{Xwd}]/utf
ABCD1234_
1234abcd_
\x{6ca}
\x{a6c}
\x{10a7}
_ABC
\= Expect no match
[]
/^[\p{Xwd}]+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
# Unicode properties for \b and \B
/\b...\B/utf,ucp
abc_
\x{37e}abc\x{376}
\x{37e}\x{376}\x{371}\x{393}\x{394}
!\x{c0}++\x{c1}\x{c2}
!\x{c0}+++++
# Without PCRE2_UCP, non-ASCII always fail, even if < 256
/\b...\B/utf
abc_
\= Expect no match
\x{37e}abc\x{376}
\x{37e}\x{376}\x{371}\x{393}\x{394}
!\x{c0}++\x{c1}\x{c2}
!\x{c0}+++++
# With PCRE2_UCP, non-UTF8 chars that are < 256 still check properties
/\b...\B/ucp
abc_
!\x{c0}++\x{c1}\x{c2}
!\x{c0}+++++
# Caseless single negated characters > 127 need UCP support
/[^\x{100}]/i,utf
\x{100}\x{101}X
/[^\x{100}]+/i,utf
\x{100}\x{101}XX
/^\X/utf
A\=ps
A\=ph
A\x{300}\x{301}\=ps
A\x{300}\x{301}\=ph
A\x{301}\=ps
A\x{301}\=ph
/^\X{2,3}/utf
A\=ps
A\=ph
AA\=ps
AA\=ph
A\x{300}\x{301}\=ps
A\x{300}\x{301}\=ph
A\x{300}\x{301}A\x{300}\x{301}\=ps
A\x{300}\x{301}A\x{300}\x{301}\=ph
/^\X{2}/utf
AA\=ps
AA\=ph
A\x{300}\x{301}A\x{300}\x{301}\=ps
A\x{300}\x{301}A\x{300}\x{301}\=ph
/^\X+/utf
AA\=ps
AA\=ph
/^\X+?Z/utf
AA\=ps
AA\=ph
# These are tests for extended grapheme clusters
/^\X/utf,aftertext
G\x{34e}\x{34e}X
\x{34e}\x{34e}X
\x04X
\x{1100}X
\x{1100}\x{34e}X
\x{1b04}\x{1b04}X
\= These match up to the roman letters
\x{1111}\x{1111}L,L
\x{1111}\x{1111}\x{1169}L,L,V
\x{1111}\x{ae4c}L, LV
\x{1111}\x{ad89}L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
\= These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
\x{ae4c}\x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
\x{1169}\x{1111}V, L
\x{1169}\x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
\x{11fe}\x{1169}T, V
\x{11fe}\x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
\= Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
\= Test CR, LF, and control
\x0d\x{0711}CR, extend
\x0d\x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
\x0a\x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
\x09\x{1b04}Control, spacingmark
\= There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}?X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
/[z\x{1e9e}]+/i,utf
\x{1e9e}\x{00df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
/[z\x{00df}]+/i,utf
\x{1e9e}\x{00df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
/[z\x{1f88}]+/i,utf
\x{1f88}\x{1f80}
# Perl matches these
/\x{00b5}+/i,utf
\x{00b5}\x{039c}\x{03bc}
/\x{039c}+/i,utf
\x{00b5}\x{039c}\x{03bc}
/\x{03bc}+/i,utf
\x{00b5}\x{039c}\x{03bc}
/\x{00c5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
/\x{00e5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
/\x{212b}+/i,utf
\x{00c5}\x{00e5}\x{212b}
/\x{01c4}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/\x{01c5}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/\x{01c6}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
/\x{01c7}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/\x{01c8}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/\x{01c9}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
/\x{01ca}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/\x{01cb}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/\x{01cc}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
/\x{01f1}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/\x{01f2}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/\x{01f3}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
/\x{0345}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0399}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{03b9}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{1fbe}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0392}+/i,utf
\x{0392}\x{03b2}\x{03d0}
/\x{03b2}+/i,utf
\x{0392}\x{03b2}\x{03d0}
/\x{03d0}+/i,utf
\x{0392}\x{03b2}\x{03d0}
/\x{0395}+/i,utf
\x{0395}\x{03b5}\x{03f5}
/\x{03b5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
/\x{03f5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
/\x{0398}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03b8}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03d1}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03f4}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{039a}+/i,utf
\x{039a}\x{03ba}\x{03f0}
/\x{03ba}+/i,utf
\x{039a}\x{03ba}\x{03f0}
/\x{03f0}+/i,utf
\x{039a}\x{03ba}\x{03f0}
/\x{03a0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/\x{03c0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/\x{03d6}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
/\x{03a1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/\x{03c1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/\x{03f1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
/\x{03a3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/\x{03c2}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/\x{03c3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
/\x{03a6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/\x{03c6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/\x{03d5}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
/\x{03c9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
/\x{03a9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
/\x{2126}+/i,utf
\x{03c9}\x{03a9}\x{2126}
/\x{1e60}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
/\x{1f80}+/i,utf
\x{1f88}\x{1f80}
/\x{004b}+/i,utf
\x{004b}\x{006b}\x{212a}
/\x{006b}+/i,utf
\x{004b}\x{006b}\x{212a}
/\x{212a}+/i,utf
\x{004b}\x{006b}\x{212a}
/\x{0053}+/i,utf
\x{0053}\x{0073}\x{017f}
/\x{0073}+/i,utf
\x{0053}\x{0073}\x{017f}
/\x{017f}+/i,utf
\x{0053}\x{0073}\x{017f}
/ist/i,utf
\= Expect no match
ikt
/is+t/i,utf
iSs\x{17f}t
\= Expect no match
ikt
/is+?t/i,utf
\= Expect no match
ikt
/is?t/i,utf
\= Expect no match
ikt
/is{2}t/i,utf
\= Expect no match
iskt
/^\p{Xuc}/utf
$abc
@abc
`abc
\x{1234}abc
\= Expect no match
abc
/^\p{Xuc}+/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}+?/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}+?\*/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}++/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}{3,5}/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\p{Xuc}{3,5}?/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^[\p{Xuc}]/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^[\p{Xuc}]+/utf
$@`\x{a0}\x{1234}\x{e000}**
\= Expect no match
\x{9f}
/^\P{Xuc}/utf
abc
\= Expect no match
$abc
@abc
`abc
\x{1234}abc
/^[\P{Xuc}]/utf
abc
\= Expect no match
$abc
@abc
`abc
\x{1234}abc
/^A\s+Z/utf,ucp
A\x{2005}Z
A\x{85}\x{180e}\x{2005}Z
/^A[\s]+Z/utf,ucp
A\x{2005}Z
A\x{85}\x{180e}\x{2005}Z
/(?<=\x{100})\x{200}(?=\x{300})/utf,allusedtext
\x{100}\x{200}\x{300}
# -----------------------------------------------------------------------------
# Tests for bidi control and bidi class properties
/\p{ bidi_control }/utf
-->\x{202c}<--
/\p{bidicontrol}+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{bidicontrol}+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{bidicontrol}++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidi_control}]/utf
-->\x{202c}<--
/[\p{bidicontrol}]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidicontrol}]+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidicontrol}]++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/[\p{bidicontrol}<>]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\P{bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{^bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
/\p{bidi class = al}/utf
-->\x{061D}<--
/\p{bidi class = al}+/utf
-->\x{061D}\x{061e}\x{061f}<--
/\p{bidi_class : AL}+?/utf
-->\x{061D}\x{061e}\x{061f}<--
/\p{Bidi_Class : AL}++/utf
-->\x{061D}\x{061e}\x{061f}<--
/\p{bidi class = aN}+/utf
-->\x{061D}\x{0602}\x{0604}\x{061f}<--
/\p{bidi class = B}+/utf
-->\x{0a}\x{0d}\x{01c}\x{01e}\x{085}\x{2029}<--
/\p{bidi class:BN}+/utf
-->\x{0}\x{08}\x{200c}\x{fffe}\x{dfffe}\x{10ffff}<--
/\p{bidiclass:cs}+/utf
-->,.\x{060c}\x{ff1a}<--
/\p{bidiclass:En}+/utf
-->09\x{b2}\x{2074}\x{1fbf9}<--
/\p{bidiclass:es}+/utf
==>+-\x{207a}\x{ff0d}<==
/\p{bidiclass:et}+/utf
-->#\{24}%\x{a2}\x{A838}\x{1e2ff}<--
/\p{bidiclass:FSI}+/utf
-->\x{2068}<--
/\p{bidi class:L}+/utf
-->ABC<--
/\P{bidi class:L}+/utf
-->ABC<--
/\p{bidi class:LRE}+\p{bidiclass=lri}*\p{bidiclass:lro}/utf
-->\x{202a}\x{2066}\x{202d}<--
/\p{bidi class:NSM}+/utf
-->\x{9bc}\x{a71}\x{e31}<--
/\p{bidi class:ON}+/utf
-->\x{21}'()*;@\x{384}\x{2039}<=-
/\p{bidiclass:pdf}\p{bidiclass:pdi}/utf
-->\x{202c}\x{2069}<--
/\p{bidi class:R}+/utf
-->\x{590}\x{5c6}\x{200f}\x{10805}<--
/\p{bidi class:RLE}+\p{bidi class:RLI}*\p{bidi class:RLO}+/utf
-->\x{202b}\x{2067}\x{202e}<--
/\p{bidi class:S}+\p{bidiclass:WS}+/utf
-->\x{9}\x{b}\x{1f} \x{c} \x{2000} \x{3000}<--
# -----------------------------------------------------------------------------
/\p{katakana}/utf
\x{30a1}
\x{3001}
/\p{scx:katakana}/utf
\x{30a1}
\x{3001}
/\p{script extensions:katakana}/utf
\x{30a1}
\x{3001}
/\p{sc:katakana}/utf
\x{30a1}
\= Expect no match
\x{3001}
/\p{script:katakana}/utf
\x{30a1}
\= Expect no match
\x{3001}
/\p{sc:katakana}{3,}/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
/\p{sc:katakana}{3,}?/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
# Tests for PCRE2_EXTRA_CASELESS_RESTRICT. Compare each test with and without
# the restriction.
/AskZ/i,utf,caseless_restrict
AskZ
aSKz
\= Expect no match
A\x{17f}kZ
As\x{212a}Z
/AskZ/i,utf
AskZ
aSKz
A\x{17f}kZ
As\x{212a}Z
/A\x{17f}\x{212a}Z/ir,utf
\= Expect no match
AskZ
/A\x{17f}\x{212a}Z/i,utf
AskZ
/[AskZ]+/i,utf,caseless_restrict
AskZ
aSKz
A\x{17f}kZ
As\x{212a}Z
/[AskZ]+/i,utf
AskZ
aSKz
A\x{17f}kZ
As\x{212a}Z
/[\x{17f}\x{212a}]+/ir,utf
\= Expect no match
AskZ
/[\x{17f}\x{212a}]+/i,utf
AskZ
/[^s]+/ir,utf
A\x{17f}Z
/[^s]+/i,utf
A\x{17f}Z
/[^k]+/ir,utf
A\x{212a}Z
/[^k]+/i,utf
A\x{212a}Z
/[^sk]+/ir,utf
A\x{17f}\x{212a}Z
/[^sk]+/i,utf
A\x{17f}\x{212a}Z
/[^\x{17f}]+/ir,utf
AsSZ
/[^\x{17f}]+/i,utf
AsSZ
/[Ss]+/irB,utf
Sss\x{17f}ss
/[Ss]+/iB,utf
Sss\x{17f}ss
/[S\x{17f}]/irB,utf
/[S\x{17f}]/iB,utf
/[\x{17f}s]/irB,utf
/[\x{17f}s]/iB,utf
/[\x{4b}\x{6b}]/irB,utf
/[\x{4b}\x{6b}]/iB,utf
/s(?r)s(?-r)s(?r:s)s/i,utf
\x{17f}S\x{17f}S\x{17f}
\= Expect no match
\x{17f}\x{17f}\x{17f}S\x{17f}
\x{17f}S\x{17f}\x{17f}\x{17f}
/k(?^i)k/ir,utf
K\x{212a}
\= Expect no match
\x{212a}\x{212a}
# End caseless restrict tests
# TESTS for PCRE2_EXTRA_TURKISH_CASING - again, tests with and without.
/i/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/i/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/I/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/I/i,utf,turkish_casing
I
\x{0131}
\= Expect no match
i
\x{0130}
/\x{0130}/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/\x{0130}/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/\x{0131}/i,utf
\x{0131}
\= Expect no match
i
I
\x{0130}
/\x{0131}/i,utf,turkish_casing
I
\x{0131}
\= Expect no match
i
\x{0130}
/[i]/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/[i]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[\x{0130}]/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/[\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[\x{0120}-\x{0130}]/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/[\x{0120}-\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[zi]/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/[zi]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[z\x{0130}]/i,utf
\x{0130}
\= Expect no match
i
I
\x{0131}
/[z\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
/[iI]/i,utf
i
I
\= Expect no match
\x{0130}
\x{0131}
/[iI]/i,utf,turkish_casing
i
I
\x{0130}
\x{0131}
/[i\x{0130}]/i,utf
i
I
\x{0130}
\= Expect no match
\x{0131}
/[i\x{0130}]/i,utf,turkish_casing
i
\x{0130}
\= Expect no match
I
\x{0131}
# End Turkish casing tests
# TESTS for PCRE2_EXTRA_ASCII_xxx - again, tests with and without.
# DIGITS
/\d+/i,utf
123\x{660}456
/\d+/i,utf,ucp
123\x{660}456
/\d+/i,utf,ucp,ascii_bsd
123\x{660}456
/[\d]+/i,utf
123\x{660}456
/[\d]+/i,utf,ucp
123\x{660}456
/[\d]+/i,utf,ucp,ascii_bsd
123\x{660}456
/\d(?aD)\d(?-aD)\d/utf,ucp
\x{660}9\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
/\d(?-aD)\d(?aD)\d/utf,ucp,ascii_bsd
999
9\x{660}9
/\d(?a)\d(?-a)\d/utf,ucp
\x{660}9\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
/\d(?-aD)\d(?aD)\d/utf,ucp,ascii_bsd
999
9\x{660}9
# SPACES
/>\s+</i,utf
> <
\= Expect no match
>\x{a0} <
/>\s+</i,utf,ucp
> <
>\x{a0} <
/>\s+</i,utf,ucp,ascii_bss
> <
\= Expect no match
>\x{a0} <
/>[\s]+</i,utf
> <
\= Expect no match
>\x{a0} <
/>[\s]+</i,utf,ucp
> <
>\x{a0} <
/>[\s]+</i,utf,ucp,ascii_bss
> <
\= Expect no match
>\x{a0} <
/>\s(?aS)\s(?-aS)\s</utf,ucp
>\x{a0} \x{a0}<
\= Expect no match
>\x{a0}\x{a0}\x{a0}<
/>\s(?a)\s(?-a)\s</utf,ucp
>\x{a0} \x{a0}<
\= Expect no match
>\x{a0}\x{a0}\x{a0}<
# WORDS
/\w+/i,utf
123\x{660}abc
/\w+/i,utf,ucp
123\x{660}abc
/\w+/i,utf,ucp,ascii_bsw
123\x{660}abc
/[\w]+/i,utf
123\x{660}abc
/[\w]+/i,utf,ucp
123\x{660}abc
/[\w]+/i,utf,ucp,ascii_bsw
123\x{660}abc
/\w(?aW)\w(?-aW)\w/utf,ucp
\x{660}A\x{c0}
\= Expect no match
\x{660}\x{c0}\x{c0}
/\w(?a)\w(?-a)\w/utf,ucp
\x{660}A\x{c0}
\= Expect no match
\x{660}\x{c0}\x{c0}
# POSIX
/^[[:digit:]]+$/utf,ucp
123456
123\x{660}456
/^[[:digit:]]+$/utf,ucp,ascii_digit
123456
\= Expect no match
123\x{660}456
/[[:digit:]]+/g,utf,ucp,ascii_digit
123\x{660}456
/(?-aT)[[:digit:]](?aT)[[:digit:]]/utf,ucp,ascii_digit
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-aT:[[:digit:]])[[:digit:]]/utf,ucp,ascii_digit
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-aT:[[:digit:]])[[:digit:]]/utf,never_ucp,ascii_digit
11
\= Expect no match
\x{ff11}1
1\x{ff11}
/[[:digit:]]+/utf,ucp,ascii_posix
123\x{660}456
/(?-aP)[[:digit:]](?aP)[[:digit:]]/utf,ucp,ascii_posix
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-aP:[[:digit:]])[[:digit:]]/utf,ucp,ascii_posix
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/(?-a:[[:digit:]])[[:digit:]]/a,utf,ucp
11
\x{ff11}1
\= Expect no match
1\x{ff11}
/>[[:space:]]+</utf,ucp
>\x{a0} \x{a0}<
>\x{a0}\x{a0}\x{a0}<
/>[[:space:]]+</utf,ucp,ascii_posix
\= Expect no match
>\x{a0} \x{a0}<
/(?aP)[[:alnum:]]+/i,ucp,utf
abcáxyz
abc\x{660}xyz
/(?aP)[[:alnum:]\d]+/i,ucp,utf
abc\x{660}xyz
/(*UCP)(*UTF)[[:alnum:]](?aP:[[:alnum:]])[[:alnum:]]/
\x{660}A\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
# VARIOUS
/[\d\s\w]+/a,ucp,utf
9 A\x{660}À
9 AÀ\x{660}
# End PCRE2_EXTRA_ASCII_xxx tests
/\w+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/[\w]+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/\b.+?\b/utf,ucp
--cafe\x{300}_au\x{203f}lait!
/caf\B.+?\B/utf,ucp
--cafe\x{300}_au\x{203f}lait!
# --------------------------------------------------------------------------
# Case-independent matching property tests added after changing PCRE2 to be
# compatible with Perl. All three cases (upper, lower, title) conflate.
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/utf
>AbbD<
>Abb\x{01c5}<
\= Expect no match
>aBBd<
>aB!!<
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/i,utf
>aB!!<
\= Expect no match
>AbbD<
>aBBd<
>Abb\x{01c5}<
/[.\p{Lu}][.\p{Ll}][.\P{Lu}][.\P{Ll}]/i,utf
>aB!!<
\= Expect no match
>AbbD<
>aBBd<
>Abb\x{01c5}<
# --------------
# EXTENDED CHARACTER CLASSES
/[\p{Ll}[\p{Nd}]]C/alt_extended_class
aC
1C
\= Expect no match
[C
/[[\p{Ll}][\p{Nd}]]/alt_extended_class
a
1
\= Expect no match
[
]
/[[\p{Ll}]||[\p{Nd}]]/alt_extended_class
a
1
\= Expect no match
C
/[[^\p{Ll}][\p{Nd}]]/alt_extended_class
1
A
\= Expect no match
a
/[^[\p{Ll}][\p{Nd}]]/alt_extended_class
A
\= Expect no match
a
1
/[^[\p{Ll}]&&[\p{Nd}]]/alt_extended_class
a
1
A
/(?[[\p{Ll}]+[\p{Nd}]])/
a
1
\= Expect no match
[
]
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
/(?[[\p{Ll}Z]&[\p{Lu}a]])/
a
Z
\= Expect no match
A
z
# --------------------------------------------------------------------------
# End of testinput7

189
3rd/pcre2/testdata/testinput8 vendored Normal file
View File

@@ -0,0 +1,189 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
/(?s)(.*X|^B)/
/(?s:.*X|^B)/
/^[[:alnum:]]/
/#/Ix
/a#/Ix
/x?+/
/x++/
/x{1,3}+/
/(x)*+/
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
/(a(?1)b)/
/(a(?1)+b)/
/a(?P<name1>b|c)d(?P<longername2>e)/
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
/(?P<a>a)...(?P=a)bbb(?P>a)d/
/abc(?C255)de(?C)f/
/abcde/auto_callout
/\x{100}/utf
/\x{1000}/utf
/\x{10000}/utf
/\x{100000}/utf
/\x{10ffff}/utf
/\x{110000}/utf
/[\x{ff}]/utf
/[\x{100}]/utf
/\x80/utf
/\xff/utf
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
/\x{65e5}\x{672c}\x{8a9e}/I,utf
/[\x{100}]/utf
/[Z\x{100}]/utf
/^[\x{100}\E-\Q\E\x{150}]/utf
/^[\QĀ\E-\QŐ\E]/utf
/^[\QĀ\E-\QŐ\E/utf
/[\p{L}]/
/[\p{^L}]/
/[\P{L}]/
/[\P{^L}]/
/[abc\p{L}\x{0660}]/utf
/[\p{Nd}]/utf
/[\p{Nd}+-]+/utf
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
/[\x{105}-\x{109}]/i,utf
/( ( (?(1)0|) )* )/x
/( (?(1)0|)* )/x
/[a]/
/[a]/utf
/[\xaa]/
/[\xaa]/utf
/[^a]/
/[^a]/utf
/[^\xaa]/
/[^\xaa]/utf
#pattern -memory
/[^\d]/utf,ucp
/[[:^alpha:][:^cntrl:]]+/utf,ucp
/[[:^cntrl:][:^alpha:]]+/utf,ucp
/[[:alpha:]]+/utf,ucp
/[[:^alpha:]\S]+/utf,ucp
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
/(((a\2)|(a*)\g<-1>))*a?/
/((?+1)(\1))/
"(?1)(?#?'){2}(a)"
/.((?2)(?R)|\1|$)()/
/.((?3)(?R)()(?2)|\1|$)()/
/(?1)()((((((\1++))\x85)+)|))/
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
/(?(1)(?1)){8,}+()/debug
abcd
/(?(1)|a(?1)b){2,}+()/debug
abcde
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

284
3rd/pcre2/testdata/testinput9 vendored Normal file
View File

@@ -0,0 +1,284 @@
# This set of tests is run only with the 8-bit library. They must not require
# UTF-8 or Unicode property support. */
#forbid_utf
#newline_default lf any anycrlf
/a\xc4\xa3b/
a\N{U+123}b
\= Expect no match # error message (too big char)
a\x{0123}b
a\o{00443}b
a\443b
/fd bf bf bf bf bf/I,hex
\= Expect warning
\N{U+7fffffff}
\= Expect no match # error message (too big char)
\x{7fffffff}
/\x{100}/I
/\o{400}/I
/ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional leading comment
(?: (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address
| # or
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # one word, optionally followed by....
(?:
[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] | # atom and space parts, or...
\(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) | # comments, or...
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
# quoted strings
)*
< (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # leading <
(?: @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* , (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
)* # further okay, if led by comma
: # closing colon
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* )? # optional route
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address spec
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* > # trailing >
# name and address
) (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional trailing comment
/Ix
/\h/I
/\H/I
/\v/I
/\V/I
/\R/I
/[\h]/B
>\x09<
/[\h]+/B
>\x09\x20\xa0<
/[\v]/B
/[\H]/B
/[^\h]/B
/[\V]/B
/[\x0a\V]/B
/\777/I
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
XX
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark,alt_verbnames
XX
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
XX
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark,alt_verbnames
XX
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
/[\u0100-\u0200]/alt_bsux,allow_empty_class,match_unset_backref,dupnames
/[^\x00-a]{12,}[^b-\xff]*/B
/[^\s]*\s* [^\W]+\W+ [^\d]*?\d0 [^\d\w]{4,6}?\w*A/B
/(*MARK:a\x{100}b)z/alt_verbnames
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/
/(?i:A{1,}\6666666666)/
A\x{1b6}6666666
# Should cause an error
/abc/substitute_extended,replace=>\777<
abc
# Should cause an error
/abc/substitute_extended,replace=>\o{012345}<
abc
/i/turkish_casing
# End of testinput9

137
3rd/pcre2/testdata/testinputEBC vendored Normal file
View File

@@ -0,0 +1,137 @@
# This is a specialized test for checking, when PCRE2 is compiled with the
# EBCDIC option but in an ASCII environment, that newline, white space, and \c
# functionality is working. It catches cases where explicit values such as 0x0a
# have been used instead of names like CHAR_LF. Needless to say, it is not a
# genuine EBCDIC test! In patterns, alphabetic characters that follow a
# backslash must be in EBCDIC code. In data, NL, NEL, LF, ESC, and DEL must be
# in EBCDIC, but can of course be specified as escapes.
# Test default newline and variations
/^A/m
ABC
12\x15ABC
/^A/m,newline=any
12\x15ABC
12\x0dABC
12\x0d\x15ABC
12\x25ABC
/^A/m,newline=anycrlf
12\x15ABC
12\x0dABC
12\x0d\x15ABC
** Fail
12\x25ABC
# Test \h
/^A\<5C>/
A B
A\x41B
# Test \H
/^A\<5C>/
AB
A\x42B
** Fail
A B
A\x41B
# Test \R
/^A\<5C>/
A\x15B
A\x0dB
A\x25B
A\x0bB
A\x0cB
** Fail
A B
# Test \v
/^A\<5C>/
A\x15B
A\x0dB
A\x25B
A\x0bB
A\x0cB
** Fail
A B
# Test \V
/^A\<5C>/
A B
** Fail
A\x15B
A\x0dB
A\x25B
A\x0bB
A\x0cB
# For repeated items, use an atomic group so that the output is the same
# for DFA matching (otherwise it may show multiple matches).
# Test \h+
/^A(?>\<5C>+)/
A B
# Test \H+
/^A(?>\<5C>+)/
AB
** Fail
A B
# Test \R+
/^A(?>\<5C>+)/
A\x15B
A\x0dB
A\x25B
A\x0bB
A\x0cB
** Fail
A B
# Test \v+
/^A(?>\<5C>+)/
A\x15B
A\x0dB
A\x25B
A\x0bB
A\x0cB
** Fail
A B
# Test \V+
/^A(?>\<5C>+)/
A B
** Fail
A\x15B
A\x0dB
A\x25B
A\x0bB
A\x0cB
# Test \c functionality
/\<5C>@\<5C>A\<5C>b\<5C>C\<5C>d\<5C>E\<5C>f\<5C>G\<5C>h\<5C>I\<5C>J\<5C>K\<5C>l\<5C>m\<5C>N\<5C>O\<5C>p\<5C>q\<5C>r\<5C>S\<5C>T\<5C>u\<5C>V\<5C>W\<5C>X\<5C>y\<5C>Z/
\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f
/\<5C>[\<5C>\\<5C>]\<5C>^\<5C>_/
\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f
/\<5C>?/
A\xffB
/\<5C>&/
# End

35
3rd/pcre2/testdata/testinputheap vendored Normal file
View File

@@ -0,0 +1,35 @@
#pattern framesize, memory
/abcd/
abcd\=memory
abcd\=find_limits
/(((((((((((((((((((((((((((((( (^abc|xyz){1,20}$ ))))))))))))))))))))))))))))))/x
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=memory
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=find_limits
/ab(cd)/
abcd\=memory
abcd\=memory,ovector=0
/\[(a)]{1000}/expand,framesize
\[a]{1000}\=ovector=1
# The heapframes_size option gets pcre2test to show the size of the heapframes
# vector that after pcre2_match() has run. Running a match with ovector=0
# causes the match data block to be freed, thus releasing that vector.
/\[(a)]{1000}/expand,framesize
\[a]{1000}\=ovector=1,heapframes_size
/a/heapframes_size,framesize
a\=ovector=0
/a|(b){200}/g,expand,heapframes_size
abacus z\[b]{200}z
a\=ovector=0
/(a)/replace=>$1<
cat\=heapframes_size
# End

11125
3rd/pcre2/testdata/testoutput1 vendored Normal file

File diff suppressed because it is too large Load Diff

2017
3rd/pcre2/testdata/testoutput10 vendored Normal file
View File

@@ -0,0 +1,2017 @@
# This set of tests is for UTF-8 support and Unicode property support, with
# relevance only for the 8-bit library.
#newline_default lf any anycrlf
# The next 5 patterns have UTF-8 errors
/[<5B>]/utf
Failed: error -8 at offset 1: UTF-8 error: byte 2 top bits not 0x80
/<2F>/utf
Failed: error -3 at offset 0: UTF-8 error: 1 byte missing at end
/<2F><><EFBFBD>xxx/utf
Failed: error -8 at offset 0: UTF-8 error: byte 2 top bits not 0x80
<><C382><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>/utf
Failed: error -22 at offset 2: UTF-8 error: isolated byte with 0x80 bit set
<><C382><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>/match_invalid_utf
Failed: error -22 at offset 2: UTF-8 error: isolated byte with 0x80 bit set
# Now test subjects
/badutf/utf
\= Expect UTF-8 errors
X\xdf
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 1
XX\xef
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
XXX\xef\x80
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
X\xf7
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 1
XX\xf7\x80
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
XXX\xf7\x80\x80
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
\xfb
Failed: error -6: UTF-8 error: 4 bytes missing at end at offset 0
\xfb\x80
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
\xfb\x80\x80
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
\xfb\x80\x80\x80
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 0
\xfd
Failed: error -7: UTF-8 error: 5 bytes missing at end at offset 0
\xfd\x80
Failed: error -6: UTF-8 error: 4 bytes missing at end at offset 0
\xfd\x80\x80
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
\xfd\x80\x80\x80
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
\xfd\x80\x80\x80\x80
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 0
\xdf\x7f
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 0
\xef\x7f\x80
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 0
\xef\x80\x7f
Failed: error -9: UTF-8 error: byte 3 top bits not 0x80 at offset 0
\xf7\x7f\x80\x80
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 0
\xf7\x80\x7f\x80
Failed: error -9: UTF-8 error: byte 3 top bits not 0x80 at offset 0
\xf7\x80\x80\x7f
Failed: error -10: UTF-8 error: byte 4 top bits not 0x80 at offset 0
\xfb\x7f\x80\x80\x80
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 0
\xfb\x80\x7f\x80\x80
Failed: error -9: UTF-8 error: byte 3 top bits not 0x80 at offset 0
\xfb\x80\x80\x7f\x80
Failed: error -10: UTF-8 error: byte 4 top bits not 0x80 at offset 0
\xfb\x80\x80\x80\x7f
Failed: error -11: UTF-8 error: byte 5 top bits not 0x80 at offset 0
\xfd\x7f\x80\x80\x80\x80
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 0
\xfd\x80\x7f\x80\x80\x80
Failed: error -9: UTF-8 error: byte 3 top bits not 0x80 at offset 0
\xfd\x80\x80\x7f\x80\x80
Failed: error -10: UTF-8 error: byte 4 top bits not 0x80 at offset 0
\xfd\x80\x80\x80\x7f\x80
Failed: error -11: UTF-8 error: byte 5 top bits not 0x80 at offset 0
\xfd\x80\x80\x80\x80\x7f
Failed: error -12: UTF-8 error: byte 6 top bits not 0x80 at offset 0
\xed\xa0\x80
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
\xc0\x8f
Failed: error -17: UTF-8 error: overlong 2-byte sequence at offset 0
\xe0\x80\x8f
Failed: error -18: UTF-8 error: overlong 3-byte sequence at offset 0
\xf0\x80\x80\x8f
Failed: error -19: UTF-8 error: overlong 4-byte sequence at offset 0
\xf8\x80\x80\x80\x8f
Failed: error -20: UTF-8 error: overlong 5-byte sequence at offset 0
\xfc\x80\x80\x80\x80\x8f
Failed: error -21: UTF-8 error: overlong 6-byte sequence at offset 0
\x80
Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 0
\xfe
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
\xff
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
/badutf/utf
\= Expect UTF-8 errors
XX\xfb\x80\x80\x80\x80
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 2
XX\xfd\x80\x80\x80\x80\x80
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 2
XX\xf7\xbf\xbf\xbf
Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 2
/shortutf/utf
\= Expect UTF-8 errors
XX\xdf\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
XX\xef\=ph
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
XX\xef\x80\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
\xf7\=ph
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
\xf7\x80\=ph
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
\xf7\x80\x80\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 0
\xfb\=ph
Failed: error -6: UTF-8 error: 4 bytes missing at end at offset 0
\xfb\x80\=ph
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
\xfb\x80\x80\=ph
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
\xfb\x80\x80\x80\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 0
\xfd\=ph
Failed: error -7: UTF-8 error: 5 bytes missing at end at offset 0
\xfd\x80\=ph
Failed: error -6: UTF-8 error: 4 bytes missing at end at offset 0
\xfd\x80\x80\=ph
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
\xfd\x80\x80\x80\=ph
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
\xfd\x80\x80\x80\x80\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 0
/anything/utf
\= Expect UTF-8 errors
X\xc0\x80
Failed: error -17: UTF-8 error: overlong 2-byte sequence at offset 1
XX\xc1\x8f
Failed: error -17: UTF-8 error: overlong 2-byte sequence at offset 2
XXX\xe0\x9f\x80
Failed: error -18: UTF-8 error: overlong 3-byte sequence at offset 3
\xf0\x8f\x80\x80
Failed: error -19: UTF-8 error: overlong 4-byte sequence at offset 0
\xf8\x87\x80\x80\x80
Failed: error -20: UTF-8 error: overlong 5-byte sequence at offset 0
\xfc\x83\x80\x80\x80\x80
Failed: error -21: UTF-8 error: overlong 6-byte sequence at offset 0
\xfe\x80\x80\x80\x80\x80
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
\xff\x80\x80\x80\x80\x80
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
\xf8\x88\x80\x80\x80
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
\xf9\x87\x80\x80\x80
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
\xfc\x84\x80\x80\x80\x80
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
\xfd\x83\x80\x80\x80\x80
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
\= Expect no match
\xc3\x8f
No match
\xe0\xaf\x80
No match
\xe1\x80\x80
No match
\xf0\x9f\x80\x80
No match
\xf1\x8f\x80\x80
No match
\xf8\x88\x80\x80\x80\=no_utf_check
No match
\xf9\x87\x80\x80\x80\=no_utf_check
No match
\xfc\x84\x80\x80\x80\x80\=no_utf_check
No match
\xfd\x83\x80\x80\x80\x80\=no_utf_check
No match
# Similar tests with offsets
/badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=1
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
\= Expect no match
X\xdfabcd\=offset=2
No match
/(?<=x)badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=1
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=2
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\xdf\=offset=3
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 6
\= Expect no match
X\xdfabcd\=offset=3
No match
/(?<=xx)badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=1
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=2
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=3
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
/(?<=xxxx)badutf/utf
\= Expect UTF-8 errors
X\xdfabcd
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=1
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=2
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabcd\=offset=3
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
X\xdfabc\xdf\=offset=6
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 5
X\xdfabc\xdf\=offset=7
Failed: error -33: bad offset value
\= Expect no match
X\xdfabcd\=offset=6
No match
/\x{100}/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = \x80
Subject length lower bound = 1
/\x{1000}/IB,utf
------------------------------------------------------------------
Bra
\x{1000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xe1
Last code unit = \x80
Subject length lower bound = 1
/\x{10000}/IB,utf
------------------------------------------------------------------
Bra
\x{10000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xf0
Last code unit = \x80
Subject length lower bound = 1
/\x{100000}/IB,utf
------------------------------------------------------------------
Bra
\x{100000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xf4
Last code unit = \x80
Subject length lower bound = 1
/\x{10ffff}/IB,utf
------------------------------------------------------------------
Bra
\x{10ffff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xf4
Last code unit = \xbf
Subject length lower bound = 1
/[\x{ff}]/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc3
Last code unit = \xbf
Subject length lower bound = 1
/[\x{100}]/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = \x80
Subject length lower bound = 1
/\x80/IB,utf
------------------------------------------------------------------
Bra
\x{80}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc2
Last code unit = \x80
Subject length lower bound = 1
/\xff/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc3
Last code unit = \xbf
Subject length lower bound = 1
/\x{D55c}\x{ad6d}\x{C5B4}/IB,utf
------------------------------------------------------------------
Bra
\x{d55c}\x{ad6d}\x{c5b4}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xed
Last code unit = \xb4
Subject length lower bound = 3
\x{D55c}\x{ad6d}\x{C5B4}
0: \x{d55c}\x{ad6d}\x{c5b4}
/\x{65e5}\x{672c}\x{8a9e}/IB,utf
------------------------------------------------------------------
Bra
\x{65e5}\x{672c}\x{8a9e}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xe6
Last code unit = \x9e
Subject length lower bound = 3
\x{65e5}\x{672c}\x{8a9e}
0: \x{65e5}\x{672c}\x{8a9e}
/\x{80}/IB,utf
------------------------------------------------------------------
Bra
\x{80}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc2
Last code unit = \x80
Subject length lower bound = 1
/\x{084}/IB,utf
------------------------------------------------------------------
Bra
\x{84}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc2
Last code unit = \x84
Subject length lower bound = 1
/\x{104}/IB,utf
------------------------------------------------------------------
Bra
\x{104}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = \x84
Subject length lower bound = 1
/\x{861}/IB,utf
------------------------------------------------------------------
Bra
\x{861}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xe0
Last code unit = \xa1
Subject length lower bound = 1
/\x{212ab}/IB,utf
------------------------------------------------------------------
Bra
\x{212ab}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xf0
Last code unit = \xab
Subject length lower bound = 1
/[^ab\xC0-\xF0]/IB,utf
------------------------------------------------------------------
Bra
[^ab\xc0-\xf0]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
\xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0
\xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf
\xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee
\xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd
\xfe \xff
Subject length lower bound = 1
\x{f1}
0: \x{f1}
\x{bf}
0: \x{bf}
\x{100}
0: \x{100}
\x{1000}
0: \x{1000}
\= Expect no match
\x{c0}
No match
\x{f0}
No match
/(\x{100}+|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}++
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: x \xc4
Subject length lower bound = 1
/(\x{100}*a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}*+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a x \xc4
Subject length lower bound = 1
/(\x{100}{0,2}a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}{0,2}+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a x \xc4
Subject length lower bound = 1
/(\x{100}{1,2}a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}
\x{100}{0,1}+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: x \xc4
Subject length lower bound = 1
/\x{100}/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = \x80
Subject length lower bound = 1
/a\x{100}\x{101}*/IB,utf
------------------------------------------------------------------
Bra
a\x{100}
\x{101}*+
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'a'
Last code unit = \x80
Subject length lower bound = 2
/a\x{100}\x{101}+/IB,utf
------------------------------------------------------------------
Bra
a\x{100}
\x{101}++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'a'
Last code unit = \x81
Subject length lower bound = 3
/[^\x{c4}]/IB
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Subject length lower bound = 1
/[\x{100}]/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = \x80
Subject length lower bound = 1
\x{100}
0: \x{100}
Z\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[\xff]/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc3
Last code unit = \xbf
Subject length lower bound = 1
>\x{ff}<
0: \x{ff}
/[^\xff]/IB,utf
------------------------------------------------------------------
Bra
[^\x{ff}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Subject length lower bound = 1
/\x{100}abc(xyz(?1))/IB,utf
------------------------------------------------------------------
Bra
\x{100}abc
CBra 1
xyz
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
First code unit = \xc4
Last code unit = 'z'
Subject length lower bound = 7
/\777/I,utf
Capture group count = 0
Options: utf
First code unit = \xc7
Last code unit = \xbf
Subject length lower bound = 1
\x{1ff}
0: \x{1ff}
\777
0: \x{1ff}
/\x{100}+\x{200}/IB,utf
------------------------------------------------------------------
Bra
\x{100}++
\x{200}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = \x80
Subject length lower bound = 2
/\x{100}+X/IB,utf
------------------------------------------------------------------
Bra
\x{100}++
X
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc4
Last code unit = 'X'
Subject length lower bound = 2
/^[\QĀ\E-\QŐ\E/B,utf
Failed: error 106 at offset 15: missing terminating ] for character class
# This tests the stricter UTF-8 check according to RFC 3629.
/X/utf
\= Expect UTF-8 errors
\x{d800}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
\x{da00}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
\x{dfff}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
\x{110000}
Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 0
\x{2000000}
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
\x{7fffffff}
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
\= Expect no match
\x{d800}\=no_utf_check
No match
\x{da00}\=no_utf_check
No match
\x{dfff}\=no_utf_check
No match
\x{110000}\=no_utf_check
No match
\x{2000000}\=no_utf_check
No match
\x{7fffffff}\=no_utf_check
No match
/(*UTF8)\x{1234}/
abcd\x{1234}pqr
0: \x{1234}
/(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
Capture group count = 0
Compile options: <none>
Overall options: utf
\R matches any Unicode newline
Forced newline is CRLF
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 3
/\h/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x20 \xc2 \xe1 \xe2 \xe3
Subject length lower bound = 1
ABC\x{09}
0: \x{09}
ABC\x{20}
0:
ABC\x{a0}
0: \x{a0}
ABC\x{1680}
0: \x{1680}
ABC\x{180e}
0: \x{180e}
ABC\x{2000}
0: \x{2000}
ABC\x{202f}
0: \x{202f}
ABC\x{205f}
0: \x{205f}
ABC\x{3000}
0: \x{3000}
/\v/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
Subject length lower bound = 1
ABC\x{0a}
0: \x{0a}
ABC\x{0b}
0: \x{0b}
ABC\x{0c}
0: \x{0c}
ABC\x{0d}
0: \x{0d}
ABC\x{85}
0: \x{85}
ABC\x{2028}
0: \x{2028}
/\h*A/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
Last code unit = 'A'
Subject length lower bound = 1
CDBABC
0: A
/\v+A/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
Last code unit = 'A'
Subject length lower bound = 2
/\s?xxx\s/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
Last code unit = 'x'
Subject length lower bound = 4
/\sxxx\s/I,utf,tables=2
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xc2
Last code unit = 'x'
Subject length lower bound = 5
AB\x{85}xxx\x{a0}XYZ
0: \x{85}xxx\x{a0}
AB\x{a0}xxx\x{85}XYZ
0: \x{a0}xxx\x{85}
/\S \S/I,utf,tables=2
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
\x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h
i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 \xc4
\xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3
\xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2
\xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1
\xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Last code unit = ' '
Subject length lower bound = 3
\x{a2} \x{84}
0: \x{a2} \x{84}
A Z
0: A Z
/a+/utf
a\x{123}aa\=offset=1
0: aa
a\x{123}aa\=offset=3
0: aa
a\x{123}aa\=offset=4
0: a
\= Expect bad offset value
a\x{123}aa\=offset=6
Failed: error -33: bad offset value
\= Expect bad UTF-8 offset
a\x{123}aa\=offset=2
Error -36 (bad UTF-8 offset)
\= Expect no match
a\x{123}aa\=offset=5
No match
/\x{1234}+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: \xe1
Subject length lower bound = 1
/\x{1234}+?/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: \xe1
Subject length lower bound = 1
/\x{1234}++/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: \xe1
Subject length lower bound = 1
/\x{1234}{2}/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: \xe1
Subject length lower bound = 2
/[^\x{c4}]/IB,utf
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Subject length lower bound = 1
/X+\x{200}/IB,utf
------------------------------------------------------------------
Bra
X++
\x{200}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'X'
Last code unit = \x80
Subject length lower bound = 2
/\R/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
Subject length lower bound = 1
/\777/IB,utf
------------------------------------------------------------------
Bra
\x{1ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xc7
Last code unit = \xbf
Subject length lower bound = 1
/\w+\x{C4}/B,utf
------------------------------------------------------------------
Bra
\w++
\x{c4}
Ket
End
------------------------------------------------------------------
a\x{C4}\x{C4}
0: a\x{c4}
/\w+\x{C4}/B,utf,tables=2
------------------------------------------------------------------
Bra
\w+
\x{c4}
Ket
End
------------------------------------------------------------------
a\x{C4}\x{C4}
0: a\x{c4}\x{c4}
/\W+\x{C4}/B,utf
------------------------------------------------------------------
Bra
\W+
\x{c4}
Ket
End
------------------------------------------------------------------
!\x{C4}
0: !\x{c4}
/\W+\x{C4}/B,utf,tables=2
------------------------------------------------------------------
Bra
\W++
\x{c4}
Ket
End
------------------------------------------------------------------
!\x{C4}
0: !\x{c4}
/\W+\x{A1}/B,utf
------------------------------------------------------------------
Bra
\W+
\x{a1}
Ket
End
------------------------------------------------------------------
!\x{A1}
0: !\x{a1}
/\W+\x{A1}/B,utf,tables=2
------------------------------------------------------------------
Bra
\W+
\x{a1}
Ket
End
------------------------------------------------------------------
!\x{A1}
0: !\x{a1}
/X\s+\x{A0}/B,utf
------------------------------------------------------------------
Bra
X
\s++
\x{a0}
Ket
End
------------------------------------------------------------------
X\x20\x{A0}\x{A0}
0: X \x{a0}
/X\s+\x{A0}/B,utf,tables=2
------------------------------------------------------------------
Bra
X
\s+
\x{a0}
Ket
End
------------------------------------------------------------------
X\x20\x{A0}\x{A0}
0: X \x{a0}\x{a0}
/\S+\x{A0}/B,utf
------------------------------------------------------------------
Bra
\S+
\x{a0}
Ket
End
------------------------------------------------------------------
X\x{A0}\x{A0}
0: X\x{a0}\x{a0}
/\S+\x{A0}/B,utf,tables=2
------------------------------------------------------------------
Bra
\S++
\x{a0}
Ket
End
------------------------------------------------------------------
X\x{A0}\x{A0}
0: X\x{a0}
/\x{a0}+\s!/B,utf
------------------------------------------------------------------
Bra
\x{a0}++
\s
!
Ket
End
------------------------------------------------------------------
\x{a0}\x20!
0: \x{a0} !
/\x{a0}+\s!/B,utf,tables=2
------------------------------------------------------------------
Bra
\x{a0}+
\s
!
Ket
End
------------------------------------------------------------------
\x{a0}\x20!
0: \x{a0} !
/A/utf
\x{ff000041}
** Character \N{U+ff000041} is greater than 0x7fffffff and therefore cannot be encoded as UTF-8
\x{7f000041}
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
/(*UTF8)abc/never_utf
Failed: error 174 at offset 7: using UTF is disabled by the application
/abc/utf,never_utf
Failed: error 174 at offset 0: using UTF is disabled by the application
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IBi,utf
------------------------------------------------------------------
Bra
/i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
First code unit = 'A' (caseless)
Subject length lower bound = 5
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IB,utf
------------------------------------------------------------------
Bra
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = \xb0
Subject length lower bound = 5
/AB\x{1fb0}/IB,utf
------------------------------------------------------------------
Bra
AB\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = \xb0
Subject length lower bound = 3
/AB\x{1fb0}/IBi,utf
------------------------------------------------------------------
Bra
/i AB\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
First code unit = 'A' (caseless)
Last code unit = 'B' (caseless)
Subject length lower bound = 3
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: \xd0 \xd1
Subject length lower bound = 17
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
0: \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
/[ⱥ]/Bi,utf
------------------------------------------------------------------
Bra
/i \x{2c65}
Ket
End
------------------------------------------------------------------
/[^ⱥ]/Bi,utf
------------------------------------------------------------------
Bra
/i [^\x{2c65}] (not)
Ket
End
------------------------------------------------------------------
/\h/I
Capture group count = 0
Starting code units: \x09 \x20 \xa0
Subject length lower bound = 1
/\v/I
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85
Subject length lower bound = 1
/\R/I
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85
Subject length lower bound = 1
/[[:blank:]]/B,ucp
------------------------------------------------------------------
Bra
[\x09 \xa0]
Ket
End
------------------------------------------------------------------
/\x{212a}+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: K k \xe2
Subject length lower bound = 1
KKkk\x{212a}
0: KKkk\x{212a}
/s+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: S s \xc5
Subject length lower bound = 1
SSss\x{17f}
0: SSss\x{17f}
/\x{100}*A/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
A
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: A \xc4
Last code unit = 'A'
Subject length lower bound = 1
A
0: A
/\x{100}*\d(?R)/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\d
Recurse
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xc4
Subject length lower bound = 1
/[Z\x{100}]/IB,utf
------------------------------------------------------------------
Bra
[Z\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: Z \xc4
Subject length lower bound = 1
Z\x{100}
0: Z
\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[z-\x{100}]/IB,utf
------------------------------------------------------------------
Bra
[z-\xff\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: z { | } ~ \x7f \xc2 \xc3 \xc4
Subject length lower bound = 1
/[z\Qa-d]Ā\E]/IB,utf
------------------------------------------------------------------
Bra
[\-\]adz\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: - ] a d z \xc4
Subject length lower bound = 1
\x{100}
0: \x{100}
Ā
0: \x{100}
/[ab\x{100}]abc(xyz(?1))/IB,utf
------------------------------------------------------------------
Bra
[ab\x{100}]
abc
CBra 1
xyz
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a b \xc4
Last code unit = 'z'
Subject length lower bound = 7
/\x{100}*\s/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\s
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xc4
Subject length lower bound = 1
/\x{100}*\d/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\d
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xc4
Subject length lower bound = 1
/\x{100}*\w/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\w
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
\xc4
Subject length lower bound = 1
/\x{100}*\D/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\D
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Subject length lower bound = 1
/\x{100}*\S/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\S
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
\x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h
i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 \xc4
\xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3
\xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2
\xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1
\xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/\x{100}*\W/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\W
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ [ \ ] ^ ` { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9
\xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8
\xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7
\xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6
\xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[\x{105}-\x{109}]/IBi,utf
------------------------------------------------------------------
Bra
[\x{104}-\x{109}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: \xc4
Subject length lower bound = 1
\x{104}
0: \x{104}
\x{105}
0: \x{105}
\x{109}
0: \x{109}
\= Expect no match
\x{100}
No match
\x{10a}
No match
/[z-\x{100}]/IBi,utf
------------------------------------------------------------------
Bra
[Zz-\xff\x{100}-\x{101}\x{178}\x{39c}\x{3bc}\x{1e9e}\x{212b}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: Z z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xce \xe1 \xe2
Subject length lower bound = 1
Z
0: Z
z
0: z
\x{39c}
0: \x{39c}
\x{178}
0: \x{178}
|
0: |
\x{80}
0: \x{80}
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\x{101}
0: \x{101}
\= Expect no match
\x{102}
No match
Y
No match
y
No match
/[z-\x{100}]/IBi,utf
------------------------------------------------------------------
Bra
[Zz-\xff\x{100}-\x{101}\x{178}\x{39c}\x{3bc}\x{1e9e}\x{212b}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: Z z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xce \xe1 \xe2
Subject length lower bound = 1
/\x{3a3}B/IBi,utf
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: \xce \xcf
Last code unit = 'B' (caseless)
Subject length lower bound = 2
/abc/utf,replace=<3D>
abc
Failed: error -3: UTF-8 error: 1 byte missing at end
/(?<=(a)(?-1))x/I,utf
Capture group count = 1
Max lookbehind = 2
Options: utf
First code unit = 'x'
Subject length lower bound = 1
a\x80zx\=offset=3
Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 1
/[\W\p{Any}]/B
------------------------------------------------------------------
Bra
AllAny
Ket
End
------------------------------------------------------------------
abc
0: a
123
0: 1
/[\W\pL]/B
------------------------------------------------------------------
Bra
[\x00-/:-^`-\xff\p{L}]
Ket
End
------------------------------------------------------------------
abc
0: a
\= Expect no match
123
No match
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/utf
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
/[\s[:^ascii:]]/B,ucp
------------------------------------------------------------------
Bra
[\x09-\x0d \x80-\xff\p{Xsp}]
Ket
End
------------------------------------------------------------------
# A special extra option allows excaped surrogate code points in 8-bit mode,
# but subjects containing them must not be UTF-checked.
/\x{d800}/I,utf,allow_surrogate_escapes
Capture group count = 0
Options: utf
Extra options: allow_surrogate_escapes
First code unit = \xed
Last code unit = \x80
Subject length lower bound = 1
\x{d800}\=no_utf_check
0: \x{d800}
/\udfff\o{157401}/utf,alt_bsux,allow_surrogate_escapes
\x{dfff}\x{df01}\=no_utf_check
0: \x{dfff}\x{df01}
# This has different starting code units in 8-bit mode.
/^[^ab]/IB,utf
------------------------------------------------------------------
Bra
^
[^ab]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Compile options: utf
Overall options: anchored utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
\xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0
\xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf
\xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee
\xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd
\xfe \xff
Subject length lower bound = 1
c
0: c
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\= Expect no match
aaa
No match
# Offsets are different in 8-bit mode.
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
123abcáyzabcdef789abcሴqr
1(2) Old 6 6 "" New 6 8 "<>"
2(2) Old 13 13 "" New 15 17 "<>"
3(2) Old 13 16 "def" New 17 22 "<def>"
4(2) Old 22 22 "" New 28 30 "<>"
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
# Check name length with non-ASCII characters
/(?'ABáC678901234567890123456789012012345678901234567890123456789AB012345678901234567890123456789AB012345678901234567890123456789AB'...)/utf
/(?'ABáC6789012345678901234567890123012345678901234567890123456789AB012345678901234567890123456789AB012345678901234567890123456789AB'...)/utf
Failed: error 148 at offset 132: subpattern name is too long (maximum 128 code units)
/(?'ABZC6789012345678901234567890123012345678901234567890123456789AB012345678901234567890123456789AB012345678901234567890123456789AB'...)/utf
/(?(n/utf
Failed: error 142 at offset 4: syntax error in subpattern name (missing terminator?)
/(?(á/utf
Failed: error 142 at offset 5: syntax error in subpattern name (missing terminator?)
# Invalid UTF-8 tests
/.../g,match_invalid_utf
abcd\x80wxzy\x80pqrs
0: abc
0: wxz
0: pqr
abcd\x{80}wxzy\x80pqrs
0: abc
0: d\x{80}w
0: xzy
0: pqr
/abc/match_invalid_utf
ab\x80ab\=ph
Partial match: ab
\= Expect no match
ab\x80cdef\=ph
No match
/.a/match_invalid_utf
ab\=ph
Partial match: b
ab\=ps
Partial match: b
b\xf0\x91\x88b\=ph
Partial match: b
b\xf0\x91\x88b\=ps
Partial match: b
b\xf0\x91\x88\xb4a
0: \x{11234}a
\= Expect no match
b\x80\=ph
No match
b\x80\=ps
No match
b\xf0\x91\x88\=ph
No match
b\xf0\x91\x88\=ps
No match
/.a$/match_invalid_utf
ab\=ph
Partial match: b
ab\=ps
Partial match: b
\= Expect no match
b\xf0\x91\x98\=ph
No match
b\xf0\x91\x98\=ps
No match
/ab$/match_invalid_utf
ab\x80cdeab
0: ab
\= Expect no match
ab\x80cde
No match
/.../g,match_invalid_utf
abcd\x{80}wxzy\x80pqrs
0: abc
0: d\x{80}w
0: xzy
0: pqr
/(?<=x)../g,match_invalid_utf
abcd\x{80}wxzy\x80pqrs
0: zy
abcd\x{80}wxzy\x80xpqrs
0: zy
0: pq
/X$/match_invalid_utf
\= Expect no match
X\xc4
No match
/(?<=..)X/match_invalid_utf,aftertext
AB\x80AQXYZ
0: X
0+ YZ
AB\x80AQXYZ\=offset=5
0: X
0+ YZ
AB\x80\x80AXYZXC\=offset=5
0: X
0+ C
\= Expect no match
AB\x80XYZ
No match
AB\x80XYZ\=offset=3
No match
AB\xfeXYZ
No match
AB\xffXYZ\=offset=3
No match
AB\x80AXYZ
No match
AB\x80AXYZ\=offset=4
No match
AB\x80\x80AXYZ\=offset=5
No match
/.../match_invalid_utf
AB\xc4CCC
0: CCC
\= Expect no match
A\x{d800}B
No match
A\x{110000}B
No match
A\xc4B
No match
/\bX/match_invalid_utf
A\x80X
0: X
/\BX/match_invalid_utf
\= Expect no match
A\x80X
No match
/(?<=...)X/match_invalid_utf
AAA\x80BBBXYZ
0: X
\= Expect no match
AAA\x80BXYZ
No match
AAA\x80BBXYZ
No match
# -------------------------------------
/(*UTF)(?=\x{123})/I
Capture group count = 0
May match empty string
Compile options: <none>
Overall options: utf
First code unit = \xc4
Last code unit = \xa3
Subject length lower bound = 1
/[\x{c1}\x{e1}]X[\x{145}\x{146}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xc3
Last code unit = 'X'
Subject length lower bound = 3
/[󿾟,]/BI,utf
------------------------------------------------------------------
Bra
[,\x{fff9f}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: , \xf3
Subject length lower bound = 1
/[\x{fff4}-\x{ffff8}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xef \xf0 \xf1 \xf2 \xf3
Subject length lower bound = 1
/[\x{fff4}-\x{afff8}\x{10ffff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xef \xf0 \xf1 \xf2 \xf4
Subject length lower bound = 1
/[\xff\x{ffff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xc3 \xef
Subject length lower bound = 1
/[\xff\x{ff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xc3
Subject length lower bound = 1
abc\x{ff}def
0: \x{ff}
/[\xff\x{ff}]/I
Capture group count = 0
First code unit = \xff
Subject length lower bound = 1
abc\x{ff}def
0: \xff
/[Ss]/I
Capture group count = 0
First code unit = 'S' (caseless)
Subject length lower bound = 1
/[Ss]/I,utf
Capture group count = 0
Options: utf
Starting code units: S s
Subject length lower bound = 1
/(?:\x{ff}|\x{3000})/I,utf
Capture group count = 0
Options: utf
Starting code units: \xc3 \xe3
Subject length lower bound = 1
/x/utf
abxyz
0: x
\x80\=startchar
Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 0
abc\x80\=startchar
Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 3
abc\x80\=startchar,offset=3
Error -36 (bad UTF-8 offset)
/\x{c1}+\x{e1}/iIB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Last code unit = \xe1 (caseless)
Subject length lower bound = 2
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
\x{e1}\x{e1}\x{e1}
0: \xe1\xe1\xe1
/a|\x{c1}/iI,ucp
Capture group count = 0
Options: caseless ucp
Starting code units: A a \xc1 \xe1
Subject length lower bound = 1
\x{e1}xxx
0: \xe1
/a|\x{c1}/iI,utf
Capture group count = 0
Options: caseless utf
Starting code units: A a \xc3
Subject length lower bound = 1
\x{e1}xxx
0: \x{e1}
/\x{c1}|\x{e1}/iI,ucp
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Subject length lower bound = 1
/X(\x{e1})Y/ucp,replace=>\U$1<,substitute_extended
X\x{e1}Y
1: >\xc1<
/X(\x{e1})Y/i,ucp,replace=>\L$1<,substitute_extended
X\x{c1}Y
1: >\xe1<
# Without UTF or UCP characters > 127 have only one case in the default locale.
/X(\x{e1})Y/replace=>\U$1<,substitute_extended
X\x{e1}Y
1: >\xe1<
/A/utf,match_invalid_utf,caseless
\xe5A
0: A
/\bch\b/utf,match_invalid_utf
qchq\=ph
Partial match:
qchq\=ps
Partial match:
/line1\nbreak/firstline,utf,match_invalid_utf
line1\nbreak
0: line1\x{0a}break
line0\nline1\nbreak
No match
/A\z/utf,match_invalid_utf
A\x80\x42\n
No match
/ab$/match_invalid_utf
\= Expect no match
ab\x80cde
No match
/ab\z/match_invalid_utf
\= Expect no match
ab\x80cde
No match
/ab\Z/match_invalid_utf
\= Expect no match
ab\x80cde
No match
/(..)(*scs:(1)ab\z)/match_invalid_utf
ab\x80cde
0: ab
1: ab
/(..)(*scs:(1)ab\Z)/match_invalid_utf
ab\x80cde
0: ab
1: ab
/(..)(*scs:(1)ab$)/match_invalid_utf
ab\x80cde
0: ab
1: ab
/(.) \1/i,ucp
i I
0: i I
1: i
/(.) \1/i,ucp,turkish_casing
Failed: error 205 at offset 0: PCRE2_EXTRA_TURKISH_CASING requires UTF in 8-bit mode
/[\x60-\x7f]/i,ucp,turkish_casing
Failed: error 205 at offset 0: PCRE2_EXTRA_TURKISH_CASING requires UTF in 8-bit mode
i
\= Expect no match
I
/[\x60-\xc0]/i,ucp,turkish_casing
Failed: error 205 at offset 0: PCRE2_EXTRA_TURKISH_CASING requires UTF in 8-bit mode
i
\= Expect no match
I
/[\x80-\xc0]/i,ucp,turkish_casing
Failed: error 205 at offset 0: PCRE2_EXTRA_TURKISH_CASING requires UTF in 8-bit mode
\= Expect no match
i
I
# python_octal
/\400/
Failed: error 151 at offset 4: octal value is greater than \377 in 8-bit non-UTF-8 mode
/abc/substitute_extended
abc\=replace=\400
Failed: error -57 at offset 4 in replacement: bad escape sequence in replacement string
/\400/python_octal
Failed: error 202 at offset 4: octal value given by \ddd is greater than \377 (forbidden by PCRE2_EXTRA_PYTHON_OCTAL)
/abc/substitute_extended,python_octal
abc\=replace=\400
Failed: error -57 at offset 4 in replacement: bad escape sequence in replacement string
/\400/utf
/abc/utf,substitute_extended
abc\=replace=\400
1: \x{100}
/\400/utf,python_octal
Failed: error 202 at offset 4: octal value given by \ddd is greater than \377 (forbidden by PCRE2_EXTRA_PYTHON_OCTAL)
/abc/utf,substitute_extended,python_octal
abc\=replace=\400
Failed: error -57 at offset 4 in replacement: bad escape sequence in replacement string
/[\x00-\x2f\x11-\xff]+/B
------------------------------------------------------------------
Bra
AllAny++
Ket
End
------------------------------------------------------------------
abcd
0: abcd
/[\x00-\x2f\x11-\xff]{4,}/B,utf
------------------------------------------------------------------
Bra
[\x00-\xff]{4,}+
Ket
End
------------------------------------------------------------------
abcd
0: abcd
# End of testinput10

853
3rd/pcre2/testdata/testoutput11-16 vendored Normal file
View File

@@ -0,0 +1,853 @@
# This set of tests is for the 16-bit and 32-bit libraries' basic (non-UTF)
# features that are not compatible with the 8-bit library, or which give
# different output in 16-bit or 32-bit mode. The output for the two widths is
# different, so they have separate output files.
#forbid_utf
#newline_default LF ANY ANYCRLF
/[^\x{c4}]/IB
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Subject length lower bound = 1
/\x{100}/I
Capture group count = 0
First code unit = \x{100}
Subject length lower bound = 1
/ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional leading comment
(?: (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address
| # or
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # one word, optionally followed by....
(?:
[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] | # atom and space parts, or...
\(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) | # comments, or...
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
# quoted strings
)*
< (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # leading <
(?: @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* , (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
)* # further okay, if led by comma
: # closing colon
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* )? # optional route
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address spec
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* > # trailing >
# name and address
) (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional trailing comment
/Ix
Capture group count = 0
Contains explicit CR or LF match
Options: extended
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff
Subject length lower bound = 3
/[\h]/B
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
Ket
End
------------------------------------------------------------------
>\x09<
0: \x09
/[\h]+/B
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]++
Ket
End
------------------------------------------------------------------
>\x09\x20\xa0<
0: \x09 \xa0
/[\v]/B
------------------------------------------------------------------
Bra
[\x0a-\x0d\x85\x{2028}-\x{2029}]
Ket
End
------------------------------------------------------------------
/[^\h]/B
------------------------------------------------------------------
Bra
[^\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
Ket
End
------------------------------------------------------------------
/\h+/I
Capture group count = 0
Starting code units: \x09 \x20 \xa0 \xff
Subject length lower bound = 1
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
0: \x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
0: \x{200a}\xa0\x{2000}
/[\h\x{dc00}]+/IB
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}\x{dc00}]++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x09 \x20 \xa0 \xff
Subject length lower bound = 1
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
0: \x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
0: \x{200a}\xa0\x{2000}
/\H+/I
Capture group count = 0
Subject length lower bound = 1
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
0: \x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
0: \x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
0: \x{202e}\x{2030}\x{205e}\x{2060}
\xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
0: \x9f\xa1\x{2fff}\x{3001}
/[\H\x{d800}]+/
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
0: \x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
0: \x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
0: \x{202e}\x{2030}\x{205e}\x{2060}
\xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
0: \x9f\xa1\x{2fff}\x{3001}
/\v+/I
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
0: \x85\x0a\x0b\x0c\x0d
/[\v\x{dc00}]+/IB
------------------------------------------------------------------
Bra
[\x0a-\x0d\x85\x{2028}-\x{2029}\x{dc00}]++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
0: \x85\x0a\x0b\x0c\x0d
/\V+/I
Capture group count = 0
Subject length lower bound = 1
\x{2028}\x{2029}\x{2027}\x{2030}
0: \x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
0: \x09\x0e\x84\x86
/[\V\x{d800}]+/
\x{2028}\x{2029}\x{2027}\x{2030}
0: \x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
0: \x09\x0e\x84\x86
/\R+/I,bsr=unicode
Capture group count = 0
\R matches any Unicode newline
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
0: \x85\x0a\x0b\x0c\x0d
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
Capture group count = 0
First code unit = \x{d800}
Last code unit = \x{dd00}
Subject length lower bound = 6
\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
0: \x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
/[^\x{80}][^\x{ff}][^\x{100}][^\x{1000}][^\x{ffff}]/B
------------------------------------------------------------------
Bra
[^\x{80}] (not)
[^\x{ff}] (not)
[^\x{100}] (not)
[^\x{1000}] (not)
[^\x{ffff}] (not)
Ket
End
------------------------------------------------------------------
/[^\x{80}][^\x{ff}][^\x{100}][^\x{1000}][^\x{ffff}]/Bi
------------------------------------------------------------------
Bra
/i [^\x{80}] (not)
/i [^\x{ff}] (not)
/i [^\x{100}] (not)
/i [^\x{1000}] (not)
/i [^\x{ffff}] (not)
Ket
End
------------------------------------------------------------------
/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/B
------------------------------------------------------------------
Bra
[^\x{100}]* (not)
[^\x{1000}]+ (not)
[^\x{ffff}]?? (not)
[^\x{8000}]{4} (not)
[^\x{8000}]* (not)
[^\x{7fff}]{2} (not)
[^\x{7fff}]{0,7}? (not)
[^\x{100}]{5} (not)
[^\x{100}]?+ (not)
Ket
End
------------------------------------------------------------------
/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/Bi
------------------------------------------------------------------
Bra
/i [^\x{100}]* (not)
/i [^\x{1000}]+ (not)
/i [^\x{ffff}]?? (not)
/i [^\x{8000}]{4} (not)
/i [^\x{8000}]* (not)
/i [^\x{7fff}]{2} (not)
/i [^\x{7fff}]{0,7}? (not)
/i [^\x{100}]{5} (not)
/i [^\x{100}]?+ (not)
Ket
End
------------------------------------------------------------------
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
XX
0: XX
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
XX
0: XX
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
/\u0100/B,alt_bsux,allow_empty_class,match_unset_backref
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
/[\u0100-\u0200]/B,alt_bsux,allow_empty_class,match_unset_backref
------------------------------------------------------------------
Bra
[\x{100}-\x{200}]
Ket
End
------------------------------------------------------------------
/\ud800/B,alt_bsux,allow_empty_class,match_unset_backref
------------------------------------------------------------------
Bra
\x{d800}
Ket
End
------------------------------------------------------------------
/^\x{ffff}+/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}?/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}*/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}{3}/i
\x{ffff}\x{ffff}\x{ffff}
0: \x{ffff}\x{ffff}\x{ffff}
/^\x{ffff}{0,3}/i
\x{ffff}
0: \x{ffff}
/[^\x00-a]{12,}[^b-\xff]*/B
------------------------------------------------------------------
Bra
[^\x00-a]{12,}
[^b-\xff]*+
Ket
End
------------------------------------------------------------------
/[^\s]*\s* [^\W]+\W+ [^\d]*?\d0 [^\d\w]{4,6}?\w*A/B
------------------------------------------------------------------
Bra
[^\x09-\x0d ]*
\s*
[0-9A-Z_a-z]++
\W+
[^0-9]*?
\d
0
[^0-9A-Z_a-z]{4,6}?
\w*
A
Ket
End
------------------------------------------------------------------
/a*[b-\x{200}]?a#a*[b-\x{200}]?b#[a-f]*[g-\x{200}]*#[g-\x{200}]*[a-c]*#[g-\x{200}]*[a-h]*/B
------------------------------------------------------------------
Bra
a*
[b-\xff\x{100}-\x{200}]?+
a#
a*+
[b-\xff\x{100}-\x{200}]?
b#
[a-f]*+
[g-\xff\x{100}-\x{200}]*+
#
[g-\xff\x{100}-\x{200}]*+
[a-c]*+
#
[g-\xff\x{100}-\x{200}]*
[a-h]*+
Ket
End
------------------------------------------------------------------
/^[\x{1234}\x{4321}]{2,4}?/
\x{1234}\x{1234}\x{1234}
0: \x{1234}\x{1234}
# Check maximum non-UTF character size for the 16-bit library.
/\x{ffff}/
A\x{ffff}B
0: \x{ffff}
/\x{10000}/
Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
/\o{20000}/
# Check maximum character size for the 32-bit library. These will all give
# errors in the 16-bit library.
/\x{110000}/
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/\x{7fffffff}/
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
/\x{80000000}/
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
/\x{ffffffff}/
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
/\x{100000000}/
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
/\o{17777777777}/
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
/\o{20000000000}/
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
/\o{37777777777}/
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
/\o{40000000000}/
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
/\x{7fffffff}\x{7fffffff}/I
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
/\x{80000000}\x{80000000}/I
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
/\x{ffffffff}\x{ffffffff}/I
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
# Non-UTF characters
/.{2,3}/
\x{400000}\x{400001}\x{400002}\x{400003}
** Character \x{400000} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{400001} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{400002} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{400003} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
0: \x00\x01\x02
/\x{400000}\x{800000}/IBi
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
# Check character ranges
/[\H]/IB
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffff}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
: ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^
_ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80
\x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f
\x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e
\x9f \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae
\xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd
\xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc
\xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb
\xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea
\xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9
\xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[\V]/IB
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffff}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82
\x83 \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92
\x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1
\xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0
\xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf
\xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce
\xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd
\xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec
\xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb
\xfc \xfd \xfe \xff
Subject length lower bound = 1
/(*THEN:\[A]{65501})/expand
# We can use pcre2test's utf8_input modifier to create wide pattern characters,
# even though this test is run when UTF is not supported.
/a\x{d800}b/utf8_input
a<><61><EFBFBD>b
0: a\x{d800}b
a\x{d800}b
0: a\x{d800}b
a\o{154000}b
0: a\x{d800}b
\= Expect warning unless 32bit
a\N{U+d800}b
** Warning: character \N{U+d800} is a surrogate and should not be encoded as UTF-16
0: a\x{d800}b
/a\x{ffff}b/utf8_input
a￿b
0: a\x{ffff}b
a\x{ffff}b
0: a\x{ffff}b
a\o{177777}b
0: a\x{ffff}b
a\N{U+ffff}b
0: a\x{ffff}b
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf8_input
** Failed: character value greater than 0xffff cannot be converted to 16-bit in non-UTF mode
ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z
ab\x{7fffffff}z
ab\o{17777777777}z
ab\N{U+7fffffff}z
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf8_input
** Failed: invalid UTF-8 string cannot be converted to 16-bit string
ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z
ab\x{ffffffff}z
/ab<61>Az/utf8_input
** Failed: invalid UTF-8 string cannot be converted to 16-bit string
ab<61>Az
ab\x{80000041}z
\= Expect no match
abAz
aAz
ab\377Az
ab\xff\N{U+0041}z
ab\N{U+ff}\N{U+41}z
/ab\x{80000041}z/
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
ab\x{80000041}z
/(?i:A{1,}\6666666666)/
A\x{1b6}6666666
0: A\x{1b6}6666666
/abc/substitute_extended,replace=>\777<
abc
1: >\x{1ff}<
/abc/substitute_extended,replace=>\o{012345}<
abc
1: >\x{14e5}<
# Character range merging tests
/[\x{100}-\x{200}\H\x{8000}-\x{9000}]/B
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffff}]
Ket
End
------------------------------------------------------------------
/[\x{100}-\x{200}\V\x{8000}-\x{9000}]/B
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffff}]
Ket
End
------------------------------------------------------------------
/[\x00-\x{6000}\x{3000}-\x{ffff}]#[\x00-\x{6000}\x{3000}-\x{ffff}]{5,7}?/B
------------------------------------------------------------------
Bra
AllAny
#
AllAny{5}
AllAny{0,2}?
Ket
End
------------------------------------------------------------------
/[\x00-\x{6000}\x{3000}-\x{ffffffff}]#[\x00-\x{6000}\x{3000}-\x{ffffffff}]{5,7}?/B
Failed: error 134 at offset 34: character code point value in \x{} or \o{} is too large
/[\x00-\x2f\x11-\xff]*?!/B
------------------------------------------------------------------
Bra
[\x00-\xff]*?
!
Ket
End
------------------------------------------------------------------
abcd!e
0: abcd!
/i/turkish_casing
Failed: error 204 at offset 0: PCRE2_EXTRA_TURKISH_CASING require Unicode (UTF or UCP) mode
# Character list tests
/([\x{100}-\x{7fff}\x{9000}\x{9002}\x{9004}\x{9006}\x{9008}\x{10000}-\x{7fffffff}]{3,8}?).#/B
Failed: error 134 at offset 66: character code point value in \x{} or \o{} is too large
\x{9001}\x{9007}\x{8000}\x{ffff}\x{9002}\x{7fff}\x{10000}\x{7fffffff}\x{500000}\x{9006}#
/([\x{3000}\x{3001}\x{3003}\x{3004}\x{3006}\x{3007}\x{8000}-\x{ffff}\x{100001}\x{100002}\x{100004}\x{100005}\x{100007}\x{100008}\x{10000a}\x{10000b}\x{80000000}-\x{ffffffff}]{5,}).#/B
Failed: error 134 at offset 76: character code point value in \x{} or \o{} is too large
\x{2fff}\x{3002}\x{7fff}\x{100000}\x{7fffffff}\x{3000}\x{3007}\x{8000}\x{ffff}\x{100001}\x{10000b}\x{80000000}\x{ffffffff}\x{3000}#
/([^\x{4000}\x{4002}\x{4004}\x{4005}\x{4007}\x{4009}\x{400a}\x{f000}\x{f002}\x{f004}\x{f005}\x{f007}\x{f009}\x{f00a}\x{100000}\x{100002}\x{100004}\x{100005}\x{100007}\x{100009}\x{10000a}\x{a0000000}\x{a0000002}\x{a0000004}\x{a0000005}\x{a0000007}\x{a0000009}\x{a000000a}]+).#/B
Failed: error 134 at offset 124: character code point value in \x{} or \o{} is too large
\x{4000}\x{4002}\x{4004}\x{4005}\x{4007}\x{4009}\x{400a}\x{3fff}\x{4001}\x{4003}\x{4006}\x{4008}\x{400b}\x{100}#
\x{f000}\x{f002}\x{f004}\x{f005}\x{f007}\x{f009}\x{f00a}\x{efff}\x{f001}\x{f003}\x{f006}\x{f008}\x{f00b}\x{100}#
\x{100000}\x{100002}\x{100004}\x{100005}\x{100007}\x{100009}\x{10000a}\x{fffff}\x{100001}\x{100003}\x{100006}\x{100008}\x{10000b}\x{100}#
\x{a0000000}\x{a0000002}\x{a0000004}\x{a0000005}\x{a0000007}\x{a0000009}\x{a000000a}\x{9fffffff}\x{a0000001}\x{a0000003}\x{a0000006}\x{a0000008}\x{a000000b}\x{100}#
# --------------
# EXTENDED CHARACTER CLASSES (UTS#18)
# META_BIGVALUE tests
/\x{80000000}/B
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
\x{80000000}
\= Expect no match
\x{7fffffff}
\x{80000001}
/[\x{80000000}-\x{8000000f}\x{8fffffff}]/B
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
\x{80000002}
\x{8fffffff}
\= Expect no match
\x{7fffffff}
\x{90000000}
/\x{80000000}/B,alt_extended_class
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
\x{80000000}
\= Expect no match
\x{7fffffff}
\x{80000001}
/[\x{80000000}-\x{8000000f}\x{8fffffff}]/B,alt_extended_class
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
\x{80000002}
\x{8fffffff}
\= Expect no match
\x{7fffffff}
\x{90000000}
/[\x{80000000}-\x{8000000f}--\x{80000002}]/B,alt_extended_class
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
/[[\x{80000000}-\x{8000000f}]--[\x{80000002}]]/B,alt_extended_class
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
# META_BIGVALUE tests
/(?[[\x{80000000}-\x{8000000f}]+\x{8fffffff}])/B
Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
\x{80000002}
\x{8fffffff}
\= Expect no match
\x{7fffffff}
\x{90000000}
/(?[[\x{80000000}-\x{8000000f}]-\x{80000002}])/B
Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
/(?[[\x{80000000}-\x{8000000f}]-\x{80000002}])/B
Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
\x{80000001}
\x{80000003}
\= Expect no match
\x{80000002}
# --------------
# End of testinput11

1012
3rd/pcre2/testdata/testoutput11-32 vendored Normal file
View File

@@ -0,0 +1,1012 @@
# This set of tests is for the 16-bit and 32-bit libraries' basic (non-UTF)
# features that are not compatible with the 8-bit library, or which give
# different output in 16-bit or 32-bit mode. The output for the two widths is
# different, so they have separate output files.
#forbid_utf
#newline_default LF ANY ANYCRLF
/[^\x{c4}]/IB
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Subject length lower bound = 1
/\x{100}/I
Capture group count = 0
First code unit = \x{100}
Subject length lower bound = 1
/ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional leading comment
(?: (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address
| # or
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # one word, optionally followed by....
(?:
[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] | # atom and space parts, or...
\(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) | # comments, or...
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
# quoted strings
)*
< (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # leading <
(?: @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* , (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
)* # further okay, if led by comma
: # closing colon
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* )? # optional route
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address spec
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* > # trailing >
# name and address
) (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional trailing comment
/Ix
Capture group count = 0
Contains explicit CR or LF match
Options: extended
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff
Subject length lower bound = 3
/[\h]/B
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
Ket
End
------------------------------------------------------------------
>\x09<
0: \x09
/[\h]+/B
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]++
Ket
End
------------------------------------------------------------------
>\x09\x20\xa0<
0: \x09 \xa0
/[\v]/B
------------------------------------------------------------------
Bra
[\x0a-\x0d\x85\x{2028}-\x{2029}]
Ket
End
------------------------------------------------------------------
/[^\h]/B
------------------------------------------------------------------
Bra
[^\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
Ket
End
------------------------------------------------------------------
/\h+/I
Capture group count = 0
Starting code units: \x09 \x20 \xa0 \xff
Subject length lower bound = 1
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
0: \x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
0: \x{200a}\xa0\x{2000}
/[\h\x{dc00}]+/IB
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}\x{dc00}]++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x09 \x20 \xa0 \xff
Subject length lower bound = 1
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
0: \x{1680}\x{2000}\x{202f}\x{3000}
\x{3001}\x{2fff}\x{200a}\xa0\x{2000}
0: \x{200a}\xa0\x{2000}
/\H+/I
Capture group count = 0
Subject length lower bound = 1
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
0: \x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
0: \x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
0: \x{202e}\x{2030}\x{205e}\x{2060}
\xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
0: \x9f\xa1\x{2fff}\x{3001}
/[\H\x{d800}]+/
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
0: \x{167f}\x{1681}\x{180d}\x{180f}
\x{2000}\x{200a}\x{1fff}\x{200b}
0: \x{1fff}\x{200b}
\x{202f}\x{205f}\x{202e}\x{2030}\x{205e}\x{2060}
0: \x{202e}\x{2030}\x{205e}\x{2060}
\xa0\x{3000}\x9f\xa1\x{2fff}\x{3001}
0: \x9f\xa1\x{2fff}\x{3001}
/\v+/I
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
0: \x85\x0a\x0b\x0c\x0d
/[\v\x{dc00}]+/IB
------------------------------------------------------------------
Bra
[\x0a-\x0d\x85\x{2028}-\x{2029}\x{dc00}]++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
0: \x85\x0a\x0b\x0c\x0d
/\V+/I
Capture group count = 0
Subject length lower bound = 1
\x{2028}\x{2029}\x{2027}\x{2030}
0: \x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
0: \x09\x0e\x84\x86
/[\V\x{d800}]+/
\x{2028}\x{2029}\x{2027}\x{2030}
0: \x{2027}\x{2030}
\x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
0: \x09\x0e\x84\x86
/\R+/I,bsr=unicode
Capture group count = 0
\R matches any Unicode newline
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
\x{2027}\x{2030}\x{2028}\x{2029}
0: \x{2028}\x{2029}
\x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
0: \x85\x0a\x0b\x0c\x0d
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
Capture group count = 0
First code unit = \x{d800}
Last code unit = \x{dd00}
Subject length lower bound = 6
\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
0: \x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}
/[^\x{80}][^\x{ff}][^\x{100}][^\x{1000}][^\x{ffff}]/B
------------------------------------------------------------------
Bra
[^\x{80}] (not)
[^\x{ff}] (not)
[^\x{100}] (not)
[^\x{1000}] (not)
[^\x{ffff}] (not)
Ket
End
------------------------------------------------------------------
/[^\x{80}][^\x{ff}][^\x{100}][^\x{1000}][^\x{ffff}]/Bi
------------------------------------------------------------------
Bra
/i [^\x{80}] (not)
/i [^\x{ff}] (not)
/i [^\x{100}] (not)
/i [^\x{1000}] (not)
/i [^\x{ffff}] (not)
Ket
End
------------------------------------------------------------------
/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/B
------------------------------------------------------------------
Bra
[^\x{100}]* (not)
[^\x{1000}]+ (not)
[^\x{ffff}]?? (not)
[^\x{8000}]{4} (not)
[^\x{8000}]* (not)
[^\x{7fff}]{2} (not)
[^\x{7fff}]{0,7}? (not)
[^\x{100}]{5} (not)
[^\x{100}]?+ (not)
Ket
End
------------------------------------------------------------------
/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/Bi
------------------------------------------------------------------
Bra
/i [^\x{100}]* (not)
/i [^\x{1000}]+ (not)
/i [^\x{ffff}]?? (not)
/i [^\x{8000}]{4} (not)
/i [^\x{8000}]* (not)
/i [^\x{7fff}]{2} (not)
/i [^\x{7fff}]{0,7}? (not)
/i [^\x{100}]{5} (not)
/i [^\x{100}]?+ (not)
Ket
End
------------------------------------------------------------------
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
XX
0: XX
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
XX
0: XX
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
/\u0100/B,alt_bsux,allow_empty_class,match_unset_backref
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
/[\u0100-\u0200]/B,alt_bsux,allow_empty_class,match_unset_backref
------------------------------------------------------------------
Bra
[\x{100}-\x{200}]
Ket
End
------------------------------------------------------------------
/\ud800/B,alt_bsux,allow_empty_class,match_unset_backref
------------------------------------------------------------------
Bra
\x{d800}
Ket
End
------------------------------------------------------------------
/^\x{ffff}+/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}?/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}*/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}{3}/i
\x{ffff}\x{ffff}\x{ffff}
0: \x{ffff}\x{ffff}\x{ffff}
/^\x{ffff}{0,3}/i
\x{ffff}
0: \x{ffff}
/[^\x00-a]{12,}[^b-\xff]*/B
------------------------------------------------------------------
Bra
[^\x00-a]{12,}
[^b-\xff]*+
Ket
End
------------------------------------------------------------------
/[^\s]*\s* [^\W]+\W+ [^\d]*?\d0 [^\d\w]{4,6}?\w*A/B
------------------------------------------------------------------
Bra
[^\x09-\x0d ]*
\s*
[0-9A-Z_a-z]++
\W+
[^0-9]*?
\d
0
[^0-9A-Z_a-z]{4,6}?
\w*
A
Ket
End
------------------------------------------------------------------
/a*[b-\x{200}]?a#a*[b-\x{200}]?b#[a-f]*[g-\x{200}]*#[g-\x{200}]*[a-c]*#[g-\x{200}]*[a-h]*/B
------------------------------------------------------------------
Bra
a*
[b-\xff\x{100}-\x{200}]?+
a#
a*+
[b-\xff\x{100}-\x{200}]?
b#
[a-f]*+
[g-\xff\x{100}-\x{200}]*+
#
[g-\xff\x{100}-\x{200}]*+
[a-c]*+
#
[g-\xff\x{100}-\x{200}]*
[a-h]*+
Ket
End
------------------------------------------------------------------
/^[\x{1234}\x{4321}]{2,4}?/
\x{1234}\x{1234}\x{1234}
0: \x{1234}\x{1234}
# Check maximum non-UTF character size for the 16-bit library.
/\x{ffff}/
A\x{ffff}B
0: \x{ffff}
/\x{10000}/
/\o{20000}/
# Check maximum character size for the 32-bit library. These will all give
# errors in the 16-bit library.
/\x{110000}/
/\x{7fffffff}/
/\x{80000000}/
/\x{ffffffff}/
/\x{100000000}/
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
/\o{17777777777}/
/\o{20000000000}/
/\o{37777777777}/
/\o{40000000000}/
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
/\x{7fffffff}\x{7fffffff}/I
Capture group count = 0
First code unit = \x{7fffffff}
Last code unit = \x{7fffffff}
Subject length lower bound = 2
/\x{80000000}\x{80000000}/I
Capture group count = 0
First code unit = \x{80000000}
Last code unit = \x{80000000}
Subject length lower bound = 2
/\x{ffffffff}\x{ffffffff}/I
Capture group count = 0
First code unit = \x{ffffffff}
Last code unit = \x{ffffffff}
Subject length lower bound = 2
# Non-UTF characters
/.{2,3}/
\x{400000}\x{400001}\x{400002}\x{400003}
0: \x{400000}\x{400001}\x{400002}
/\x{400000}\x{800000}/IBi
------------------------------------------------------------------
Bra
/i \x{400000}\x{800000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless
First code unit = \x{400000}
Last code unit = \x{800000}
Subject length lower bound = 2
# Check character ranges
/[\H]/IB
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffffffff}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
: ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^
_ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80
\x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f
\x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e
\x9f \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae
\xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd
\xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc
\xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb
\xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea
\xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9
\xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[\V]/IB
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffffffff}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82
\x83 \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92
\x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1
\xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0
\xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf
\xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce
\xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd
\xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec
\xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb
\xfc \xfd \xfe \xff
Subject length lower bound = 1
/(*THEN:\[A]{65501})/expand
# We can use pcre2test's utf8_input modifier to create wide pattern characters,
# even though this test is run when UTF is not supported.
/a\x{d800}b/utf8_input
a<><61><EFBFBD>b
0: a\x{d800}b
a\x{d800}b
0: a\x{d800}b
a\o{154000}b
0: a\x{d800}b
\= Expect warning unless 32bit
a\N{U+d800}b
0: a\x{d800}b
/a\x{ffff}b/utf8_input
a￿b
0: a\x{ffff}b
a\x{ffff}b
0: a\x{ffff}b
a\o{177777}b
0: a\x{ffff}b
a\N{U+ffff}b
0: a\x{ffff}b
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf8_input
ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z
0: ab\x{7fffffff}z
ab\x{7fffffff}z
0: ab\x{7fffffff}z
ab\o{17777777777}z
0: ab\x{7fffffff}z
ab\N{U+7fffffff}z
** Warning: character \N{U+7fffffff} is greater than 0x10ffff and should not be encoded as UTF-32
0: ab\x{7fffffff}z
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf8_input
ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z
0: ab\x{ffffffff}z
ab\x{ffffffff}z
0: ab\x{ffffffff}z
/ab<61>Az/utf8_input
ab<61>Az
0: ab\x{80000041}z
ab\x{80000041}z
0: ab\x{80000041}z
\= Expect no match
abAz
No match
aAz
No match
ab\377Az
No match
ab\xff\N{U+0041}z
No match
ab\N{U+ff}\N{U+41}z
No match
/ab\x{80000041}z/
ab\x{80000041}z
0: ab\x{80000041}z
/(?i:A{1,}\6666666666)/
A\x{1b6}6666666
0: A\x{1b6}6666666
/abc/substitute_extended,replace=>\777<
abc
1: >\x{1ff}<
/abc/substitute_extended,replace=>\o{012345}<
abc
1: >\x{14e5}<
# Character range merging tests
/[\x{100}-\x{200}\H\x{8000}-\x{9000}]/B
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffffffff}]
Ket
End
------------------------------------------------------------------
/[\x{100}-\x{200}\V\x{8000}-\x{9000}]/B
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffffffff}]
Ket
End
------------------------------------------------------------------
/[\x00-\x{6000}\x{3000}-\x{ffff}]#[\x00-\x{6000}\x{3000}-\x{ffff}]{5,7}?/B
------------------------------------------------------------------
Bra
[\x00-\xff\x{100}-\x{ffff}]
#
[\x00-\xff\x{100}-\x{ffff}]{5,7}?
Ket
End
------------------------------------------------------------------
/[\x00-\x{6000}\x{3000}-\x{ffffffff}]#[\x00-\x{6000}\x{3000}-\x{ffffffff}]{5,7}?/B
------------------------------------------------------------------
Bra
AllAny
#
AllAny{5}
AllAny{0,2}?
Ket
End
------------------------------------------------------------------
/[\x00-\x2f\x11-\xff]*?!/B
------------------------------------------------------------------
Bra
[\x00-\xff]*?
!
Ket
End
------------------------------------------------------------------
abcd!e
0: abcd!
/i/turkish_casing
Failed: error 204 at offset 0: PCRE2_EXTRA_TURKISH_CASING require Unicode (UTF or UCP) mode
# Character list tests
/([\x{100}-\x{7fff}\x{9000}\x{9002}\x{9004}\x{9006}\x{9008}\x{10000}-\x{7fffffff}]{3,8}?).#/B
------------------------------------------------------------------
Bra
CBra 1
[\x{100}-\x{7fff}\x{9000}\x{9002}\x{9004}\x{9006}\x{9008}\x{10000}-\x{7fffffff}]{3,8}?
Ket
Any
#
Ket
End
------------------------------------------------------------------
\x{9001}\x{9007}\x{8000}\x{ffff}\x{9002}\x{7fff}\x{10000}\x{7fffffff}\x{500000}\x{9006}#
0: \x{9002}\x{7fff}\x{10000}\x{7fffffff}\x{500000}\x{9006}#
1: \x{9002}\x{7fff}\x{10000}\x{7fffffff}\x{500000}
/([\x{3000}\x{3001}\x{3003}\x{3004}\x{3006}\x{3007}\x{8000}-\x{ffff}\x{100001}\x{100002}\x{100004}\x{100005}\x{100007}\x{100008}\x{10000a}\x{10000b}\x{80000000}-\x{ffffffff}]{5,}).#/B
------------------------------------------------------------------
Bra
CBra 1
[\x{3000}-\x{3001}\x{3003}-\x{3004}\x{3006}-\x{3007}\x{8000}-\x{ffff}\x{100001}-\x{100002}\x{100004}-\x{100005}\x{100007}-\x{100008}\x{10000a}-\x{10000b}\x{80000000}-\x{ffffffff}]{5,}
Ket
Any
#
Ket
End
------------------------------------------------------------------
\x{2fff}\x{3002}\x{7fff}\x{100000}\x{7fffffff}\x{3000}\x{3007}\x{8000}\x{ffff}\x{100001}\x{10000b}\x{80000000}\x{ffffffff}\x{3000}#
0: \x{3000}\x{3007}\x{8000}\x{ffff}\x{100001}\x{10000b}\x{80000000}\x{ffffffff}\x{3000}#
1: \x{3000}\x{3007}\x{8000}\x{ffff}\x{100001}\x{10000b}\x{80000000}\x{ffffffff}
/([^\x{4000}\x{4002}\x{4004}\x{4005}\x{4007}\x{4009}\x{400a}\x{f000}\x{f002}\x{f004}\x{f005}\x{f007}\x{f009}\x{f00a}\x{100000}\x{100002}\x{100004}\x{100005}\x{100007}\x{100009}\x{10000a}\x{a0000000}\x{a0000002}\x{a0000004}\x{a0000005}\x{a0000007}\x{a0000009}\x{a000000a}]+).#/B
------------------------------------------------------------------
Bra
CBra 1
[^\x{4000}\x{4002}\x{4004}-\x{4005}\x{4007}\x{4009}-\x{400a}\x{f000}\x{f002}\x{f004}-\x{f005}\x{f007}\x{f009}-\x{f00a}\x{100000}\x{100002}\x{100004}-\x{100005}\x{100007}\x{100009}-\x{10000a}\x{a0000000}\x{a0000002}\x{a0000004}-\x{a0000005}\x{a0000007}\x{a0000009}-\x{a000000a}]+
Ket
Any
#
Ket
End
------------------------------------------------------------------
\x{4000}\x{4002}\x{4004}\x{4005}\x{4007}\x{4009}\x{400a}\x{3fff}\x{4001}\x{4003}\x{4006}\x{4008}\x{400b}\x{100}#
0: \x{3fff}\x{4001}\x{4003}\x{4006}\x{4008}\x{400b}\x{100}#
1: \x{3fff}\x{4001}\x{4003}\x{4006}\x{4008}\x{400b}
\x{f000}\x{f002}\x{f004}\x{f005}\x{f007}\x{f009}\x{f00a}\x{efff}\x{f001}\x{f003}\x{f006}\x{f008}\x{f00b}\x{100}#
0: \x{efff}\x{f001}\x{f003}\x{f006}\x{f008}\x{f00b}\x{100}#
1: \x{efff}\x{f001}\x{f003}\x{f006}\x{f008}\x{f00b}
\x{100000}\x{100002}\x{100004}\x{100005}\x{100007}\x{100009}\x{10000a}\x{fffff}\x{100001}\x{100003}\x{100006}\x{100008}\x{10000b}\x{100}#
0: \x{fffff}\x{100001}\x{100003}\x{100006}\x{100008}\x{10000b}\x{100}#
1: \x{fffff}\x{100001}\x{100003}\x{100006}\x{100008}\x{10000b}
\x{a0000000}\x{a0000002}\x{a0000004}\x{a0000005}\x{a0000007}\x{a0000009}\x{a000000a}\x{9fffffff}\x{a0000001}\x{a0000003}\x{a0000006}\x{a0000008}\x{a000000b}\x{100}#
0: \x{9fffffff}\x{a0000001}\x{a0000003}\x{a0000006}\x{a0000008}\x{a000000b}\x{100}#
1: \x{9fffffff}\x{a0000001}\x{a0000003}\x{a0000006}\x{a0000008}\x{a000000b}
# --------------
# EXTENDED CHARACTER CLASSES (UTS#18)
# META_BIGVALUE tests
/\x{80000000}/B
------------------------------------------------------------------
Bra
\x{80000000}
Ket
End
------------------------------------------------------------------
\x{80000000}
0: \x{80000000}
\= Expect no match
\x{7fffffff}
No match
\x{80000001}
No match
/[\x{80000000}-\x{8000000f}\x{8fffffff}]/B
------------------------------------------------------------------
Bra
[\x{80000000}-\x{8000000f}\x{8fffffff}]
Ket
End
------------------------------------------------------------------
\x{80000002}
0: \x{80000002}
\x{8fffffff}
0: \x{8fffffff}
\= Expect no match
\x{7fffffff}
No match
\x{90000000}
No match
/\x{80000000}/B,alt_extended_class
------------------------------------------------------------------
Bra
\x{80000000}
Ket
End
------------------------------------------------------------------
\x{80000000}
0: \x{80000000}
\= Expect no match
\x{7fffffff}
No match
\x{80000001}
No match
/[\x{80000000}-\x{8000000f}\x{8fffffff}]/B,alt_extended_class
------------------------------------------------------------------
Bra
[\x{80000000}-\x{8000000f}\x{8fffffff}]
Ket
End
------------------------------------------------------------------
\x{80000002}
0: \x{80000002}
\x{8fffffff}
0: \x{8fffffff}
\= Expect no match
\x{7fffffff}
No match
\x{90000000}
No match
/[\x{80000000}-\x{8000000f}--\x{80000002}]/B,alt_extended_class
------------------------------------------------------------------
Bra
eclass[
no bitmap
xclass: [\x{80000000}-\x{8000000f}]
xclass: [^\x{80000002}]
AND
]
Ket
End
------------------------------------------------------------------
\x{80000001}
0: \x{80000001}
\x{80000003}
0: \x{80000003}
\= Expect no match
\x{80000002}
No match
/[[\x{80000000}-\x{8000000f}]--[\x{80000002}]]/B,alt_extended_class
------------------------------------------------------------------
Bra
eclass[
no bitmap
xclass: [\x{80000000}-\x{8000000f}]
xclass: [^\x{80000002}]
AND
]
Ket
End
------------------------------------------------------------------
\x{80000001}
0: \x{80000001}
\x{80000003}
0: \x{80000003}
\= Expect no match
\x{80000002}
No match
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
# META_BIGVALUE tests
/(?[[\x{80000000}-\x{8000000f}]+\x{8fffffff}])/B
------------------------------------------------------------------
Bra
eclass[
no bitmap
xclass: [\x{80000000}-\x{8000000f}]
xclass: [\x{8fffffff}]
OR
]
Ket
End
------------------------------------------------------------------
\x{80000002}
0: \x{80000002}
\x{8fffffff}
0: \x{8fffffff}
\= Expect no match
\x{7fffffff}
No match
\x{90000000}
No match
/(?[[\x{80000000}-\x{8000000f}]-\x{80000002}])/B
------------------------------------------------------------------
Bra
eclass[
no bitmap
xclass: [\x{80000000}-\x{8000000f}]
xclass: [^\x{80000002}]
AND
]
Ket
End
------------------------------------------------------------------
\x{80000001}
0: \x{80000001}
\x{80000003}
0: \x{80000003}
\= Expect no match
\x{80000002}
No match
/(?[[\x{80000000}-\x{8000000f}]-\x{80000002}])/B
------------------------------------------------------------------
Bra
eclass[
no bitmap
xclass: [\x{80000000}-\x{8000000f}]
xclass: [^\x{80000002}]
AND
]
Ket
End
------------------------------------------------------------------
\x{80000001}
0: \x{80000001}
\x{80000003}
0: \x{80000003}
\= Expect no match
\x{80000002}
No match
# --------------
# End of testinput11

2040
3rd/pcre2/testdata/testoutput12-16 vendored Normal file
View File

@@ -0,0 +1,2040 @@
# This set of tests is for UTF-16 and UTF-32 support, including Unicode
# properties. It is relevant only to the 16-bit and 32-bit libraries. The
# output is different for each library, so there are separate output files.
/<2F><><EFBFBD>xxx/IB,utf,no_utf_check
** Failed: invalid UTF-8 string cannot be converted to 16-bit string
/abc/utf
<20>]
** Failed: invalid UTF-8 string cannot be used as input in UTF mode
# Check maximum character size
/\x{ffff}/IB,utf
------------------------------------------------------------------
Bra
\x{ffff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{ffff}
Subject length lower bound = 1
/\x{10000}/IB,utf
------------------------------------------------------------------
Bra
\x{10000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d800}
Last code unit = \x{dc00}
Subject length lower bound = 1
/\x{100}/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
/\x{1000}/IB,utf
------------------------------------------------------------------
Bra
\x{1000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{1000}
Subject length lower bound = 1
/\x{10000}/IB,utf
------------------------------------------------------------------
Bra
\x{10000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d800}
Last code unit = \x{dc00}
Subject length lower bound = 1
/\x{100000}/IB,utf
------------------------------------------------------------------
Bra
\x{100000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{dbc0}
Last code unit = \x{dc00}
Subject length lower bound = 1
/\x{10ffff}/IB,utf
------------------------------------------------------------------
Bra
\x{10ffff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{dbff}
Last code unit = \x{dfff}
Subject length lower bound = 1
/[\x{ff}]/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xff
Subject length lower bound = 1
/[\x{100}]/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
/\x80/IB,utf
------------------------------------------------------------------
Bra
\x{80}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x80
Subject length lower bound = 1
/\xff/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xff
Subject length lower bound = 1
/\x{D55c}\x{ad6d}\x{C5B4}/IB,utf
------------------------------------------------------------------
Bra
\x{d55c}\x{ad6d}\x{c5b4}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
\x{D55c}\x{ad6d}\x{C5B4}
0: \x{d55c}\x{ad6d}\x{c5b4}
/\x{65e5}\x{672c}\x{8a9e}/IB,utf
------------------------------------------------------------------
Bra
\x{65e5}\x{672c}\x{8a9e}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
\x{65e5}\x{672c}\x{8a9e}
0: \x{65e5}\x{672c}\x{8a9e}
/\x{80}/IB,utf
------------------------------------------------------------------
Bra
\x{80}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x80
Subject length lower bound = 1
/\x{084}/IB,utf
------------------------------------------------------------------
Bra
\x{84}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x84
Subject length lower bound = 1
/\x{104}/IB,utf
------------------------------------------------------------------
Bra
\x{104}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{104}
Subject length lower bound = 1
/\x{861}/IB,utf
------------------------------------------------------------------
Bra
\x{861}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{861}
Subject length lower bound = 1
/\x{212ab}/IB,utf
------------------------------------------------------------------
Bra
\x{212ab}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d844}
Last code unit = \x{deab}
Subject length lower bound = 1
/[^ab\xC0-\xF0]/IB,utf
------------------------------------------------------------------
Bra
[^ab\xc0-\xf0]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
\x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e
\x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d
\x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac
\xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb
\xbc \xbd \xbe \xbf \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb
\xfc \xfd \xfe \xff
Subject length lower bound = 1
\x{f1}
0: \x{f1}
\x{bf}
0: \x{bf}
\x{100}
0: \x{100}
\x{1000}
0: \x{1000}
\= Expect no match
\x{c0}
No match
\x{f0}
No match
/(\x{100}+|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}++
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: x \xff
Subject length lower bound = 1
/(\x{100}*a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}*+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a x \xff
Subject length lower bound = 1
/(\x{100}{0,2}a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}{0,2}+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a x \xff
Subject length lower bound = 1
/(\x{100}{1,2}a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}
\x{100}{0,1}+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: x \xff
Subject length lower bound = 1
/\x{100}/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
/a\x{100}\x{101}*/IB,utf
------------------------------------------------------------------
Bra
a\x{100}
\x{101}*+
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'a'
Last code unit = \x{100}
Subject length lower bound = 2
/a\x{100}\x{101}+/IB,utf
------------------------------------------------------------------
Bra
a\x{100}
\x{101}++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'a'
Last code unit = \x{101}
Subject length lower bound = 3
/[^\x{c4}]/IB
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Subject length lower bound = 1
/[\x{100}]/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
\x{100}
0: \x{100}
Z\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[\xff]/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xff
Subject length lower bound = 1
>\x{ff}<
0: \x{ff}
/[^\xff]/IB,utf
------------------------------------------------------------------
Bra
[^\x{ff}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Subject length lower bound = 1
/\x{100}abc(xyz(?1))/IB,utf
------------------------------------------------------------------
Bra
\x{100}abc
CBra 1
xyz
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
First code unit = \x{100}
Last code unit = 'z'
Subject length lower bound = 7
/\777/I,utf
Capture group count = 0
Options: utf
First code unit = \x{1ff}
Subject length lower bound = 1
\x{1ff}
0: \x{1ff}
\777
0: \x{1ff}
/\x{100}+\x{200}/IB,utf
------------------------------------------------------------------
Bra
\x{100}++
\x{200}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Last code unit = \x{200}
Subject length lower bound = 2
/\x{100}+X/IB,utf
------------------------------------------------------------------
Bra
\x{100}++
X
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Last code unit = 'X'
Subject length lower bound = 2
/^[\QĀ\E-\QŐ\E/B,utf
Failed: error 106 at offset 13: missing terminating ] for character class
/X/utf
XX\x{d800}\=no_utf_check
0: X
XX\x{da00}\=no_utf_check
0: X
XX\x{dc00}\=no_utf_check
0: X
XX\x{de00}\=no_utf_check
0: X
XX\x{dfff}\=no_utf_check
0: X
\= Expect UTF error
XX\x{d800}
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
XX\x{da00}
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
XX\x{dc00}
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
XX\x{de00}
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
XX\x{dfff}
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
XX\x{110000}
** Failed: character \N{U+110000} is greater than 0x10ffff and therefore cannot be encoded as UTF-16
XX\x{d800}\x{1234}
Failed: error -25: UTF-16 error: invalid low surrogate at offset 2
\= Expect no match
XX\x{d800}\=offset=3
No match
/(?<=.)X/utf
XX\x{d800}\=offset=3
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
/(*UTF16)\x{11234}/
abcd\x{11234}pqr
0: \x{11234}
/(*UTF)\x{11234}/I
Capture group count = 0
Compile options: <none>
Overall options: utf
First code unit = \x{d804}
Last code unit = \x{de34}
Subject length lower bound = 1
abcd\x{11234}pqr
0: \x{11234}
/(*UTF-32)\x{11234}/
Failed: error 160 at offset 5: (*VERB) not recognized or malformed
abcd\x{11234}pqr
/(*UTF-32)\x{112}/
Failed: error 160 at offset 5: (*VERB) not recognized or malformed
abcd\x{11234}pqr
/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
Capture group count = 0
Compile options: <none>
Overall options: utf
\R matches any Unicode newline
Forced newline is CRLF
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 3
/(*CRLF)(*UTF32)(*BSR_UNICODE)a\Rb/I
Failed: error 160 at offset 14: (*VERB) not recognized or malformed
/\h/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x20 \xa0 \xff
Subject length lower bound = 1
ABC\x{09}
0: \x{09}
ABC\x{20}
0:
ABC\x{a0}
0: \x{a0}
ABC\x{1680}
0: \x{1680}
ABC\x{180e}
0: \x{180e}
ABC\x{2000}
0: \x{2000}
ABC\x{202f}
0: \x{202f}
ABC\x{205f}
0: \x{205f}
ABC\x{3000}
0: \x{3000}
/\v/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
ABC\x{0a}
0: \x{0a}
ABC\x{0b}
0: \x{0b}
ABC\x{0c}
0: \x{0c}
ABC\x{0d}
0: \x{0d}
ABC\x{85}
0: \x{85}
ABC\x{2028}
0: \x{2028}
/\h*A/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x20 A \xa0 \xff
Last code unit = 'A'
Subject length lower bound = 1
CDBABC
0: A
\x{2000}ABC
0: \x{2000}A
/\R*A/I,bsr=unicode,utf
Capture group count = 0
Options: utf
\R matches any Unicode newline
Starting code units: \x0a \x0b \x0c \x0d A \x85 \xff
Last code unit = 'A'
Subject length lower bound = 1
CDBABC
0: A
\x{2028}A
0: \x{2028}A
/\v+A/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Last code unit = 'A'
Subject length lower bound = 2
/\s?xxx\s/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
Last code unit = 'x'
Subject length lower bound = 4
/\sxxx\s/I,utf,tables=2
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0
Last code unit = 'x'
Subject length lower bound = 5
AB\x{85}xxx\x{a0}XYZ
0: \x{85}xxx\x{a0}
AB\x{a0}xxx\x{85}XYZ
0: \x{a0}xxx\x{85}
/\S \S/I,utf,tables=2
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
\x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h
i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84
\x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94
\x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 \xa4
\xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3
\xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Last code unit = ' '
Subject length lower bound = 3
\x{a2} \x{84}
0: \x{a2} \x{84}
A Z
0: A Z
/a+/utf
a\x{123}aa\=offset=1
0: aa
a\x{123}aa\=offset=2
0: aa
a\x{123}aa\=offset=3
0: a
\= Expect no match
a\x{123}aa\=offset=4
No match
\= Expect bad offset error
a\x{123}aa\=offset=5
Failed: error -33: bad offset value
a\x{123}aa\=offset=6
Failed: error -33: bad offset value
/\x{1234}+/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Subject length lower bound = 1
/\x{1234}+?/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Subject length lower bound = 1
/\x{1234}++/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Subject length lower bound = 1
/\x{1234}{2}/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Last code unit = \x{1234}
Subject length lower bound = 2
/[^\x{c4}]/IB,utf
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Subject length lower bound = 1
/X+\x{200}/IB,utf
------------------------------------------------------------------
Bra
X++
\x{200}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'X'
Last code unit = \x{200}
Subject length lower bound = 2
/\R/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
# Check bad offset
/a/utf
\= Expect bad UTF-16 offset, or no match in 32-bit
\x{10000}\=offset=1
Error -36 (bad UTF-16 offset)
\x{10000}ab\=offset=1
Error -36 (bad UTF-16 offset)
\= Expect 16-bit match, 32-bit no match
\x{10000}ab\=offset=2
0: a
\= Expect no match
\x{10000}ab\=offset=3
No match
\= Expect no match in 16-bit, bad offset in 32-bit
\x{10000}ab\=offset=4
No match
\= Expect bad offset
\x{10000}ab\=offset=5
Failed: error -33: bad offset value
/<2F><><EFBFBD>/utf
Failed: error -26 at offset 0: UTF-16 error: isolated low surrogate
/\w+\x{C4}/B,utf
------------------------------------------------------------------
Bra
\w++
\x{c4}
Ket
End
------------------------------------------------------------------
a\x{C4}\x{C4}
0: a\x{c4}
/\w+\x{C4}/B,utf,tables=2
------------------------------------------------------------------
Bra
\w+
\x{c4}
Ket
End
------------------------------------------------------------------
a\x{C4}\x{C4}
0: a\x{c4}\x{c4}
/\W+\x{C4}/B,utf
------------------------------------------------------------------
Bra
\W+
\x{c4}
Ket
End
------------------------------------------------------------------
!\x{C4}
0: !\x{c4}
/\W+\x{C4}/B,utf,tables=2
------------------------------------------------------------------
Bra
\W++
\x{c4}
Ket
End
------------------------------------------------------------------
!\x{C4}
0: !\x{c4}
/\W+\x{A1}/B,utf
------------------------------------------------------------------
Bra
\W+
\x{a1}
Ket
End
------------------------------------------------------------------
!\x{A1}
0: !\x{a1}
/\W+\x{A1}/B,utf,tables=2
------------------------------------------------------------------
Bra
\W+
\x{a1}
Ket
End
------------------------------------------------------------------
!\x{A1}
0: !\x{a1}
/X\s+\x{A0}/B,utf
------------------------------------------------------------------
Bra
X
\s++
\x{a0}
Ket
End
------------------------------------------------------------------
X\x20\x{A0}\x{A0}
0: X \x{a0}
/X\s+\x{A0}/B,utf,tables=2
------------------------------------------------------------------
Bra
X
\s+
\x{a0}
Ket
End
------------------------------------------------------------------
X\x20\x{A0}\x{A0}
0: X \x{a0}\x{a0}
/\S+\x{A0}/B,utf
------------------------------------------------------------------
Bra
\S+
\x{a0}
Ket
End
------------------------------------------------------------------
X\x{A0}\x{A0}
0: X\x{a0}\x{a0}
/\S+\x{A0}/B,utf,tables=2
------------------------------------------------------------------
Bra
\S++
\x{a0}
Ket
End
------------------------------------------------------------------
X\x{A0}\x{A0}
0: X\x{a0}
/\x{a0}+\s!/B,utf
------------------------------------------------------------------
Bra
\x{a0}++
\s
!
Ket
End
------------------------------------------------------------------
\x{a0}\x20!
0: \x{a0} !
/\x{a0}+\s!/B,utf,tables=2
------------------------------------------------------------------
Bra
\x{a0}+
\s
!
Ket
End
------------------------------------------------------------------
\x{a0}\x20!
0: \x{a0} !
/(*UTF)abc/never_utf
Failed: error 174 at offset 6: using UTF is disabled by the application
/abc/utf,never_utf
Failed: error 174 at offset 0: using UTF is disabled by the application
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IBi,utf
------------------------------------------------------------------
Bra
/i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
First code unit = 'A' (caseless)
Last code unit = \x{1fb0} (caseless)
Subject length lower bound = 5
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IB,utf
------------------------------------------------------------------
Bra
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = \x{1fb0}
Subject length lower bound = 5
/AB\x{1fb0}/IB,utf
------------------------------------------------------------------
Bra
AB\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = \x{1fb0}
Subject length lower bound = 3
/AB\x{1fb0}/IBi,utf
------------------------------------------------------------------
Bra
/i AB\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
First code unit = 'A' (caseless)
Last code unit = \x{1fb0} (caseless)
Subject length lower bound = 3
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{401} (caseless)
Last code unit = \x{42f} (caseless)
Subject length lower bound = 17
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
0: \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
/[ⱥ]/Bi,utf
------------------------------------------------------------------
Bra
/i \x{2c65}
Ket
End
------------------------------------------------------------------
/[^ⱥ]/Bi,utf
------------------------------------------------------------------
Bra
/i [^\x{2c65}] (not)
Ket
End
------------------------------------------------------------------
/[[:blank:]]/B,ucp
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
Ket
End
------------------------------------------------------------------
/\x{212a}+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: K k \xff
Subject length lower bound = 1
KKkk\x{212a}
0: KKkk\x{212a}
/s+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: S s \xff
Subject length lower bound = 1
SSss\x{17f}
0: SSss\x{17f}
# Non-UTF characters should give errors in both 16-bit and 32-bit modes.
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/\o{4200000}/utf
Failed: error 134 at offset 10: character code point value in \x{} or \o{} is too large
/\x{100}*A/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
A
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: A \xff
Last code unit = 'A'
Subject length lower bound = 1
A
0: A
/\x{100}*\d(?R)/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\d
Recurse
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
Subject length lower bound = 1
/[Z\x{100}]/IB,utf
------------------------------------------------------------------
Bra
[Z\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: Z \xff
Subject length lower bound = 1
Z\x{100}
0: Z
\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[z-\x{100}]/IB,utf
------------------------------------------------------------------
Bra
[z-\xff\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87
\x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96
\x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5
\xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4
\xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 \xc3
\xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2
\xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1
\xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0
\xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[z\Qa-d]Ā\E]/IB,utf
------------------------------------------------------------------
Bra
[\-\]adz\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: - ] a d z \xff
Subject length lower bound = 1
\x{100}
0: \x{100}
Ā
0: \x{100}
/[ab\x{100}]abc(xyz(?1))/IB,utf
------------------------------------------------------------------
Bra
[ab\x{100}]
abc
CBra 1
xyz
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a b \xff
Last code unit = 'z'
Subject length lower bound = 7
/\x{100}*\s/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\s
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xff
Subject length lower bound = 1
/\x{100}*\d/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\d
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
Subject length lower bound = 1
/\x{100}*\w/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\w
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
\xff
Subject length lower bound = 1
/\x{100}*\D/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\D
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82
\x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91
\x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0
\xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf
\xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe
\xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd
\xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc
\xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb
\xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa
\xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/\x{100}*\S/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\S
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
\x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h
i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84
\x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93
\x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2
\xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1
\xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0
\xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf
\xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde
\xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed
\xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc
\xfd \xfe \xff
Subject length lower bound = 1
/\x{100}*\W/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\W
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ [ \ ] ^ ` { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89
\x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98
\x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7
\xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6
\xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5
\xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4
\xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3
\xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2
\xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[\x{105}-\x{109}]/IBi,utf
------------------------------------------------------------------
Bra
[\x{104}-\x{109}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: \xff
Subject length lower bound = 1
\x{104}
0: \x{104}
\x{105}
0: \x{105}
\x{109}
0: \x{109}
\= Expect no match
\x{100}
No match
\x{10a}
No match
/[z-\x{100}]/IBi,utf
------------------------------------------------------------------
Bra
[Zz-\xff\x{100}-\x{101}\x{178}\x{39c}\x{3bc}\x{1e9e}\x{212b}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
\x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4
\xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3
\xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Subject length lower bound = 1
Z
0: Z
z
0: z
\x{39c}
0: \x{39c}
\x{178}
0: \x{178}
|
0: |
\x{80}
0: \x{80}
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\x{101}
0: \x{101}
\= Expect no match
\x{102}
No match
Y
No match
y
No match
/[z-\x{100}]/IBi,utf
------------------------------------------------------------------
Bra
[Zz-\xff\x{100}-\x{101}\x{178}\x{39c}\x{3bc}\x{1e9e}\x{212b}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
\x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4
\xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3
\xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Subject length lower bound = 1
/\x{3a3}B/IBi,utf
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: \xff
Last code unit = 'B' (caseless)
Subject length lower bound = 2
/./utf
\x{110000}
** Failed: character \N{U+110000} is greater than 0x10ffff and therefore cannot be encoded as UTF-16
/(*UTF)ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/B
------------------------------------------------------------------
Bra
ab\x{fd}\x{bf}\x{bf}\x{bf}\x{bf}\x{bf}z
Ket
End
------------------------------------------------------------------
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf
** Failed: character value greater than 0x10ffff cannot be converted to UTF
/[\W\p{Any}]/B
------------------------------------------------------------------
Bra
AllAny
Ket
End
------------------------------------------------------------------
abc
0: a
123
0: 1
/[\W\pL]/B
------------------------------------------------------------------
Bra
[^0-9_]
Ket
End
------------------------------------------------------------------
abc
0: a
\x{100}
0: \x{100}
\x{308}
0: \x{308}
\= Expect no match
123
No match
/[\s[:^ascii:]]/B,ucp
------------------------------------------------------------------
Bra
[^\x00-\x08\x0e-\x1f!-\x7f]
Ket
End
------------------------------------------------------------------
/\pP/ucp
\x{7fffffff}
** Character \x{7fffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
No match
# A special extra option allows excaped surrogate code points in 32-bit mode,
# but subjects containing them must not be UTF-checked. These patterns give
# errors in 16-bit mode.
/\x{d800}/I,utf,allow_surrogate_escapes
Failed: error 191 at offset 0: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowed in UTF-16 mode
\x{d800}\=no_utf_check
/\udfff\o{157401}/utf,alt_bsux,allow_surrogate_escapes
Failed: error 191 at offset 0: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowed in UTF-16 mode
\x{dfff}\x{df01}\=no_utf_check
# This has different starting code units in 8-bit mode.
/^[^ab]/IB,utf
------------------------------------------------------------------
Bra
^
[^ab]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Compile options: utf
Overall options: anchored utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
\x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e
\x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d
\x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac
\xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb
\xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca
\xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9
\xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8
\xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7
\xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
c
0: c
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\= Expect no match
aaa
No match
# Offsets are different in 8-bit mode.
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
123abcáyzabcdef789abcሴqr
1(2) Old 6 6 "" New 6 8 "<>"
2(2) Old 12 12 "" New 14 16 "<>"
3(2) Old 12 15 "def" New 16 21 "<def>"
4(2) Old 21 21 "" New 27 29 "<>"
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
# A few script run tests in non-UTF mode (but they need Unicode support)
/^(*script_run:.{4})/
\x{3041}\x{30a1}\x{3007}\x{3007} Hiragana Katakana Han Han
0: \x{3041}\x{30a1}\x{3007}\x{3007}
\x{30a1}\x{3041}\x{3007}\x{3007} Katakana Hiragana Han Han
0: \x{30a1}\x{3041}\x{3007}\x{3007}
\x{1100}\x{2e80}\x{2e80}\x{1101} Hangul Han Han Hangul
0: \x{1100}\x{2e80}\x{2e80}\x{1101}
/^(*sr:.*)/utf,allow_surrogate_escapes
Failed: error 191 at offset 0: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowed in UTF-16 mode
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
\x{d800}\x{dfff} Surrogates (Unknown) \=no_utf_check
/(?(n/utf
Failed: error 142 at offset 4: syntax error in subpattern name (missing terminator?)
/(?(á/utf
Failed: error 142 at offset 4: syntax error in subpattern name (missing terminator?)
# Invalid UTF-16/32 tests.
/.../g,match_invalid_utf
abcd\x{df00}wxzy\x{df00}pqrs
0: abc
0: wxz
0: pqr
abcd\x{80}wxzy\x{df00}pqrs
0: abc
0: d\x{80}w
0: xzy
0: pqr
/abc/match_invalid_utf
ab\x{df00}ab\=ph
Partial match: ab
\= Expect no match
ab\x{df00}cdef\=ph
No match
/.a/match_invalid_utf
ab\=ph
Partial match: b
ab\=ps
Partial match: b
\= Expect no match
b\x{df00}\=ph
No match
b\x{df00}\=ps
No match
/.a$/match_invalid_utf
ab\=ph
Partial match: b
ab\=ps
Partial match: b
\= Expect no match
b\x{df00}\=ph
No match
b\x{df00}\=ps
No match
/ab$/match_invalid_utf
ab\x{df00}cdeab
0: ab
\= Expect no match
ab\x{df00}cde
No match
/.../g,match_invalid_utf
abcd\x{80}wxzy\x{df00}pqrs
0: abc
0: d\x{80}w
0: xzy
0: pqr
/(?<=x)../g,match_invalid_utf
abcd\x{80}wxzy\x{df00}pqrs
0: zy
abcd\x{80}wxzy\x{df00}xpqrs
0: zy
0: pq
/X$/match_invalid_utf
\= Expect no match
X\x{df00}
No match
/(?<=..)X/match_invalid_utf,aftertext
AB\x{df00}AQXYZ
0: X
0+ YZ
AB\x{df00}AQXYZ\=offset=5
0: X
0+ YZ
AB\x{df00}\x{df00}AXYZXC\=offset=5
0: X
0+ C
\= Expect no match
AB\x{df00}XYZ
No match
AB\x{df00}XYZ\=offset=3
No match
AB\x{df00}AXYZ
No match
AB\x{df00}AXYZ\=offset=4
No match
AB\x{df00}\x{df00}AXYZ\=offset=5
No match
/.../match_invalid_utf
\= Expect no match
A\x{d800}B
No match
A\x{110000}B
** Failed: character \N{U+110000} is greater than 0x10ffff and therefore cannot be encoded as UTF-16
/aa/utf,ucp,match_invalid_utf,global
aa\x{d800}aa
0: aa
0: aa
/aa/utf,ucp,match_invalid_utf,global
\x{d800}aa
0: aa
/A\z/utf,match_invalid_utf
A\x{df00}\n
No match
/ab$/match_invalid_utf
\= Expect no match
ab\x{df00}cde
No match
/ab\z/match_invalid_utf
\= Expect no match
ab\x{df00}cde
No match
/ab\Z/match_invalid_utf
\= Expect no match
ab\x{df00}cde
No match
/(..)(*scs:(1)ab\z)/match_invalid_utf
ab\x{df00}cde
0: ab
1: ab
/(..)(*scs:(1)ab\Z)/match_invalid_utf
ab\x{df00}cde
0: ab
1: ab
/(..)(*scs:(1)ab$)/match_invalid_utf
ab\x{df00}cde
0: ab
1: ab
# ----------------------------------------------------
/(*UTF)(?=\x{123})/I
Capture group count = 0
May match empty string
Compile options: <none>
Overall options: utf
First code unit = \x{123}
Subject length lower bound = 1
/[\x{c1}\x{e1}]X[\x{145}\x{146}]/I,utf
Capture group count = 0
Options: utf
First code unit = \xc1 (caseless)
Last code unit = \x{145} (caseless)
Subject length lower bound = 3
/[\xff\x{ffff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xff
Subject length lower bound = 1
/[\xff\x{ff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xff
Subject length lower bound = 1
/[\xff\x{ff}]/I
Capture group count = 0
Starting code units: \xff
Subject length lower bound = 1
/[Ss]/I
Capture group count = 0
First code unit = 'S' (caseless)
Subject length lower bound = 1
/[Ss]/I,utf
Capture group count = 0
Options: utf
Starting code units: S s
Subject length lower bound = 1
/(?:\x{ff}|\x{3000})/I,utf
Capture group count = 0
Options: utf
Starting code units: \xff
Subject length lower bound = 1
# ----------------------------------------------------
# UCP and casing tests
/\x{120}/iI
Capture group count = 0
Options: caseless
First code unit = \x{120}
Subject length lower bound = 1
/\x{c1}/iI,ucp
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Subject length lower bound = 1
/[\x{120}\x{121}]/iB,ucp
------------------------------------------------------------------
Bra
/i \x{120}
Ket
End
------------------------------------------------------------------
/[ab\x{120}]+/iB,ucp
------------------------------------------------------------------
Bra
[ABab\x{120}-\x{121}]++
Ket
End
------------------------------------------------------------------
aABb\x{121}\x{120}
0: aABb\x{121}\x{120}
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
No match
/\x{120}\x{c1}/i,ucp,no_start_optimize
\x{121}\x{e1}
0: \x{121}\xe1
/\x{120}\x{c1}/i,ucp
\x{121}\x{e1}
0: \x{121}\xe1
/[^\x{120}]/i,no_start_optimize
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp,no_start_optimize
\= Expect no match
\x{121}
No match
/[^\x{120}]/i
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp
\= Expect no match
\x{121}
No match
/\x{120}{2}/i,ucp
\x{121}\x{121}
0: \x{121}\x{121}
/[^\x{120}]{2}/i,ucp
\= Expect no match
\x{121}\x{121}
No match
/\x{c1}+\x{e1}/iB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
/\x{c1}+\x{e1}/iIB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Last code unit = \xe1 (caseless)
Subject length lower bound = 2
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
\x{e1}\x{e1}\x{e1}
0: \xe1\xe1\xe1
/a|\x{c1}/iI,ucp
Capture group count = 0
Options: caseless ucp
Starting code units: A a \xc1 \xe1
Subject length lower bound = 1
\x{e1}xxx
0: \xe1
/\x{c1}|\x{e1}/iI,ucp
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Subject length lower bound = 1
/X(\x{e1})Y/ucp,replace=>\U$1<,substitute_extended
X\x{e1}Y
1: >\xc1<
/X(\x{121})Y/ucp,replace=>\U$1<,substitute_extended
X\x{121}Y
1: >\x{120}<
/s/i,ucp
\x{17f}
0: \x{17f}
/s/i,utf
\x{17f}
0: \x{17f}
/[^s]/i,ucp
\= Expect no match
\x{17f}
No match
/[^s]/i,utf
\= Expect no match
\x{17f}
No match
/(.) \1/i,ucp
i I
0: i I
1: i
/(.) \1/i,ucp,turkish_casing
\= Expect no match
i I
No match
/(.) \1/i,ucp
i I
0: i I
1: i
\x{212a} k
0: \x{212a} k
1: \x{212a}
\= Expect no match
i \x{0130}
No match
\x{0131} I
No match
/(.) \1/i,ucp,turkish_casing
\x{212a} k
0: \x{212a} k
1: \x{212a}
i \x{0130}
0: i \x{130}
1: i
\x{0131} I
0: \x{131} I
1: \x{131}
\= Expect no match
i I
No match
/(.) (?r:\1)/i,ucp,turkish_casing
i I
0: i I
1: i
\= Expect no match
i \x{0130}
No match
\x{0131} I
No match
\x{212a} k
No match
/[a-z][^i]I/ucp,turkish_casing
bII
0: bII
b\x{0130}I
0: b\x{130}I
b\x{0131}I
0: b\x{131}I
\= Expect no match
biI
No match
/[a-z][^i]I/i,ucp,turkish_casing
b\x{0131}I
0: b\x{131}I
bII
0: bII
\= Expect no match
biI
No match
b\x{0130}I
No match
/[a-z](?r:[^i])I/i,ucp,turkish_casing
b\x{0131}I
0: b\x{131}I
b\x{0130}I
0: b\x{130}I
\= Expect no match
bII
No match
biI
No match
/b(?r:[\x{00FF}-\x{FFEE}])/i,ucp,turkish_casing
b\x{0130}
0: b\x{130}
b\x{0131}
0: b\x{131}
B\x{212a}
0: B\x{212a}
\= Expect no match
bi
No match
bI
No match
bk
No match
/[\x60-\x7f]/i,ucp,turkish_casing
i
0: i
\= Expect no match
I
No match
/[\x60-\xc0]/i,ucp,turkish_casing
i
0: i
\= Expect no match
I
No match
/[\x80-\xc0]/i,ucp,turkish_casing
\= Expect no match
i
No match
I
No match
# ----------------------------------------------------
/b[\x{00FF}-\x{FFEE}]/ir
b\x{0130}
0: b\x{130}
b\x{0131}
0: b\x{131}
B\x{212a}
0: B\x{212a}
\= Expect no match
bi
No match
bI
No match
bk
No match
# Quantifier after a literal that has the value of META_ACCEPT (not UTF). This
# fails in 16-bit mode, but is OK for 32-bit.
/\x{802a0000}*/
Failed: error 134 at offset 11: character code point value in \x{} or \o{} is too large
\x{802a0000}\x{802a0000}
# UTF matching without UTF, check invalid UTF characters
/\X++/
a\x{110000}\x{ffffffff}
** Character \x{110000} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
0: a\x00\x{ffff}
# This used to loop in 32-bit mode; it will fail in 16-bit mode.
/[\x{ffffffff}]/caseless,ucp
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
\x{ffffffff}xyz
# These are 32-bit tests for handing 0xffffffff when in UCP caselsss mode. They
# will give errors in 16-bit mode.
/k*\x{ffffffff}/caseless,ucp
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
\x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
No match
/[sk](?r:[sk])[sk]/Bi,ucp
------------------------------------------------------------------
Bra
[KSks\x{17f}\x{212a}]
Bra
[KSks]
Ket
[KSks\x{17f}\x{212a}]
Ket
End
------------------------------------------------------------------
SKS
0: SKS
sks
0: sks
\x{212a}S\x{17f}
0: \x{212a}S\x{17f}
\x{17f}K\x{212a}
0: \x{17f}K\x{212a}
\= Expect no match
s\x{212a}s
No match
K\x{17f}K
No match
# ---------------------------------------------------------
# End of testinput12

2030
3rd/pcre2/testdata/testoutput12-32 vendored Normal file
View File

@@ -0,0 +1,2030 @@
# This set of tests is for UTF-16 and UTF-32 support, including Unicode
# properties. It is relevant only to the 16-bit and 32-bit libraries. The
# output is different for each library, so there are separate output files.
/<2F><><EFBFBD>xxx/IB,utf,no_utf_check
** Failed: invalid UTF-8 string cannot be converted to 32-bit string
/abc/utf
<20>]
** Failed: invalid UTF-8 string cannot be used as input in UTF mode
# Check maximum character size
/\x{ffff}/IB,utf
------------------------------------------------------------------
Bra
\x{ffff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{ffff}
Subject length lower bound = 1
/\x{10000}/IB,utf
------------------------------------------------------------------
Bra
\x{10000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{10000}
Subject length lower bound = 1
/\x{100}/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
/\x{1000}/IB,utf
------------------------------------------------------------------
Bra
\x{1000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{1000}
Subject length lower bound = 1
/\x{10000}/IB,utf
------------------------------------------------------------------
Bra
\x{10000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{10000}
Subject length lower bound = 1
/\x{100000}/IB,utf
------------------------------------------------------------------
Bra
\x{100000}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100000}
Subject length lower bound = 1
/\x{10ffff}/IB,utf
------------------------------------------------------------------
Bra
\x{10ffff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{10ffff}
Subject length lower bound = 1
/[\x{ff}]/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xff
Subject length lower bound = 1
/[\x{100}]/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
/\x80/IB,utf
------------------------------------------------------------------
Bra
\x{80}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x80
Subject length lower bound = 1
/\xff/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xff
Subject length lower bound = 1
/\x{D55c}\x{ad6d}\x{C5B4}/IB,utf
------------------------------------------------------------------
Bra
\x{d55c}\x{ad6d}\x{c5b4}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
\x{D55c}\x{ad6d}\x{C5B4}
0: \x{d55c}\x{ad6d}\x{c5b4}
/\x{65e5}\x{672c}\x{8a9e}/IB,utf
------------------------------------------------------------------
Bra
\x{65e5}\x{672c}\x{8a9e}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
\x{65e5}\x{672c}\x{8a9e}
0: \x{65e5}\x{672c}\x{8a9e}
/\x{80}/IB,utf
------------------------------------------------------------------
Bra
\x{80}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x80
Subject length lower bound = 1
/\x{084}/IB,utf
------------------------------------------------------------------
Bra
\x{84}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x84
Subject length lower bound = 1
/\x{104}/IB,utf
------------------------------------------------------------------
Bra
\x{104}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{104}
Subject length lower bound = 1
/\x{861}/IB,utf
------------------------------------------------------------------
Bra
\x{861}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{861}
Subject length lower bound = 1
/\x{212ab}/IB,utf
------------------------------------------------------------------
Bra
\x{212ab}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{212ab}
Subject length lower bound = 1
/[^ab\xC0-\xF0]/IB,utf
------------------------------------------------------------------
Bra
[^ab\xc0-\xf0]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
\x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e
\x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d
\x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac
\xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb
\xbc \xbd \xbe \xbf \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb
\xfc \xfd \xfe \xff
Subject length lower bound = 1
\x{f1}
0: \x{f1}
\x{bf}
0: \x{bf}
\x{100}
0: \x{100}
\x{1000}
0: \x{1000}
\= Expect no match
\x{c0}
No match
\x{f0}
No match
/(\x{100}+|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}++
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: x \xff
Subject length lower bound = 1
/(\x{100}*a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}*+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a x \xff
Subject length lower bound = 1
/(\x{100}{0,2}a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}{0,2}+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a x \xff
Subject length lower bound = 1
/(\x{100}{1,2}a|x)/IB,utf
------------------------------------------------------------------
Bra
CBra 1
\x{100}
\x{100}{0,1}+
a
Alt
x
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: x \xff
Subject length lower bound = 1
/\x{100}/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
/a\x{100}\x{101}*/IB,utf
------------------------------------------------------------------
Bra
a\x{100}
\x{101}*+
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'a'
Last code unit = \x{100}
Subject length lower bound = 2
/a\x{100}\x{101}+/IB,utf
------------------------------------------------------------------
Bra
a\x{100}
\x{101}++
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'a'
Last code unit = \x{101}
Subject length lower bound = 3
/[^\x{c4}]/IB
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Subject length lower bound = 1
/[\x{100}]/IB,utf
------------------------------------------------------------------
Bra
\x{100}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Subject length lower bound = 1
\x{100}
0: \x{100}
Z\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[\xff]/IB,utf
------------------------------------------------------------------
Bra
\x{ff}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xff
Subject length lower bound = 1
>\x{ff}<
0: \x{ff}
/[^\xff]/IB,utf
------------------------------------------------------------------
Bra
[^\x{ff}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Subject length lower bound = 1
/\x{100}abc(xyz(?1))/IB,utf
------------------------------------------------------------------
Bra
\x{100}abc
CBra 1
xyz
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
First code unit = \x{100}
Last code unit = 'z'
Subject length lower bound = 7
/\777/I,utf
Capture group count = 0
Options: utf
First code unit = \x{1ff}
Subject length lower bound = 1
\x{1ff}
0: \x{1ff}
\777
0: \x{1ff}
/\x{100}+\x{200}/IB,utf
------------------------------------------------------------------
Bra
\x{100}++
\x{200}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Last code unit = \x{200}
Subject length lower bound = 2
/\x{100}+X/IB,utf
------------------------------------------------------------------
Bra
\x{100}++
X
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{100}
Last code unit = 'X'
Subject length lower bound = 2
/^[\QĀ\E-\QŐ\E/B,utf
Failed: error 106 at offset 13: missing terminating ] for character class
/X/utf
XX\x{d800}\=no_utf_check
0: X
XX\x{da00}\=no_utf_check
0: X
XX\x{dc00}\=no_utf_check
0: X
XX\x{de00}\=no_utf_check
0: X
XX\x{dfff}\=no_utf_check
0: X
\= Expect UTF error
XX\x{d800}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{da00}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{dc00}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{de00}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{dfff}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{110000}
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 2
XX\x{d800}\x{1234}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
\= Expect no match
XX\x{d800}\=offset=3
No match
/(?<=.)X/utf
XX\x{d800}\=offset=3
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
/(*UTF16)\x{11234}/
Failed: error 160 at offset 7: (*VERB) not recognized or malformed
abcd\x{11234}pqr
/(*UTF)\x{11234}/I
Capture group count = 0
Compile options: <none>
Overall options: utf
First code unit = \x{11234}
Subject length lower bound = 1
abcd\x{11234}pqr
0: \x{11234}
/(*UTF-32)\x{11234}/
Failed: error 160 at offset 5: (*VERB) not recognized or malformed
abcd\x{11234}pqr
/(*UTF-32)\x{112}/
Failed: error 160 at offset 5: (*VERB) not recognized or malformed
abcd\x{11234}pqr
/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
Failed: error 160 at offset 14: (*VERB) not recognized or malformed
/(*CRLF)(*UTF32)(*BSR_UNICODE)a\Rb/I
Capture group count = 0
Compile options: <none>
Overall options: utf
\R matches any Unicode newline
Forced newline is CRLF
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 3
/\h/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x20 \xa0 \xff
Subject length lower bound = 1
ABC\x{09}
0: \x{09}
ABC\x{20}
0:
ABC\x{a0}
0: \x{a0}
ABC\x{1680}
0: \x{1680}
ABC\x{180e}
0: \x{180e}
ABC\x{2000}
0: \x{2000}
ABC\x{202f}
0: \x{202f}
ABC\x{205f}
0: \x{205f}
ABC\x{3000}
0: \x{3000}
/\v/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
ABC\x{0a}
0: \x{0a}
ABC\x{0b}
0: \x{0b}
ABC\x{0c}
0: \x{0c}
ABC\x{0d}
0: \x{0d}
ABC\x{85}
0: \x{85}
ABC\x{2028}
0: \x{2028}
/\h*A/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x20 A \xa0 \xff
Last code unit = 'A'
Subject length lower bound = 1
CDBABC
0: A
\x{2000}ABC
0: \x{2000}A
/\R*A/I,bsr=unicode,utf
Capture group count = 0
Options: utf
\R matches any Unicode newline
Starting code units: \x0a \x0b \x0c \x0d A \x85 \xff
Last code unit = 'A'
Subject length lower bound = 1
CDBABC
0: A
\x{2028}A
0: \x{2028}A
/\v+A/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Last code unit = 'A'
Subject length lower bound = 2
/\s?xxx\s/I,utf
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
Last code unit = 'x'
Subject length lower bound = 4
/\sxxx\s/I,utf,tables=2
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0
Last code unit = 'x'
Subject length lower bound = 5
AB\x{85}xxx\x{a0}XYZ
0: \x{85}xxx\x{a0}
AB\x{a0}xxx\x{85}XYZ
0: \x{a0}xxx\x{85}
/\S \S/I,utf,tables=2
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
\x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h
i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84
\x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94
\x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 \xa4
\xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3
\xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Last code unit = ' '
Subject length lower bound = 3
\x{a2} \x{84}
0: \x{a2} \x{84}
A Z
0: A Z
/a+/utf
a\x{123}aa\=offset=1
0: aa
a\x{123}aa\=offset=2
0: aa
a\x{123}aa\=offset=3
0: a
\= Expect no match
a\x{123}aa\=offset=4
No match
\= Expect bad offset error
a\x{123}aa\=offset=5
Failed: error -33: bad offset value
a\x{123}aa\=offset=6
Failed: error -33: bad offset value
/\x{1234}+/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Subject length lower bound = 1
/\x{1234}+?/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Subject length lower bound = 1
/\x{1234}++/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Subject length lower bound = 1
/\x{1234}{2}/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{1234}
Last code unit = \x{1234}
Subject length lower bound = 2
/[^\x{c4}]/IB,utf
------------------------------------------------------------------
Bra
[^\x{c4}] (not)
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Subject length lower bound = 1
/X+\x{200}/IB,utf
------------------------------------------------------------------
Bra
X++
\x{200}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'X'
Last code unit = \x{200}
Subject length lower bound = 2
/\R/I,utf
Capture group count = 0
Options: utf
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
Subject length lower bound = 1
# Check bad offset
/a/utf
\= Expect bad UTF-16 offset, or no match in 32-bit
\x{10000}\=offset=1
No match
\x{10000}ab\=offset=1
0: a
\= Expect 16-bit match, 32-bit no match
\x{10000}ab\=offset=2
No match
\= Expect no match
\x{10000}ab\=offset=3
No match
\= Expect no match in 16-bit, bad offset in 32-bit
\x{10000}ab\=offset=4
Failed: error -33: bad offset value
\= Expect bad offset
\x{10000}ab\=offset=5
Failed: error -33: bad offset value
/<2F><><EFBFBD>/utf
Failed: error -27 at offset 0: UTF-32 error: code points 0xd800-0xdfff are not defined
/\w+\x{C4}/B,utf
------------------------------------------------------------------
Bra
\w++
\x{c4}
Ket
End
------------------------------------------------------------------
a\x{C4}\x{C4}
0: a\x{c4}
/\w+\x{C4}/B,utf,tables=2
------------------------------------------------------------------
Bra
\w+
\x{c4}
Ket
End
------------------------------------------------------------------
a\x{C4}\x{C4}
0: a\x{c4}\x{c4}
/\W+\x{C4}/B,utf
------------------------------------------------------------------
Bra
\W+
\x{c4}
Ket
End
------------------------------------------------------------------
!\x{C4}
0: !\x{c4}
/\W+\x{C4}/B,utf,tables=2
------------------------------------------------------------------
Bra
\W++
\x{c4}
Ket
End
------------------------------------------------------------------
!\x{C4}
0: !\x{c4}
/\W+\x{A1}/B,utf
------------------------------------------------------------------
Bra
\W+
\x{a1}
Ket
End
------------------------------------------------------------------
!\x{A1}
0: !\x{a1}
/\W+\x{A1}/B,utf,tables=2
------------------------------------------------------------------
Bra
\W+
\x{a1}
Ket
End
------------------------------------------------------------------
!\x{A1}
0: !\x{a1}
/X\s+\x{A0}/B,utf
------------------------------------------------------------------
Bra
X
\s++
\x{a0}
Ket
End
------------------------------------------------------------------
X\x20\x{A0}\x{A0}
0: X \x{a0}
/X\s+\x{A0}/B,utf,tables=2
------------------------------------------------------------------
Bra
X
\s+
\x{a0}
Ket
End
------------------------------------------------------------------
X\x20\x{A0}\x{A0}
0: X \x{a0}\x{a0}
/\S+\x{A0}/B,utf
------------------------------------------------------------------
Bra
\S+
\x{a0}
Ket
End
------------------------------------------------------------------
X\x{A0}\x{A0}
0: X\x{a0}\x{a0}
/\S+\x{A0}/B,utf,tables=2
------------------------------------------------------------------
Bra
\S++
\x{a0}
Ket
End
------------------------------------------------------------------
X\x{A0}\x{A0}
0: X\x{a0}
/\x{a0}+\s!/B,utf
------------------------------------------------------------------
Bra
\x{a0}++
\s
!
Ket
End
------------------------------------------------------------------
\x{a0}\x20!
0: \x{a0} !
/\x{a0}+\s!/B,utf,tables=2
------------------------------------------------------------------
Bra
\x{a0}+
\s
!
Ket
End
------------------------------------------------------------------
\x{a0}\x20!
0: \x{a0} !
/(*UTF)abc/never_utf
Failed: error 174 at offset 6: using UTF is disabled by the application
/abc/utf,never_utf
Failed: error 174 at offset 0: using UTF is disabled by the application
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IBi,utf
------------------------------------------------------------------
Bra
/i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
First code unit = 'A' (caseless)
Last code unit = \x{1fb0} (caseless)
Subject length lower bound = 5
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/IB,utf
------------------------------------------------------------------
Bra
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = \x{1fb0}
Subject length lower bound = 5
/AB\x{1fb0}/IB,utf
------------------------------------------------------------------
Bra
AB\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = \x{1fb0}
Subject length lower bound = 3
/AB\x{1fb0}/IBi,utf
------------------------------------------------------------------
Bra
/i AB\x{1fb0}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
First code unit = 'A' (caseless)
Last code unit = \x{1fb0} (caseless)
Subject length lower bound = 3
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
Capture group count = 0
Options: caseless utf
First code unit = \x{401} (caseless)
Last code unit = \x{42f} (caseless)
Subject length lower bound = 17
\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
\x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
0: \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
/[ⱥ]/Bi,utf
------------------------------------------------------------------
Bra
/i \x{2c65}
Ket
End
------------------------------------------------------------------
/[^ⱥ]/Bi,utf
------------------------------------------------------------------
Bra
/i [^\x{2c65}] (not)
Ket
End
------------------------------------------------------------------
/[[:blank:]]/B,ucp
------------------------------------------------------------------
Bra
[\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
Ket
End
------------------------------------------------------------------
/\x{212a}+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: K k \xff
Subject length lower bound = 1
KKkk\x{212a}
0: KKkk\x{212a}
/s+/Ii,utf
Capture group count = 0
Options: caseless utf
Starting code units: S s \xff
Subject length lower bound = 1
SSss\x{17f}
0: SSss\x{17f}
# Non-UTF characters should give errors in both 16-bit and 32-bit modes.
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/\o{4200000}/utf
Failed: error 134 at offset 10: character code point value in \x{} or \o{} is too large
/\x{100}*A/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
A
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: A \xff
Last code unit = 'A'
Subject length lower bound = 1
A
0: A
/\x{100}*\d(?R)/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\d
Recurse
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
Subject length lower bound = 1
/[Z\x{100}]/IB,utf
------------------------------------------------------------------
Bra
[Z\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: Z \xff
Subject length lower bound = 1
Z\x{100}
0: Z
\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[z-\x{100}]/IB,utf
------------------------------------------------------------------
Bra
[z-\xff\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87
\x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96
\x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5
\xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4
\xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 \xc3
\xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2
\xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1
\xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0
\xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[z\Qa-d]Ā\E]/IB,utf
------------------------------------------------------------------
Bra
[\-\]adz\x{100}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: - ] a d z \xff
Subject length lower bound = 1
\x{100}
0: \x{100}
Ā
0: \x{100}
/[ab\x{100}]abc(xyz(?1))/IB,utf
------------------------------------------------------------------
Bra
[ab\x{100}]
abc
CBra 1
xyz
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Options: utf
Starting code units: a b \xff
Last code unit = 'z'
Subject length lower bound = 7
/\x{100}*\s/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\s
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xff
Subject length lower bound = 1
/\x{100}*\d/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\d
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
Subject length lower bound = 1
/\x{100}*\w/IB,utf
------------------------------------------------------------------
Bra
\x{100}*+
\w
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
\xff
Subject length lower bound = 1
/\x{100}*\D/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\D
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82
\x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91
\x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0
\xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf
\xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe
\xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd
\xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc
\xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb
\xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa
\xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/\x{100}*\S/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\S
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
\x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h
i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84
\x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93
\x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2
\xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1
\xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0
\xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf
\xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde
\xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed
\xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc
\xfd \xfe \xff
Subject length lower bound = 1
/\x{100}*\W/IB,utf
------------------------------------------------------------------
Bra
\x{100}*
\W
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = >
? @ [ \ ] ^ ` { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89
\x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98
\x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7
\xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6
\xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5
\xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4
\xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3
\xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2
\xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
/[\x{105}-\x{109}]/IBi,utf
------------------------------------------------------------------
Bra
[\x{104}-\x{109}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: \xff
Subject length lower bound = 1
\x{104}
0: \x{104}
\x{105}
0: \x{105}
\x{109}
0: \x{109}
\= Expect no match
\x{100}
No match
\x{10a}
No match
/[z-\x{100}]/IBi,utf
------------------------------------------------------------------
Bra
[Zz-\xff\x{100}-\x{101}\x{178}\x{39c}\x{3bc}\x{1e9e}\x{212b}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
\x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4
\xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3
\xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Subject length lower bound = 1
Z
0: Z
z
0: z
\x{39c}
0: \x{39c}
\x{178}
0: \x{178}
|
0: |
\x{80}
0: \x{80}
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\x{101}
0: \x{101}
\= Expect no match
\x{102}
No match
Y
No match
y
No match
/[z-\x{100}]/IBi,utf
------------------------------------------------------------------
Bra
[Zz-\xff\x{100}-\x{101}\x{178}\x{39c}\x{3bc}\x{1e9e}\x{212b}]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
\x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4
\xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3
\xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2
\xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1
\xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0
\xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef
\xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe
\xff
Subject length lower bound = 1
/\x{3a3}B/IBi,utf
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless utf
Starting code units: \xff
Last code unit = 'B' (caseless)
Subject length lower bound = 2
/./utf
\x{110000}
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 0
/(*UTF)ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/B
------------------------------------------------------------------
Bra
ab\x{fd}\x{bf}\x{bf}\x{bf}\x{bf}\x{bf}z
Ket
End
------------------------------------------------------------------
/ab<61><62><EFBFBD><EFBFBD><EFBFBD><EFBFBD>z/utf
** Failed: character value greater than 0x10ffff cannot be converted to UTF
/[\W\p{Any}]/B
------------------------------------------------------------------
Bra
AllAny
Ket
End
------------------------------------------------------------------
abc
0: a
123
0: 1
/[\W\pL]/B
------------------------------------------------------------------
Bra
[^0-9_]
Ket
End
------------------------------------------------------------------
abc
0: a
\x{100}
0: \x{100}
\x{308}
0: \x{308}
\= Expect no match
123
No match
/[\s[:^ascii:]]/B,ucp
------------------------------------------------------------------
Bra
[^\x00-\x08\x0e-\x1f!-\x7f]
Ket
End
------------------------------------------------------------------
/\pP/ucp
\x{7fffffff}
No match
# A special extra option allows excaped surrogate code points in 32-bit mode,
# but subjects containing them must not be UTF-checked. These patterns give
# errors in 16-bit mode.
/\x{d800}/I,utf,allow_surrogate_escapes
Capture group count = 0
Options: utf
Extra options: allow_surrogate_escapes
First code unit = \x{d800}
Subject length lower bound = 1
\x{d800}\=no_utf_check
0: \x{d800}
/\udfff\o{157401}/utf,alt_bsux,allow_surrogate_escapes
\x{dfff}\x{df01}\=no_utf_check
0: \x{dfff}\x{df01}
# This has different starting code units in 8-bit mode.
/^[^ab]/IB,utf
------------------------------------------------------------------
Bra
^
[^ab]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Compile options: utf
Overall options: anchored utf
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
\x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
\x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e
\x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d
\x9e \x9f \xa0 \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac
\xad \xae \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb
\xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca
\xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9
\xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8
\xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7
\xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
Subject length lower bound = 1
c
0: c
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\= Expect no match
aaa
No match
# Offsets are different in 8-bit mode.
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
123abcáyzabcdef789abcሴqr
1(2) Old 6 6 "" New 6 8 "<>"
2(2) Old 12 12 "" New 14 16 "<>"
3(2) Old 12 15 "def" New 16 21 "<def>"
4(2) Old 21 21 "" New 27 29 "<>"
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
# A few script run tests in non-UTF mode (but they need Unicode support)
/^(*script_run:.{4})/
\x{3041}\x{30a1}\x{3007}\x{3007} Hiragana Katakana Han Han
0: \x{3041}\x{30a1}\x{3007}\x{3007}
\x{30a1}\x{3041}\x{3007}\x{3007} Katakana Hiragana Han Han
0: \x{30a1}\x{3041}\x{3007}\x{3007}
\x{1100}\x{2e80}\x{2e80}\x{1101} Hangul Han Han Hangul
0: \x{1100}\x{2e80}\x{2e80}\x{1101}
/^(*sr:.*)/utf,allow_surrogate_escapes
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
0: \x{2e80}\x{3105}\x{2e80}
\x{d800}\x{dfff} Surrogates (Unknown) \=no_utf_check
0: \x{d800}
/(?(n/utf
Failed: error 142 at offset 4: syntax error in subpattern name (missing terminator?)
/(?(á/utf
Failed: error 142 at offset 4: syntax error in subpattern name (missing terminator?)
# Invalid UTF-16/32 tests.
/.../g,match_invalid_utf
abcd\x{df00}wxzy\x{df00}pqrs
0: abc
0: wxz
0: pqr
abcd\x{80}wxzy\x{df00}pqrs
0: abc
0: d\x{80}w
0: xzy
0: pqr
/abc/match_invalid_utf
ab\x{df00}ab\=ph
Partial match: ab
\= Expect no match
ab\x{df00}cdef\=ph
No match
/.a/match_invalid_utf
ab\=ph
Partial match: b
ab\=ps
Partial match: b
\= Expect no match
b\x{df00}\=ph
No match
b\x{df00}\=ps
No match
/.a$/match_invalid_utf
ab\=ph
Partial match: b
ab\=ps
Partial match: b
\= Expect no match
b\x{df00}\=ph
No match
b\x{df00}\=ps
No match
/ab$/match_invalid_utf
ab\x{df00}cdeab
0: ab
\= Expect no match
ab\x{df00}cde
No match
/.../g,match_invalid_utf
abcd\x{80}wxzy\x{df00}pqrs
0: abc
0: d\x{80}w
0: xzy
0: pqr
/(?<=x)../g,match_invalid_utf
abcd\x{80}wxzy\x{df00}pqrs
0: zy
abcd\x{80}wxzy\x{df00}xpqrs
0: zy
0: pq
/X$/match_invalid_utf
\= Expect no match
X\x{df00}
No match
/(?<=..)X/match_invalid_utf,aftertext
AB\x{df00}AQXYZ
0: X
0+ YZ
AB\x{df00}AQXYZ\=offset=5
0: X
0+ YZ
AB\x{df00}\x{df00}AXYZXC\=offset=5
0: X
0+ C
\= Expect no match
AB\x{df00}XYZ
No match
AB\x{df00}XYZ\=offset=3
No match
AB\x{df00}AXYZ
No match
AB\x{df00}AXYZ\=offset=4
No match
AB\x{df00}\x{df00}AXYZ\=offset=5
No match
/.../match_invalid_utf
\= Expect no match
A\x{d800}B
No match
A\x{110000}B
No match
/aa/utf,ucp,match_invalid_utf,global
aa\x{d800}aa
0: aa
0: aa
/aa/utf,ucp,match_invalid_utf,global
\x{d800}aa
0: aa
/A\z/utf,match_invalid_utf
A\x{df00}\n
No match
/ab$/match_invalid_utf
\= Expect no match
ab\x{df00}cde
No match
/ab\z/match_invalid_utf
\= Expect no match
ab\x{df00}cde
No match
/ab\Z/match_invalid_utf
\= Expect no match
ab\x{df00}cde
No match
/(..)(*scs:(1)ab\z)/match_invalid_utf
ab\x{df00}cde
0: ab
1: ab
/(..)(*scs:(1)ab\Z)/match_invalid_utf
ab\x{df00}cde
0: ab
1: ab
/(..)(*scs:(1)ab$)/match_invalid_utf
ab\x{df00}cde
0: ab
1: ab
# ----------------------------------------------------
/(*UTF)(?=\x{123})/I
Capture group count = 0
May match empty string
Compile options: <none>
Overall options: utf
First code unit = \x{123}
Subject length lower bound = 1
/[\x{c1}\x{e1}]X[\x{145}\x{146}]/I,utf
Capture group count = 0
Options: utf
First code unit = \xc1 (caseless)
Last code unit = \x{145} (caseless)
Subject length lower bound = 3
/[\xff\x{ffff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xff
Subject length lower bound = 1
/[\xff\x{ff}]/I,utf
Capture group count = 0
Options: utf
Starting code units: \xff
Subject length lower bound = 1
/[\xff\x{ff}]/I
Capture group count = 0
Starting code units: \xff
Subject length lower bound = 1
/[Ss]/I
Capture group count = 0
First code unit = 'S' (caseless)
Subject length lower bound = 1
/[Ss]/I,utf
Capture group count = 0
Options: utf
Starting code units: S s
Subject length lower bound = 1
/(?:\x{ff}|\x{3000})/I,utf
Capture group count = 0
Options: utf
Starting code units: \xff
Subject length lower bound = 1
# ----------------------------------------------------
# UCP and casing tests
/\x{120}/iI
Capture group count = 0
Options: caseless
First code unit = \x{120}
Subject length lower bound = 1
/\x{c1}/iI,ucp
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Subject length lower bound = 1
/[\x{120}\x{121}]/iB,ucp
------------------------------------------------------------------
Bra
/i \x{120}
Ket
End
------------------------------------------------------------------
/[ab\x{120}]+/iB,ucp
------------------------------------------------------------------
Bra
[ABab\x{120}-\x{121}]++
Ket
End
------------------------------------------------------------------
aABb\x{121}\x{120}
0: aABb\x{121}\x{120}
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
No match
/\x{120}\x{c1}/i,ucp,no_start_optimize
\x{121}\x{e1}
0: \x{121}\xe1
/\x{120}\x{c1}/i,ucp
\x{121}\x{e1}
0: \x{121}\xe1
/[^\x{120}]/i,no_start_optimize
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp,no_start_optimize
\= Expect no match
\x{121}
No match
/[^\x{120}]/i
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp
\= Expect no match
\x{121}
No match
/\x{120}{2}/i,ucp
\x{121}\x{121}
0: \x{121}\x{121}
/[^\x{120}]{2}/i,ucp
\= Expect no match
\x{121}\x{121}
No match
/\x{c1}+\x{e1}/iB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
/\x{c1}+\x{e1}/iIB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Last code unit = \xe1 (caseless)
Subject length lower bound = 2
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
\x{e1}\x{e1}\x{e1}
0: \xe1\xe1\xe1
/a|\x{c1}/iI,ucp
Capture group count = 0
Options: caseless ucp
Starting code units: A a \xc1 \xe1
Subject length lower bound = 1
\x{e1}xxx
0: \xe1
/\x{c1}|\x{e1}/iI,ucp
Capture group count = 0
Options: caseless ucp
First code unit = \xc1 (caseless)
Subject length lower bound = 1
/X(\x{e1})Y/ucp,replace=>\U$1<,substitute_extended
X\x{e1}Y
1: >\xc1<
/X(\x{121})Y/ucp,replace=>\U$1<,substitute_extended
X\x{121}Y
1: >\x{120}<
/s/i,ucp
\x{17f}
0: \x{17f}
/s/i,utf
\x{17f}
0: \x{17f}
/[^s]/i,ucp
\= Expect no match
\x{17f}
No match
/[^s]/i,utf
\= Expect no match
\x{17f}
No match
/(.) \1/i,ucp
i I
0: i I
1: i
/(.) \1/i,ucp,turkish_casing
\= Expect no match
i I
No match
/(.) \1/i,ucp
i I
0: i I
1: i
\x{212a} k
0: \x{212a} k
1: \x{212a}
\= Expect no match
i \x{0130}
No match
\x{0131} I
No match
/(.) \1/i,ucp,turkish_casing
\x{212a} k
0: \x{212a} k
1: \x{212a}
i \x{0130}
0: i \x{130}
1: i
\x{0131} I
0: \x{131} I
1: \x{131}
\= Expect no match
i I
No match
/(.) (?r:\1)/i,ucp,turkish_casing
i I
0: i I
1: i
\= Expect no match
i \x{0130}
No match
\x{0131} I
No match
\x{212a} k
No match
/[a-z][^i]I/ucp,turkish_casing
bII
0: bII
b\x{0130}I
0: b\x{130}I
b\x{0131}I
0: b\x{131}I
\= Expect no match
biI
No match
/[a-z][^i]I/i,ucp,turkish_casing
b\x{0131}I
0: b\x{131}I
bII
0: bII
\= Expect no match
biI
No match
b\x{0130}I
No match
/[a-z](?r:[^i])I/i,ucp,turkish_casing
b\x{0131}I
0: b\x{131}I
b\x{0130}I
0: b\x{130}I
\= Expect no match
bII
No match
biI
No match
/b(?r:[\x{00FF}-\x{FFEE}])/i,ucp,turkish_casing
b\x{0130}
0: b\x{130}
b\x{0131}
0: b\x{131}
B\x{212a}
0: B\x{212a}
\= Expect no match
bi
No match
bI
No match
bk
No match
/[\x60-\x7f]/i,ucp,turkish_casing
i
0: i
\= Expect no match
I
No match
/[\x60-\xc0]/i,ucp,turkish_casing
i
0: i
\= Expect no match
I
No match
/[\x80-\xc0]/i,ucp,turkish_casing
\= Expect no match
i
No match
I
No match
# ----------------------------------------------------
/b[\x{00FF}-\x{FFEE}]/ir
b\x{0130}
0: b\x{130}
b\x{0131}
0: b\x{131}
B\x{212a}
0: B\x{212a}
\= Expect no match
bi
No match
bI
No match
bk
No match
# Quantifier after a literal that has the value of META_ACCEPT (not UTF). This
# fails in 16-bit mode, but is OK for 32-bit.
/\x{802a0000}*/
\x{802a0000}\x{802a0000}
0: \x{802a0000}\x{802a0000}
# UTF matching without UTF, check invalid UTF characters
/\X++/
a\x{110000}\x{ffffffff}
0: a\x{110000}\x{ffffffff}
# This used to loop in 32-bit mode; it will fail in 16-bit mode.
/[\x{ffffffff}]/caseless,ucp
\x{ffffffff}xyz
0: \x{ffffffff}
# These are 32-bit tests for handing 0xffffffff when in UCP caselsss mode. They
# will give errors in 16-bit mode.
/k*\x{ffffffff}/caseless,ucp
\x{ffffffff}
0: \x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
0: K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
No match
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
No match
/k\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
0: K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
No match
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
No match
/[sk](?r:[sk])[sk]/Bi,ucp
------------------------------------------------------------------
Bra
[KSks\x{17f}\x{212a}]
Bra
[KSks]
Ket
[KSks\x{17f}\x{212a}]
Ket
End
------------------------------------------------------------------
SKS
0: SKS
sks
0: sks
\x{212a}S\x{17f}
0: \x{212a}S\x{17f}
\x{17f}K\x{212a}
0: \x{17f}K\x{212a}
\= Expect no match
s\x{212a}s
No match
K\x{17f}K
No match
# ---------------------------------------------------------
# End of testinput12

27
3rd/pcre2/testdata/testoutput13 vendored Normal file
View File

@@ -0,0 +1,27 @@
# These DFA tests are for the handling of characters greater than 255 in
# 16-bit or 32-bit, non-UTF mode.
#forbid_utf
#subject dfa
/^\x{ffff}+/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}?/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}*/i
\x{ffff}
0: \x{ffff}
/^\x{ffff}{3}/i
\x{ffff}\x{ffff}\x{ffff}
0: \x{ffff}\x{ffff}\x{ffff}
/^\x{ffff}{0,3}/i
\x{ffff}
0: \x{ffff}
# End of testinput13

163
3rd/pcre2/testdata/testoutput14-16 vendored Normal file
View File

@@ -0,0 +1,163 @@
# These test special UTF and UCP features of DFA matching. The output is
# different for the different widths.
#subject dfa
# ----------------------------------------------------
# These are a selection of the more comprehensive tests that are run for
# non-DFA matching.
/X/utf
XX\x{d800}
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
XX\x{d800}\=offset=3
No match
XX\x{d800}\=no_utf_check
0: X
XX\x{da00}
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
XX\x{da00}\=no_utf_check
0: X
XX\x{dc00}
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
XX\x{dc00}\=no_utf_check
0: X
XX\x{de00}
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
XX\x{de00}\=no_utf_check
0: X
XX\x{dfff}
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
XX\x{dfff}\=no_utf_check
0: X
XX\x{110000}
** Failed: character \N{U+110000} is greater than 0x10ffff and therefore cannot be encoded as UTF-16
XX\x{d800}\x{1234}
Failed: error -25: UTF-16 error: invalid low surrogate at offset 2
/badutf/utf
X\xdf
No match
XX\xef
No match
XXX\xef\x80
No match
X\xf7
No match
XX\xf7\x80
No match
XXX\xf7\x80\x80
No match
/shortutf/utf
XX\xdf\=ph
No match
XX\xef\=ph
No match
XX\xef\x80\=ph
No match
\xf7\=ph
No match
\xf7\x80\=ph
No match
# ----------------------------------------------------
# UCP and casing tests - except for the first two, these will all fail in 8-bit
# mode because they are testing UCP without UTF and use characters > 255.
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
No match
/\x{c1}+\x{e1}/iB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
1: \xc1\xc1
\x{e1}\x{e1}\x{e1}
0: \xe1\xe1\xe1
1: \xe1\xe1
/\x{120}\x{c1}/i,ucp,no_start_optimize
\x{121}\x{e1}
0: \x{121}\xe1
/\x{120}\x{c1}/i,ucp
\x{121}\x{e1}
0: \x{121}\xe1
/[^\x{120}]/i,no_start_optimize
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp,no_start_optimize
\= Expect no match
\x{121}
No match
/[^\x{120}]/i
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp
\= Expect no match
\x{121}
No match
/\x{120}{2}/i,ucp
\x{121}\x{121}
0: \x{121}\x{121}
/[^\x{120}]{2}/i,ucp
\= Expect no match
\x{121}\x{121}
No match
# ----------------------------------------------------
# ----------------------------------------------------
# Tests for handling 0xffffffff in caseless UCP mode. They only apply to 32-bit
# mode; for the other widths they will fail.
/k*\x{ffffffff}/caseless,ucp
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
\x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
** Truncation will probably give the wrong result.
No match
# ----------------------------------------------------
# End of testinput14

159
3rd/pcre2/testdata/testoutput14-32 vendored Normal file
View File

@@ -0,0 +1,159 @@
# These test special UTF and UCP features of DFA matching. The output is
# different for the different widths.
#subject dfa
# ----------------------------------------------------
# These are a selection of the more comprehensive tests that are run for
# non-DFA matching.
/X/utf
XX\x{d800}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{d800}\=offset=3
No match
XX\x{d800}\=no_utf_check
0: X
XX\x{da00}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{da00}\=no_utf_check
0: X
XX\x{dc00}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{dc00}\=no_utf_check
0: X
XX\x{de00}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{de00}\=no_utf_check
0: X
XX\x{dfff}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{dfff}\=no_utf_check
0: X
XX\x{110000}
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 2
XX\x{d800}\x{1234}
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
/badutf/utf
X\xdf
No match
XX\xef
No match
XXX\xef\x80
No match
X\xf7
No match
XX\xf7\x80
No match
XXX\xf7\x80\x80
No match
/shortutf/utf
XX\xdf\=ph
No match
XX\xef\=ph
No match
XX\xef\x80\=ph
No match
\xf7\=ph
No match
\xf7\x80\=ph
No match
# ----------------------------------------------------
# UCP and casing tests - except for the first two, these will all fail in 8-bit
# mode because they are testing UCP without UTF and use characters > 255.
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
No match
/\x{c1}+\x{e1}/iB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
1: \xc1\xc1
\x{e1}\x{e1}\x{e1}
0: \xe1\xe1\xe1
1: \xe1\xe1
/\x{120}\x{c1}/i,ucp,no_start_optimize
\x{121}\x{e1}
0: \x{121}\xe1
/\x{120}\x{c1}/i,ucp
\x{121}\x{e1}
0: \x{121}\xe1
/[^\x{120}]/i,no_start_optimize
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp,no_start_optimize
\= Expect no match
\x{121}
No match
/[^\x{120}]/i
\x{121}
0: \x{121}
/[^\x{120}]/i,ucp
\= Expect no match
\x{121}
No match
/\x{120}{2}/i,ucp
\x{121}\x{121}
0: \x{121}\x{121}
/[^\x{120}]{2}/i,ucp
\= Expect no match
\x{121}\x{121}
No match
# ----------------------------------------------------
# ----------------------------------------------------
# Tests for handling 0xffffffff in caseless UCP mode. They only apply to 32-bit
# mode; for the other widths they will fail.
/k*\x{ffffffff}/caseless,ucp
\x{ffffffff}
0: \x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
0: K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
No match
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
No match
/k\x{ffffffff}/caseless,ucp,no_start_optimize
K\x{ffffffff}
0: K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
No match
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
No match
# ----------------------------------------------------
# End of testinput14

163
3rd/pcre2/testdata/testoutput14-8 vendored Normal file
View File

@@ -0,0 +1,163 @@
# These test special UTF and UCP features of DFA matching. The output is
# different for the different widths.
#subject dfa
# ----------------------------------------------------
# These are a selection of the more comprehensive tests that are run for
# non-DFA matching.
/X/utf
XX\x{d800}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{d800}\=offset=3
Error -36 (bad UTF-8 offset)
XX\x{d800}\=no_utf_check
0: X
XX\x{da00}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{da00}\=no_utf_check
0: X
XX\x{dc00}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{dc00}\=no_utf_check
0: X
XX\x{de00}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{de00}\=no_utf_check
0: X
XX\x{dfff}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
XX\x{dfff}\=no_utf_check
0: X
XX\x{110000}
Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 2
XX\x{d800}\x{1234}
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
/badutf/utf
X\xdf
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 1
XX\xef
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
XXX\xef\x80
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
X\xf7
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 1
XX\xf7\x80
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
XXX\xf7\x80\x80
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
/shortutf/utf
XX\xdf\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
XX\xef\=ph
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
XX\xef\x80\=ph
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
\xf7\=ph
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
\xf7\x80\=ph
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
# ----------------------------------------------------
# UCP and casing tests - except for the first two, these will all fail in 8-bit
# mode because they are testing UCP without UTF and use characters > 255.
/\x{c1}/i,no_start_optimize
\= Expect no match
\x{e1}
No match
/\x{c1}+\x{e1}/iB,ucp
------------------------------------------------------------------
Bra
/i \x{c1}+
/i \x{e1}
Ket
End
------------------------------------------------------------------
\x{c1}\x{c1}\x{c1}
0: \xc1\xc1\xc1
1: \xc1\xc1
\x{e1}\x{e1}\x{e1}
0: \xe1\xe1\xe1
1: \xe1\xe1
/\x{120}\x{c1}/i,ucp,no_start_optimize
Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
\x{121}\x{e1}
/\x{120}\x{c1}/i,ucp
Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
\x{121}\x{e1}
/[^\x{120}]/i,no_start_optimize
Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
\x{121}
/[^\x{120}]/i,ucp,no_start_optimize
Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
\= Expect no match
\x{121}
/[^\x{120}]/i
Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
\x{121}
/[^\x{120}]/i,ucp
Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
\= Expect no match
\x{121}
/\x{120}{2}/i,ucp
Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
\x{121}\x{121}
/[^\x{120}]{2}/i,ucp
Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
\= Expect no match
\x{121}\x{121}
# ----------------------------------------------------
# ----------------------------------------------------
# Tests for handling 0xffffffff in caseless UCP mode. They only apply to 32-bit
# mode; for the other widths they will fail.
/k*\x{ffffffff}/caseless,ucp
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
\x{ffffffff}
/k+\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}
/k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k\x{ffffffff}/caseless,ucp,no_start_optimize
Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
K\x{ffffffff}
\= Expect no match
\x{ffffffff}\x{ffffffff}\x{ffffffff}
/k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
\= Expect no match
Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
** Character \x{ffffffff} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
** Character \x{ffffffff} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
No match
# ----------------------------------------------------
# End of testinput14

542
3rd/pcre2/testdata/testoutput15 vendored Normal file
View File

@@ -0,0 +1,542 @@
# These are:
#
# (1) Tests of the match-limiting features. The results are different for
# interpretive or JIT matching, so this test should not be run with JIT. The
# same tests are run using JIT in test 17.
# (2) Other tests that must not be run with JIT.
# These tests are first so that they don't inherit a large enough heap frame
# vector from a previous test.
/(*LIMIT_HEAP=21)\[(a)]{60}/expand
\[a]{60}
Failed: error -63: heap limit exceeded
"(*LIMIT_HEAP=21)()((?))()()()()()()()()()()()()()()()()()()()()()()()(())()()()()()()()()()()()()()()()()()()()()()(())()()()()()()()()()()()()()"
xx
Failed: error -63: heap limit exceeded
# -----------------------------------------------------------------------
/(a+)*zz/I
Capture group count = 1
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits_noheap
Minimum match limit = 7
Minimum depth limit = 7
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazz
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaz\=find_limits_noheap
Minimum match limit = 20481
Minimum depth limit = 30
No match
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
Capture group count = 1
May match empty string
Subject length lower bound = 0
/* this is a C style comment */\=find_limits_noheap
Minimum match limit = 64
Minimum depth limit = 7
0: /* this is a C style comment */
1: /* this is a C style comment */
/^(?>a)++/
aa\=find_limits_noheap
Minimum match limit = 5
Minimum depth limit = 3
0: aa
aaaaaaaaa\=find_limits_noheap
Minimum match limit = 12
Minimum depth limit = 3
0: aaaaaaaaa
/(a)(?1)++/
aa\=find_limits_noheap
Minimum match limit = 7
Minimum depth limit = 5
0: aa
1: a
aaaaaaaaa\=find_limits_noheap
Minimum match limit = 21
Minimum depth limit = 5
0: aaaaaaaaa
1: a
/a(?:.)*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
Minimum match limit = 24
Minimum depth limit = 3
0: abbbbbbbbbbbbbbbbbbbbba
/a(?:.(*THEN))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
Minimum match limit = 66
Minimum depth limit = 45
0: abbbbbbbbbbbbbbbbbbbbba
/a(?:.(*THEN:ABC))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
Minimum match limit = 66
Minimum depth limit = 45
0: abbbbbbbbbbbbbbbbbbbbba
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
aabbccddee\=find_limits_noheap
Minimum match limit = 7
Minimum depth limit = 7
0: aabbccddee
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
aabbccddee\=find_limits_noheap
Minimum match limit = 12
Minimum depth limit = 12
0: aabbccddee
1: aa
2: bb
3: cc
4: dd
5: ee
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
aabbccddee\=find_limits_noheap
Minimum match limit = 10
Minimum depth limit = 10
0: aabbccddee
1: aa
2: cc
3: ee
/(*LIMIT_MATCH=12bc)abc/
Failed: error 160 at offset 16: (*VERB) not recognized or malformed
/(*LIMIT_MATCH=4294967290)abc/
Failed: error 160 at offset 23: (*VERB) not recognized or malformed
/(*LIMIT_DEPTH=4294967280)abc/I
Capture group count = 0
Depth limit = 4294967280
First code unit = 'a'
Last code unit = 'c'
Subject length lower bound = 3
/(a+)*zz/
\= Expect no match
aaaaaaaaaaaaaz
No match
\= Expect limit exceeded
aaaaaaaaaaaaaz\=match_limit=3000
Failed: error -47: match limit exceeded
/(a+)*zz/
\= Expect limit exceeded
aaaaaaaaaaaaaz\=depth_limit=10
Failed: error -53: matching depth limit exceeded
/(*LIMIT_MATCH=3000)(a+)*zz/I
Capture group count = 1
Match limit = 3000
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
\= Expect limit exceeded
aaaaaaaaaaaaaz
Failed: error -47: match limit exceeded
\= Expect limit exceeded
aaaaaaaaaaaaaz\=match_limit=60000
Failed: error -47: match limit exceeded
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
Capture group count = 1
Match limit = 3000
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
\= Expect limit exceeded
aaaaaaaaaaaaaz
Failed: error -47: match limit exceeded
/(*LIMIT_MATCH=60000)(a+)*zz/I
Capture group count = 1
Match limit = 60000
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
\= Expect no match
aaaaaaaaaaaaaz
No match
\= Expect limit exceeded
aaaaaaaaaaaaaz\=match_limit=3000
Failed: error -47: match limit exceeded
/(*LIMIT_DEPTH=10)(a+)*zz/I
Capture group count = 1
Depth limit = 10
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
\= Expect limit exceeded
aaaaaaaaaaaaaz
Failed: error -53: matching depth limit exceeded
\= Expect limit exceeded
aaaaaaaaaaaaaz\=depth_limit=1000
Failed: error -53: matching depth limit exceeded
/(*LIMIT_DEPTH=10)(*LIMIT_DEPTH=1000)(a+)*zz/I
Capture group count = 1
Depth limit = 1000
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
\= Expect no match
aaaaaaaaaaaaaz
No match
/(*LIMIT_DEPTH=1000)(a+)*zz/I
Capture group count = 1
Depth limit = 1000
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
\= Expect no match
aaaaaaaaaaaaaz
No match
\= Expect limit exceeded
aaaaaaaaaaaaaz\=depth_limit=10
Failed: error -53: matching depth limit exceeded
# These three have infinitely nested recursions.
/((?2))((?1))/
abc
Failed: error -52: nested recursion at the same subject position
/((?(R2)a+|(?1)b))()/
aaaabcde
Failed: error -52: nested recursion at the same subject position
/(?(R)a*(?1)|((?R))b)/
aaaabcde
Failed: error -52: nested recursion at the same subject position
# The allusedtext modifier does not work with JIT, which does not maintain
# the leftchar/rightchar data.
/abc(?=xyz)/allusedtext
abcxyzpqr
0: abcxyz
>>>
abcxyzpqr\=aftertext
0: abcxyz
>>>
0+ xyzpqr
/(?<=pqr)abc(?=xyz)/allusedtext
xyzpqrabcxyzpqr
0: pqrabcxyz
<<< >>>
xyzpqrabcxyzpqr\=aftertext
0: pqrabcxyz
<<< >>>
0+ xyzpqr
/a\b/
a.\=allusedtext
0: a.
>
a\=allusedtext
0: a
/abc\Kxyz/
abcxyz\=allusedtext
0: abcxyz
<<<
/abc(?=xyz(*ACCEPT))/
abcxyz\=allusedtext
0: abcxyz
>>>
/abc(?=abcde)(?=ab)/allusedtext
abcabcdefg
0: abcabcde
>>>>>
#subject allusedtext
/(?<=abc)123/
xyzabc123pqr
0: abc123
<<<
xyzabc12\=ps
Partial match: abc12
<<<
xyzabc12\=ph
Partial match: abc12
<<<
/\babc\b/
+++abc+++
0: +abc+
< >
+++ab\=ps
Partial match: +ab
<
+++ab\=ph
Partial match: +ab
<
/(?<=abc)def/
abc\=ph
Partial match: abc
<<<
/(?<=123)(*MARK:xx)abc/mark
xxxx123a\=ph
Partial match, mark=xx: 123a
<<<
xxxx123a\=ps
Partial match, mark=xx: 123a
<<<
/(?<=(?<=a)b)c.*/I
Capture group count = 0
Max lookbehind = 1
First code unit = 'c'
Subject length lower bound = 1
abc\=ph
Partial match: abc
<<
\= Expect no match
xbc\=ph
No match
/(?<=ab)c.*/I
Capture group count = 0
Max lookbehind = 2
First code unit = 'c'
Subject length lower bound = 1
abc\=ph
Partial match: abc
<<
\= Expect no match
xbc\=ph
No match
/abc(?<=bc)def/
xxxabcd\=ph
Partial match: abcd
/(?<=ab)cdef/
xxabcd\=ph
Partial match: abcd
<<
/(?<=(?<=(?<=a)b)c)./I
Capture group count = 0
Max lookbehind = 1
Subject length lower bound = 1
123abcXYZ
0: abcX
<<<
/(?<=ab(cd(?<=...)))./I
Capture group count = 1
Max lookbehind = 4
Subject length lower bound = 1
abcdX
0: abcdX
<<<<
1: cd
/(?<=ab((?<=...)cd))./I
Capture group count = 1
Max lookbehind = 4
Subject length lower bound = 1
ZabcdX
0: ZabcdX
<<<<<
1: cd
/(?<=((?<=(?<=ab).))(?1)(?1))./I
Capture group count = 1
Max lookbehind = 2
Subject length lower bound = 1
abxZ
0: abxZ
<<<
1:
#subject
# -------------------------------------------------------------------
# These tests provoke recursion loops, which give a different error message
# when JIT is used.
/(?R)/I
Capture group count = 0
May match empty string
Subject length lower bound = 0
abcd
Failed: error -52: nested recursion at the same subject position
/(a|(?R))/I
Capture group count = 1
May match empty string
Subject length lower bound = 0
abcd
0: a
1: a
defg
Failed: error -52: nested recursion at the same subject position
/(ab|(bc|(de|(?R))))/I
Capture group count = 3
May match empty string
Subject length lower bound = 0
abcd
0: ab
1: ab
fghi
Failed: error -52: nested recursion at the same subject position
/(ab|(bc|(de|(?1))))/I
Capture group count = 3
May match empty string
Subject length lower bound = 0
abcd
0: ab
1: ab
fghi
Failed: error -52: nested recursion at the same subject position
/x(ab|(bc|(de|(?1)x)x)x)/I
Capture group count = 3
First code unit = 'x'
Subject length lower bound = 3
xab123
0: xab
1: ab
xfghi
Failed: error -52: nested recursion at the same subject position
/(?!\w)(?R)/
abcd
Failed: error -52: nested recursion at the same subject position
=abc
Failed: error -52: nested recursion at the same subject position
/(?=\w)(?R)/
=abc
Failed: error -52: nested recursion at the same subject position
abcd
Failed: error -52: nested recursion at the same subject position
/(?<!\w)(?R)/
abcd
Failed: error -52: nested recursion at the same subject position
/(?<=\w)(?R)/
abcd
Failed: error -52: nested recursion at the same subject position
/(a+|(?R)b)/
aaa
0: aaa
1: aaa
bbb
Failed: error -52: nested recursion at the same subject position
/[^\xff]((?1))/BI
------------------------------------------------------------------
Bra
[^\x{ff}] (not)
CBra 1
Recurse
Ket
Ket
End
------------------------------------------------------------------
Capture group count = 1
Subject length lower bound = 1
abcd
Failed: error -52: nested recursion at the same subject position
# These tests don't behave the same with JIT
/\w+(?C1)/BI,no_auto_possess
------------------------------------------------------------------
Bra
\w+
Callout 1 8 0
Ket
End
------------------------------------------------------------------
Capture group count = 0
Options: no_auto_possess
Optimizations: dotstar_anchor,start_optimize
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
Subject length lower bound = 1
abc\=callout_fail=1
--->abc
1 ^ ^ End of pattern
1 ^ ^ End of pattern
1 ^^ End of pattern
1 ^ ^ End of pattern
1 ^^ End of pattern
1 ^^ End of pattern
No match
/(*NO_AUTO_POSSESS)\w+(?C1)/BI
------------------------------------------------------------------
Bra
\w+
Callout 1 26 0
Ket
End
------------------------------------------------------------------
Capture group count = 0
Compile options: <none>
Overall options: no_auto_possess
Optimizations: dotstar_anchor,start_optimize
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
Subject length lower bound = 1
abc\=callout_fail=1
--->abc
1 ^ ^ End of pattern
1 ^ ^ End of pattern
1 ^^ End of pattern
1 ^ ^ End of pattern
1 ^^ End of pattern
1 ^^ End of pattern
No match
# This test breaks the JIT stack limit
/(|]+){2,2452}/
(|]+){2,2452}
0:
1:
/b(?<!ax)(?!cx)/allusedtext
abc
0: abc
< >
abcz
0: abcz
< >>
# This test triggers the recursion limit in the interpreter, but completes in
# JIT. It's in testinput2 with disable_recurse_loop_check to get it to work
# in the interpreter.
/(a(?1)z||(?1)++)$/
abcd
Failed: error -52: nested recursion at the same subject position
# End of testinput15

18
3rd/pcre2/testdata/testoutput16 vendored Normal file
View File

@@ -0,0 +1,18 @@
# This test is run only when JIT support is not available. It checks that an
# attempt to use it has the expected behaviour. It also tests things that
# are different without JIT.
/abc/I,jit,jitverify
JIT compilation was not successful (bad JIT option)
Capture group count = 0
First code unit = 'a'
Last code unit = 'c'
Subject length lower bound = 3
JIT support is not available in this version of PCRE2
/a*/I
Capture group count = 0
May match empty string
Subject length lower bound = 0
# End of testinput16

570
3rd/pcre2/testdata/testoutput17 vendored Normal file

File diff suppressed because one or more lines are too long

230
3rd/pcre2/testdata/testoutput18 vendored Normal file
View File

@@ -0,0 +1,230 @@
# This set of tests is run only with the 8-bit library. It tests the POSIX
# interface, which is supported only with the 8-bit library. This test should
# not be run with JIT (which is not available for the POSIX interface).
#forbid_utf
#pattern posix
# Test some invalid options
/abc/auto_callout
** Ignored with POSIX interface: auto_callout
/abc/
abc\=find_limits
** Ignored with POSIX interface: find_limits
0: abc
/abc/
abc\=partial_hard
** Ignored with POSIX interface: partial_hard
0: abc
/a(())bc/parens_nest_limit=1
** Ignored with POSIX interface: parens_nest_limit
/abc/allow_surrogate_escapes,max_pattern_length=2
** Ignored with POSIX interface: allow_surrogate_escapes max_pattern_length
# Real tests
/abc/
abc
0: abc
/^abc|def/
abcdef
0: abc
abcdef\=notbol
0: def
/.*((abc)$|(def))/
defabc
0: defabc
1: abc
2: abc
defabc\=noteol
0: def
1: def
2: <unset>
3: def
/the quick brown fox/
the quick brown fox
0: the quick brown fox
\= Expect no match
The Quick Brown Fox
No match: POSIX code 17: match failed
/the quick brown fox/i
the quick brown fox
0: the quick brown fox
The Quick Brown Fox
0: The Quick Brown Fox
/(*LF)abc.def/
\= Expect no match
abc\ndef
No match: POSIX code 17: match failed
/(*LF)abc$/
abc
0: abc
abc\n
0: abc
/(abc)\2/
Failed: POSIX code 15: bad back reference at offset 6
/(abc\1)/
\= Expect no match
abc
No match: POSIX code 17: match failed
/a*(b+)(z)(z)/
aaaabbbbzzzz
0: aaaabbbbzz
1: bbbb
2: z
3: z
aaaabbbbzzzz\=ovector=0
Matched without capture
aaaabbbbzzzz\=ovector=1
0: aaaabbbbzz
aaaabbbbzzzz\=ovector=2
0: aaaabbbbzz
1: bbbb
/(*ANY)ab.cd/
ab-cd
0: ab-cd
ab=cd
0: ab=cd
\= Expect no match
ab\ncd
No match: POSIX code 17: match failed
/ab.cd/s
ab-cd
0: ab-cd
ab=cd
0: ab=cd
ab\ncd
0: ab\x0acd
/a(b)c/posix_nosub
abc
Matched with REG_NOSUB
/a(?P<name>b)c/posix_nosub
abc
Matched with REG_NOSUB
/(a)\1/posix_nosub
zaay
Matched with REG_NOSUB
/a?|b?/
abc
0: a
\= Expect no match
ddd\=notempty
No match: POSIX code 17: match failed
/\w+A/
CDAAAAB
0: CDAAAA
/\w+A/ungreedy
CDAAAAB
0: CDA
/\Biss\B/I,aftertext
** Ignored with POSIX interface: info
Mississippi
0: iss
0+ issippi
/abc/\
Failed: POSIX code 9: bad escape sequence at offset 4
"(?(?C)"
Failed: POSIX code 11: unbalanced () at offset 6
"(?(?C))"
Failed: POSIX code 3: pattern error at offset 6
/abcd/substitute_extended
** Ignored with POSIX interface: substitute_extended
/\[A]{1000000}**/expand,regerror_buffsize=31
Failed: POSIX code 4: ? * + invalid at offset 100000
** regerror() message truncated
/\[A]{1000000}**/expand,regerror_buffsize=32
Failed: POSIX code 4: ? * + invalid at offset 1000001
//posix_nosub
\=offset=70000
** Ignored with POSIX interface: offset
Matched with REG_NOSUB
/^d(e)$/posix
acdef\=posix_startend=2:4
0: de
1: e
acde\=posix_startend=2
0: de
1: e
\= Expect no match
acdef
No match: POSIX code 17: match failed
acdef\=posix_startend=2
No match: POSIX code 17: match failed
/^a\x{00}b$/posix
a\x{00}b\=posix_startend=0:3
0: a\x00b
/"A" 00 "B"/hex
A\x{00}B\=posix_startend=0:3
0: A\x00B
/ABC/use_length
ABC
0: ABC
/a\b(c/literal,posix
a\\b(c
0: a\b(c
/a\b(c/literal,posix,dotall
Failed: POSIX code 16: bad argument at offset 0
/((a)(b)?(c))/posix
123ace
0: ac
1: ac
2: a
3: <unset>
4: c
123ace\=posix_startend=2:6
0: ac
1: ac
2: a
3: <unset>
4: c
//posix
\= Expect errors
\=null_subject
No match: POSIX code 16: bad argument
abc\=null_subject
No match: POSIX code 16: bad argument
/(*LIMIT_HEAP=0)xx/posix
\= Expect error
xxxx
No match: POSIX code 14: failed to get memory
# End of testdata/testinput18

30
3rd/pcre2/testdata/testoutput19 vendored Normal file
View File

@@ -0,0 +1,30 @@
# This set of tests is run only with the 8-bit library. It tests the POSIX
# interface with UTF/UCP support, which is supported only with the 8-bit
# library. This test should not be run with JIT (which is not available for the
# POSIX interface).
#pattern posix
/a\x{1234}b/utf
a\x{1234}b
0: a\x{1234}b
/\w/
\= Expect no match
+++\x{c2}
No match: POSIX code 17: match failed
/\w/ucp
+++\x{c2}
0: \xc2
/"^AB" 00 "\x{1234}$"/hex,utf
AB\x{00}\x{1234}\=posix_startend=0:6
0: AB\x{00}\x{1234}
/\w/utf
\= Expect UTF error
A\xabB
No match: POSIX code 16: bad argument
# End of testdata/testinput19

21830
3rd/pcre2/testdata/testoutput2 vendored Normal file

File diff suppressed because it is too large Load Diff

161
3rd/pcre2/testdata/testoutput20 vendored Normal file
View File

@@ -0,0 +1,161 @@
# This set of tests exercises the serialization/deserialization and code copy
# functions in the library. It does not use UTF or JIT.
#forbid_utf
# Compile several patterns, push them onto the stack, and then write them
# all to a file.
#pattern push
/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
(?(DEFINE)
(?<NAME_PAT>[a-z]+)
(?<ADDRESS_PAT>\d+)
)/x
/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
#save testsaved1
# Do it again for some more patterns.
/(*MARK:A)(*SKIP:B)(C|X)/mark
** Ignored when compiled pattern is stacked with 'push': mark
/(?:(?<n>foo)|(?<n>bar))\k<n>/dupnames
#save testsaved2
#pattern -push
# Reload the patterns, then pop them one by one and check them.
#load testsaved1
#load testsaved2
#pop info
Capture group count = 2
Max back reference = 2
Named capture groups:
n 1
n 2
Options: dupnames
Starting code units: b f
Subject length lower bound = 6
foofoo
0: foofoo
1: foo
barbar
0: barbar
1: <unset>
2: bar
#pop mark
C
0: C
1: C
MK: A
\= Expect no match
D
No match, mark = A
#pop
AmanaplanacanalPanama
0: AmanaplanacanalPanama
1: <unset>
2: <unset>
3: AmanaplanacanalPanama
4: A
#pop info
Capture group count = 4
Named capture groups:
ADDR 2
ADDRESS_PAT 4
NAME 1
NAME_PAT 3
Options: extended
Subject length lower bound = 3
metcalfe 33
0: metcalfe 33
1: metcalfe
2: 33
# Check for an error when different tables are used.
/abc/push,tables=1
/xyz/push,tables=2
#save testsaved1
Serialization failed: error -30: patterns do not all use the same character tables
#pop
xyz
0: xyz
#pop
abc
0: abc
#pop should give an error
** Can't pop off an empty stack
pqr
/abcd/pushcopy
abcd
0: abcd
#pop
abcd
0: abcd
#pop should give an error
** Can't pop off an empty stack
/abcd/push
#popcopy
abcd
0: abcd
#pop
abcd
0: abcd
/abcd/push
#save testsaved1
#pop should give an error
** Can't pop off an empty stack
#load testsaved1
#popcopy
abcd
0: abcd
#pop
abcd
0: abcd
#pop should give an error
** Can't pop off an empty stack
/abcd/pushtablescopy
abcd
0: abcd
#popcopy
abcd
0: abcd
#pop
abcd
0: abcd
# Must only specify one of these
//push,pushcopy
** Not allowed together: push pushcopy
//push,pushtablescopy
** Not allowed together: push pushtablescopy
//pushcopy,pushtablescopy
** Not allowed together: pushcopy pushtablescopy
# End of testinput20

97
3rd/pcre2/testdata/testoutput21 vendored Normal file
View File

@@ -0,0 +1,97 @@
# These are tests of \C that do not involve UTF. They are not run when \C is
# disabled by compiling with --enable-never-backslash-C.
/\C+\D \C+\d \C+\S \C+\s \C+\W \C+\w \C+. \C+\R \C+\H \C+\h \C+\V \C+\v \C+\Z \C+\z \C+$/Bx
------------------------------------------------------------------
Bra
AllAny+
\D
AllAny+
\d
AllAny+
\S
AllAny+
\s
AllAny+
\W
AllAny+
\w
AllAny+
Any
AllAny+
\R
AllAny+
\H
AllAny+
\h
AllAny+
\V
AllAny+
\v
AllAny+
\Z
AllAny++
\z
AllAny+
$
Ket
End
------------------------------------------------------------------
/\D+\C \d+\C \S+\C \s+\C \W+\C \w+\C .+\C \R+\C \H+\C \h+\C \V+\C \v+\C a+\C \n+\C \C+\C/Bx
------------------------------------------------------------------
Bra
\D+
AllAny
\d+
AllAny
\S+
AllAny
\s+
AllAny
\W+
AllAny
\w+
AllAny
Any+
AllAny
\R+
AllAny
\H+
AllAny
\h+
AllAny
\V+
AllAny
\v+
AllAny
a+
AllAny
\x0a+
AllAny
AllAny+
AllAny
Ket
End
------------------------------------------------------------------
/ab\Cde/never_backslash_c
Failed: error 183 at offset 4: using \C is disabled by the application
/ab\Cde/info
Capture group count = 0
Contains \C
First code unit = 'a'
Last code unit = 'e'
Subject length lower bound = 5
abXde
0: abXde
/(?<=ab\Cde)X/
abZdeX
0: X
/[\C]/
Failed: error 107 at offset 2: escape sequence is invalid in character class
# End of testinput21

182
3rd/pcre2/testdata/testoutput22-16 vendored Normal file
View File

@@ -0,0 +1,182 @@
# Tests of \C when Unicode support is available. Note that \C is not supported
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
# in some widths and not in others.
/ab\Cde/utf,info
Capture group count = 0
Contains \C
Options: utf
First code unit = 'a'
Last code unit = 'e'
Subject length lower bound = 2
abXde
0: abXde
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
# 16-bit modes, but not in 32-bit mode.
/(?<=ab\Cde)X/utf
Failed: error 136 at offset 0: \C is not allowed in a lookbehind assertion in UTF-16 mode
ab!deXYZ
# Autopossessification tests
/\C+\X \X+\C/Bx
------------------------------------------------------------------
Bra
AllAny+
extuni
extuni+
AllAny
Ket
End
------------------------------------------------------------------
/\C+\X \X+\C/Bx,utf
------------------------------------------------------------------
Bra
Anybyte+
extuni
extuni+
Anybyte
Ket
End
------------------------------------------------------------------
/\C\X*TӅ;
{0,6}\v+
F
/utf
\= Expect no match
Ӆ\x0a
No match
/\C(\W?ſ)'?{{/utf
\= Expect no match
\\C(\\W?ſ)'?{{
No match
/X(\C{3})/utf
X\x{1234}
No match
X\x{11234}Y
0: X\x{11234}Y
1: \x{11234}Y
X\x{11234}YZ
0: X\x{11234}Y
1: \x{11234}Y
/X(\C{4})/utf
X\x{1234}YZ
No match
X\x{11234}YZ
0: X\x{11234}YZ
1: \x{11234}YZ
X\x{11234}YZW
0: X\x{11234}YZ
1: \x{11234}YZ
/X\C*/utf
XYZabcdce
0: XYZabcdce
/X\C*?/utf
XYZabcde
0: X
/X\C{3,5}/utf
Xabcdefg
0: Xabcde
X\x{1234}
No match
X\x{1234}YZ
0: X\x{1234}YZ
X\x{1234}\x{512}
No match
X\x{1234}\x{512}YZ
0: X\x{1234}\x{512}YZ
X\x{11234}Y
0: X\x{11234}Y
X\x{11234}YZ
0: X\x{11234}YZ
X\x{11234}\x{512}
0: X\x{11234}\x{512}
X\x{11234}\x{512}YZ
0: X\x{11234}\x{512}YZ
X\x{11234}\x{512}\x{11234}Z
0: X\x{11234}\x{512}\x{11234}
/X\C{3,5}?/utf
Xabcdefg
0: Xabc
X\x{1234}
No match
X\x{1234}YZ
0: X\x{1234}YZ
X\x{1234}\x{512}
No match
X\x{11234}Y
0: X\x{11234}Y
X\x{11234}YZ
0: X\x{11234}Y
X\x{11234}\x{512}YZ
0: X\x{11234}\x{512}
X\x{11234}
No match
/a\Cb/utf
aXb
0: aXb
a\nb
0: a\x{0a}b
a\x{100}b
0: a\x{100}b
/a\C\Cb/utf
a\x{100}b
No match
a\x{12257}b
0: a\x{12257}b
a\x{12257}\x{11234}b
No match
/ab\Cde/utf
abXde
0: abXde
# This one is here not because it's different to Perl, but because the way
# the captured single code unit is displayed. (In Perl it becomes a character,
# and you can't tell the difference.)
/X(\C)(.*)/utf
X\x{1234}
0: X\x{1234}
1: \x{1234}
2:
X\nabc
0: X\x{0a}abc
1: \x{0a}
2: abc
# This one is here because Perl gives out a grumbly error message (quite
# correctly, but that messes up comparisons).
/a\Cb/utf
\= Expect no match in 8-bit mode
a\x{100}b
0: a\x{100}b
/^ab\C/utf,no_start_optimize
\= Expect no match - tests \C at end of subject
ab
No match
/\C[^\v]+\x80/utf
[AΏBŀC]
No match
/\C[^\d]+\x80/utf
[AΏBŀC]
No match

180
3rd/pcre2/testdata/testoutput22-32 vendored Normal file
View File

@@ -0,0 +1,180 @@
# Tests of \C when Unicode support is available. Note that \C is not supported
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
# in some widths and not in others.
/ab\Cde/utf,info
Capture group count = 0
Contains \C
Options: utf
First code unit = 'a'
Last code unit = 'e'
Subject length lower bound = 5
abXde
0: abXde
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
# 16-bit modes, but not in 32-bit mode.
/(?<=ab\Cde)X/utf
ab!deXYZ
0: X
# Autopossessification tests
/\C+\X \X+\C/Bx
------------------------------------------------------------------
Bra
AllAny+
extuni
extuni+
AllAny
Ket
End
------------------------------------------------------------------
/\C+\X \X+\C/Bx,utf
------------------------------------------------------------------
Bra
AllAny+
extuni
extuni+
AllAny
Ket
End
------------------------------------------------------------------
/\C\X*TӅ;
{0,6}\v+
F
/utf
\= Expect no match
Ӆ\x0a
No match
/\C(\W?ſ)'?{{/utf
\= Expect no match
\\C(\\W?ſ)'?{{
No match
/X(\C{3})/utf
X\x{1234}
No match
X\x{11234}Y
No match
X\x{11234}YZ
0: X\x{11234}YZ
1: \x{11234}YZ
/X(\C{4})/utf
X\x{1234}YZ
No match
X\x{11234}YZ
No match
X\x{11234}YZW
0: X\x{11234}YZW
1: \x{11234}YZW
/X\C*/utf
XYZabcdce
0: XYZabcdce
/X\C*?/utf
XYZabcde
0: X
/X\C{3,5}/utf
Xabcdefg
0: Xabcde
X\x{1234}
No match
X\x{1234}YZ
0: X\x{1234}YZ
X\x{1234}\x{512}
No match
X\x{1234}\x{512}YZ
0: X\x{1234}\x{512}YZ
X\x{11234}Y
No match
X\x{11234}YZ
0: X\x{11234}YZ
X\x{11234}\x{512}
No match
X\x{11234}\x{512}YZ
0: X\x{11234}\x{512}YZ
X\x{11234}\x{512}\x{11234}Z
0: X\x{11234}\x{512}\x{11234}Z
/X\C{3,5}?/utf
Xabcdefg
0: Xabc
X\x{1234}
No match
X\x{1234}YZ
0: X\x{1234}YZ
X\x{1234}\x{512}
No match
X\x{11234}Y
No match
X\x{11234}YZ
0: X\x{11234}YZ
X\x{11234}\x{512}YZ
0: X\x{11234}\x{512}Y
X\x{11234}
No match
/a\Cb/utf
aXb
0: aXb
a\nb
0: a\x{0a}b
a\x{100}b
0: a\x{100}b
/a\C\Cb/utf
a\x{100}b
No match
a\x{12257}b
No match
a\x{12257}\x{11234}b
0: a\x{12257}\x{11234}b
/ab\Cde/utf
abXde
0: abXde
# This one is here not because it's different to Perl, but because the way
# the captured single code unit is displayed. (In Perl it becomes a character,
# and you can't tell the difference.)
/X(\C)(.*)/utf
X\x{1234}
0: X\x{1234}
1: \x{1234}
2:
X\nabc
0: X\x{0a}abc
1: \x{0a}
2: abc
# This one is here because Perl gives out a grumbly error message (quite
# correctly, but that messes up comparisons).
/a\Cb/utf
\= Expect no match in 8-bit mode
a\x{100}b
0: a\x{100}b
/^ab\C/utf,no_start_optimize
\= Expect no match - tests \C at end of subject
ab
No match
/\C[^\v]+\x80/utf
[AΏBŀC]
No match
/\C[^\d]+\x80/utf
[AΏBŀC]
No match

184
3rd/pcre2/testdata/testoutput22-8 vendored Normal file
View File

@@ -0,0 +1,184 @@
# Tests of \C when Unicode support is available. Note that \C is not supported
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
# in some widths and not in others.
/ab\Cde/utf,info
Capture group count = 0
Contains \C
Options: utf
First code unit = 'a'
Last code unit = 'e'
Subject length lower bound = 2
abXde
0: abXde
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
# 16-bit modes, but not in 32-bit mode.
/(?<=ab\Cde)X/utf
Failed: error 136 at offset 0: \C is not allowed in a lookbehind assertion in UTF-8 mode
ab!deXYZ
# Autopossessification tests
/\C+\X \X+\C/Bx
------------------------------------------------------------------
Bra
AllAny+
extuni
extuni+
AllAny
Ket
End
------------------------------------------------------------------
/\C+\X \X+\C/Bx,utf
------------------------------------------------------------------
Bra
Anybyte+
extuni
extuni+
Anybyte
Ket
End
------------------------------------------------------------------
/\C\X*TӅ;
{0,6}\v+
F
/utf
\= Expect no match
Ӆ\x0a
No match
/\C(\W?ſ)'?{{/utf
\= Expect no match
\\C(\\W?ſ)'?{{
No match
/X(\C{3})/utf
X\x{1234}
0: X\x{1234}
1: \x{1234}
X\x{11234}Y
0: X\x{f0}\x{91}\x{88}
1: \x{f0}\x{91}\x{88}
X\x{11234}YZ
0: X\x{f0}\x{91}\x{88}
1: \x{f0}\x{91}\x{88}
/X(\C{4})/utf
X\x{1234}YZ
0: X\x{1234}Y
1: \x{1234}Y
X\x{11234}YZ
0: X\x{11234}
1: \x{11234}
X\x{11234}YZW
0: X\x{11234}
1: \x{11234}
/X\C*/utf
XYZabcdce
0: XYZabcdce
/X\C*?/utf
XYZabcde
0: X
/X\C{3,5}/utf
Xabcdefg
0: Xabcde
X\x{1234}
0: X\x{1234}
X\x{1234}YZ
0: X\x{1234}YZ
X\x{1234}\x{512}
0: X\x{1234}\x{512}
X\x{1234}\x{512}YZ
0: X\x{1234}\x{512}
X\x{11234}Y
0: X\x{11234}Y
X\x{11234}YZ
0: X\x{11234}Y
X\x{11234}\x{512}
0: X\x{11234}\x{d4}
X\x{11234}\x{512}YZ
0: X\x{11234}\x{d4}
X\x{11234}\x{512}\x{11234}Z
0: X\x{11234}\x{d4}
/X\C{3,5}?/utf
Xabcdefg
0: Xabc
X\x{1234}
0: X\x{1234}
X\x{1234}YZ
0: X\x{1234}
X\x{1234}\x{512}
0: X\x{1234}
X\x{11234}Y
0: X\x{f0}\x{91}\x{88}
X\x{11234}YZ
0: X\x{f0}\x{91}\x{88}
X\x{11234}\x{512}YZ
0: X\x{f0}\x{91}\x{88}
X\x{11234}
0: X\x{f0}\x{91}\x{88}
/a\Cb/utf
aXb
0: aXb
a\nb
0: a\x{0a}b
a\x{100}b
No match
/a\C\Cb/utf
a\x{100}b
0: a\x{100}b
a\x{12257}b
No match
a\x{12257}\x{11234}b
No match
/ab\Cde/utf
abXde
0: abXde
# This one is here not because it's different to Perl, but because the way
# the captured single code unit is displayed. (In Perl it becomes a character,
# and you can't tell the difference.)
/X(\C)(.*)/utf
X\x{1234}
0: X\x{1234}
1: \x{e1}
2: \x{88}\x{b4}
X\nabc
0: X\x{0a}abc
1: \x{0a}
2: abc
# This one is here because Perl gives out a grumbly error message (quite
# correctly, but that messes up comparisons).
/a\Cb/utf
\= Expect no match in 8-bit mode
a\x{100}b
No match
/^ab\C/utf,no_start_optimize
\= Expect no match - tests \C at end of subject
ab
No match
/\C[^\v]+\x80/utf
[AΏBŀC]
No match
/\C[^\d]+\x80/utf
[AΏBŀC]
No match

11
3rd/pcre2/testdata/testoutput23 vendored Normal file
View File

@@ -0,0 +1,11 @@
# This test is run when PCRE2 has been built with --enable-never-backslash-C,
# which disables the use of \C. All we can do is check that it gives the
# correct error message.
/a\Cb/
Failed: error 185 at offset 3: using \C is disabled in this PCRE2 library
/a[\C]b/
Failed: error 107 at offset 3: escape sequence is invalid in character class
# End of testinput23

624
3rd/pcre2/testdata/testoutput24 vendored Normal file
View File

@@ -0,0 +1,624 @@
# This file tests the auxiliary pattern conversion features of the PCRE2
# library, in non-UTF mode.
#forbid_utf
#newline_default lf any anycrlf
# -------- Tests of glob conversion --------
# Set the glob separator explicitly so that different OS defaults are not a
# problem. Then test various errors.
#pattern convert=glob,convert_glob_escape=\,convert_glob_separator=/
/abc/posix
** The convert and posix modifiers are mutually exclusive
# Separator must be / \ or .
/a*b/convert_glob_separator=%
** Invalid glob separator '%'
# Can't have separator in a class
"[ab/cd]"
(?s)\A[ab/cd](?<!/)\z
"[,-/]"
(?s)\A[,-/](?<!/)\z
/[ab/
** Pattern conversion error at offset 3: missing terminating ] for character class
# Length check
/abc/convert_length=11
** Pattern conversion error at offset 3: no more memory
/abc/convert_length=12
(?s)\Aabc\z
# Now some actual tests
/a?b[]xy]*c/
(?s)\Aa[^/]b[\]xy](*COMMIT)[^/]*?c\z
azb]1234c
0: azb]1234c
# Tests from the gitwildmatch list, with some additions
/foo/
(?s)\Afoo\z
foo
0: foo
/= Expect no match
No match
bar
No match
//
(?s)\A\z
\
0:
/???/
(?s)\A[^/][^/][^/]\z
foo
0: foo
\= Expect no match
foobar
No match
/*/
(?s)\A[^/]*+\z
foo
0: foo
\
0:
/f*/
(?s)\Af(*COMMIT)[^/]*+\z
foo
0: foo
f
0: f
/*f/
(?s)\A[^/]*?f\z
oof
0: oof
\= Expect no match
foo
No match
/*foo*/
(?s)\A[^/]*?foo(*COMMIT)[^/]*+\z
foo
0: foo
food
0: food
aprilfool
0: aprilfool
/*ob*a*r*/
(?s)\A[^/]*?ob(*COMMIT)[^/]*?a(*COMMIT)[^/]*?r(*COMMIT)[^/]*+\z
foobar
0: foobar
/*ab/
(?s)\A[^/]*?ab\z
aaaaaaabababab
0: aaaaaaabababab
/foo\*/
(?s)\Afoo\*\z
foo*
0: foo*
/foo\*bar/
(?s)\Afoo\*bar\z
\= Expect no match
foobar
No match
/f\\oo/
(?s)\Af\\oo\z
f\\oo
0: f\oo
/*[al]?/
(?s)\A[^/]*?[al][^/]\z
ball
0: ball
/[ten]/
(?s)\A[ten]\z
\= Expect no match
ten
No match
/t[a-g]n/
(?s)\At[a-g]n\z
ten
0: ten
/a[]]b/
(?s)\Aa[\]]b\z
a]b
0: a]b
/a[]a-]b/
(?s)\Aa[\]a\-]b\z
/a[]-]b/
(?s)\Aa[\]\-]b\z
a-b
0: a-b
a]b
0: a]b
\= Expect no match
aab
No match
/a[]a-z]b/
(?s)\Aa[\]a-z]b\z
aab
0: aab
/]/
(?s)\A\]\z
]
0: ]
/t[!a-g]n/
(?s)\At[^/a-g]n\z
ton
0: ton
\= Expect no match
ten
No match
'[[:alpha:]][[:digit:]][[:upper:]]'
(?s)\A[[:alpha:]][[:digit:]][[:upper:]]\z
a1B
0: a1B
'[[:digit:][:upper:][:space:]]'
(?s)\A[[:digit:][:upper:][:space:]]\z
A
0: A
1
0: 1
\ \=
0:
\= Expect no match
a
No match
.
No match
'[a-c[:digit:]x-z]'
(?s)\A[a-c[:digit:]x-z]\z
5
0: 5
b
0: b
y
0: y
\= Expect no match
q
No match
# End of gitwildmatch tests
/*.j?g/
(?s)\A[^/]*?\.j[^/]g\z
pic01.jpg
0: pic01.jpg
.jpg
0: .jpg
pic02.jxg
0: pic02.jxg
\= Expect no match
pic03.j/g
No match
/A[+-0]B/
(?s)\AA[+-0](?<!/)B\z
A+B
0: A+B
A.B
0: A.B
A0B
0: A0B
\= Expect no match
A/B
No match
/*x?z/
(?s)\A[^/]*?x[^/]z\z
abc.xyz
0: abc.xyz
\= Expect no match
.xyz
0: .xyz
/?x?z/
(?s)\A[^/]x[^/]z\z
axyz
0: axyz
\= Expect no match
.xyz
0: .xyz
"[,-0]x?z"
(?s)\A[,-0](?<!/)x[^/]z\z
,xyz
0: ,xyz
\= Expect no match
/xyz
No match
.xyz
0: .xyz
".x*"
(?s)\A\.x(*COMMIT)[^/]*+\z
.xabc
0: .xabc
/a[--0]z/
(?s)\Aa[\--0](?<!/)z\z
a-z
0: a-z
a.z
0: a.z
a0z
0: a0z
\= Expect no match
a/z
No match
a1z
No match
/<[a-c-d]>/
(?s)\A<[a-c\-d]>\z
<a>
0: <a>
<b>
0: <b>
<c>
0: <c>
<d>
0: <d>
<->
0: <->
/a[[:digit:].]z/
(?s)\Aa[[:digit:].]z\z
a1z
0: a1z
a.z
0: a.z
\= Expect no match
a:z
No match
/a[[:digit].]z/
(?s)\Aa[\[:digit]\.\]z\z
a[.]z
0: a[.]z
a:.]z
0: a:.]z
ad.]z
0: ad.]z
/<[[:a[:digit:]b]>/
(?s)\A<[\[:a[:digit:]b]>\z
<[>
0: <[>
<:>
0: <:>
<a>
0: <a>
<9>
0: <9>
<b>
0: <b>
\= Expect no match
<d>
No match
/a*b/convert_glob_separator=\
(?s)\Aa(*COMMIT)[^\\]*?b\z
/a*b/convert_glob_separator=.
(?s)\Aa(*COMMIT)[^\.]*?b\z
/a*b/convert_glob_separator=/
(?s)\Aa(*COMMIT)[^/]*?b\z
# Non control character checking
/A\B\\C\D/
(?s)\AAB\\CD\z
/\\{}\?\*+\[\]()|.^$/
(?s)\A\\\{\}\?\*\+\[\]\(\)\|\.\^\$\z
/*a*\/*b*/
(?s)\A[^/]*?a(*COMMIT)[^/]*?/(*COMMIT)[^/]*?b(*COMMIT)[^/]*+\z
/?a?\/?b?/
(?s)\A[^/]a[^/]/[^/]b[^/]\z
/[a\\b\c][]][-][\]\-]/
(?s)\A[a\\bc][\]][\-][\]\-]\z
/[^a\\b\c][!]][!-][^\]\-]/
(?s)\A[^/a\\bc][^/\]][^/\-][^/\]\-]\z
/[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:word:][:xdigit:]]/
(?s)\A[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:word:][:xdigit:]](?<!/)\z
"[/-/]"
(?s)\A[/-/](?<!/)\z
/[-----]/
(?s)\A[\--\-\-\-]\z
/[------]/
(?s)\A[\--\-\--\-]\z
/[!------]/
(?s)\A[^/\--\-\--\-]\z
/[[:alpha:]-a]/
(?s)\A[[:alpha:]\-a]\z
/[[:alpha:]][[:punct:]][[:ascii:]]/
(?s)\A[[:alpha:]][[:punct:]](?<!/)[[:ascii:]](?<!/)\z
/[a-[:alpha:]]/
** Pattern conversion error at offset 4: invalid syntax
/[[:alpha:/
** Pattern conversion error at offset 9: missing terminating ] for character class
/[[:alpha:]/
** Pattern conversion error at offset 10: missing terminating ] for character class
/[[:alphaa:]]/
(?s)\A[\[:alphaa:]\]\z
/[[:xdigi:]]/
(?s)\A[\[:xdigi:]\]\z
/[[:xdigit::]]/
(?s)\A[\[:xdigit::]\]\z
/****/
(?s)
/**\/abc/
(?s)(?:\A|/)abc\z
abc
0: abc
x/abc
0: /abc
xabc
No match
/abc\/**/
(?s)\Aabc/
/abc\/**\/abc/
(?s)\Aabc/(*COMMIT)(?:.*?/)??abc\z
/**\/*a*b*g*n*t/
(?s)(?:\A|/)(?>[^/]*?a)(?>[^/]*?b)(?>[^/]*?g)(?>[^/]*?n)(?>[^/]*?t\z)
abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt
0: /abcdefghijklmnop.txt
/**\/*a*\/**/
(?s)(?:\A|/)(?>[^/]*?a)(?>[^/]*?/)
xx/xx/xx/xax/xx/xb
0: /xax/
/**\/*a*/
(?s)(?:\A|/)(?>[^/]*?a)(?>[^/]*+\z)
xx/xx/xx/xax
0: /xax
xx/xx/xx/xax/xx
No match
/**\/*a*\/**\/*b*/
(?s)(?:\A|/)(?>[^/]*?a)(?>[^/]*?/)(*COMMIT)(?:.*?/)??(?>[^/]*?b)(?>[^/]*+\z)
xx/xx/xx/xax/xx/xb
0: /xax/xx/xb
xx/xx/xx/xax/xx/x
No match
"**a"convert=glob
(?s)a\z
a
0: a
c/b/a
0: a
c/b/aaa
0: a
"a**/b"convert=glob
(?s)\Aa(*COMMIT).*?/b\z
a/b
0: a/b
ab
No match
"a/**b"convert=glob
(?s)\Aa/(*COMMIT).*?b\z
a/b
0: a/b
ab
No match
#pattern convert=glob:glob_no_starstar
/***/
(?s)\A[^/]*+\z
/**a**/
(?s)\A[^/]*?a(*COMMIT)[^/]*+\z
#pattern convert=unset
#pattern convert=glob:glob_no_wild_separator
/*/
(?s)
/*a*/
(?s)a
/**a**/
(?s)a
/a*b/
(?s)\Aa(*COMMIT).*?b\z
/*a*b*/
(?s)a(*COMMIT).*?b
/??a??/
(?s)\A..a..\z
#pattern convert=unset
#pattern convert=glob,convert_glob_escape=0
/a\b\cd/
(?s)\Aa\\b\\cd\z
/**\/a/
(?s)\\/a\z
/a`*b/convert_glob_escape=`
(?s)\Aa\*b\z
/a`*b/convert_glob_escape=0
(?s)\Aa`(*COMMIT)[^/]*?b\z
/a`*b/convert_glob_escape=x
** Invalid glob escape 'x'
# -------- Tests of extended POSIX conversion --------
#pattern convert=unset:posix_extended
/<[[:a[:digit:]b]>/
(*NUL)<[[:a[:digit:]b]>
<[>
0: <[>
<:>
0: <:>
<a>
0: <a>
<9>
0: <9>
<b>
0: <b>
\= Expect no match
<d>
No match
/a+\1b\\c|d[ab\c]/
(*NUL)a+1b\\c|d[ab\\c]
/<[]bc]>/
(*NUL)<[]bc]>
<]>
0: <]>
<b>
0: <b>
<c>
0: <c>
/<[^]bc]>/
(*NUL)<[^]bc]>
<.>
0: <.>
\= Expect no match
<]>
No match
<b>
No match
/(a)\1b/
(*NUL)(a)1b
a1b
0: a1b
1: a
\= Expect no match
aab
No match
/(ab)c)d]/
(*NUL)(ab)c\)d\]
Xabc)d]Y
0: abc)d]
1: ab
/a***b/
(*NUL)a*b
# -------- Tests of basic POSIX conversion --------
#pattern convert=unset:posix_basic
/a*b+c\+[def](ab)\(cd\)/
(*NUL)a*b\+c\+[def]\(ab\)(cd)
/\(a\)\1b/
(*NUL)(a)\1b
aab
0: aab
1: a
\= Expect no match
a1b
No match
/how.to how\.to/
(*NUL)how.to how\.to
how\nto how.to
0: how\x0ato how.to
\= Expect no match
how\x{0}to how.to
No match
/^how to \^how to/
(*NUL)^how to \^how to
/^*abc/
(*NUL)^\*abc
/*abc/
(*NUL)\*abc
X*abcY
0: *abc
/**abc/
(*NUL)\**abc
XabcY
0: abc
X*abcY
0: *abc
X**abcY
0: **abc
/*ab\(*cd\)/
(*NUL)\*ab(\*cd)
/^b\(c^d\)\(^e^f\)/
(*NUL)^b(c\^d)(^e\^f)
/a***b/
(*NUL)a*b
# End of testinput24

25
3rd/pcre2/testdata/testoutput25 vendored Normal file

File diff suppressed because one or more lines are too long

3515
3rd/pcre2/testdata/testoutput26 vendored Normal file
View File

@@ -0,0 +1,3515 @@
# These tests were generated by maint/GenerateTest.py using PCRE2's UCP
# data, do not edit unless that data has changed and they are reflecting
# a previous version.
# Unicode Script Extension tests for version 15.0.0
#perltest
# Base script check
/^\p{sc=Latin}/utf
A
0: A
/^\p{Script=Latn}/utf
\x{1df2a}
0: \x{1df2a}
# Script extension check
/^\p{Latin}/utf
\x{363}
0: \x{363}
/^\p{scx=Latn}/utf
\x{a92e}
0: \x{a92e}
# Script extension only character
/^\p{Latin}/utf
\x{363}
0: \x{363}
/^\p{sc=Latin}/utf
\x{363}
No match
# Character not in script
/^\p{Latin}/utf
\x{1df2b}
No match
# Base script check
/^\p{sc=Greek}/utf
\x{370}
0: \x{370}
/^\p{Script=Grek}/utf
\x{1d245}
0: \x{1d245}
# Script extension check
/^\p{Greek}/utf
\x{342}
0: \x{342}
/^\p{Script_Extensions=Grek}/utf
\x{1dc1}
0: \x{1dc1}
# Script extension only character
/^\p{Greek}/utf
\x{342}
0: \x{342}
/^\p{sc=Greek}/utf
\x{342}
No match
# Character not in script
/^\p{Greek}/utf
\x{1d246}
No match
# Base script check
/^\p{sc=Cyrillic}/utf
\x{400}
0: \x{400}
/^\p{Script=Cyrl}/utf
\x{1e08f}
0: \x{1e08f}
# Script extension check
/^\p{Cyrillic}/utf
\x{483}
0: \x{483}
/^\p{scx=Cyrl}/utf
\x{a66f}
0: \x{a66f}
# Script extension only character
/^\p{Cyrillic}/utf
\x{2e43}
0: \x{2e43}
/^\p{sc=Cyrillic}/utf
\x{2e43}
No match
# Character not in script
/^\p{Cyrillic}/utf
\x{1e090}
No match
# Base script check
/^\p{sc=Arabic}/utf
\x{600}
0: \x{600}
/^\p{Script=Arab}/utf
\x{1eef1}
0: \x{1eef1}
# Script extension check
/^\p{Arabic}/utf
\x{60c}
0: \x{60c}
/^\p{Script_Extensions=Arab}/utf
\x{102fb}
0: \x{102fb}
# Script extension only character
/^\p{Arabic}/utf
\x{102e0}
0: \x{102e0}
/^\p{sc=Arabic}/utf
\x{102e0}
No match
# Character not in script
/^\p{Arabic}/utf
\x{1eef2}
No match
# Base script check
/^\p{sc=Syriac}/utf
\x{700}
0: \x{700}
/^\p{Script=Syrc}/utf
\x{86a}
0: \x{86a}
# Script extension check
/^\p{Syriac}/utf
\x{60c}
0: \x{60c}
/^\p{scx=Syrc}/utf
\x{1dfa}
0: \x{1dfa}
# Script extension only character
/^\p{Syriac}/utf
\x{1dfa}
0: \x{1dfa}
/^\p{sc=Syriac}/utf
\x{1dfa}
No match
# Character not in script
/^\p{Syriac}/utf
\x{1dfb}
No match
# Base script check
/^\p{sc=Thaana}/utf
\x{780}
0: \x{780}
/^\p{Script=Thaa}/utf
\x{7b1}
0: \x{7b1}
# Script extension check
/^\p{Thaana}/utf
\x{60c}
0: \x{60c}
/^\p{Script_Extensions=Thaa}/utf
\x{fdfd}
0: \x{fdfd}
# Script extension only character
/^\p{Thaana}/utf
\x{fdf2}
0: \x{fdf2}
/^\p{sc=Thaana}/utf
\x{fdf2}
No match
# Character not in script
/^\p{Thaana}/utf
\x{fdfe}
No match
# Base script check
/^\p{sc=Devanagari}/utf
\x{900}
0: \x{900}
/^\p{Script=Deva}/utf
\x{11b09}
0: \x{11b09}
# Script extension check
/^\p{Devanagari}/utf
\x{951}
0: \x{951}
/^\p{scx=Deva}/utf
\x{a8f3}
0: \x{a8f3}
# Script extension only character
/^\p{Devanagari}/utf
\x{1cd1}
0: \x{1cd1}
/^\p{sc=Devanagari}/utf
\x{1cd1}
No match
# Character not in script
/^\p{Devanagari}/utf
\x{11b0a}
No match
# Base script check
/^\p{sc=Bengali}/utf
\x{980}
0: \x{980}
/^\p{Script=Beng}/utf
\x{9fe}
0: \x{9fe}
# Script extension check
/^\p{Bengali}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Beng}/utf
\x{a8f1}
0: \x{a8f1}
# Script extension only character
/^\p{Bengali}/utf
\x{1cf7}
0: \x{1cf7}
/^\p{sc=Bengali}/utf
\x{1cf7}
No match
# Character not in script
/^\p{Bengali}/utf
\x{a8f2}
No match
# Base script check
/^\p{sc=Gurmukhi}/utf
\x{a01}
0: \x{a01}
/^\p{Script=Guru}/utf
\x{a76}
0: \x{a76}
# Script extension check
/^\p{Gurmukhi}/utf
\x{951}
0: \x{951}
/^\p{scx=Guru}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Gurmukhi}/utf
\x{a836}
0: \x{a836}
/^\p{sc=Gurmukhi}/utf
\x{a836}
No match
# Character not in script
/^\p{Gurmukhi}/utf
\x{a83a}
No match
# Base script check
/^\p{sc=Gujarati}/utf
\x{a81}
0: \x{a81}
/^\p{Script=Gujr}/utf
\x{aff}
0: \x{aff}
# Script extension check
/^\p{Gujarati}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Gujr}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Gujarati}/utf
\x{a836}
0: \x{a836}
/^\p{sc=Gujarati}/utf
\x{a836}
No match
# Character not in script
/^\p{Gujarati}/utf
\x{a83a}
No match
# Base script check
/^\p{sc=Oriya}/utf
\x{b01}
0: \x{b01}
/^\p{Script=Orya}/utf
\x{b77}
0: \x{b77}
# Script extension check
/^\p{Oriya}/utf
\x{951}
0: \x{951}
/^\p{scx=Orya}/utf
\x{1cf2}
0: \x{1cf2}
# Script extension only character
/^\p{Oriya}/utf
\x{1cda}
0: \x{1cda}
/^\p{sc=Oriya}/utf
\x{1cda}
No match
# Character not in script
/^\p{Oriya}/utf
\x{1cf3}
No match
# Base script check
/^\p{sc=Tamil}/utf
\x{b82}
0: \x{b82}
/^\p{Script=Taml}/utf
\x{11fff}
0: \x{11fff}
# Script extension check
/^\p{Tamil}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Taml}/utf
\x{11fd3}
0: \x{11fd3}
# Script extension only character
/^\p{Tamil}/utf
\x{a8f3}
0: \x{a8f3}
/^\p{sc=Tamil}/utf
\x{a8f3}
No match
# Character not in script
/^\p{Tamil}/utf
\x{12000}
No match
# Base script check
/^\p{sc=Telugu}/utf
\x{c00}
0: \x{c00}
/^\p{Script=Telu}/utf
\x{c7f}
0: \x{c7f}
# Script extension check
/^\p{Telugu}/utf
\x{951}
0: \x{951}
/^\p{scx=Telu}/utf
\x{1cf2}
0: \x{1cf2}
# Script extension only character
/^\p{Telugu}/utf
\x{1cda}
0: \x{1cda}
/^\p{sc=Telugu}/utf
\x{1cda}
No match
# Character not in script
/^\p{Telugu}/utf
\x{1cf3}
No match
# Base script check
/^\p{sc=Kannada}/utf
\x{c80}
0: \x{c80}
/^\p{Script=Knda}/utf
\x{cf3}
0: \x{cf3}
# Script extension check
/^\p{Kannada}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Knda}/utf
\x{a835}
0: \x{a835}
# Script extension only character
/^\p{Kannada}/utf
\x{1cf4}
0: \x{1cf4}
/^\p{sc=Kannada}/utf
\x{1cf4}
No match
# Character not in script
/^\p{Kannada}/utf
\x{a836}
No match
# Base script check
/^\p{sc=Malayalam}/utf
\x{d00}
0: \x{d00}
/^\p{Script=Mlym}/utf
\x{d7f}
0: \x{d7f}
# Script extension check
/^\p{Malayalam}/utf
\x{951}
0: \x{951}
/^\p{scx=Mlym}/utf
\x{a832}
0: \x{a832}
# Script extension only character
/^\p{Malayalam}/utf
\x{1cda}
0: \x{1cda}
/^\p{sc=Malayalam}/utf
\x{1cda}
No match
# Character not in script
/^\p{Malayalam}/utf
\x{a833}
No match
# Base script check
/^\p{sc=Sinhala}/utf
\x{d81}
0: \x{d81}
/^\p{Script=Sinh}/utf
\x{111f4}
0: \x{111f4}
# Script extension check
/^\p{Sinhala}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Sinh}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Sinhala}/utf
\x{964}
0: \x{964}
/^\p{sc=Sinhala}/utf
\x{964}
No match
# Character not in script
/^\p{Sinhala}/utf
\x{111f5}
No match
# Base script check
/^\p{sc=Myanmar}/utf
\x{1000}
0: \x{1000}
/^\p{Script=Mymr}/utf
\x{aa7f}
0: \x{aa7f}
# Script extension check
/^\p{Myanmar}/utf
\x{1040}
0: \x{1040}
/^\p{scx=Mymr}/utf
\x{a92e}
0: \x{a92e}
# Script extension only character
/^\p{Myanmar}/utf
\x{a92e}
0: \x{a92e}
/^\p{sc=Myanmar}/utf
\x{a92e}
No match
# Character not in script
/^\p{Myanmar}/utf
\x{aa80}
No match
# Base script check
/^\p{sc=Georgian}/utf
\x{10a0}
0: \x{10a0}
/^\p{Script=Geor}/utf
\x{2d2d}
0: \x{2d2d}
# Script extension check
/^\p{Georgian}/utf
\x{10fb}
0: \x{10fb}
/^\p{Script_Extensions=Geor}/utf
\x{10fb}
0: \x{10fb}
# Script extension only character
/^\p{Georgian}/utf
\x{10fb}
0: \x{10fb}
/^\p{sc=Georgian}/utf
\x{10fb}
No match
# Character not in script
/^\p{Georgian}/utf
\x{2d2e}
No match
# Base script check
/^\p{sc=Hangul}/utf
\x{1100}
0: \x{1100}
/^\p{Script=Hang}/utf
\x{ffdc}
0: \x{ffdc}
# Script extension check
/^\p{Hangul}/utf
\x{3001}
0: \x{3001}
/^\p{scx=Hang}/utf
\x{ff65}
0: \x{ff65}
# Script extension only character
/^\p{Hangul}/utf
\x{3003}
0: \x{3003}
/^\p{sc=Hangul}/utf
\x{3003}
No match
# Character not in script
/^\p{Hangul}/utf
\x{ffdd}
No match
# Base script check
/^\p{sc=Mongolian}/utf
\x{1800}
0: \x{1800}
/^\p{Script=Mong}/utf
\x{1166c}
0: \x{1166c}
# Script extension check
/^\p{Mongolian}/utf
\x{1802}
0: \x{1802}
/^\p{Script_Extensions=Mong}/utf
\x{202f}
0: \x{202f}
# Script extension only character
/^\p{Mongolian}/utf
\x{202f}
0: \x{202f}
/^\p{sc=Mongolian}/utf
\x{202f}
No match
# Character not in script
/^\p{Mongolian}/utf
\x{1166d}
No match
# Base script check
/^\p{sc=Hiragana}/utf
\x{3041}
0: \x{3041}
/^\p{Script=Hira}/utf
\x{1f200}
0: \x{1f200}
# Script extension check
/^\p{Hiragana}/utf
\x{3001}
0: \x{3001}
/^\p{scx=Hira}/utf
\x{ff9f}
0: \x{ff9f}
# Script extension only character
/^\p{Hiragana}/utf
\x{3031}
0: \x{3031}
/^\p{sc=Hiragana}/utf
\x{3031}
No match
# Character not in script
/^\p{Hiragana}/utf
\x{1f201}
No match
# Base script check
/^\p{sc=Katakana}/utf
\x{30a1}
0: \x{30a1}
/^\p{Script=Kana}/utf
\x{1b167}
0: \x{1b167}
# Script extension check
/^\p{Katakana}/utf
\x{3001}
0: \x{3001}
/^\p{Script_Extensions=Kana}/utf
\x{ff9f}
0: \x{ff9f}
# Script extension only character
/^\p{Katakana}/utf
\x{3031}
0: \x{3031}
/^\p{sc=Katakana}/utf
\x{3031}
No match
# Character not in script
/^\p{Katakana}/utf
\x{1b168}
No match
# Base script check
/^\p{sc=Bopomofo}/utf
\x{2ea}
0: \x{2ea}
/^\p{Script=Bopo}/utf
\x{31bf}
0: \x{31bf}
# Script extension check
/^\p{Bopomofo}/utf
\x{3001}
0: \x{3001}
/^\p{scx=Bopo}/utf
\x{ff65}
0: \x{ff65}
# Script extension only character
/^\p{Bopomofo}/utf
\x{302a}
0: \x{302a}
/^\p{sc=Bopomofo}/utf
\x{302a}
No match
# Character not in script
/^\p{Bopomofo}/utf
\x{ff66}
No match
# Base script check
/^\p{sc=Han}/utf
\x{2e80}
0: \x{2e80}
/^\p{Script=Hani}/utf
\x{323af}
0: \x{323af}
# Script extension check
/^\p{Han}/utf
\x{3001}
0: \x{3001}
/^\p{Script_Extensions=Hani}/utf
\x{1f251}
0: \x{1f251}
# Script extension only character
/^\p{Han}/utf
\x{3006}
0: \x{3006}
/^\p{sc=Han}/utf
\x{3006}
No match
# Character not in script
/^\p{Han}/utf
\x{323b0}
No match
# Base script check
/^\p{sc=Yi}/utf
\x{a000}
0: \x{a000}
/^\p{Script=Yiii}/utf
\x{a4c6}
0: \x{a4c6}
# Script extension check
/^\p{Yi}/utf
\x{3001}
0: \x{3001}
/^\p{scx=Yiii}/utf
\x{ff65}
0: \x{ff65}
# Script extension only character
/^\p{Yi}/utf
\x{3001}
0: \x{3001}
/^\p{sc=Yi}/utf
\x{3001}
No match
# Character not in script
/^\p{Yi}/utf
\x{ff66}
No match
# Base script check
/^\p{sc=Tagalog}/utf
\x{1700}
0: \x{1700}
/^\p{Script=Tglg}/utf
\x{171f}
0: \x{171f}
# Script extension check
/^\p{Tagalog}/utf
\x{1735}
0: \x{1735}
/^\p{Script_Extensions=Tglg}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Tagalog}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Tagalog}/utf
\x{1735}
No match
# Character not in script
/^\p{Tagalog}/utf
\x{1737}
No match
# Base script check
/^\p{sc=Hanunoo}/utf
\x{1720}
0: \x{1720}
/^\p{Script=Hano}/utf
\x{1734}
0: \x{1734}
# Script extension check
/^\p{Hanunoo}/utf
\x{1735}
0: \x{1735}
/^\p{scx=Hano}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Hanunoo}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Hanunoo}/utf
\x{1735}
No match
# Character not in script
/^\p{Hanunoo}/utf
\x{1737}
No match
# Base script check
/^\p{sc=Buhid}/utf
\x{1740}
0: \x{1740}
/^\p{Script=Buhd}/utf
\x{1753}
0: \x{1753}
# Script extension check
/^\p{Buhid}/utf
\x{1735}
0: \x{1735}
/^\p{Script_Extensions=Buhd}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Buhid}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Buhid}/utf
\x{1735}
No match
# Character not in script
/^\p{Buhid}/utf
\x{1754}
No match
# Base script check
/^\p{sc=Tagbanwa}/utf
\x{1760}
0: \x{1760}
/^\p{Script=Tagb}/utf
\x{1773}
0: \x{1773}
# Script extension check
/^\p{Tagbanwa}/utf
\x{1735}
0: \x{1735}
/^\p{scx=Tagb}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Tagbanwa}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Tagbanwa}/utf
\x{1735}
No match
# Character not in script
/^\p{Tagbanwa}/utf
\x{1774}
No match
# Base script check
/^\p{sc=Limbu}/utf
\x{1900}
0: \x{1900}
/^\p{Script=Limb}/utf
\x{194f}
0: \x{194f}
# Script extension check
/^\p{Limbu}/utf
\x{965}
0: \x{965}
/^\p{Script_Extensions=Limb}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Limbu}/utf
\x{965}
0: \x{965}
/^\p{sc=Limbu}/utf
\x{965}
No match
# Character not in script
/^\p{Limbu}/utf
\x{1950}
No match
# Base script check
/^\p{sc=Tai_Le}/utf
\x{1950}
0: \x{1950}
/^\p{Script=Tale}/utf
\x{1974}
0: \x{1974}
# Script extension check
/^\p{Tai_Le}/utf
\x{1040}
0: \x{1040}
/^\p{scx=Tale}/utf
\x{1049}
0: \x{1049}
# Script extension only character
/^\p{Tai_Le}/utf
\x{1040}
0: \x{1040}
/^\p{sc=Tai_Le}/utf
\x{1040}
No match
# Character not in script
/^\p{Tai_Le}/utf
\x{1975}
No match
# Base script check
/^\p{sc=Linear_B}/utf
\x{10000}
0: \x{10000}
/^\p{Script=Linb}/utf
\x{100fa}
0: \x{100fa}
# Script extension check
/^\p{Linear_B}/utf
\x{10100}
0: \x{10100}
/^\p{Script_Extensions=Linb}/utf
\x{1013f}
0: \x{1013f}
# Script extension only character
/^\p{Linear_B}/utf
\x{10102}
0: \x{10102}
/^\p{sc=Linear_B}/utf
\x{10102}
No match
# Character not in script
/^\p{Linear_B}/utf
\x{10140}
No match
# Base script check
/^\p{sc=Cypriot}/utf
\x{10800}
0: \x{10800}
/^\p{Script=Cprt}/utf
\x{1083f}
0: \x{1083f}
# Script extension check
/^\p{Cypriot}/utf
\x{10100}
0: \x{10100}
/^\p{scx=Cprt}/utf
\x{1013f}
0: \x{1013f}
# Script extension only character
/^\p{Cypriot}/utf
\x{10102}
0: \x{10102}
/^\p{sc=Cypriot}/utf
\x{10102}
No match
# Character not in script
/^\p{Cypriot}/utf
\x{10840}
No match
# Base script check
/^\p{sc=Buginese}/utf
\x{1a00}
0: \x{1a00}
/^\p{Script=Bugi}/utf
\x{1a1f}
0: \x{1a1f}
# Script extension check
/^\p{Buginese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{Script_Extensions=Bugi}/utf
\x{a9cf}
0: \x{a9cf}
# Script extension only character
/^\p{Buginese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{sc=Buginese}/utf
\x{a9cf}
No match
# Character not in script
/^\p{Buginese}/utf
\x{a9d0}
No match
# Base script check
/^\p{sc=Coptic}/utf
\x{3e2}
0: \x{3e2}
/^\p{Script=Copt}/utf
\x{2cff}
0: \x{2cff}
# Script extension check
/^\p{Coptic}/utf
\x{102e0}
0: \x{102e0}
/^\p{scx=Copt}/utf
\x{102fb}
0: \x{102fb}
# Script extension only character
/^\p{Coptic}/utf
\x{102e0}
0: \x{102e0}
/^\p{sc=Coptic}/utf
\x{102e0}
No match
# Character not in script
/^\p{Coptic}/utf
\x{102fc}
No match
# Base script check
/^\p{sc=Glagolitic}/utf
\x{2c00}
0: \x{2c00}
/^\p{Script=Glag}/utf
\x{1e02a}
0: \x{1e02a}
# Script extension check
/^\p{Glagolitic}/utf
\x{484}
0: \x{484}
/^\p{Script_Extensions=Glag}/utf
\x{a66f}
0: \x{a66f}
# Script extension only character
/^\p{Glagolitic}/utf
\x{484}
0: \x{484}
/^\p{sc=Glagolitic}/utf
\x{484}
No match
# Character not in script
/^\p{Glagolitic}/utf
\x{1e02b}
No match
# Base script check
/^\p{sc=Syloti_Nagri}/utf
\x{a800}
0: \x{a800}
/^\p{Script=Sylo}/utf
\x{a82c}
0: \x{a82c}
# Script extension check
/^\p{Syloti_Nagri}/utf
\x{964}
0: \x{964}
/^\p{scx=Sylo}/utf
\x{9ef}
0: \x{9ef}
# Script extension only character
/^\p{Syloti_Nagri}/utf
\x{9e6}
0: \x{9e6}
/^\p{sc=Syloti_Nagri}/utf
\x{9e6}
No match
# Character not in script
/^\p{Syloti_Nagri}/utf
\x{a82d}
No match
# Base script check
/^\p{sc=Phags_Pa}/utf
\x{a840}
0: \x{a840}
/^\p{Script=Phag}/utf
\x{a877}
0: \x{a877}
# Script extension check
/^\p{Phags_Pa}/utf
\x{1802}
0: \x{1802}
/^\p{Script_Extensions=Phag}/utf
\x{1805}
0: \x{1805}
# Script extension only character
/^\p{Phags_Pa}/utf
\x{1802}
0: \x{1802}
/^\p{sc=Phags_Pa}/utf
\x{1802}
No match
# Character not in script
/^\p{Phags_Pa}/utf
\x{a878}
No match
# Base script check
/^\p{sc=Nko}/utf
\x{7c0}
0: \x{7c0}
/^\p{Script=Nkoo}/utf
\x{7ff}
0: \x{7ff}
# Script extension check
/^\p{Nko}/utf
\x{60c}
0: \x{60c}
/^\p{scx=Nkoo}/utf
\x{fd3f}
0: \x{fd3f}
# Script extension only character
/^\p{Nko}/utf
\x{fd3e}
0: \x{fd3e}
/^\p{sc=Nko}/utf
\x{fd3e}
No match
# Character not in script
/^\p{Nko}/utf
\x{fd40}
No match
# Base script check
/^\p{sc=Kayah_Li}/utf
\x{a900}
0: \x{a900}
/^\p{Script=Kali}/utf
\x{a92f}
0: \x{a92f}
# Script extension check
/^\p{Kayah_Li}/utf
\x{a92e}
0: \x{a92e}
/^\p{Script_Extensions=Kali}/utf
\x{a92e}
0: \x{a92e}
# Script extension only character
/^\p{Kayah_Li}/utf
\x{a92e}
0: \x{a92e}
/^\p{sc=Kayah_Li}/utf
\x{a92e}
No match
# Character not in script
/^\p{Kayah_Li}/utf
\x{a930}
No match
# Base script check
/^\p{sc=Javanese}/utf
\x{a980}
0: \x{a980}
/^\p{Script=Java}/utf
\x{a9df}
0: \x{a9df}
# Script extension check
/^\p{Javanese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{scx=Java}/utf
\x{a9cf}
0: \x{a9cf}
# Script extension only character
/^\p{Javanese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{sc=Javanese}/utf
\x{a9cf}
No match
# Character not in script
/^\p{Javanese}/utf
\x{a9e0}
No match
# Base script check
/^\p{sc=Kaithi}/utf
\x{11080}
0: \x{11080}
/^\p{Script=Kthi}/utf
\x{110cd}
0: \x{110cd}
# Script extension check
/^\p{Kaithi}/utf
\x{966}
0: \x{966}
/^\p{Script_Extensions=Kthi}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Kaithi}/utf
\x{966}
0: \x{966}
/^\p{sc=Kaithi}/utf
\x{966}
No match
# Character not in script
/^\p{Kaithi}/utf
\x{110ce}
No match
# Base script check
/^\p{sc=Mandaic}/utf
\x{840}
0: \x{840}
/^\p{Script=Mand}/utf
\x{85e}
0: \x{85e}
# Script extension check
/^\p{Mandaic}/utf
\x{640}
0: \x{640}
/^\p{scx=Mand}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Mandaic}/utf
\x{640}
0: \x{640}
/^\p{sc=Mandaic}/utf
\x{640}
No match
# Character not in script
/^\p{Mandaic}/utf
\x{85f}
No match
# Base script check
/^\p{sc=Chakma}/utf
\x{11100}
0: \x{11100}
/^\p{Script=Cakm}/utf
\x{11147}
0: \x{11147}
# Script extension check
/^\p{Chakma}/utf
\x{9e6}
0: \x{9e6}
/^\p{Script_Extensions=Cakm}/utf
\x{1049}
0: \x{1049}
# Script extension only character
/^\p{Chakma}/utf
\x{9e6}
0: \x{9e6}
/^\p{sc=Chakma}/utf
\x{9e6}
No match
# Character not in script
/^\p{Chakma}/utf
\x{11148}
No match
# Base script check
/^\p{sc=Sharada}/utf
\x{11180}
0: \x{11180}
/^\p{Script=Shrd}/utf
\x{111df}
0: \x{111df}
# Script extension check
/^\p{Sharada}/utf
\x{951}
0: \x{951}
/^\p{scx=Shrd}/utf
\x{1ce0}
0: \x{1ce0}
# Script extension only character
/^\p{Sharada}/utf
\x{1cd7}
0: \x{1cd7}
/^\p{sc=Sharada}/utf
\x{1cd7}
No match
# Character not in script
/^\p{Sharada}/utf
\x{111e0}
No match
# Base script check
/^\p{sc=Takri}/utf
\x{11680}
0: \x{11680}
/^\p{Script=Takr}/utf
\x{116c9}
0: \x{116c9}
# Script extension check
/^\p{Takri}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Takr}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Takri}/utf
\x{a836}
0: \x{a836}
/^\p{sc=Takri}/utf
\x{a836}
No match
# Character not in script
/^\p{Takri}/utf
\x{116ca}
No match
# Base script check
/^\p{sc=Duployan}/utf
\x{1bc00}
0: \x{1bc00}
/^\p{Script=Dupl}/utf
\x{1bc9f}
0: \x{1bc9f}
# Script extension check
/^\p{Duployan}/utf
\x{1bca0}
0: \x{1bca0}
/^\p{scx=Dupl}/utf
\x{1bca3}
0: \x{1bca3}
# Script extension only character
/^\p{Duployan}/utf
\x{1bca0}
0: \x{1bca0}
/^\p{sc=Duployan}/utf
\x{1bca0}
No match
# Character not in script
/^\p{Duployan}/utf
\x{1bca4}
No match
# Base script check
/^\p{sc=Grantha}/utf
\x{11300}
0: \x{11300}
/^\p{Script=Gran}/utf
\x{11374}
0: \x{11374}
# Script extension check
/^\p{Grantha}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Gran}/utf
\x{11fd3}
0: \x{11fd3}
# Script extension only character
/^\p{Grantha}/utf
\x{1cd3}
0: \x{1cd3}
/^\p{sc=Grantha}/utf
\x{1cd3}
No match
# Character not in script
/^\p{Grantha}/utf
\x{11fd4}
No match
# Base script check
/^\p{sc=Khojki}/utf
\x{11200}
0: \x{11200}
/^\p{Script=Khoj}/utf
\x{11241}
0: \x{11241}
# Script extension check
/^\p{Khojki}/utf
\x{ae6}
0: \x{ae6}
/^\p{scx=Khoj}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Khojki}/utf
\x{ae6}
0: \x{ae6}
/^\p{sc=Khojki}/utf
\x{ae6}
No match
# Character not in script
/^\p{Khojki}/utf
\x{11242}
No match
# Base script check
/^\p{sc=Linear_A}/utf
\x{10600}
0: \x{10600}
/^\p{Script=Lina}/utf
\x{10767}
0: \x{10767}
# Script extension check
/^\p{Linear_A}/utf
\x{10107}
0: \x{10107}
/^\p{Script_Extensions=Lina}/utf
\x{10133}
0: \x{10133}
# Script extension only character
/^\p{Linear_A}/utf
\x{10107}
0: \x{10107}
/^\p{sc=Linear_A}/utf
\x{10107}
No match
# Character not in script
/^\p{Linear_A}/utf
\x{10768}
No match
# Base script check
/^\p{sc=Mahajani}/utf
\x{11150}
0: \x{11150}
/^\p{Script=Mahj}/utf
\x{11176}
0: \x{11176}
# Script extension check
/^\p{Mahajani}/utf
\x{964}
0: \x{964}
/^\p{scx=Mahj}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Mahajani}/utf
\x{966}
0: \x{966}
/^\p{sc=Mahajani}/utf
\x{966}
No match
# Character not in script
/^\p{Mahajani}/utf
\x{11177}
No match
# Base script check
/^\p{sc=Manichaean}/utf
\x{10ac0}
0: \x{10ac0}
/^\p{Script=Mani}/utf
\x{10af6}
0: \x{10af6}
# Script extension check
/^\p{Manichaean}/utf
\x{640}
0: \x{640}
/^\p{Script_Extensions=Mani}/utf
\x{10af2}
0: \x{10af2}
# Script extension only character
/^\p{Manichaean}/utf
\x{640}
0: \x{640}
/^\p{sc=Manichaean}/utf
\x{640}
No match
# Character not in script
/^\p{Manichaean}/utf
\x{10af7}
No match
# Base script check
/^\p{sc=Modi}/utf
\x{11600}
0: \x{11600}
/^\p{Script=Modi}/utf
\x{11659}
0: \x{11659}
# Script extension check
/^\p{Modi}/utf
\x{a830}
0: \x{a830}
/^\p{scx=Modi}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Modi}/utf
\x{a836}
0: \x{a836}
/^\p{sc=Modi}/utf
\x{a836}
No match
# Character not in script
/^\p{Modi}/utf
\x{1165a}
No match
# Base script check
/^\p{sc=Old_Permic}/utf
\x{10350}
0: \x{10350}
/^\p{Script=Perm}/utf
\x{1037a}
0: \x{1037a}
# Script extension check
/^\p{Old_Permic}/utf
\x{483}
0: \x{483}
/^\p{Script_Extensions=Perm}/utf
\x{483}
0: \x{483}
# Script extension only character
/^\p{Old_Permic}/utf
\x{483}
0: \x{483}
/^\p{sc=Old_Permic}/utf
\x{483}
No match
# Character not in script
/^\p{Old_Permic}/utf
\x{1037b}
No match
# Base script check
/^\p{sc=Psalter_Pahlavi}/utf
\x{10b80}
0: \x{10b80}
/^\p{Script=Phlp}/utf
\x{10baf}
0: \x{10baf}
# Script extension check
/^\p{Psalter_Pahlavi}/utf
\x{640}
0: \x{640}
/^\p{scx=Phlp}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Psalter_Pahlavi}/utf
\x{640}
0: \x{640}
/^\p{sc=Psalter_Pahlavi}/utf
\x{640}
No match
# Character not in script
/^\p{Psalter_Pahlavi}/utf
\x{10bb0}
No match
# Base script check
/^\p{sc=Khudawadi}/utf
\x{112b0}
0: \x{112b0}
/^\p{Script=Sind}/utf
\x{112f9}
0: \x{112f9}
# Script extension check
/^\p{Khudawadi}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Sind}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Khudawadi}/utf
\x{a836}
0: \x{a836}
/^\p{sc=Khudawadi}/utf
\x{a836}
No match
# Character not in script
/^\p{Khudawadi}/utf
\x{112fa}
No match
# Base script check
/^\p{sc=Tirhuta}/utf
\x{11480}
0: \x{11480}
/^\p{Script=Tirh}/utf
\x{114d9}
0: \x{114d9}
# Script extension check
/^\p{Tirhuta}/utf
\x{951}
0: \x{951}
/^\p{scx=Tirh}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Tirhuta}/utf
\x{1cf2}
0: \x{1cf2}
/^\p{sc=Tirhuta}/utf
\x{1cf2}
No match
# Character not in script
/^\p{Tirhuta}/utf
\x{114da}
No match
# Base script check
/^\p{sc=Multani}/utf
\x{11280}
0: \x{11280}
/^\p{Script=Mult}/utf
\x{112a9}
0: \x{112a9}
# Script extension check
/^\p{Multani}/utf
\x{a66}
0: \x{a66}
/^\p{Script_Extensions=Mult}/utf
\x{a6f}
0: \x{a6f}
# Script extension only character
/^\p{Multani}/utf
\x{a66}
0: \x{a66}
/^\p{sc=Multani}/utf
\x{a66}
No match
# Character not in script
/^\p{Multani}/utf
\x{112aa}
No match
# Base script check
/^\p{sc=Adlam}/utf
\x{1e900}
0: \x{1e900}
/^\p{Script=Adlm}/utf
\x{1e95f}
0: \x{1e95f}
# Script extension check
/^\p{Adlam}/utf
\x{61f}
0: \x{61f}
/^\p{scx=Adlm}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Adlam}/utf
\x{61f}
0: \x{61f}
/^\p{sc=Adlam}/utf
\x{61f}
No match
# Character not in script
/^\p{Adlam}/utf
\x{1e960}
No match
# Base script check
/^\p{sc=Masaram_Gondi}/utf
\x{11d00}
0: \x{11d00}
/^\p{Script=Gonm}/utf
\x{11d59}
0: \x{11d59}
# Script extension check
/^\p{Masaram_Gondi}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Gonm}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Masaram_Gondi}/utf
\x{964}
0: \x{964}
/^\p{sc=Masaram_Gondi}/utf
\x{964}
No match
# Character not in script
/^\p{Masaram_Gondi}/utf
\x{11d5a}
No match
# Base script check
/^\p{sc=Dogra}/utf
\x{11800}
0: \x{11800}
/^\p{Script=Dogr}/utf
\x{1183b}
0: \x{1183b}
# Script extension check
/^\p{Dogra}/utf
\x{964}
0: \x{964}
/^\p{scx=Dogr}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Dogra}/utf
\x{966}
0: \x{966}
/^\p{sc=Dogra}/utf
\x{966}
No match
# Character not in script
/^\p{Dogra}/utf
\x{1183c}
No match
# Base script check
/^\p{sc=Gunjala_Gondi}/utf
\x{11d60}
0: \x{11d60}
/^\p{Script=Gong}/utf
\x{11da9}
0: \x{11da9}
# Script extension check
/^\p{Gunjala_Gondi}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Gong}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Gunjala_Gondi}/utf
\x{964}
0: \x{964}
/^\p{sc=Gunjala_Gondi}/utf
\x{964}
No match
# Character not in script
/^\p{Gunjala_Gondi}/utf
\x{11daa}
No match
# Base script check
/^\p{sc=Hanifi_Rohingya}/utf
\x{10d00}
0: \x{10d00}
/^\p{Script=Rohg}/utf
\x{10d39}
0: \x{10d39}
# Script extension check
/^\p{Hanifi_Rohingya}/utf
\x{60c}
0: \x{60c}
/^\p{scx=Rohg}/utf
\x{6d4}
0: \x{6d4}
# Script extension only character
/^\p{Hanifi_Rohingya}/utf
\x{6d4}
0: \x{6d4}
/^\p{sc=Hanifi_Rohingya}/utf
\x{6d4}
No match
# Character not in script
/^\p{Hanifi_Rohingya}/utf
\x{10d3a}
No match
# Base script check
/^\p{sc=Sogdian}/utf
\x{10f30}
0: \x{10f30}
/^\p{Script=Sogd}/utf
\x{10f59}
0: \x{10f59}
# Script extension check
/^\p{Sogdian}/utf
\x{640}
0: \x{640}
/^\p{Script_Extensions=Sogd}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Sogdian}/utf
\x{640}
0: \x{640}
/^\p{sc=Sogdian}/utf
\x{640}
No match
# Character not in script
/^\p{Sogdian}/utf
\x{10f5a}
No match
# Base script check
/^\p{sc=Nandinagari}/utf
\x{119a0}
0: \x{119a0}
/^\p{Script=Nand}/utf
\x{119e4}
0: \x{119e4}
# Script extension check
/^\p{Nandinagari}/utf
\x{964}
0: \x{964}
/^\p{scx=Nand}/utf
\x{a835}
0: \x{a835}
# Script extension only character
/^\p{Nandinagari}/utf
\x{1cfa}
0: \x{1cfa}
/^\p{sc=Nandinagari}/utf
\x{1cfa}
No match
# Character not in script
/^\p{Nandinagari}/utf
\x{119e5}
No match
# Base script check
/^\p{sc=Yezidi}/utf
\x{10e80}
0: \x{10e80}
/^\p{Script=Yezi}/utf
\x{10eb1}
0: \x{10eb1}
# Script extension check
/^\p{Yezidi}/utf
\x{60c}
0: \x{60c}
/^\p{Script_Extensions=Yezi}/utf
\x{669}
0: \x{669}
# Script extension only character
/^\p{Yezidi}/utf
\x{660}
0: \x{660}
/^\p{sc=Yezidi}/utf
\x{660}
No match
# Character not in script
/^\p{Yezidi}/utf
\x{10eb2}
No match
# Base script check
/^\p{sc=Cypro_Minoan}/utf
\x{12f90}
0: \x{12f90}
/^\p{Script=Cpmn}/utf
\x{12ff2}
0: \x{12ff2}
# Script extension check
/^\p{Cypro_Minoan}/utf
\x{10100}
0: \x{10100}
/^\p{scx=Cpmn}/utf
\x{10101}
0: \x{10101}
# Script extension only character
/^\p{Cypro_Minoan}/utf
\x{10100}
0: \x{10100}
/^\p{sc=Cypro_Minoan}/utf
\x{10100}
No match
# Character not in script
/^\p{Cypro_Minoan}/utf
\x{12ff3}
No match
# Base script check
/^\p{sc=Old_Uyghur}/utf
\x{10f70}
0: \x{10f70}
/^\p{Script=Ougr}/utf
\x{10f89}
0: \x{10f89}
# Script extension check
/^\p{Old_Uyghur}/utf
\x{640}
0: \x{640}
/^\p{Script_Extensions=Ougr}/utf
\x{10af2}
0: \x{10af2}
# Script extension only character
/^\p{Old_Uyghur}/utf
\x{10af2}
0: \x{10af2}
/^\p{sc=Old_Uyghur}/utf
\x{10af2}
No match
# Character not in script
/^\p{Old_Uyghur}/utf
\x{10f8a}
No match
# Base script check
/^\p{sc=Common}/utf
\x{00}
0: \x{00}
/^\p{Script=Zyyy}/utf
\x{e007f}
0: \x{e007f}
# Character not in script
/^\p{Common}/utf
\x{e0080}
No match
# Base script check
/^\p{sc=Armenian}/utf
\x{531}
0: \x{531}
/^\p{Script=Armn}/utf
\x{fb17}
0: \x{fb17}
# Character not in script
/^\p{Armenian}/utf
\x{fb18}
No match
# Base script check
/^\p{sc=Hebrew}/utf
\x{591}
0: \x{591}
/^\p{Script=Hebr}/utf
\x{fb4f}
0: \x{fb4f}
# Character not in script
/^\p{Hebrew}/utf
\x{fb50}
No match
# Base script check
/^\p{sc=Thai}/utf
\x{e01}
0: \x{e01}
/^\p{Script=Thai}/utf
\x{e5b}
0: \x{e5b}
# Character not in script
/^\p{Thai}/utf
\x{e5c}
No match
# Base script check
/^\p{sc=Lao}/utf
\x{e81}
0: \x{e81}
/^\p{Script=Laoo}/utf
\x{edf}
0: \x{edf}
# Character not in script
/^\p{Lao}/utf
\x{ee0}
No match
# Base script check
/^\p{sc=Tibetan}/utf
\x{f00}
0: \x{f00}
/^\p{Script=Tibt}/utf
\x{fda}
0: \x{fda}
# Character not in script
/^\p{Tibetan}/utf
\x{fdb}
No match
# Base script check
/^\p{sc=Ethiopic}/utf
\x{1200}
0: \x{1200}
/^\p{Script=Ethi}/utf
\x{1e7fe}
0: \x{1e7fe}
# Character not in script
/^\p{Ethiopic}/utf
\x{1e7ff}
No match
# Base script check
/^\p{sc=Cherokee}/utf
\x{13a0}
0: \x{13a0}
/^\p{Script=Cher}/utf
\x{abbf}
0: \x{abbf}
# Character not in script
/^\p{Cherokee}/utf
\x{abc0}
No match
# Base script check
/^\p{sc=Canadian_Aboriginal}/utf
\x{1400}
0: \x{1400}
/^\p{Script=Cans}/utf
\x{11abf}
0: \x{11abf}
# Character not in script
/^\p{Canadian_Aboriginal}/utf
\x{11ac0}
No match
# Base script check
/^\p{sc=Ogham}/utf
\x{1680}
0: \x{1680}
/^\p{Script=Ogam}/utf
\x{169c}
0: \x{169c}
# Character not in script
/^\p{Ogham}/utf
\x{169d}
No match
# Base script check
/^\p{sc=Runic}/utf
\x{16a0}
0: \x{16a0}
/^\p{Script=Runr}/utf
\x{16f8}
0: \x{16f8}
# Character not in script
/^\p{Runic}/utf
\x{16f9}
No match
# Base script check
/^\p{sc=Khmer}/utf
\x{1780}
0: \x{1780}
/^\p{Script=Khmr}/utf
\x{19ff}
0: \x{19ff}
# Character not in script
/^\p{Khmer}/utf
\x{1a00}
No match
# Base script check
/^\p{sc=Old_Italic}/utf
\x{10300}
0: \x{10300}
/^\p{Script=Ital}/utf
\x{1032f}
0: \x{1032f}
# Character not in script
/^\p{Old_Italic}/utf
\x{10330}
No match
# Base script check
/^\p{sc=Gothic}/utf
\x{10330}
0: \x{10330}
/^\p{Script=Goth}/utf
\x{1034a}
0: \x{1034a}
# Character not in script
/^\p{Gothic}/utf
\x{1034b}
No match
# Base script check
/^\p{sc=Deseret}/utf
\x{10400}
0: \x{10400}
/^\p{Script=Dsrt}/utf
\x{1044f}
0: \x{1044f}
# Character not in script
/^\p{Deseret}/utf
\x{10450}
No match
# Base script check
/^\p{sc=Inherited}/utf
\x{300}
0: \x{300}
/^\p{Script=Zinh}/utf
\x{e01ef}
0: \x{e01ef}
# Character not in script
/^\p{Inherited}/utf
\x{e01f0}
No match
# Base script check
/^\p{sc=Ugaritic}/utf
\x{10380}
0: \x{10380}
/^\p{Script=Ugar}/utf
\x{1039f}
0: \x{1039f}
# Character not in script
/^\p{Ugaritic}/utf
\x{103a0}
No match
# Base script check
/^\p{sc=Shavian}/utf
\x{10450}
0: \x{10450}
/^\p{Script=Shaw}/utf
\x{1047f}
0: \x{1047f}
# Character not in script
/^\p{Shavian}/utf
\x{10480}
No match
# Base script check
/^\p{sc=Osmanya}/utf
\x{10480}
0: \x{10480}
/^\p{Script=Osma}/utf
\x{104a9}
0: \x{104a9}
# Character not in script
/^\p{Osmanya}/utf
\x{104aa}
No match
# Base script check
/^\p{sc=Braille}/utf
\x{2800}
0: \x{2800}
/^\p{Script=Brai}/utf
\x{28ff}
0: \x{28ff}
# Character not in script
/^\p{Braille}/utf
\x{2900}
No match
# Base script check
/^\p{sc=New_Tai_Lue}/utf
\x{1980}
0: \x{1980}
/^\p{Script=Talu}/utf
\x{19df}
0: \x{19df}
# Character not in script
/^\p{New_Tai_Lue}/utf
\x{19e0}
No match
# Base script check
/^\p{sc=Tifinagh}/utf
\x{2d30}
0: \x{2d30}
/^\p{Script=Tfng}/utf
\x{2d7f}
0: \x{2d7f}
# Character not in script
/^\p{Tifinagh}/utf
\x{2d80}
No match
# Base script check
/^\p{sc=Old_Persian}/utf
\x{103a0}
0: \x{103a0}
/^\p{Script=Xpeo}/utf
\x{103d5}
0: \x{103d5}
# Character not in script
/^\p{Old_Persian}/utf
\x{103d6}
No match
# Base script check
/^\p{sc=Kharoshthi}/utf
\x{10a00}
0: \x{10a00}
/^\p{Script=Khar}/utf
\x{10a58}
0: \x{10a58}
# Character not in script
/^\p{Kharoshthi}/utf
\x{10a59}
No match
# Base script check
/^\p{sc=Balinese}/utf
\x{1b00}
0: \x{1b00}
/^\p{Script=Bali}/utf
\x{1b7e}
0: \x{1b7e}
# Character not in script
/^\p{Balinese}/utf
\x{1b8f}
No match
# Base script check
/^\p{sc=Cuneiform}/utf
\x{12000}
0: \x{12000}
/^\p{Script=Xsux}/utf
\x{12543}
0: \x{12543}
# Character not in script
/^\p{Cuneiform}/utf
\x{12544}
No match
# Base script check
/^\p{sc=Phoenician}/utf
\x{10900}
0: \x{10900}
/^\p{Script=Phnx}/utf
\x{1091f}
0: \x{1091f}
# Character not in script
/^\p{Phoenician}/utf
\x{10920}
No match
# Base script check
/^\p{sc=Sundanese}/utf
\x{1b80}
0: \x{1b80}
/^\p{Script=Sund}/utf
\x{1cc7}
0: \x{1cc7}
# Character not in script
/^\p{Sundanese}/utf
\x{1cc8}
No match
# Base script check
/^\p{sc=Lepcha}/utf
\x{1c00}
0: \x{1c00}
/^\p{Script=Lepc}/utf
\x{1c4f}
0: \x{1c4f}
# Character not in script
/^\p{Lepcha}/utf
\x{1c50}
No match
# Base script check
/^\p{sc=Ol_Chiki}/utf
\x{1c50}
0: \x{1c50}
/^\p{Script=Olck}/utf
\x{1c7f}
0: \x{1c7f}
# Character not in script
/^\p{Ol_Chiki}/utf
\x{1c80}
No match
# Base script check
/^\p{sc=Vai}/utf
\x{a500}
0: \x{a500}
/^\p{Script=Vaii}/utf
\x{a62b}
0: \x{a62b}
# Character not in script
/^\p{Vai}/utf
\x{a62c}
No match
# Base script check
/^\p{sc=Saurashtra}/utf
\x{a880}
0: \x{a880}
/^\p{Script=Saur}/utf
\x{a8d9}
0: \x{a8d9}
# Character not in script
/^\p{Saurashtra}/utf
\x{a8da}
No match
# Base script check
/^\p{sc=Rejang}/utf
\x{a930}
0: \x{a930}
/^\p{Script=Rjng}/utf
\x{a95f}
0: \x{a95f}
# Character not in script
/^\p{Rejang}/utf
\x{a960}
No match
# Base script check
/^\p{sc=Lycian}/utf
\x{10280}
0: \x{10280}
/^\p{Script=Lyci}/utf
\x{1029c}
0: \x{1029c}
# Character not in script
/^\p{Lycian}/utf
\x{1029d}
No match
# Base script check
/^\p{sc=Carian}/utf
\x{102a0}
0: \x{102a0}
/^\p{Script=Cari}/utf
\x{102d0}
0: \x{102d0}
# Character not in script
/^\p{Carian}/utf
\x{102d1}
No match
# Base script check
/^\p{sc=Lydian}/utf
\x{10920}
0: \x{10920}
/^\p{Script=Lydi}/utf
\x{1093f}
0: \x{1093f}
# Character not in script
/^\p{Lydian}/utf
\x{10940}
No match
# Base script check
/^\p{sc=Cham}/utf
\x{aa00}
0: \x{aa00}
/^\p{Script=Cham}/utf
\x{aa5f}
0: \x{aa5f}
# Character not in script
/^\p{Cham}/utf
\x{aa60}
No match
# Base script check
/^\p{sc=Tai_Tham}/utf
\x{1a20}
0: \x{1a20}
/^\p{Script=Lana}/utf
\x{1aad}
0: \x{1aad}
# Character not in script
/^\p{Tai_Tham}/utf
\x{1aae}
No match
# Base script check
/^\p{sc=Tai_Viet}/utf
\x{aa80}
0: \x{aa80}
/^\p{Script=Tavt}/utf
\x{aadf}
0: \x{aadf}
# Character not in script
/^\p{Tai_Viet}/utf
\x{aae0}
No match
# Base script check
/^\p{sc=Avestan}/utf
\x{10b00}
0: \x{10b00}
/^\p{Script=Avst}/utf
\x{10b3f}
0: \x{10b3f}
# Character not in script
/^\p{Avestan}/utf
\x{10b40}
No match
# Base script check
/^\p{sc=Egyptian_Hieroglyphs}/utf
\x{13000}
0: \x{13000}
/^\p{Script=Egyp}/utf
\x{13455}
0: \x{13455}
# Character not in script
/^\p{Egyptian_Hieroglyphs}/utf
\x{13456}
No match
# Base script check
/^\p{sc=Samaritan}/utf
\x{800}
0: \x{800}
/^\p{Script=Samr}/utf
\x{83e}
0: \x{83e}
# Character not in script
/^\p{Samaritan}/utf
\x{83f}
No match
# Base script check
/^\p{sc=Lisu}/utf
\x{a4d0}
0: \x{a4d0}
/^\p{Script=Lisu}/utf
\x{11fb0}
0: \x{11fb0}
# Character not in script
/^\p{Lisu}/utf
\x{11fb1}
No match
# Base script check
/^\p{sc=Bamum}/utf
\x{a6a0}
0: \x{a6a0}
/^\p{Script=Bamu}/utf
\x{16a38}
0: \x{16a38}
# Character not in script
/^\p{Bamum}/utf
\x{16a39}
No match
# Base script check
/^\p{sc=Meetei_Mayek}/utf
\x{aae0}
0: \x{aae0}
/^\p{Script=Mtei}/utf
\x{abf9}
0: \x{abf9}
# Character not in script
/^\p{Meetei_Mayek}/utf
\x{abfa}
No match
# Base script check
/^\p{sc=Imperial_Aramaic}/utf
\x{10840}
0: \x{10840}
/^\p{Script=Armi}/utf
\x{1085f}
0: \x{1085f}
# Character not in script
/^\p{Imperial_Aramaic}/utf
\x{10860}
No match
# Base script check
/^\p{sc=Old_South_Arabian}/utf
\x{10a60}
0: \x{10a60}
/^\p{Script=Sarb}/utf
\x{10a7f}
0: \x{10a7f}
# Character not in script
/^\p{Old_South_Arabian}/utf
\x{10a80}
No match
# Base script check
/^\p{sc=Inscriptional_Parthian}/utf
\x{10b40}
0: \x{10b40}
/^\p{Script=Prti}/utf
\x{10b5f}
0: \x{10b5f}
# Character not in script
/^\p{Inscriptional_Parthian}/utf
\x{10b60}
No match
# Base script check
/^\p{sc=Inscriptional_Pahlavi}/utf
\x{10b60}
0: \x{10b60}
/^\p{Script=Phli}/utf
\x{10b7f}
0: \x{10b7f}
# Character not in script
/^\p{Inscriptional_Pahlavi}/utf
\x{10b80}
No match
# Base script check
/^\p{sc=Old_Turkic}/utf
\x{10c00}
0: \x{10c00}
/^\p{Script=Orkh}/utf
\x{10c48}
0: \x{10c48}
# Character not in script
/^\p{Old_Turkic}/utf
\x{10c49}
No match
# Base script check
/^\p{sc=Batak}/utf
\x{1bc0}
0: \x{1bc0}
/^\p{Script=Batk}/utf
\x{1bff}
0: \x{1bff}
# Character not in script
/^\p{Batak}/utf
\x{1c00}
No match
# Base script check
/^\p{sc=Brahmi}/utf
\x{11000}
0: \x{11000}
/^\p{Script=Brah}/utf
\x{1107f}
0: \x{1107f}
# Character not in script
/^\p{Brahmi}/utf
\x{11080}
No match
# Base script check
/^\p{sc=Meroitic_Cursive}/utf
\x{109a0}
0: \x{109a0}
/^\p{Script=Merc}/utf
\x{109ff}
0: \x{109ff}
# Character not in script
/^\p{Meroitic_Cursive}/utf
\x{10a00}
No match
# Base script check
/^\p{sc=Meroitic_Hieroglyphs}/utf
\x{10980}
0: \x{10980}
/^\p{Script=Mero}/utf
\x{1099f}
0: \x{1099f}
# Character not in script
/^\p{Meroitic_Hieroglyphs}/utf
\x{109a0}
No match
# Base script check
/^\p{sc=Miao}/utf
\x{16f00}
0: \x{16f00}
/^\p{Script=Plrd}/utf
\x{16f9f}
0: \x{16f9f}
# Character not in script
/^\p{Miao}/utf
\x{16fa0}
No match
# Base script check
/^\p{sc=Sora_Sompeng}/utf
\x{110d0}
0: \x{110d0}
/^\p{Script=Sora}/utf
\x{110f9}
0: \x{110f9}
# Character not in script
/^\p{Sora_Sompeng}/utf
\x{110fa}
No match
# Base script check
/^\p{sc=Caucasian_Albanian}/utf
\x{10530}
0: \x{10530}
/^\p{Script=Aghb}/utf
\x{1056f}
0: \x{1056f}
# Character not in script
/^\p{Caucasian_Albanian}/utf
\x{10570}
No match
# Base script check
/^\p{sc=Bassa_Vah}/utf
\x{16ad0}
0: \x{16ad0}
/^\p{Script=Bass}/utf
\x{16af5}
0: \x{16af5}
# Character not in script
/^\p{Bassa_Vah}/utf
\x{16af6}
No match
# Base script check
/^\p{sc=Elbasan}/utf
\x{10500}
0: \x{10500}
/^\p{Script=Elba}/utf
\x{10527}
0: \x{10527}
# Character not in script
/^\p{Elbasan}/utf
\x{10528}
No match
# Base script check
/^\p{sc=Pahawh_Hmong}/utf
\x{16b00}
0: \x{16b00}
/^\p{Script=Hmng}/utf
\x{16b8f}
0: \x{16b8f}
# Character not in script
/^\p{Pahawh_Hmong}/utf
\x{16b90}
No match
# Base script check
/^\p{sc=Mende_Kikakui}/utf
\x{1e800}
0: \x{1e800}
/^\p{Script=Mend}/utf
\x{1e8d6}
0: \x{1e8d6}
# Character not in script
/^\p{Mende_Kikakui}/utf
\x{1e8d7}
No match
# Base script check
/^\p{sc=Mro}/utf
\x{16a40}
0: \x{16a40}
/^\p{Script=Mroo}/utf
\x{16a6f}
0: \x{16a6f}
# Character not in script
/^\p{Mro}/utf
\x{16a70}
No match
# Base script check
/^\p{sc=Old_North_Arabian}/utf
\x{10a80}
0: \x{10a80}
/^\p{Script=Narb}/utf
\x{10a9f}
0: \x{10a9f}
# Character not in script
/^\p{Old_North_Arabian}/utf
\x{10aa0}
No match
# Base script check
/^\p{sc=Nabataean}/utf
\x{10880}
0: \x{10880}
/^\p{Script=Nbat}/utf
\x{108af}
0: \x{108af}
# Character not in script
/^\p{Nabataean}/utf
\x{108b0}
No match
# Base script check
/^\p{sc=Palmyrene}/utf
\x{10860}
0: \x{10860}
/^\p{Script=Palm}/utf
\x{1087f}
0: \x{1087f}
# Character not in script
/^\p{Palmyrene}/utf
\x{10880}
No match
# Base script check
/^\p{sc=Pau_Cin_Hau}/utf
\x{11ac0}
0: \x{11ac0}
/^\p{Script=Pauc}/utf
\x{11af8}
0: \x{11af8}
# Character not in script
/^\p{Pau_Cin_Hau}/utf
\x{11af9}
No match
# Base script check
/^\p{sc=Siddham}/utf
\x{11580}
0: \x{11580}
/^\p{Script=Sidd}/utf
\x{115dd}
0: \x{115dd}
# Character not in script
/^\p{Siddham}/utf
\x{115de}
No match
# Base script check
/^\p{sc=Warang_Citi}/utf
\x{118a0}
0: \x{118a0}
/^\p{Script=Wara}/utf
\x{118ff}
0: \x{118ff}
# Character not in script
/^\p{Warang_Citi}/utf
\x{11900}
No match
# Base script check
/^\p{sc=Ahom}/utf
\x{11700}
0: \x{11700}
/^\p{Script=Ahom}/utf
\x{11746}
0: \x{11746}
# Character not in script
/^\p{Ahom}/utf
\x{11747}
No match
# Base script check
/^\p{sc=Anatolian_Hieroglyphs}/utf
\x{14400}
0: \x{14400}
/^\p{Script=Hluw}/utf
\x{14646}
0: \x{14646}
# Character not in script
/^\p{Anatolian_Hieroglyphs}/utf
\x{14647}
No match
# Base script check
/^\p{sc=Hatran}/utf
\x{108e0}
0: \x{108e0}
/^\p{Script=Hatr}/utf
\x{108ff}
0: \x{108ff}
# Character not in script
/^\p{Hatran}/utf
\x{10900}
No match
# Base script check
/^\p{sc=Old_Hungarian}/utf
\x{10c80}
0: \x{10c80}
/^\p{Script=Hung}/utf
\x{10cff}
0: \x{10cff}
# Character not in script
/^\p{Old_Hungarian}/utf
\x{10d00}
No match
# Base script check
/^\p{sc=SignWriting}/utf
\x{1d800}
0: \x{1d800}
/^\p{Script=Sgnw}/utf
\x{1daaf}
0: \x{1daaf}
# Character not in script
/^\p{SignWriting}/utf
\x{1dab0}
No match
# Base script check
/^\p{sc=Bhaiksuki}/utf
\x{11c00}
0: \x{11c00}
/^\p{Script=Bhks}/utf
\x{11c6c}
0: \x{11c6c}
# Character not in script
/^\p{Bhaiksuki}/utf
\x{11c6d}
No match
# Base script check
/^\p{sc=Marchen}/utf
\x{11c70}
0: \x{11c70}
/^\p{Script=Marc}/utf
\x{11cb6}
0: \x{11cb6}
# Character not in script
/^\p{Marchen}/utf
\x{11cb7}
No match
# Base script check
/^\p{sc=Newa}/utf
\x{11400}
0: \x{11400}
/^\p{Script=Newa}/utf
\x{11461}
0: \x{11461}
# Character not in script
/^\p{Newa}/utf
\x{11462}
No match
# Base script check
/^\p{sc=Osage}/utf
\x{104b0}
0: \x{104b0}
/^\p{Script=Osge}/utf
\x{104fb}
0: \x{104fb}
# Character not in script
/^\p{Osage}/utf
\x{104fc}
No match
# Base script check
/^\p{sc=Tangut}/utf
\x{16fe0}
0: \x{16fe0}
/^\p{Script=Tang}/utf
\x{18d08}
0: \x{18d08}
# Character not in script
/^\p{Tangut}/utf
\x{18d09}
No match
# Base script check
/^\p{sc=Nushu}/utf
\x{16fe1}
0: \x{16fe1}
/^\p{Script=Nshu}/utf
\x{1b2fb}
0: \x{1b2fb}
# Character not in script
/^\p{Nushu}/utf
\x{1b2fc}
No match
# Base script check
/^\p{sc=Soyombo}/utf
\x{11a50}
0: \x{11a50}
/^\p{Script=Soyo}/utf
\x{11aa2}
0: \x{11aa2}
# Character not in script
/^\p{Soyombo}/utf
\x{11aa3}
No match
# Base script check
/^\p{sc=Zanabazar_Square}/utf
\x{11a00}
0: \x{11a00}
/^\p{Script=Zanb}/utf
\x{11a47}
0: \x{11a47}
# Character not in script
/^\p{Zanabazar_Square}/utf
\x{11a48}
No match
# Base script check
/^\p{sc=Makasar}/utf
\x{11ee0}
0: \x{11ee0}
/^\p{Script=Maka}/utf
\x{11ef8}
0: \x{11ef8}
# Character not in script
/^\p{Makasar}/utf
\x{11ef9}
No match
# Base script check
/^\p{sc=Medefaidrin}/utf
\x{16e40}
0: \x{16e40}
/^\p{Script=Medf}/utf
\x{16e9a}
0: \x{16e9a}
# Character not in script
/^\p{Medefaidrin}/utf
\x{16e9b}
No match
# Base script check
/^\p{sc=Old_Sogdian}/utf
\x{10f00}
0: \x{10f00}
/^\p{Script=Sogo}/utf
\x{10f27}
0: \x{10f27}
# Character not in script
/^\p{Old_Sogdian}/utf
\x{10f28}
No match
# Base script check
/^\p{sc=Elymaic}/utf
\x{10fe0}
0: \x{10fe0}
/^\p{Script=Elym}/utf
\x{10ff6}
0: \x{10ff6}
# Character not in script
/^\p{Elymaic}/utf
\x{10ff7}
No match
# Base script check
/^\p{sc=Nyiakeng_Puachue_Hmong}/utf
\x{1e100}
0: \x{1e100}
/^\p{Script=Hmnp}/utf
\x{1e14f}
0: \x{1e14f}
# Character not in script
/^\p{Nyiakeng_Puachue_Hmong}/utf
\x{1e150}
No match
# Base script check
/^\p{sc=Wancho}/utf
\x{1e2c0}
0: \x{1e2c0}
/^\p{Script=Wcho}/utf
\x{1e2ff}
0: \x{1e2ff}
# Character not in script
/^\p{Wancho}/utf
\x{1e300}
No match
# Base script check
/^\p{sc=Chorasmian}/utf
\x{10fb0}
0: \x{10fb0}
/^\p{Script=Chrs}/utf
\x{10fcb}
0: \x{10fcb}
# Character not in script
/^\p{Chorasmian}/utf
\x{10fcc}
No match
# Base script check
/^\p{sc=Dives_Akuru}/utf
\x{11900}
0: \x{11900}
/^\p{Script=Diak}/utf
\x{11959}
0: \x{11959}
# Character not in script
/^\p{Dives_Akuru}/utf
\x{1195a}
No match
# Base script check
/^\p{sc=Khitan_Small_Script}/utf
\x{16fe4}
0: \x{16fe4}
/^\p{Script=Kits}/utf
\x{18cd5}
0: \x{18cd5}
# Character not in script
/^\p{Khitan_Small_Script}/utf
\x{18cd6}
No match
# Base script check
/^\p{sc=Tangsa}/utf
\x{16a70}
0: \x{16a70}
/^\p{Script=Tnsa}/utf
\x{16ac9}
0: \x{16ac9}
# Character not in script
/^\p{Tangsa}/utf
\x{16aca}
No match
# Base script check
/^\p{sc=Toto}/utf
\x{1e290}
0: \x{1e290}
/^\p{Script=Toto}/utf
\x{1e2ae}
0: \x{1e2ae}
# Character not in script
/^\p{Toto}/utf
\x{1e2af}
No match
# Base script check
/^\p{sc=Vithkuqi}/utf
\x{10570}
0: \x{10570}
/^\p{Script=Vith}/utf
\x{105bc}
0: \x{105bc}
# Character not in script
/^\p{Vithkuqi}/utf
\x{105bd}
No match
# Base script check
/^\p{sc=Kawi}/utf
\x{11f00}
0: \x{11f00}
/^\p{Script=Kawi}/utf
\x{11f59}
0: \x{11f59}
# Character not in script
/^\p{Kawi}/utf
\x{11f6a}
No match
# Base script check
/^\p{sc=Nag_Mundari}/utf
\x{1e4d0}
0: \x{1e4d0}
/^\p{Script=Nagm}/utf
\x{1e4f9}
0: \x{1e4f9}
# Character not in script
/^\p{Nag_Mundari}/utf
\x{1e4fa}
No match
# End of testinput26

4153
3rd/pcre2/testdata/testoutput27 vendored Normal file
View File

@@ -0,0 +1,4153 @@
# These tests were generated by maint/GenerateTest.py using PCRE2's UCP
# data, do not edit unless that data has changed and they are reflecting
# a previous version.
# Unicode Script Extension tests for version 16.0.0
#perltest
# Base script check
/^\p{sc=Latin}/utf
A
0: A
/^\p{Script=Latn}/utf
\x{1df2a}
0: \x{1df2a}
# Script extension check
/^\p{Latin}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Latn}/utf
\x{a92e}
0: \x{a92e}
# Script extension only character
/^\p{Latin}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Latin}/utf
\x{b7}
No match
# Character not in script
/^\p{Latin}/utf
\x{1df2b}
No match
# Base script check
/^\p{sc=Greek}/utf
\x{370}
0: \x{370}
/^\p{Script=Grek}/utf
\x{1d245}
0: \x{1d245}
# Script extension check
/^\p{Greek}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Grek}/utf
\x{205d}
0: \x{205d}
# Script extension only character
/^\p{Greek}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Greek}/utf
\x{b7}
No match
# Character not in script
/^\p{Greek}/utf
\x{1d246}
No match
# Base script check
/^\p{sc=Cyrillic}/utf
\x{400}
0: \x{400}
/^\p{Script=Cyrl}/utf
\x{1e08f}
0: \x{1e08f}
# Script extension check
/^\p{Cyrillic}/utf
\x{2bc}
0: \x{2bc}
/^\p{scx=Cyrl}/utf
\x{a66f}
0: \x{a66f}
# Script extension only character
/^\p{Cyrillic}/utf
\x{2bc}
0: \x{2bc}
/^\p{sc=Cyrillic}/utf
\x{2bc}
No match
# Character not in script
/^\p{Cyrillic}/utf
\x{1e090}
No match
# Base script check
/^\p{sc=Armenian}/utf
\x{531}
0: \x{531}
/^\p{Script=Armn}/utf
\x{fb17}
0: \x{fb17}
# Script extension check
/^\p{Armenian}/utf
\x{308}
0: \x{308}
/^\p{Script_Extensions=Armn}/utf
\x{589}
0: \x{589}
# Script extension only character
/^\p{Armenian}/utf
\x{308}
0: \x{308}
/^\p{sc=Armenian}/utf
\x{308}
No match
# Character not in script
/^\p{Armenian}/utf
\x{fb18}
No match
# Base script check
/^\p{sc=Hebrew}/utf
\x{591}
0: \x{591}
/^\p{Script=Hebr}/utf
\x{fb4f}
0: \x{fb4f}
# Script extension check
/^\p{Hebrew}/utf
\x{307}
0: \x{307}
/^\p{scx=Hebr}/utf
\x{308}
0: \x{308}
# Script extension only character
/^\p{Hebrew}/utf
\x{307}
0: \x{307}
/^\p{sc=Hebrew}/utf
\x{307}
No match
# Character not in script
/^\p{Hebrew}/utf
\x{fb50}
No match
# Base script check
/^\p{sc=Arabic}/utf
\x{600}
0: \x{600}
/^\p{Script=Arab}/utf
\x{1eef1}
0: \x{1eef1}
# Script extension check
/^\p{Arabic}/utf
\x{60c}
0: \x{60c}
/^\p{Script_Extensions=Arab}/utf
\x{102fb}
0: \x{102fb}
# Script extension only character
/^\p{Arabic}/utf
\x{60c}
0: \x{60c}
/^\p{sc=Arabic}/utf
\x{60c}
No match
# Character not in script
/^\p{Arabic}/utf
\x{1eef2}
No match
# Base script check
/^\p{sc=Syriac}/utf
\x{700}
0: \x{700}
/^\p{Script=Syrc}/utf
\x{86a}
0: \x{86a}
# Script extension check
/^\p{Syriac}/utf
\x{303}
0: \x{303}
/^\p{scx=Syrc}/utf
\x{1dfa}
0: \x{1dfa}
# Script extension only character
/^\p{Syriac}/utf
\x{303}
0: \x{303}
/^\p{sc=Syriac}/utf
\x{303}
No match
# Character not in script
/^\p{Syriac}/utf
\x{1dfb}
No match
# Base script check
/^\p{sc=Thaana}/utf
\x{780}
0: \x{780}
/^\p{Script=Thaa}/utf
\x{7b1}
0: \x{7b1}
# Script extension check
/^\p{Thaana}/utf
\x{60c}
0: \x{60c}
/^\p{Script_Extensions=Thaa}/utf
\x{fdfd}
0: \x{fdfd}
# Script extension only character
/^\p{Thaana}/utf
\x{60c}
0: \x{60c}
/^\p{sc=Thaana}/utf
\x{60c}
No match
# Character not in script
/^\p{Thaana}/utf
\x{fdfe}
No match
# Base script check
/^\p{sc=Devanagari}/utf
\x{900}
0: \x{900}
/^\p{Script=Deva}/utf
\x{11b09}
0: \x{11b09}
# Script extension check
/^\p{Devanagari}/utf
\x{2bc}
0: \x{2bc}
/^\p{scx=Deva}/utf
\x{a8f3}
0: \x{a8f3}
# Script extension only character
/^\p{Devanagari}/utf
\x{2bc}
0: \x{2bc}
/^\p{sc=Devanagari}/utf
\x{2bc}
No match
# Character not in script
/^\p{Devanagari}/utf
\x{11b0a}
No match
# Base script check
/^\p{sc=Bengali}/utf
\x{980}
0: \x{980}
/^\p{Script=Beng}/utf
\x{9fe}
0: \x{9fe}
# Script extension check
/^\p{Bengali}/utf
\x{2bc}
0: \x{2bc}
/^\p{Script_Extensions=Beng}/utf
\x{a8f1}
0: \x{a8f1}
# Script extension only character
/^\p{Bengali}/utf
\x{2bc}
0: \x{2bc}
/^\p{sc=Bengali}/utf
\x{2bc}
No match
# Character not in script
/^\p{Bengali}/utf
\x{a8f2}
No match
# Base script check
/^\p{sc=Gurmukhi}/utf
\x{a01}
0: \x{a01}
/^\p{Script=Guru}/utf
\x{a76}
0: \x{a76}
# Script extension check
/^\p{Gurmukhi}/utf
\x{951}
0: \x{951}
/^\p{scx=Guru}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Gurmukhi}/utf
\x{951}
0: \x{951}
/^\p{sc=Gurmukhi}/utf
\x{951}
No match
# Character not in script
/^\p{Gurmukhi}/utf
\x{a83a}
No match
# Base script check
/^\p{sc=Gujarati}/utf
\x{a81}
0: \x{a81}
/^\p{Script=Gujr}/utf
\x{aff}
0: \x{aff}
# Script extension check
/^\p{Gujarati}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Gujr}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Gujarati}/utf
\x{951}
0: \x{951}
/^\p{sc=Gujarati}/utf
\x{951}
No match
# Character not in script
/^\p{Gujarati}/utf
\x{a83a}
No match
# Base script check
/^\p{sc=Oriya}/utf
\x{b01}
0: \x{b01}
/^\p{Script=Orya}/utf
\x{b77}
0: \x{b77}
# Script extension check
/^\p{Oriya}/utf
\x{951}
0: \x{951}
/^\p{scx=Orya}/utf
\x{1cf2}
0: \x{1cf2}
# Script extension only character
/^\p{Oriya}/utf
\x{951}
0: \x{951}
/^\p{sc=Oriya}/utf
\x{951}
No match
# Character not in script
/^\p{Oriya}/utf
\x{1cf3}
No match
# Base script check
/^\p{sc=Tamil}/utf
\x{b82}
0: \x{b82}
/^\p{Script=Taml}/utf
\x{11fff}
0: \x{11fff}
# Script extension check
/^\p{Tamil}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Taml}/utf
\x{11fd3}
0: \x{11fd3}
# Script extension only character
/^\p{Tamil}/utf
\x{951}
0: \x{951}
/^\p{sc=Tamil}/utf
\x{951}
No match
# Character not in script
/^\p{Tamil}/utf
\x{12000}
No match
# Base script check
/^\p{sc=Telugu}/utf
\x{c00}
0: \x{c00}
/^\p{Script=Telu}/utf
\x{c7f}
0: \x{c7f}
# Script extension check
/^\p{Telugu}/utf
\x{951}
0: \x{951}
/^\p{scx=Telu}/utf
\x{1cf2}
0: \x{1cf2}
# Script extension only character
/^\p{Telugu}/utf
\x{951}
0: \x{951}
/^\p{sc=Telugu}/utf
\x{951}
No match
# Character not in script
/^\p{Telugu}/utf
\x{1cf3}
No match
# Base script check
/^\p{sc=Kannada}/utf
\x{c80}
0: \x{c80}
/^\p{Script=Knda}/utf
\x{cf3}
0: \x{cf3}
# Script extension check
/^\p{Kannada}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Knda}/utf
\x{a835}
0: \x{a835}
# Script extension only character
/^\p{Kannada}/utf
\x{951}
0: \x{951}
/^\p{sc=Kannada}/utf
\x{951}
No match
# Character not in script
/^\p{Kannada}/utf
\x{a836}
No match
# Base script check
/^\p{sc=Malayalam}/utf
\x{d00}
0: \x{d00}
/^\p{Script=Mlym}/utf
\x{d7f}
0: \x{d7f}
# Script extension check
/^\p{Malayalam}/utf
\x{951}
0: \x{951}
/^\p{scx=Mlym}/utf
\x{a832}
0: \x{a832}
# Script extension only character
/^\p{Malayalam}/utf
\x{951}
0: \x{951}
/^\p{sc=Malayalam}/utf
\x{951}
No match
# Character not in script
/^\p{Malayalam}/utf
\x{a833}
No match
# Base script check
/^\p{sc=Sinhala}/utf
\x{d81}
0: \x{d81}
/^\p{Script=Sinh}/utf
\x{111f4}
0: \x{111f4}
# Script extension check
/^\p{Sinhala}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Sinh}/utf
\x{1cf2}
0: \x{1cf2}
# Script extension only character
/^\p{Sinhala}/utf
\x{964}
0: \x{964}
/^\p{sc=Sinhala}/utf
\x{964}
No match
# Character not in script
/^\p{Sinhala}/utf
\x{111f5}
No match
# Base script check
/^\p{sc=Thai}/utf
\x{e01}
0: \x{e01}
/^\p{Script=Thai}/utf
\x{e5b}
0: \x{e5b}
# Script extension check
/^\p{Thai}/utf
\x{2bc}
0: \x{2bc}
/^\p{scx=Thai}/utf
\x{331}
0: \x{331}
# Script extension only character
/^\p{Thai}/utf
\x{2bc}
0: \x{2bc}
/^\p{sc=Thai}/utf
\x{2bc}
No match
# Character not in script
/^\p{Thai}/utf
\x{e5c}
No match
# Base script check
/^\p{sc=Tibetan}/utf
\x{f00}
0: \x{f00}
/^\p{Script=Tibt}/utf
\x{fda}
0: \x{fda}
# Script extension check
/^\p{Tibetan}/utf
\x{3008}
0: \x{3008}
/^\p{Script_Extensions=Tibt}/utf
\x{300b}
0: \x{300b}
# Script extension only character
/^\p{Tibetan}/utf
\x{3008}
0: \x{3008}
/^\p{sc=Tibetan}/utf
\x{3008}
No match
# Character not in script
/^\p{Tibetan}/utf
\x{300c}
No match
# Base script check
/^\p{sc=Myanmar}/utf
\x{1000}
0: \x{1000}
/^\p{Script=Mymr}/utf
\x{116e3}
0: \x{116e3}
# Script extension check
/^\p{Myanmar}/utf
\x{1040}
0: \x{1040}
/^\p{scx=Mymr}/utf
\x{a92e}
0: \x{a92e}
# Script extension only character
/^\p{Myanmar}/utf
\x{a92e}
0: \x{a92e}
/^\p{sc=Myanmar}/utf
\x{a92e}
No match
# Character not in script
/^\p{Myanmar}/utf
\x{116e4}
No match
# Base script check
/^\p{sc=Georgian}/utf
\x{10a0}
0: \x{10a0}
/^\p{Script=Geor}/utf
\x{2d2d}
0: \x{2d2d}
# Script extension check
/^\p{Georgian}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Geor}/utf
\x{2e31}
0: \x{2e31}
# Script extension only character
/^\p{Georgian}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Georgian}/utf
\x{b7}
No match
# Character not in script
/^\p{Georgian}/utf
\x{2e32}
No match
# Base script check
/^\p{sc=Hangul}/utf
\x{1100}
0: \x{1100}
/^\p{Script=Hang}/utf
\x{ffdc}
0: \x{ffdc}
# Script extension check
/^\p{Hangul}/utf
\x{3001}
0: \x{3001}
/^\p{scx=Hang}/utf
\x{ff65}
0: \x{ff65}
# Script extension only character
/^\p{Hangul}/utf
\x{3001}
0: \x{3001}
/^\p{sc=Hangul}/utf
\x{3001}
No match
# Character not in script
/^\p{Hangul}/utf
\x{ffdd}
No match
# Base script check
/^\p{sc=Ethiopic}/utf
\x{1200}
0: \x{1200}
/^\p{Script=Ethi}/utf
\x{1e7fe}
0: \x{1e7fe}
# Script extension check
/^\p{Ethiopic}/utf
\x{30e}
0: \x{30e}
/^\p{Script_Extensions=Ethi}/utf
\x{30e}
0: \x{30e}
# Script extension only character
/^\p{Ethiopic}/utf
\x{30e}
0: \x{30e}
/^\p{sc=Ethiopic}/utf
\x{30e}
No match
# Character not in script
/^\p{Ethiopic}/utf
\x{1e7ff}
No match
# Base script check
/^\p{sc=Cherokee}/utf
\x{13a0}
0: \x{13a0}
/^\p{Script=Cher}/utf
\x{abbf}
0: \x{abbf}
# Script extension check
/^\p{Cherokee}/utf
\x{300}
0: \x{300}
/^\p{scx=Cher}/utf
\x{331}
0: \x{331}
# Script extension only character
/^\p{Cherokee}/utf
\x{300}
0: \x{300}
/^\p{sc=Cherokee}/utf
\x{300}
No match
# Character not in script
/^\p{Cherokee}/utf
\x{abc0}
No match
# Base script check
/^\p{sc=Runic}/utf
\x{16a0}
0: \x{16a0}
/^\p{Script=Runr}/utf
\x{16f8}
0: \x{16f8}
# Script extension check
/^\p{Runic}/utf
\x{16eb}
0: \x{16eb}
/^\p{Script_Extensions=Runr}/utf
\x{16ed}
0: \x{16ed}
# Script extension only character
/^\p{Runic}/utf
\x{16eb}
0: \x{16eb}
/^\p{sc=Runic}/utf
\x{16eb}
No match
# Character not in script
/^\p{Runic}/utf
\x{16f9}
No match
# Base script check
/^\p{sc=Mongolian}/utf
\x{1800}
0: \x{1800}
/^\p{Script=Mong}/utf
\x{1166c}
0: \x{1166c}
# Script extension check
/^\p{Mongolian}/utf
\x{1802}
0: \x{1802}
/^\p{scx=Mong}/utf
\x{300b}
0: \x{300b}
# Script extension only character
/^\p{Mongolian}/utf
\x{1802}
0: \x{1802}
/^\p{sc=Mongolian}/utf
\x{1802}
No match
# Character not in script
/^\p{Mongolian}/utf
\x{1166d}
No match
# Base script check
/^\p{sc=Hiragana}/utf
\x{3041}
0: \x{3041}
/^\p{Script=Hira}/utf
\x{1f200}
0: \x{1f200}
# Script extension check
/^\p{Hiragana}/utf
\x{3001}
0: \x{3001}
/^\p{Script_Extensions=Hira}/utf
\x{ff9f}
0: \x{ff9f}
# Script extension only character
/^\p{Hiragana}/utf
\x{3001}
0: \x{3001}
/^\p{sc=Hiragana}/utf
\x{3001}
No match
# Character not in script
/^\p{Hiragana}/utf
\x{1f201}
No match
# Base script check
/^\p{sc=Katakana}/utf
\x{30a1}
0: \x{30a1}
/^\p{Script=Kana}/utf
\x{1b167}
0: \x{1b167}
# Script extension check
/^\p{Katakana}/utf
\x{305}
0: \x{305}
/^\p{scx=Kana}/utf
\x{ff9f}
0: \x{ff9f}
# Script extension only character
/^\p{Katakana}/utf
\x{305}
0: \x{305}
/^\p{sc=Katakana}/utf
\x{305}
No match
# Character not in script
/^\p{Katakana}/utf
\x{1b168}
No match
# Base script check
/^\p{sc=Bopomofo}/utf
\x{2ea}
0: \x{2ea}
/^\p{Script=Bopo}/utf
\x{31bf}
0: \x{31bf}
# Script extension check
/^\p{Bopomofo}/utf
\x{2c7}
0: \x{2c7}
/^\p{Script_Extensions=Bopo}/utf
\x{ff65}
0: \x{ff65}
# Script extension only character
/^\p{Bopomofo}/utf
\x{2c7}
0: \x{2c7}
/^\p{sc=Bopomofo}/utf
\x{2c7}
No match
# Character not in script
/^\p{Bopomofo}/utf
\x{ff66}
No match
# Base script check
/^\p{sc=Han}/utf
\x{2e80}
0: \x{2e80}
/^\p{Script=Hani}/utf
\x{323af}
0: \x{323af}
# Script extension check
/^\p{Han}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Hani}/utf
\x{1f251}
0: \x{1f251}
# Script extension only character
/^\p{Han}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Han}/utf
\x{b7}
No match
# Character not in script
/^\p{Han}/utf
\x{323b0}
No match
# Base script check
/^\p{sc=Yi}/utf
\x{a000}
0: \x{a000}
/^\p{Script=Yiii}/utf
\x{a4c6}
0: \x{a4c6}
# Script extension check
/^\p{Yi}/utf
\x{3001}
0: \x{3001}
/^\p{Script_Extensions=Yiii}/utf
\x{ff65}
0: \x{ff65}
# Script extension only character
/^\p{Yi}/utf
\x{3001}
0: \x{3001}
/^\p{sc=Yi}/utf
\x{3001}
No match
# Character not in script
/^\p{Yi}/utf
\x{ff66}
No match
# Base script check
/^\p{sc=Gothic}/utf
\x{10330}
0: \x{10330}
/^\p{Script=Goth}/utf
\x{1034a}
0: \x{1034a}
# Script extension check
/^\p{Gothic}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Goth}/utf
\x{331}
0: \x{331}
# Script extension only character
/^\p{Gothic}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Gothic}/utf
\x{b7}
No match
# Character not in script
/^\p{Gothic}/utf
\x{1034b}
No match
# Base script check
/^\p{sc=Tagalog}/utf
\x{1700}
0: \x{1700}
/^\p{Script=Tglg}/utf
\x{171f}
0: \x{171f}
# Script extension check
/^\p{Tagalog}/utf
\x{1735}
0: \x{1735}
/^\p{Script_Extensions=Tglg}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Tagalog}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Tagalog}/utf
\x{1735}
No match
# Character not in script
/^\p{Tagalog}/utf
\x{1737}
No match
# Base script check
/^\p{sc=Hanunoo}/utf
\x{1720}
0: \x{1720}
/^\p{Script=Hano}/utf
\x{1734}
0: \x{1734}
# Script extension check
/^\p{Hanunoo}/utf
\x{1735}
0: \x{1735}
/^\p{scx=Hano}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Hanunoo}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Hanunoo}/utf
\x{1735}
No match
# Character not in script
/^\p{Hanunoo}/utf
\x{1737}
No match
# Base script check
/^\p{sc=Buhid}/utf
\x{1740}
0: \x{1740}
/^\p{Script=Buhd}/utf
\x{1753}
0: \x{1753}
# Script extension check
/^\p{Buhid}/utf
\x{1735}
0: \x{1735}
/^\p{Script_Extensions=Buhd}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Buhid}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Buhid}/utf
\x{1735}
No match
# Character not in script
/^\p{Buhid}/utf
\x{1754}
No match
# Base script check
/^\p{sc=Tagbanwa}/utf
\x{1760}
0: \x{1760}
/^\p{Script=Tagb}/utf
\x{1773}
0: \x{1773}
# Script extension check
/^\p{Tagbanwa}/utf
\x{1735}
0: \x{1735}
/^\p{scx=Tagb}/utf
\x{1736}
0: \x{1736}
# Script extension only character
/^\p{Tagbanwa}/utf
\x{1735}
0: \x{1735}
/^\p{sc=Tagbanwa}/utf
\x{1735}
No match
# Character not in script
/^\p{Tagbanwa}/utf
\x{1774}
No match
# Base script check
/^\p{sc=Limbu}/utf
\x{1900}
0: \x{1900}
/^\p{Script=Limb}/utf
\x{194f}
0: \x{194f}
# Script extension check
/^\p{Limbu}/utf
\x{965}
0: \x{965}
/^\p{Script_Extensions=Limb}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Limbu}/utf
\x{965}
0: \x{965}
/^\p{sc=Limbu}/utf
\x{965}
No match
# Character not in script
/^\p{Limbu}/utf
\x{1950}
No match
# Base script check
/^\p{sc=Tai_Le}/utf
\x{1950}
0: \x{1950}
/^\p{Script=Tale}/utf
\x{1974}
0: \x{1974}
# Script extension check
/^\p{Tai_Le}/utf
\x{300}
0: \x{300}
/^\p{scx=Tale}/utf
\x{1049}
0: \x{1049}
# Script extension only character
/^\p{Tai_Le}/utf
\x{300}
0: \x{300}
/^\p{sc=Tai_Le}/utf
\x{300}
No match
# Character not in script
/^\p{Tai_Le}/utf
\x{1975}
No match
# Base script check
/^\p{sc=Linear_B}/utf
\x{10000}
0: \x{10000}
/^\p{Script=Linb}/utf
\x{100fa}
0: \x{100fa}
# Script extension check
/^\p{Linear_B}/utf
\x{10100}
0: \x{10100}
/^\p{Script_Extensions=Linb}/utf
\x{1013f}
0: \x{1013f}
# Script extension only character
/^\p{Linear_B}/utf
\x{10100}
0: \x{10100}
/^\p{sc=Linear_B}/utf
\x{10100}
No match
# Character not in script
/^\p{Linear_B}/utf
\x{10140}
No match
# Base script check
/^\p{sc=Shavian}/utf
\x{10450}
0: \x{10450}
/^\p{Script=Shaw}/utf
\x{1047f}
0: \x{1047f}
# Script extension check
/^\p{Shavian}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Shaw}/utf
\x{b7}
0: \x{b7}
# Script extension only character
/^\p{Shavian}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Shavian}/utf
\x{b7}
No match
# Character not in script
/^\p{Shavian}/utf
\x{10480}
No match
# Base script check
/^\p{sc=Cypriot}/utf
\x{10800}
0: \x{10800}
/^\p{Script=Cprt}/utf
\x{1083f}
0: \x{1083f}
# Script extension check
/^\p{Cypriot}/utf
\x{10100}
0: \x{10100}
/^\p{Script_Extensions=Cprt}/utf
\x{1013f}
0: \x{1013f}
# Script extension only character
/^\p{Cypriot}/utf
\x{10100}
0: \x{10100}
/^\p{sc=Cypriot}/utf
\x{10100}
No match
# Character not in script
/^\p{Cypriot}/utf
\x{10840}
No match
# Base script check
/^\p{sc=Buginese}/utf
\x{1a00}
0: \x{1a00}
/^\p{Script=Bugi}/utf
\x{1a1f}
0: \x{1a1f}
# Script extension check
/^\p{Buginese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{scx=Bugi}/utf
\x{a9cf}
0: \x{a9cf}
# Script extension only character
/^\p{Buginese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{sc=Buginese}/utf
\x{a9cf}
No match
# Character not in script
/^\p{Buginese}/utf
\x{a9d0}
No match
# Base script check
/^\p{sc=Coptic}/utf
\x{3e2}
0: \x{3e2}
/^\p{Script=Copt}/utf
\x{2cff}
0: \x{2cff}
# Script extension check
/^\p{Coptic}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Copt}/utf
\x{102fb}
0: \x{102fb}
# Script extension only character
/^\p{Coptic}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Coptic}/utf
\x{b7}
No match
# Character not in script
/^\p{Coptic}/utf
\x{102fc}
No match
# Base script check
/^\p{sc=Glagolitic}/utf
\x{2c00}
0: \x{2c00}
/^\p{Script=Glag}/utf
\x{1e02a}
0: \x{1e02a}
# Script extension check
/^\p{Glagolitic}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Glag}/utf
\x{a66f}
0: \x{a66f}
# Script extension only character
/^\p{Glagolitic}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Glagolitic}/utf
\x{b7}
No match
# Character not in script
/^\p{Glagolitic}/utf
\x{1e02b}
No match
# Base script check
/^\p{sc=Tifinagh}/utf
\x{2d30}
0: \x{2d30}
/^\p{Script=Tfng}/utf
\x{2d7f}
0: \x{2d7f}
# Script extension check
/^\p{Tifinagh}/utf
\x{302}
0: \x{302}
/^\p{Script_Extensions=Tfng}/utf
\x{309}
0: \x{309}
# Script extension only character
/^\p{Tifinagh}/utf
\x{302}
0: \x{302}
/^\p{sc=Tifinagh}/utf
\x{302}
No match
# Character not in script
/^\p{Tifinagh}/utf
\x{2d80}
No match
# Base script check
/^\p{sc=Syloti_Nagri}/utf
\x{a800}
0: \x{a800}
/^\p{Script=Sylo}/utf
\x{a82c}
0: \x{a82c}
# Script extension check
/^\p{Syloti_Nagri}/utf
\x{964}
0: \x{964}
/^\p{scx=Sylo}/utf
\x{9ef}
0: \x{9ef}
# Script extension only character
/^\p{Syloti_Nagri}/utf
\x{964}
0: \x{964}
/^\p{sc=Syloti_Nagri}/utf
\x{964}
No match
# Character not in script
/^\p{Syloti_Nagri}/utf
\x{a82d}
No match
# Base script check
/^\p{sc=Phags_Pa}/utf
\x{a840}
0: \x{a840}
/^\p{Script=Phag}/utf
\x{a877}
0: \x{a877}
# Script extension check
/^\p{Phags_Pa}/utf
\x{1802}
0: \x{1802}
/^\p{Script_Extensions=Phag}/utf
\x{3002}
0: \x{3002}
# Script extension only character
/^\p{Phags_Pa}/utf
\x{1802}
0: \x{1802}
/^\p{sc=Phags_Pa}/utf
\x{1802}
No match
# Character not in script
/^\p{Phags_Pa}/utf
\x{a878}
No match
# Base script check
/^\p{sc=Nko}/utf
\x{7c0}
0: \x{7c0}
/^\p{Script=Nkoo}/utf
\x{7ff}
0: \x{7ff}
# Script extension check
/^\p{Nko}/utf
\x{60c}
0: \x{60c}
/^\p{scx=Nkoo}/utf
\x{fd3f}
0: \x{fd3f}
# Script extension only character
/^\p{Nko}/utf
\x{60c}
0: \x{60c}
/^\p{sc=Nko}/utf
\x{60c}
No match
# Character not in script
/^\p{Nko}/utf
\x{fd40}
No match
# Base script check
/^\p{sc=Kayah_Li}/utf
\x{a900}
0: \x{a900}
/^\p{Script=Kali}/utf
\x{a92f}
0: \x{a92f}
# Script extension check
/^\p{Kayah_Li}/utf
\x{a92e}
0: \x{a92e}
/^\p{Script_Extensions=Kali}/utf
\x{a92e}
0: \x{a92e}
# Script extension only character
/^\p{Kayah_Li}/utf
\x{a92e}
0: \x{a92e}
/^\p{sc=Kayah_Li}/utf
\x{a92e}
No match
# Character not in script
/^\p{Kayah_Li}/utf
\x{a930}
No match
# Base script check
/^\p{sc=Lycian}/utf
\x{10280}
0: \x{10280}
/^\p{Script=Lyci}/utf
\x{1029c}
0: \x{1029c}
# Script extension check
/^\p{Lycian}/utf
\x{205a}
0: \x{205a}
/^\p{scx=Lyci}/utf
\x{205a}
0: \x{205a}
# Script extension only character
/^\p{Lycian}/utf
\x{205a}
0: \x{205a}
/^\p{sc=Lycian}/utf
\x{205a}
No match
# Character not in script
/^\p{Lycian}/utf
\x{1029d}
No match
# Base script check
/^\p{sc=Carian}/utf
\x{102a0}
0: \x{102a0}
/^\p{Script=Cari}/utf
\x{102d0}
0: \x{102d0}
# Script extension check
/^\p{Carian}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Cari}/utf
\x{2e31}
0: \x{2e31}
# Script extension only character
/^\p{Carian}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Carian}/utf
\x{b7}
No match
# Character not in script
/^\p{Carian}/utf
\x{102d1}
No match
# Base script check
/^\p{sc=Lydian}/utf
\x{10920}
0: \x{10920}
/^\p{Script=Lydi}/utf
\x{1093f}
0: \x{1093f}
# Script extension check
/^\p{Lydian}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Lydi}/utf
\x{2e31}
0: \x{2e31}
# Script extension only character
/^\p{Lydian}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Lydian}/utf
\x{b7}
No match
# Character not in script
/^\p{Lydian}/utf
\x{10940}
No match
# Base script check
/^\p{sc=Avestan}/utf
\x{10b00}
0: \x{10b00}
/^\p{Script=Avst}/utf
\x{10b3f}
0: \x{10b3f}
# Script extension check
/^\p{Avestan}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Avst}/utf
\x{2e31}
0: \x{2e31}
# Script extension only character
/^\p{Avestan}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Avestan}/utf
\x{b7}
No match
# Character not in script
/^\p{Avestan}/utf
\x{10b40}
No match
# Base script check
/^\p{sc=Samaritan}/utf
\x{800}
0: \x{800}
/^\p{Script=Samr}/utf
\x{83e}
0: \x{83e}
# Script extension check
/^\p{Samaritan}/utf
\x{2e31}
0: \x{2e31}
/^\p{scx=Samr}/utf
\x{2e31}
0: \x{2e31}
# Script extension only character
/^\p{Samaritan}/utf
\x{2e31}
0: \x{2e31}
/^\p{sc=Samaritan}/utf
\x{2e31}
No match
# Character not in script
/^\p{Samaritan}/utf
\x{2e32}
No match
# Base script check
/^\p{sc=Lisu}/utf
\x{a4d0}
0: \x{a4d0}
/^\p{Script=Lisu}/utf
\x{11fb0}
0: \x{11fb0}
# Script extension check
/^\p{Lisu}/utf
\x{2bc}
0: \x{2bc}
/^\p{Script_Extensions=Lisu}/utf
\x{300b}
0: \x{300b}
# Script extension only character
/^\p{Lisu}/utf
\x{2bc}
0: \x{2bc}
/^\p{sc=Lisu}/utf
\x{2bc}
No match
# Character not in script
/^\p{Lisu}/utf
\x{11fb1}
No match
# Base script check
/^\p{sc=Javanese}/utf
\x{a980}
0: \x{a980}
/^\p{Script=Java}/utf
\x{a9df}
0: \x{a9df}
# Script extension check
/^\p{Javanese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{scx=Java}/utf
\x{a9cf}
0: \x{a9cf}
# Script extension only character
/^\p{Javanese}/utf
\x{a9cf}
0: \x{a9cf}
/^\p{sc=Javanese}/utf
\x{a9cf}
No match
# Character not in script
/^\p{Javanese}/utf
\x{a9e0}
No match
# Base script check
/^\p{sc=Old_Turkic}/utf
\x{10c00}
0: \x{10c00}
/^\p{Script=Orkh}/utf
\x{10c48}
0: \x{10c48}
# Script extension check
/^\p{Old_Turkic}/utf
\x{205a}
0: \x{205a}
/^\p{Script_Extensions=Orkh}/utf
\x{2e30}
0: \x{2e30}
# Script extension only character
/^\p{Old_Turkic}/utf
\x{205a}
0: \x{205a}
/^\p{sc=Old_Turkic}/utf
\x{205a}
No match
# Character not in script
/^\p{Old_Turkic}/utf
\x{10c49}
No match
# Base script check
/^\p{sc=Kaithi}/utf
\x{11080}
0: \x{11080}
/^\p{Script=Kthi}/utf
\x{110cd}
0: \x{110cd}
# Script extension check
/^\p{Kaithi}/utf
\x{966}
0: \x{966}
/^\p{scx=Kthi}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Kaithi}/utf
\x{966}
0: \x{966}
/^\p{sc=Kaithi}/utf
\x{966}
No match
# Character not in script
/^\p{Kaithi}/utf
\x{110ce}
No match
# Base script check
/^\p{sc=Mandaic}/utf
\x{840}
0: \x{840}
/^\p{Script=Mand}/utf
\x{85e}
0: \x{85e}
# Script extension check
/^\p{Mandaic}/utf
\x{640}
0: \x{640}
/^\p{Script_Extensions=Mand}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Mandaic}/utf
\x{640}
0: \x{640}
/^\p{sc=Mandaic}/utf
\x{640}
No match
# Character not in script
/^\p{Mandaic}/utf
\x{85f}
No match
# Base script check
/^\p{sc=Chakma}/utf
\x{11100}
0: \x{11100}
/^\p{Script=Cakm}/utf
\x{11147}
0: \x{11147}
# Script extension check
/^\p{Chakma}/utf
\x{9e6}
0: \x{9e6}
/^\p{scx=Cakm}/utf
\x{1049}
0: \x{1049}
# Script extension only character
/^\p{Chakma}/utf
\x{9e6}
0: \x{9e6}
/^\p{sc=Chakma}/utf
\x{9e6}
No match
# Character not in script
/^\p{Chakma}/utf
\x{11148}
No match
# Base script check
/^\p{sc=Meroitic_Hieroglyphs}/utf
\x{10980}
0: \x{10980}
/^\p{Script=Mero}/utf
\x{1099f}
0: \x{1099f}
# Script extension check
/^\p{Meroitic_Hieroglyphs}/utf
\x{205d}
0: \x{205d}
/^\p{Script_Extensions=Mero}/utf
\x{205d}
0: \x{205d}
# Script extension only character
/^\p{Meroitic_Hieroglyphs}/utf
\x{205d}
0: \x{205d}
/^\p{sc=Meroitic_Hieroglyphs}/utf
\x{205d}
No match
# Character not in script
/^\p{Meroitic_Hieroglyphs}/utf
\x{109a0}
No match
# Base script check
/^\p{sc=Sharada}/utf
\x{11180}
0: \x{11180}
/^\p{Script=Shrd}/utf
\x{111df}
0: \x{111df}
# Script extension check
/^\p{Sharada}/utf
\x{951}
0: \x{951}
/^\p{scx=Shrd}/utf
\x{a838}
0: \x{a838}
# Script extension only character
/^\p{Sharada}/utf
\x{951}
0: \x{951}
/^\p{sc=Sharada}/utf
\x{951}
No match
# Character not in script
/^\p{Sharada}/utf
\x{111e0}
No match
# Base script check
/^\p{sc=Takri}/utf
\x{11680}
0: \x{11680}
/^\p{Script=Takr}/utf
\x{116c9}
0: \x{116c9}
# Script extension check
/^\p{Takri}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Takr}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Takri}/utf
\x{964}
0: \x{964}
/^\p{sc=Takri}/utf
\x{964}
No match
# Character not in script
/^\p{Takri}/utf
\x{116ca}
No match
# Base script check
/^\p{sc=Caucasian_Albanian}/utf
\x{10530}
0: \x{10530}
/^\p{Script=Aghb}/utf
\x{1056f}
0: \x{1056f}
# Script extension check
/^\p{Caucasian_Albanian}/utf
\x{304}
0: \x{304}
/^\p{scx=Aghb}/utf
\x{35e}
0: \x{35e}
# Script extension only character
/^\p{Caucasian_Albanian}/utf
\x{304}
0: \x{304}
/^\p{sc=Caucasian_Albanian}/utf
\x{304}
No match
# Character not in script
/^\p{Caucasian_Albanian}/utf
\x{10570}
No match
# Base script check
/^\p{sc=Duployan}/utf
\x{1bc00}
0: \x{1bc00}
/^\p{Script=Dupl}/utf
\x{1bc9f}
0: \x{1bc9f}
# Script extension check
/^\p{Duployan}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Dupl}/utf
\x{1bca3}
0: \x{1bca3}
# Script extension only character
/^\p{Duployan}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Duployan}/utf
\x{b7}
No match
# Character not in script
/^\p{Duployan}/utf
\x{1bca4}
No match
# Base script check
/^\p{sc=Elbasan}/utf
\x{10500}
0: \x{10500}
/^\p{Script=Elba}/utf
\x{10527}
0: \x{10527}
# Script extension check
/^\p{Elbasan}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Elba}/utf
\x{305}
0: \x{305}
# Script extension only character
/^\p{Elbasan}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Elbasan}/utf
\x{b7}
No match
# Character not in script
/^\p{Elbasan}/utf
\x{10528}
No match
# Base script check
/^\p{sc=Grantha}/utf
\x{11300}
0: \x{11300}
/^\p{Script=Gran}/utf
\x{11374}
0: \x{11374}
# Script extension check
/^\p{Grantha}/utf
\x{951}
0: \x{951}
/^\p{Script_Extensions=Gran}/utf
\x{11fd3}
0: \x{11fd3}
# Script extension only character
/^\p{Grantha}/utf
\x{951}
0: \x{951}
/^\p{sc=Grantha}/utf
\x{951}
No match
# Character not in script
/^\p{Grantha}/utf
\x{11fd4}
No match
# Base script check
/^\p{sc=Khojki}/utf
\x{11200}
0: \x{11200}
/^\p{Script=Khoj}/utf
\x{11241}
0: \x{11241}
# Script extension check
/^\p{Khojki}/utf
\x{ae6}
0: \x{ae6}
/^\p{scx=Khoj}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Khojki}/utf
\x{ae6}
0: \x{ae6}
/^\p{sc=Khojki}/utf
\x{ae6}
No match
# Character not in script
/^\p{Khojki}/utf
\x{11242}
No match
# Base script check
/^\p{sc=Linear_A}/utf
\x{10600}
0: \x{10600}
/^\p{Script=Lina}/utf
\x{10767}
0: \x{10767}
# Script extension check
/^\p{Linear_A}/utf
\x{10107}
0: \x{10107}
/^\p{Script_Extensions=Lina}/utf
\x{10133}
0: \x{10133}
# Script extension only character
/^\p{Linear_A}/utf
\x{10107}
0: \x{10107}
/^\p{sc=Linear_A}/utf
\x{10107}
No match
# Character not in script
/^\p{Linear_A}/utf
\x{10768}
No match
# Base script check
/^\p{sc=Mahajani}/utf
\x{11150}
0: \x{11150}
/^\p{Script=Mahj}/utf
\x{11176}
0: \x{11176}
# Script extension check
/^\p{Mahajani}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Mahj}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Mahajani}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Mahajani}/utf
\x{b7}
No match
# Character not in script
/^\p{Mahajani}/utf
\x{11177}
No match
# Base script check
/^\p{sc=Manichaean}/utf
\x{10ac0}
0: \x{10ac0}
/^\p{Script=Mani}/utf
\x{10af6}
0: \x{10af6}
# Script extension check
/^\p{Manichaean}/utf
\x{640}
0: \x{640}
/^\p{Script_Extensions=Mani}/utf
\x{10af2}
0: \x{10af2}
# Script extension only character
/^\p{Manichaean}/utf
\x{640}
0: \x{640}
/^\p{sc=Manichaean}/utf
\x{640}
No match
# Character not in script
/^\p{Manichaean}/utf
\x{10af7}
No match
# Base script check
/^\p{sc=Modi}/utf
\x{11600}
0: \x{11600}
/^\p{Script=Modi}/utf
\x{11659}
0: \x{11659}
# Script extension check
/^\p{Modi}/utf
\x{a830}
0: \x{a830}
/^\p{scx=Modi}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Modi}/utf
\x{a830}
0: \x{a830}
/^\p{sc=Modi}/utf
\x{a830}
No match
# Character not in script
/^\p{Modi}/utf
\x{1165a}
No match
# Base script check
/^\p{sc=Old_Permic}/utf
\x{10350}
0: \x{10350}
/^\p{Script=Perm}/utf
\x{1037a}
0: \x{1037a}
# Script extension check
/^\p{Old_Permic}/utf
\x{b7}
0: \x{b7}
/^\p{Script_Extensions=Perm}/utf
\x{483}
0: \x{483}
# Script extension only character
/^\p{Old_Permic}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Old_Permic}/utf
\x{b7}
No match
# Character not in script
/^\p{Old_Permic}/utf
\x{1037b}
No match
# Base script check
/^\p{sc=Psalter_Pahlavi}/utf
\x{10b80}
0: \x{10b80}
/^\p{Script=Phlp}/utf
\x{10baf}
0: \x{10baf}
# Script extension check
/^\p{Psalter_Pahlavi}/utf
\x{640}
0: \x{640}
/^\p{scx=Phlp}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Psalter_Pahlavi}/utf
\x{640}
0: \x{640}
/^\p{sc=Psalter_Pahlavi}/utf
\x{640}
No match
# Character not in script
/^\p{Psalter_Pahlavi}/utf
\x{10bb0}
No match
# Base script check
/^\p{sc=Khudawadi}/utf
\x{112b0}
0: \x{112b0}
/^\p{Script=Sind}/utf
\x{112f9}
0: \x{112f9}
# Script extension check
/^\p{Khudawadi}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Sind}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Khudawadi}/utf
\x{964}
0: \x{964}
/^\p{sc=Khudawadi}/utf
\x{964}
No match
# Character not in script
/^\p{Khudawadi}/utf
\x{112fa}
No match
# Base script check
/^\p{sc=Tirhuta}/utf
\x{11480}
0: \x{11480}
/^\p{Script=Tirh}/utf
\x{114d9}
0: \x{114d9}
# Script extension check
/^\p{Tirhuta}/utf
\x{951}
0: \x{951}
/^\p{scx=Tirh}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Tirhuta}/utf
\x{951}
0: \x{951}
/^\p{sc=Tirhuta}/utf
\x{951}
No match
# Character not in script
/^\p{Tirhuta}/utf
\x{114da}
No match
# Base script check
/^\p{sc=Multani}/utf
\x{11280}
0: \x{11280}
/^\p{Script=Mult}/utf
\x{112a9}
0: \x{112a9}
# Script extension check
/^\p{Multani}/utf
\x{a66}
0: \x{a66}
/^\p{Script_Extensions=Mult}/utf
\x{a6f}
0: \x{a6f}
# Script extension only character
/^\p{Multani}/utf
\x{a66}
0: \x{a66}
/^\p{sc=Multani}/utf
\x{a66}
No match
# Character not in script
/^\p{Multani}/utf
\x{112aa}
No match
# Base script check
/^\p{sc=Old_Hungarian}/utf
\x{10c80}
0: \x{10c80}
/^\p{Script=Hung}/utf
\x{10cff}
0: \x{10cff}
# Script extension check
/^\p{Old_Hungarian}/utf
\x{205a}
0: \x{205a}
/^\p{scx=Hung}/utf
\x{2e41}
0: \x{2e41}
# Script extension only character
/^\p{Old_Hungarian}/utf
\x{205a}
0: \x{205a}
/^\p{sc=Old_Hungarian}/utf
\x{205a}
No match
# Character not in script
/^\p{Old_Hungarian}/utf
\x{10d00}
No match
# Base script check
/^\p{sc=Adlam}/utf
\x{1e900}
0: \x{1e900}
/^\p{Script=Adlm}/utf
\x{1e95f}
0: \x{1e95f}
# Script extension check
/^\p{Adlam}/utf
\x{61f}
0: \x{61f}
/^\p{Script_Extensions=Adlm}/utf
\x{2e41}
0: \x{2e41}
# Script extension only character
/^\p{Adlam}/utf
\x{61f}
0: \x{61f}
/^\p{sc=Adlam}/utf
\x{61f}
No match
# Character not in script
/^\p{Adlam}/utf
\x{1e960}
No match
# Base script check
/^\p{sc=Osage}/utf
\x{104b0}
0: \x{104b0}
/^\p{Script=Osge}/utf
\x{104fb}
0: \x{104fb}
# Script extension check
/^\p{Osage}/utf
\x{301}
0: \x{301}
/^\p{scx=Osge}/utf
\x{358}
0: \x{358}
# Script extension only character
/^\p{Osage}/utf
\x{301}
0: \x{301}
/^\p{sc=Osage}/utf
\x{301}
No match
# Character not in script
/^\p{Osage}/utf
\x{104fc}
No match
# Base script check
/^\p{sc=Tangut}/utf
\x{16fe0}
0: \x{16fe0}
/^\p{Script=Tang}/utf
\x{18d08}
0: \x{18d08}
# Script extension check
/^\p{Tangut}/utf
\x{2ff0}
0: \x{2ff0}
/^\p{Script_Extensions=Tang}/utf
\x{31ef}
0: \x{31ef}
# Script extension only character
/^\p{Tangut}/utf
\x{2ff0}
0: \x{2ff0}
/^\p{sc=Tangut}/utf
\x{2ff0}
No match
# Character not in script
/^\p{Tangut}/utf
\x{18d09}
No match
# Base script check
/^\p{sc=Masaram_Gondi}/utf
\x{11d00}
0: \x{11d00}
/^\p{Script=Gonm}/utf
\x{11d59}
0: \x{11d59}
# Script extension check
/^\p{Masaram_Gondi}/utf
\x{964}
0: \x{964}
/^\p{scx=Gonm}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Masaram_Gondi}/utf
\x{964}
0: \x{964}
/^\p{sc=Masaram_Gondi}/utf
\x{964}
No match
# Character not in script
/^\p{Masaram_Gondi}/utf
\x{11d5a}
No match
# Base script check
/^\p{sc=Dogra}/utf
\x{11800}
0: \x{11800}
/^\p{Script=Dogr}/utf
\x{1183b}
0: \x{1183b}
# Script extension check
/^\p{Dogra}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Dogr}/utf
\x{a839}
0: \x{a839}
# Script extension only character
/^\p{Dogra}/utf
\x{964}
0: \x{964}
/^\p{sc=Dogra}/utf
\x{964}
No match
# Character not in script
/^\p{Dogra}/utf
\x{1183c}
No match
# Base script check
/^\p{sc=Gunjala_Gondi}/utf
\x{11d60}
0: \x{11d60}
/^\p{Script=Gong}/utf
\x{11da9}
0: \x{11da9}
# Script extension check
/^\p{Gunjala_Gondi}/utf
\x{b7}
0: \x{b7}
/^\p{scx=Gong}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Gunjala_Gondi}/utf
\x{b7}
0: \x{b7}
/^\p{sc=Gunjala_Gondi}/utf
\x{b7}
No match
# Character not in script
/^\p{Gunjala_Gondi}/utf
\x{11daa}
No match
# Base script check
/^\p{sc=Hanifi_Rohingya}/utf
\x{10d00}
0: \x{10d00}
/^\p{Script=Rohg}/utf
\x{10d39}
0: \x{10d39}
# Script extension check
/^\p{Hanifi_Rohingya}/utf
\x{60c}
0: \x{60c}
/^\p{Script_Extensions=Rohg}/utf
\x{6d4}
0: \x{6d4}
# Script extension only character
/^\p{Hanifi_Rohingya}/utf
\x{60c}
0: \x{60c}
/^\p{sc=Hanifi_Rohingya}/utf
\x{60c}
No match
# Character not in script
/^\p{Hanifi_Rohingya}/utf
\x{10d3a}
No match
# Base script check
/^\p{sc=Sogdian}/utf
\x{10f30}
0: \x{10f30}
/^\p{Script=Sogd}/utf
\x{10f59}
0: \x{10f59}
# Script extension check
/^\p{Sogdian}/utf
\x{640}
0: \x{640}
/^\p{scx=Sogd}/utf
\x{640}
0: \x{640}
# Script extension only character
/^\p{Sogdian}/utf
\x{640}
0: \x{640}
/^\p{sc=Sogdian}/utf
\x{640}
No match
# Character not in script
/^\p{Sogdian}/utf
\x{10f5a}
No match
# Base script check
/^\p{sc=Nandinagari}/utf
\x{119a0}
0: \x{119a0}
/^\p{Script=Nand}/utf
\x{119e4}
0: \x{119e4}
# Script extension check
/^\p{Nandinagari}/utf
\x{964}
0: \x{964}
/^\p{Script_Extensions=Nand}/utf
\x{a835}
0: \x{a835}
# Script extension only character
/^\p{Nandinagari}/utf
\x{964}
0: \x{964}
/^\p{sc=Nandinagari}/utf
\x{964}
No match
# Character not in script
/^\p{Nandinagari}/utf
\x{119e5}
No match
# Base script check
/^\p{sc=Yezidi}/utf
\x{10e80}
0: \x{10e80}
/^\p{Script=Yezi}/utf
\x{10eb1}
0: \x{10eb1}
# Script extension check
/^\p{Yezidi}/utf
\x{60c}
0: \x{60c}
/^\p{scx=Yezi}/utf
\x{669}
0: \x{669}
# Script extension only character
/^\p{Yezidi}/utf
\x{60c}
0: \x{60c}
/^\p{sc=Yezidi}/utf
\x{60c}
No match
# Character not in script
/^\p{Yezidi}/utf
\x{10eb2}
No match
# Base script check
/^\p{sc=Cypro_Minoan}/utf
\x{12f90}
0: \x{12f90}
/^\p{Script=Cpmn}/utf
\x{12ff2}
0: \x{12ff2}
# Script extension check
/^\p{Cypro_Minoan}/utf
\x{10100}
0: \x{10100}
/^\p{Script_Extensions=Cpmn}/utf
\x{10101}
0: \x{10101}
# Script extension only character
/^\p{Cypro_Minoan}/utf
\x{10100}
0: \x{10100}
/^\p{sc=Cypro_Minoan}/utf
\x{10100}
No match
# Character not in script
/^\p{Cypro_Minoan}/utf
\x{12ff3}
No match
# Base script check
/^\p{sc=Old_Uyghur}/utf
\x{10f70}
0: \x{10f70}
/^\p{Script=Ougr}/utf
\x{10f89}
0: \x{10f89}
# Script extension check
/^\p{Old_Uyghur}/utf
\x{640}
0: \x{640}
/^\p{scx=Ougr}/utf
\x{10af2}
0: \x{10af2}
# Script extension only character
/^\p{Old_Uyghur}/utf
\x{640}
0: \x{640}
/^\p{sc=Old_Uyghur}/utf
\x{640}
No match
# Character not in script
/^\p{Old_Uyghur}/utf
\x{10f8a}
No match
# Base script check
/^\p{sc=Toto}/utf
\x{1e290}
0: \x{1e290}
/^\p{Script=Toto}/utf
\x{1e2ae}
0: \x{1e2ae}
# Script extension check
/^\p{Toto}/utf
\x{2bc}
0: \x{2bc}
/^\p{Script_Extensions=Toto}/utf
\x{2bc}
0: \x{2bc}
# Script extension only character
/^\p{Toto}/utf
\x{2bc}
0: \x{2bc}
/^\p{sc=Toto}/utf
\x{2bc}
No match
# Character not in script
/^\p{Toto}/utf
\x{1e2af}
No match
# Base script check
/^\p{sc=Garay}/utf
\x{10d40}
0: \x{10d40}
/^\p{Script=Gara}/utf
\x{10d8f}
0: \x{10d8f}
# Script extension check
/^\p{Garay}/utf
\x{60c}
0: \x{60c}
/^\p{scx=Gara}/utf
\x{61f}
0: \x{61f}
# Script extension only character
/^\p{Garay}/utf
\x{60c}
0: \x{60c}
/^\p{sc=Garay}/utf
\x{60c}
No match
# Character not in script
/^\p{Garay}/utf
\x{10d90}
No match
# Base script check
/^\p{sc=Gurung_Khema}/utf
\x{16100}
0: \x{16100}
/^\p{Script=Gukh}/utf
\x{16139}
0: \x{16139}
# Script extension check
/^\p{Gurung_Khema}/utf
\x{965}
0: \x{965}
/^\p{Script_Extensions=Gukh}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Gurung_Khema}/utf
\x{965}
0: \x{965}
/^\p{sc=Gurung_Khema}/utf
\x{965}
No match
# Character not in script
/^\p{Gurung_Khema}/utf
\x{1613a}
No match
# Base script check
/^\p{sc=Ol_Onal}/utf
\x{1e5d0}
0: \x{1e5d0}
/^\p{Script=Onao}/utf
\x{1e5ff}
0: \x{1e5ff}
# Script extension check
/^\p{Ol_Onal}/utf
\x{964}
0: \x{964}
/^\p{scx=Onao}/utf
\x{965}
0: \x{965}
# Script extension only character
/^\p{Ol_Onal}/utf
\x{964}
0: \x{964}
/^\p{sc=Ol_Onal}/utf
\x{964}
No match
# Character not in script
/^\p{Ol_Onal}/utf
\x{1e600}
No match
# Base script check
/^\p{sc=Sunuwar}/utf
\x{11bc0}
0: \x{11bc0}
/^\p{Script=Sunu}/utf
\x{11bf9}
0: \x{11bf9}
# Script extension check
/^\p{Sunuwar}/utf
\x{300}
0: \x{300}
/^\p{Script_Extensions=Sunu}/utf
\x{331}
0: \x{331}
# Script extension only character
/^\p{Sunuwar}/utf
\x{300}
0: \x{300}
/^\p{sc=Sunuwar}/utf
\x{300}
No match
# Character not in script
/^\p{Sunuwar}/utf
\x{11bfa}
No match
# Base script check
/^\p{sc=Todhri}/utf
\x{105c0}
0: \x{105c0}
/^\p{Script=Todr}/utf
\x{105f3}
0: \x{105f3}
# Script extension check
/^\p{Todhri}/utf
\x{301}
0: \x{301}
/^\p{scx=Todr}/utf
\x{35e}
0: \x{35e}
# Script extension only character
/^\p{Todhri}/utf
\x{301}
0: \x{301}
/^\p{sc=Todhri}/utf
\x{301}
No match
# Character not in script
/^\p{Todhri}/utf
\x{105f4}
No match
# Base script check
/^\p{sc=Tulu_Tigalari}/utf
\x{11380}
0: \x{11380}
/^\p{Script=Tutg}/utf
\x{113e2}
0: \x{113e2}
# Script extension check
/^\p{Tulu_Tigalari}/utf
\x{ce6}
0: \x{ce6}
/^\p{Script_Extensions=Tutg}/utf
\x{a8f1}
0: \x{a8f1}
# Script extension only character
/^\p{Tulu_Tigalari}/utf
\x{ce6}
0: \x{ce6}
/^\p{sc=Tulu_Tigalari}/utf
\x{ce6}
No match
# Character not in script
/^\p{Tulu_Tigalari}/utf
\x{113e3}
No match
# Base script check
/^\p{sc=Common}/utf
\x{00}
0: \x{00}
/^\p{Script=Zyyy}/utf
\x{e007f}
0: \x{e007f}
# Character not in script
/^\p{Common}/utf
\x{e0080}
No match
# Base script check
/^\p{sc=Lao}/utf
\x{e81}
0: \x{e81}
/^\p{Script=Laoo}/utf
\x{edf}
0: \x{edf}
# Character not in script
/^\p{Lao}/utf
\x{ee0}
No match
# Base script check
/^\p{sc=Canadian_Aboriginal}/utf
\x{1400}
0: \x{1400}
/^\p{Script=Cans}/utf
\x{11abf}
0: \x{11abf}
# Character not in script
/^\p{Canadian_Aboriginal}/utf
\x{11ac0}
No match
# Base script check
/^\p{sc=Ogham}/utf
\x{1680}
0: \x{1680}
/^\p{Script=Ogam}/utf
\x{169c}
0: \x{169c}
# Character not in script
/^\p{Ogham}/utf
\x{169d}
No match
# Base script check
/^\p{sc=Khmer}/utf
\x{1780}
0: \x{1780}
/^\p{Script=Khmr}/utf
\x{19ff}
0: \x{19ff}
# Character not in script
/^\p{Khmer}/utf
\x{1a00}
No match
# Base script check
/^\p{sc=Old_Italic}/utf
\x{10300}
0: \x{10300}
/^\p{Script=Ital}/utf
\x{1032f}
0: \x{1032f}
# Character not in script
/^\p{Old_Italic}/utf
\x{10330}
No match
# Base script check
/^\p{sc=Deseret}/utf
\x{10400}
0: \x{10400}
/^\p{Script=Dsrt}/utf
\x{1044f}
0: \x{1044f}
# Character not in script
/^\p{Deseret}/utf
\x{10450}
No match
# Base script check
/^\p{sc=Inherited}/utf
\x{300}
0: \x{300}
/^\p{Script=Zinh}/utf
\x{e01ef}
0: \x{e01ef}
# Character not in script
/^\p{Inherited}/utf
\x{e01f0}
No match
# Base script check
/^\p{sc=Ugaritic}/utf
\x{10380}
0: \x{10380}
/^\p{Script=Ugar}/utf
\x{1039f}
0: \x{1039f}
# Character not in script
/^\p{Ugaritic}/utf
\x{103a0}
No match
# Base script check
/^\p{sc=Osmanya}/utf
\x{10480}
0: \x{10480}
/^\p{Script=Osma}/utf
\x{104a9}
0: \x{104a9}
# Character not in script
/^\p{Osmanya}/utf
\x{104aa}
No match
# Base script check
/^\p{sc=Braille}/utf
\x{2800}
0: \x{2800}
/^\p{Script=Brai}/utf
\x{28ff}
0: \x{28ff}
# Character not in script
/^\p{Braille}/utf
\x{2900}
No match
# Base script check
/^\p{sc=New_Tai_Lue}/utf
\x{1980}
0: \x{1980}
/^\p{Script=Talu}/utf
\x{19df}
0: \x{19df}
# Character not in script
/^\p{New_Tai_Lue}/utf
\x{19e0}
No match
# Base script check
/^\p{sc=Old_Persian}/utf
\x{103a0}
0: \x{103a0}
/^\p{Script=Xpeo}/utf
\x{103d5}
0: \x{103d5}
# Character not in script
/^\p{Old_Persian}/utf
\x{103d6}
No match
# Base script check
/^\p{sc=Kharoshthi}/utf
\x{10a00}
0: \x{10a00}
/^\p{Script=Khar}/utf
\x{10a58}
0: \x{10a58}
# Character not in script
/^\p{Kharoshthi}/utf
\x{10a59}
No match
# Base script check
/^\p{sc=Balinese}/utf
\x{1b00}
0: \x{1b00}
/^\p{Script=Bali}/utf
\x{1b7f}
0: \x{1b7f}
# Character not in script
/^\p{Balinese}/utf
\x{1b80}
No match
# Base script check
/^\p{sc=Cuneiform}/utf
\x{12000}
0: \x{12000}
/^\p{Script=Xsux}/utf
\x{12543}
0: \x{12543}
# Character not in script
/^\p{Cuneiform}/utf
\x{12544}
No match
# Base script check
/^\p{sc=Phoenician}/utf
\x{10900}
0: \x{10900}
/^\p{Script=Phnx}/utf
\x{1091f}
0: \x{1091f}
# Character not in script
/^\p{Phoenician}/utf
\x{10920}
No match
# Base script check
/^\p{sc=Sundanese}/utf
\x{1b80}
0: \x{1b80}
/^\p{Script=Sund}/utf
\x{1cc7}
0: \x{1cc7}
# Character not in script
/^\p{Sundanese}/utf
\x{1cc8}
No match
# Base script check
/^\p{sc=Lepcha}/utf
\x{1c00}
0: \x{1c00}
/^\p{Script=Lepc}/utf
\x{1c4f}
0: \x{1c4f}
# Character not in script
/^\p{Lepcha}/utf
\x{1c50}
No match
# Base script check
/^\p{sc=Ol_Chiki}/utf
\x{1c50}
0: \x{1c50}
/^\p{Script=Olck}/utf
\x{1c7f}
0: \x{1c7f}
# Character not in script
/^\p{Ol_Chiki}/utf
\x{1c80}
No match
# Base script check
/^\p{sc=Vai}/utf
\x{a500}
0: \x{a500}
/^\p{Script=Vaii}/utf
\x{a62b}
0: \x{a62b}
# Character not in script
/^\p{Vai}/utf
\x{a62c}
No match
# Base script check
/^\p{sc=Saurashtra}/utf
\x{a880}
0: \x{a880}
/^\p{Script=Saur}/utf
\x{a8d9}
0: \x{a8d9}
# Character not in script
/^\p{Saurashtra}/utf
\x{a8da}
No match
# Base script check
/^\p{sc=Rejang}/utf
\x{a930}
0: \x{a930}
/^\p{Script=Rjng}/utf
\x{a95f}
0: \x{a95f}
# Character not in script
/^\p{Rejang}/utf
\x{a960}
No match
# Base script check
/^\p{sc=Cham}/utf
\x{aa00}
0: \x{aa00}
/^\p{Script=Cham}/utf
\x{aa5f}
0: \x{aa5f}
# Character not in script
/^\p{Cham}/utf
\x{aa60}
No match
# Base script check
/^\p{sc=Tai_Tham}/utf
\x{1a20}
0: \x{1a20}
/^\p{Script=Lana}/utf
\x{1aad}
0: \x{1aad}
# Character not in script
/^\p{Tai_Tham}/utf
\x{1aae}
No match
# Base script check
/^\p{sc=Tai_Viet}/utf
\x{aa80}
0: \x{aa80}
/^\p{Script=Tavt}/utf
\x{aadf}
0: \x{aadf}
# Character not in script
/^\p{Tai_Viet}/utf
\x{aae0}
No match
# Base script check
/^\p{sc=Egyptian_Hieroglyphs}/utf
\x{13000}
0: \x{13000}
/^\p{Script=Egyp}/utf
\x{143fa}
0: \x{143fa}
# Character not in script
/^\p{Egyptian_Hieroglyphs}/utf
\x{143fb}
No match
# Base script check
/^\p{sc=Bamum}/utf
\x{a6a0}
0: \x{a6a0}
/^\p{Script=Bamu}/utf
\x{16a38}
0: \x{16a38}
# Character not in script
/^\p{Bamum}/utf
\x{16a39}
No match
# Base script check
/^\p{sc=Meetei_Mayek}/utf
\x{aae0}
0: \x{aae0}
/^\p{Script=Mtei}/utf
\x{abf9}
0: \x{abf9}
# Character not in script
/^\p{Meetei_Mayek}/utf
\x{abfa}
No match
# Base script check
/^\p{sc=Imperial_Aramaic}/utf
\x{10840}
0: \x{10840}
/^\p{Script=Armi}/utf
\x{1085f}
0: \x{1085f}
# Character not in script
/^\p{Imperial_Aramaic}/utf
\x{10860}
No match
# Base script check
/^\p{sc=Old_South_Arabian}/utf
\x{10a60}
0: \x{10a60}
/^\p{Script=Sarb}/utf
\x{10a7f}
0: \x{10a7f}
# Character not in script
/^\p{Old_South_Arabian}/utf
\x{10a80}
No match
# Base script check
/^\p{sc=Inscriptional_Parthian}/utf
\x{10b40}
0: \x{10b40}
/^\p{Script=Prti}/utf
\x{10b5f}
0: \x{10b5f}
# Character not in script
/^\p{Inscriptional_Parthian}/utf
\x{10b60}
No match
# Base script check
/^\p{sc=Inscriptional_Pahlavi}/utf
\x{10b60}
0: \x{10b60}
/^\p{Script=Phli}/utf
\x{10b7f}
0: \x{10b7f}
# Character not in script
/^\p{Inscriptional_Pahlavi}/utf
\x{10b80}
No match
# Base script check
/^\p{sc=Batak}/utf
\x{1bc0}
0: \x{1bc0}
/^\p{Script=Batk}/utf
\x{1bff}
0: \x{1bff}
# Character not in script
/^\p{Batak}/utf
\x{1c00}
No match
# Base script check
/^\p{sc=Brahmi}/utf
\x{11000}
0: \x{11000}
/^\p{Script=Brah}/utf
\x{1107f}
0: \x{1107f}
# Character not in script
/^\p{Brahmi}/utf
\x{11080}
No match
# Base script check
/^\p{sc=Meroitic_Cursive}/utf
\x{109a0}
0: \x{109a0}
/^\p{Script=Merc}/utf
\x{109ff}
0: \x{109ff}
# Character not in script
/^\p{Meroitic_Cursive}/utf
\x{10a00}
No match
# Base script check
/^\p{sc=Miao}/utf
\x{16f00}
0: \x{16f00}
/^\p{Script=Plrd}/utf
\x{16f9f}
0: \x{16f9f}
# Character not in script
/^\p{Miao}/utf
\x{16fa0}
No match
# Base script check
/^\p{sc=Sora_Sompeng}/utf
\x{110d0}
0: \x{110d0}
/^\p{Script=Sora}/utf
\x{110f9}
0: \x{110f9}
# Character not in script
/^\p{Sora_Sompeng}/utf
\x{110fa}
No match
# Base script check
/^\p{sc=Bassa_Vah}/utf
\x{16ad0}
0: \x{16ad0}
/^\p{Script=Bass}/utf
\x{16af5}
0: \x{16af5}
# Character not in script
/^\p{Bassa_Vah}/utf
\x{16af6}
No match
# Base script check
/^\p{sc=Pahawh_Hmong}/utf
\x{16b00}
0: \x{16b00}
/^\p{Script=Hmng}/utf
\x{16b8f}
0: \x{16b8f}
# Character not in script
/^\p{Pahawh_Hmong}/utf
\x{16b90}
No match
# Base script check
/^\p{sc=Mende_Kikakui}/utf
\x{1e800}
0: \x{1e800}
/^\p{Script=Mend}/utf
\x{1e8d6}
0: \x{1e8d6}
# Character not in script
/^\p{Mende_Kikakui}/utf
\x{1e8d7}
No match
# Base script check
/^\p{sc=Mro}/utf
\x{16a40}
0: \x{16a40}
/^\p{Script=Mroo}/utf
\x{16a6f}
0: \x{16a6f}
# Character not in script
/^\p{Mro}/utf
\x{16a70}
No match
# Base script check
/^\p{sc=Old_North_Arabian}/utf
\x{10a80}
0: \x{10a80}
/^\p{Script=Narb}/utf
\x{10a9f}
0: \x{10a9f}
# Character not in script
/^\p{Old_North_Arabian}/utf
\x{10aa0}
No match
# Base script check
/^\p{sc=Nabataean}/utf
\x{10880}
0: \x{10880}
/^\p{Script=Nbat}/utf
\x{108af}
0: \x{108af}
# Character not in script
/^\p{Nabataean}/utf
\x{108b0}
No match
# Base script check
/^\p{sc=Palmyrene}/utf
\x{10860}
0: \x{10860}
/^\p{Script=Palm}/utf
\x{1087f}
0: \x{1087f}
# Character not in script
/^\p{Palmyrene}/utf
\x{10880}
No match
# Base script check
/^\p{sc=Pau_Cin_Hau}/utf
\x{11ac0}
0: \x{11ac0}
/^\p{Script=Pauc}/utf
\x{11af8}
0: \x{11af8}
# Character not in script
/^\p{Pau_Cin_Hau}/utf
\x{11af9}
No match
# Base script check
/^\p{sc=Siddham}/utf
\x{11580}
0: \x{11580}
/^\p{Script=Sidd}/utf
\x{115dd}
0: \x{115dd}
# Character not in script
/^\p{Siddham}/utf
\x{115de}
No match
# Base script check
/^\p{sc=Warang_Citi}/utf
\x{118a0}
0: \x{118a0}
/^\p{Script=Wara}/utf
\x{118ff}
0: \x{118ff}
# Character not in script
/^\p{Warang_Citi}/utf
\x{11900}
No match
# Base script check
/^\p{sc=Ahom}/utf
\x{11700}
0: \x{11700}
/^\p{Script=Ahom}/utf
\x{11746}
0: \x{11746}
# Character not in script
/^\p{Ahom}/utf
\x{11747}
No match
# Base script check
/^\p{sc=Anatolian_Hieroglyphs}/utf
\x{14400}
0: \x{14400}
/^\p{Script=Hluw}/utf
\x{14646}
0: \x{14646}
# Character not in script
/^\p{Anatolian_Hieroglyphs}/utf
\x{14647}
No match
# Base script check
/^\p{sc=Hatran}/utf
\x{108e0}
0: \x{108e0}
/^\p{Script=Hatr}/utf
\x{108ff}
0: \x{108ff}
# Character not in script
/^\p{Hatran}/utf
\x{10900}
No match
# Base script check
/^\p{sc=SignWriting}/utf
\x{1d800}
0: \x{1d800}
/^\p{Script=Sgnw}/utf
\x{1daaf}
0: \x{1daaf}
# Character not in script
/^\p{SignWriting}/utf
\x{1dab0}
No match
# Base script check
/^\p{sc=Bhaiksuki}/utf
\x{11c00}
0: \x{11c00}
/^\p{Script=Bhks}/utf
\x{11c6c}
0: \x{11c6c}
# Character not in script
/^\p{Bhaiksuki}/utf
\x{11c6d}
No match
# Base script check
/^\p{sc=Marchen}/utf
\x{11c70}
0: \x{11c70}
/^\p{Script=Marc}/utf
\x{11cb6}
0: \x{11cb6}
# Character not in script
/^\p{Marchen}/utf
\x{11cb7}
No match
# Base script check
/^\p{sc=Newa}/utf
\x{11400}
0: \x{11400}
/^\p{Script=Newa}/utf
\x{11461}
0: \x{11461}
# Character not in script
/^\p{Newa}/utf
\x{11462}
No match
# Base script check
/^\p{sc=Nushu}/utf
\x{16fe1}
0: \x{16fe1}
/^\p{Script=Nshu}/utf
\x{1b2fb}
0: \x{1b2fb}
# Character not in script
/^\p{Nushu}/utf
\x{1b2fc}
No match
# Base script check
/^\p{sc=Soyombo}/utf
\x{11a50}
0: \x{11a50}
/^\p{Script=Soyo}/utf
\x{11aa2}
0: \x{11aa2}
# Character not in script
/^\p{Soyombo}/utf
\x{11aa3}
No match
# Base script check
/^\p{sc=Zanabazar_Square}/utf
\x{11a00}
0: \x{11a00}
/^\p{Script=Zanb}/utf
\x{11a47}
0: \x{11a47}
# Character not in script
/^\p{Zanabazar_Square}/utf
\x{11a48}
No match
# Base script check
/^\p{sc=Makasar}/utf
\x{11ee0}
0: \x{11ee0}
/^\p{Script=Maka}/utf
\x{11ef8}
0: \x{11ef8}
# Character not in script
/^\p{Makasar}/utf
\x{11ef9}
No match
# Base script check
/^\p{sc=Medefaidrin}/utf
\x{16e40}
0: \x{16e40}
/^\p{Script=Medf}/utf
\x{16e9a}
0: \x{16e9a}
# Character not in script
/^\p{Medefaidrin}/utf
\x{16e9b}
No match
# Base script check
/^\p{sc=Old_Sogdian}/utf
\x{10f00}
0: \x{10f00}
/^\p{Script=Sogo}/utf
\x{10f27}
0: \x{10f27}
# Character not in script
/^\p{Old_Sogdian}/utf
\x{10f28}
No match
# Base script check
/^\p{sc=Elymaic}/utf
\x{10fe0}
0: \x{10fe0}
/^\p{Script=Elym}/utf
\x{10ff6}
0: \x{10ff6}
# Character not in script
/^\p{Elymaic}/utf
\x{10ff7}
No match
# Base script check
/^\p{sc=Nyiakeng_Puachue_Hmong}/utf
\x{1e100}
0: \x{1e100}
/^\p{Script=Hmnp}/utf
\x{1e14f}
0: \x{1e14f}
# Character not in script
/^\p{Nyiakeng_Puachue_Hmong}/utf
\x{1e150}
No match
# Base script check
/^\p{sc=Wancho}/utf
\x{1e2c0}
0: \x{1e2c0}
/^\p{Script=Wcho}/utf
\x{1e2ff}
0: \x{1e2ff}
# Character not in script
/^\p{Wancho}/utf
\x{1e300}
No match
# Base script check
/^\p{sc=Chorasmian}/utf
\x{10fb0}
0: \x{10fb0}
/^\p{Script=Chrs}/utf
\x{10fcb}
0: \x{10fcb}
# Character not in script
/^\p{Chorasmian}/utf
\x{10fcc}
No match
# Base script check
/^\p{sc=Dives_Akuru}/utf
\x{11900}
0: \x{11900}
/^\p{Script=Diak}/utf
\x{11959}
0: \x{11959}
# Character not in script
/^\p{Dives_Akuru}/utf
\x{1195a}
No match
# Base script check
/^\p{sc=Khitan_Small_Script}/utf
\x{16fe4}
0: \x{16fe4}
/^\p{Script=Kits}/utf
\x{18cff}
0: \x{18cff}
# Character not in script
/^\p{Khitan_Small_Script}/utf
\x{18d00}
No match
# Base script check
/^\p{sc=Tangsa}/utf
\x{16a70}
0: \x{16a70}
/^\p{Script=Tnsa}/utf
\x{16ac9}
0: \x{16ac9}
# Character not in script
/^\p{Tangsa}/utf
\x{16aca}
No match
# Base script check
/^\p{sc=Vithkuqi}/utf
\x{10570}
0: \x{10570}
/^\p{Script=Vith}/utf
\x{105bc}
0: \x{105bc}
# Character not in script
/^\p{Vithkuqi}/utf
\x{105bd}
No match
# Base script check
/^\p{sc=Kawi}/utf
\x{11f00}
0: \x{11f00}
/^\p{Script=Kawi}/utf
\x{11f5a}
0: \x{11f5a}
# Character not in script
/^\p{Kawi}/utf
\x{11f5b}
No match
# Base script check
/^\p{sc=Nag_Mundari}/utf
\x{1e4d0}
0: \x{1e4d0}
/^\p{Script=Nagm}/utf
\x{1e4f9}
0: \x{1e4f9}
# Character not in script
/^\p{Nag_Mundari}/utf
\x{1e4fa}
No match
# Base script check
/^\p{sc=Kirat_Rai}/utf
\x{16d40}
0: \x{16d40}
/^\p{Script=Krai}/utf
\x{16d79}
0: \x{16d79}
# Character not in script
/^\p{Kirat_Rai}/utf
\x{16d7a}
No match
# End of test

177
3rd/pcre2/testdata/testoutput3 vendored Normal file
View File

@@ -0,0 +1,177 @@
# This set of tests checks local-specific features, using the "fr_FR" locale.
# It is almost Perl-compatible. When run via RunTest, the locale is edited to
# be whichever of "fr_FR", "french", or "fr" is found to exist. There is
# different version of this file called wintestinput3 for use on Windows,
# where the locale is called "french" and the tests are run using
# RunTest.bat.
#forbid_utf
/^[\w]+/
\= Expect no match
<20>cole
No match
/^[\w]+/locale=fr_FR
<20>cole
0: <20>cole
/^[\W]+/
<20>cole
0: \xc9
/^[\W]+/locale=fr_FR
\= Expect no match
<20>cole
No match
/[\b]/
\b
0: \x08
\= Expect no match
a
No match
/[\b]/locale=fr_FR
\b
0: \x08
\= Expect no match
a
No match
/^\w+/
\= Expect no match
<20>cole
No match
/^\w+/locale=fr_FR
<20>cole
0: <20>cole
/(.+)\b(.+)/
<20>cole
0: \xc9cole
1: \xc9
2: cole
/(.+)\b(.+)/locale=fr_FR
\= Expect no match
<20>cole
No match
/<2F>cole/i
<20>cole
0: \xc9cole
\= Expect no match
<20>cole
No match
/<2F>cole/i,locale=fr_FR
<20>cole
0: <20>cole
<20>cole
0: <20>cole
/\w/I
Capture group count = 0
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
Subject length lower bound = 1
/\w/I,locale=fr_FR
Capture group count = 0
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
Subject length lower bound = 1
# All remaining tests are in the fr_FR locale, so set the default.
#pattern locale=fr_FR
/^[\xc8-\xc9]/i
<20>cole
0: <20>
<20>cole
0: <20>
/^[\xc8-\xc9]/
<20>cole
0: <20>
\= Expect no match
<20>cole
No match
/\xb5/i
<20>
0: <20>
\= Expect no match
\x9c
No match
/<2F>/i
\xff
0: <20>
\= Expect no match
y
No match
/(.)\1/i
\xfe\xde
0: <20><>
1: <20>
/\W+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/[\W]+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/[^[:alpha:]]+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/\w+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[\w]+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[[:alpha:]]+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[[:alpha:]][[:lower:]][[:upper:]]/IB
------------------------------------------------------------------
Bra
[A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
[a-z\xb5\xdf-\xf6\xf8-\xff]
[A-Z\xc0-\xd6\xd8-\xde]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
Subject length lower bound = 3
# End of testinput3

177
3rd/pcre2/testdata/testoutput3A vendored Normal file
View File

@@ -0,0 +1,177 @@
# This set of tests checks local-specific features, using the "fr_FR" locale.
# It is almost Perl-compatible. When run via RunTest, the locale is edited to
# be whichever of "fr_FR", "french", or "fr" is found to exist. There is
# different version of this file called wintestinput3 for use on Windows,
# where the locale is called "french" and the tests are run using
# RunTest.bat.
#forbid_utf
/^[\w]+/
\= Expect no match
<20>cole
No match
/^[\w]+/locale=fr_FR
<20>cole
0: <20>cole
/^[\W]+/
<20>cole
0: \xc9
/^[\W]+/locale=fr_FR
\= Expect no match
<20>cole
No match
/[\b]/
\b
0: \x08
\= Expect no match
a
No match
/[\b]/locale=fr_FR
\b
0: \x08
\= Expect no match
a
No match
/^\w+/
\= Expect no match
<20>cole
No match
/^\w+/locale=fr_FR
<20>cole
0: <20>cole
/(.+)\b(.+)/
<20>cole
0: \xc9cole
1: \xc9
2: cole
/(.+)\b(.+)/locale=fr_FR
\= Expect no match
<20>cole
No match
/<2F>cole/i
<20>cole
0: \xc9cole
\= Expect no match
<20>cole
No match
/<2F>cole/i,locale=fr_FR
<20>cole
0: <20>cole
<20>cole
0: <20>cole
/\w/I
Capture group count = 0
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
Subject length lower bound = 1
/\w/I,locale=fr_FR
Capture group count = 0
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
Subject length lower bound = 1
# All remaining tests are in the fr_FR locale, so set the default.
#pattern locale=fr_FR
/^[\xc8-\xc9]/i
<20>cole
0: <20>
<20>cole
0: <20>
/^[\xc8-\xc9]/
<20>cole
0: <20>
\= Expect no match
<20>cole
No match
/\xb5/i
<20>
0: <20>
\= Expect no match
\x9c
No match
/<2F>/i
\xff
0: <20>
\= Expect no match
y
No match
/(.)\1/i
\xfe\xde
0: <20><>
1: <20>
/\W+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/[\W]+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/[^[:alpha:]]+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/\w+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[\w]+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[[:alpha:]]+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[[:alpha:]][[:lower:]][[:upper:]]/IB
------------------------------------------------------------------
Bra
[A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
[a-z\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
[A-Z\xc0-\xd6\xd8-\xde]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
Subject length lower bound = 3
# End of testinput3

177
3rd/pcre2/testdata/testoutput3B vendored Normal file
View File

@@ -0,0 +1,177 @@
# This set of tests checks local-specific features, using the "fr_FR" locale.
# It is almost Perl-compatible. When run via RunTest, the locale is edited to
# be whichever of "fr_FR", "french", or "fr" is found to exist. There is
# different version of this file called wintestinput3 for use on Windows,
# where the locale is called "french" and the tests are run using
# RunTest.bat.
#forbid_utf
/^[\w]+/
\= Expect no match
<20>cole
No match
/^[\w]+/locale=fr_FR
<20>cole
0: <20>cole
/^[\W]+/
<20>cole
0: \xc9
/^[\W]+/locale=fr_FR
\= Expect no match
<20>cole
No match
/[\b]/
\b
0: \x08
\= Expect no match
a
No match
/[\b]/locale=fr_FR
\b
0: \x08
\= Expect no match
a
No match
/^\w+/
\= Expect no match
<20>cole
No match
/^\w+/locale=fr_FR
<20>cole
0: <20>cole
/(.+)\b(.+)/
<20>cole
0: \xc9cole
1: \xc9
2: cole
/(.+)\b(.+)/locale=fr_FR
\= Expect no match
<20>cole
No match
/<2F>cole/i
<20>cole
0: \xc9cole
\= Expect no match
<20>cole
No match
/<2F>cole/i,locale=fr_FR
<20>cole
0: <20>cole
<20>cole
0: <20>cole
/\w/I
Capture group count = 0
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
Subject length lower bound = 1
/\w/I,locale=fr_FR
Capture group count = 0
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
Subject length lower bound = 1
# All remaining tests are in the fr_FR locale, so set the default.
#pattern locale=fr_FR
/^[\xc8-\xc9]/i
<20>cole
0: <20>
<20>cole
0: <20>
/^[\xc8-\xc9]/
<20>cole
0: <20>
\= Expect no match
<20>cole
No match
/\xb5/i
<20>
0: <20>
\= Expect no match
\x9c
No match
/<2F>/i
\xff
0: <20>
\= Expect no match
y
No match
/(.)\1/i
\xfe\xde
0: <20><>
1: <20>
/\W+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/[\W]+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/[^[:alpha:]]+/
>>>\xaa<<<
0: >>>
>>>\xba<<<
0: >>>
/\w+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[\w]+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[[:alpha:]]+/
>>>\xaa<<<
0: <20>
>>>\xba<<<
0: <20>
/[[:alpha:]][[:lower:]][[:upper:]]/IB
------------------------------------------------------------------
Bra
[A-Za-z\x83\x8a\x8c\x8e\x9a\x9c\x9e\x9f\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
[a-z\x83\x9a\x9c\x9e\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
[A-Z\x8a\x8c\x8e\x9f\xc0-\xd6\xd8-\xde]
Ket
End
------------------------------------------------------------------
Capture group count = 0
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
Subject length lower bound = 3
# End of testinput3

4991
3rd/pcre2/testdata/testoutput4 vendored Normal file
View File

@@ -0,0 +1,4991 @@
# This set of tests is for UTF support, including Unicode properties. The
# Unicode tests are all compatible with all versions of Perl >= 5.10, but
# some of the property tests may differ because of different versions of
# Unicode in use by PCRE2 and Perl.
# WARNING: Use only / as the pattern delimiter. Although pcre2test supports
# a number of delimiters, all those other than / give problems with the
# perltest.sh script.
#newline_default lf anycrlf any
#perltest
/a.b/utf
acb
0: acb
a\x7fb
0: a\x{7f}b
a\x{100}b
0: a\x{100}b
\= Expect no match
a\nb
No match
/a(.{3})b/utf
a\x{4000}xyb
0: a\x{4000}xyb
1: \x{4000}xy
a\x{4000}\x7fyb
0: a\x{4000}\x{7f}yb
1: \x{4000}\x{7f}y
a\x{4000}\x{100}yb
0: a\x{4000}\x{100}yb
1: \x{4000}\x{100}y
\= Expect no match
a\x{4000}b
No match
ac\ncb
No match
/a(.*?)(.)/
a\xc0\x88b
0: a\xc0
1:
2: \xc0
/a(.*?)(.)/utf
a\x{100}b
0: a\x{100}
1:
2: \x{100}
/a(.*)(.)/
a\xc0\x88b
0: a\xc0\x88b
1: \xc0\x88
2: b
/a(.*)(.)/utf
a\x{100}b
0: a\x{100}b
1: \x{100}
2: b
/a(.)(.)/
a\xc0\x92bcd
0: a\xc0\x92
1: \xc0
2: \x92
/a(.)(.)/utf
a\x{240}bcd
0: a\x{240}b
1: \x{240}
2: b
/a(.?)(.)/
a\xc0\x92bcd
0: a\xc0\x92
1: \xc0
2: \x92
/a(.?)(.)/utf
a\x{240}bcd
0: a\x{240}b
1: \x{240}
2: b
/a(.??)(.)/
a\xc0\x92bcd
0: a\xc0
1:
2: \xc0
/a(.??)(.)/utf
a\x{240}bcd
0: a\x{240}
1:
2: \x{240}
/a(.{3})b/utf
a\x{1234}xyb
0: a\x{1234}xyb
1: \x{1234}xy
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
1: \x{1234}\x{4321}y
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
1: \x{1234}\x{4321}\x{3412}
\= Expect no match
a\x{1234}b
No match
ac\ncb
No match
/a(.{3,})b/utf
a\x{1234}xyb
0: a\x{1234}xyb
1: \x{1234}xy
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
1: \x{1234}\x{4321}y
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
1: \x{1234}\x{4321}\x{3412}
axxxxbcdefghijb
0: axxxxbcdefghijb
1: xxxxbcdefghij
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
1: \x{1234}\x{4321}\x{3412}\x{3421}
\= Expect no match
a\x{1234}b
No match
/a(.{3,}?)b/utf
a\x{1234}xyb
0: a\x{1234}xyb
1: \x{1234}xy
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
1: \x{1234}\x{4321}y
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
1: \x{1234}\x{4321}\x{3412}
axxxxbcdefghijb
0: axxxxb
1: xxxx
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
1: \x{1234}\x{4321}\x{3412}\x{3421}
\= Expect no match
a\x{1234}b
No match
/a(.{3,5})b/utf
a\x{1234}xyb
0: a\x{1234}xyb
1: \x{1234}xy
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
1: \x{1234}\x{4321}y
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
1: \x{1234}\x{4321}\x{3412}
axxxxbcdefghijb
0: axxxxb
1: xxxx
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
1: \x{1234}\x{4321}\x{3412}\x{3421}
axbxxbcdefghijb
0: axbxxb
1: xbxx
axxxxxbcdefghijb
0: axxxxxb
1: xxxxx
\= Expect no match
a\x{1234}b
No match
axxxxxxbcdefghijb
No match
/a(.{3,5}?)b/utf
a\x{1234}xyb
0: a\x{1234}xyb
1: \x{1234}xy
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
1: \x{1234}\x{4321}y
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
1: \x{1234}\x{4321}\x{3412}
axxxxbcdefghijb
0: axxxxb
1: xxxx
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
1: \x{1234}\x{4321}\x{3412}\x{3421}
axbxxbcdefghijb
0: axbxxb
1: xbxx
axxxxxbcdefghijb
0: axxxxxb
1: xxxxx
\= Expect no match
a\x{1234}b
No match
axxxxxxbcdefghijb
No match
/^[a\x{c0}]/utf
\= Expect no match
\x{100}
No match
/(?<=aXb)cd/utf
aXbcd
0: cd
/(?<=a\x{100}b)cd/utf
a\x{100}bcd
0: cd
/(?<=a\x{100000}b)cd/utf
a\x{100000}bcd
0: cd
/(?:\x{100}){3}b/utf
\x{100}\x{100}\x{100}b
0: \x{100}\x{100}\x{100}b
\= Expect no match
\x{100}\x{100}b
No match
/\x{ab}/utf
\x{ab}
0: \x{ab}
\xc2\xab
0: \x{ab}
\= Expect no match
\x00{ab}
No match
/(?<=(.))X/utf
WXYZ
0: X
1: W
\x{256}XYZ
0: X
1: \x{256}
\= Expect no match
XYZ
No match
/[^a]+/g,utf
bcd
0: bcd
\x{100}aY\x{256}Z
0: \x{100}
0: Y\x{256}Z
/^[^a]{2}/utf
\x{100}bc
0: \x{100}b
/^[^a]{2,}/utf
\x{100}bcAa
0: \x{100}bcA
/^[^a]{2,}?/utf
\x{100}bca
0: \x{100}b
/[^a]+/gi,utf
bcd
0: bcd
\x{100}aY\x{256}Z
0: \x{100}
0: Y\x{256}Z
/^[^a]{2}/i,utf
\x{100}bc
0: \x{100}b
/^[^a]{2,}/i,utf
\x{100}bcAa
0: \x{100}bc
/^[^a]{2,}?/i,utf
\x{100}bca
0: \x{100}b
/\x{100}{0,0}/utf
abcd
0:
/\x{100}?/utf
abcd
0:
\x{100}\x{100}
0: \x{100}
/\x{100}{0,3}/utf
\x{100}\x{100}
0: \x{100}\x{100}
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}
/\x{100}*/utf
abce
0:
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
/\x{100}{1,1}/utf
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}
/\x{100}{1,3}/utf
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}
/\x{100}+/utf
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
/\x{100}{3}/utf
abcd\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}
/\x{100}{3,5}/utf
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}\x{100}\x{100}
/\x{100}{3,}/utf
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
/(?<=a\x{100}{2}b)X/utf,aftertext
Xyyya\x{100}\x{100}bXzzz
0: X
0+ zzz
/\D*/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
/\D*/utf
\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
/\D/utf
1X2
0: X
1\x{100}2
0: \x{100}
/>\S/utf
> >X Y
0: >X
> >\x{100} Y
0: >\x{100}
/\d/utf
\x{100}3
0: 3
/\s/utf
\x{100} X
0:
/\D+/utf
12abcd34
0: abcd
\= Expect no match
1234
No match
/\D{2,3}/utf
12abcd34
0: abc
12ab34
0: ab
\= Expect no match
1234
No match
12a34
No match
/\D{2,3}?/utf
12abcd34
0: ab
12ab34
0: ab
\= Expect no match
1234
No match
12a34
No match
/\d+/utf
12abcd34
0: 12
/\d{2,3}/utf
12abcd34
0: 12
1234abcd
0: 123
\= Expect no match
1.4
No match
/\d{2,3}?/utf
12abcd34
0: 12
1234abcd
0: 12
\= Expect no match
1.4
No match
/\S+/utf
12abcd34
0: 12abcd34
\= Expect no match
\ \
No match
/\S{2,3}/utf
12abcd34
0: 12a
1234abcd
0: 123
\= Expect no match
\ \
No match
/\S{2,3}?/utf
12abcd34
0: 12
1234abcd
0: 12
\= Expect no match
\ \
No match
/>\s+</utf,aftertext
12> <34
0: > <
0+ 34
/>\s{2,3}</utf,aftertext
ab> <cd
0: > <
0+ cd
ab> <ce
0: > <
0+ ce
\= Expect no match
ab> <cd
No match
/>\s{2,3}?</utf,aftertext
ab> <cd
0: > <
0+ cd
ab> <ce
0: > <
0+ ce
\= Expect no match
ab> <cd
No match
/\w+/utf
12 34
0: 12
\= Expect no match
+++=*!
No match
/\w{2,3}/utf
ab cd
0: ab
abcd ce
0: abc
\= Expect no match
a.b.c
No match
/\w{2,3}?/utf
ab cd
0: ab
abcd ce
0: ab
\= Expect no match
a.b.c
No match
/\W+/utf
12====34
0: ====
\= Expect no match
abcd
No match
/\W{2,3}/utf
ab====cd
0: ===
ab==cd
0: ==
\= Expect no match
a.b.c
No match
/\W{2,3}?/utf
ab====cd
0: ==
ab==cd
0: ==
\= Expect no match
a.b.c
No match
/[\x{100}]/utf
\x{100}
0: \x{100}
Z\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[Z\x{100}]/utf
Z\x{100}
0: Z
\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[\x{100}\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
/[\x{100}-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
/[z-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
abzcd
0: z
ab|cd
0: |
/[Q\x{100}\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
Q?
0: Q
/[Q\x{100}-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
Q?
0: Q
/[Qz-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
abzcd
0: z
ab|cd
0: |
Q?
0: Q
/[\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}\x{100}\x{200}
/[\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}
/[Q\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}\x{100}\x{200}
/[Q\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}
/(?<=[\x{100}\x{200}])X/utf
abc\x{200}X
0: X
abc\x{100}X
0: X
\= Expect no match
X
No match
/(?<=[Q\x{100}\x{200}])X/utf
abc\x{200}X
0: X
abc\x{100}X
0: X
abQX
0: X
\= Expect no match
X
No match
/(?<=[\x{100}\x{200}]{3})X/utf
abc\x{100}\x{200}\x{100}X
0: X
\= Expect no match
abc\x{200}X
No match
X
No match
/[^\x{100}\x{200}]X/utf
AX
0: AX
\x{150}X
0: \x{150}X
\x{500}X
0: \x{500}X
\= Expect no match
\x{100}X
No match
\x{200}X
No match
/[^Q\x{100}\x{200}]X/utf
AX
0: AX
\x{150}X
0: \x{150}X
\x{500}X
0: \x{500}X
\= Expect no match
\x{100}X
No match
\x{200}X
No match
QX
No match
/[^\x{100}-\x{200}]X/utf
AX
0: AX
\x{500}X
0: \x{500}X
\= Expect no match
\x{100}X
No match
\x{150}X
No match
\x{200}X
No match
/[z-\x{100}]/i,utf
z
0: z
Z
0: Z
\x{100}
0: \x{100}
\= Expect no match
\x{102}
No match
y
No match
/[\xFF]/
>\xff<
0: \xff
/[\xff]/utf
>\x{ff}<
0: \x{ff}
/[^\xFF]/
XYZ
0: X
/[^\xff]/utf
XYZ
0: X
\x{123}
0: \x{123}
/^[ac]*b/utf
\= Expect no match
xb
No match
/^[ac\x{100}]*b/utf
\= Expect no match
xb
No match
/^[^x]*b/i,utf
\= Expect no match
xb
No match
/^[^x]*b/utf
\= Expect no match
xb
No match
/^\d*b/utf
\= Expect no match
xb
No match
/(|a)/g,utf
catac
0:
1:
0:
1:
0: a
1: a
0:
1:
0:
1:
0: a
1: a
0:
1:
0:
1:
a\x{256}a
0:
1:
0: a
1: a
0:
1:
0:
1:
0: a
1: a
0:
1:
/^\x{85}$/i,utf
\x{85}
0: \x{85}
/^ሴ/utf
0: \x{1234}
/^\ሴ/utf
0: \x{1234}
/(?s)(.{1,5})/utf
abcdefg
0: abcde
1: abcde
ab
0: ab
1: ab
/a*\x{100}*\w/utf
a
0: a
/\S\S/g,utf
A\x{a3}BC
0: A\x{a3}
0: BC
/\S{2}/g,utf
A\x{a3}BC
0: A\x{a3}
0: BC
/\W\W/g,utf
+\x{a3}==
0: +\x{a3}
0: ==
/\W{2}/g,utf
+\x{a3}==
0: +\x{a3}
0: ==
/\S/g,utf
\x{442}\x{435}\x{441}\x{442}
0: \x{442}
0: \x{435}
0: \x{441}
0: \x{442}
/[\S]/g,utf
\x{442}\x{435}\x{441}\x{442}
0: \x{442}
0: \x{435}
0: \x{441}
0: \x{442}
/\D/g,utf
\x{442}\x{435}\x{441}\x{442}
0: \x{442}
0: \x{435}
0: \x{441}
0: \x{442}
/[\D]/g,utf
\x{442}\x{435}\x{441}\x{442}
0: \x{442}
0: \x{435}
0: \x{441}
0: \x{442}
/\W/g,utf
\x{2442}\x{2435}\x{2441}\x{2442}
0: \x{2442}
0: \x{2435}
0: \x{2441}
0: \x{2442}
/[\W]/g,utf
\x{2442}\x{2435}\x{2441}\x{2442}
0: \x{2442}
0: \x{2435}
0: \x{2441}
0: \x{2442}
/[\S\s]*/utf
abc\n\r\x{442}\x{435}\x{441}\x{442}xyz
0: abc\x{0a}\x{0d}\x{442}\x{435}\x{441}\x{442}xyz
/[\x{41f}\S]/g,utf
\x{442}\x{435}\x{441}\x{442}
0: \x{442}
0: \x{435}
0: \x{441}
0: \x{442}
/.[^\S]./g,utf
abc def\x{442}\x{443}xyz\npqr
0: c d
0: z\x{0a}p
/.[^\S\n]./g,utf
abc def\x{442}\x{443}xyz\npqr
0: c d
/[[:^alnum:]]/g,utf
+\x{2442}
0: +
0: \x{2442}
/[[:^alpha:]]/g,utf
+\x{2442}
0: +
0: \x{2442}
/[[:^ascii:]]/g,utf
A\x{442}
0: \x{442}
/[[:^blank:]]/g,utf
A\x{442}
0: A
0: \x{442}
/[[:^cntrl:]]/g,utf
A\x{442}
0: A
0: \x{442}
/[[:^digit:]]/g,utf
A\x{442}
0: A
0: \x{442}
/[[:^graph:]]/g,utf
\x19\x{e01ff}
0: \x{19}
0: \x{e01ff}
/[[:^lower:]]/g,utf
A\x{422}
0: A
0: \x{422}
/[[:^print:]]/g,utf
\x{19}\x{e01ff}
0: \x{19}
0: \x{e01ff}
/[[:^punct:]]/g,utf
A\x{442}
0: A
0: \x{442}
/[[:^space:]]/g,utf
A\x{442}
0: A
0: \x{442}
/[[:^upper:]]/g,utf
a\x{442}
0: a
0: \x{442}
/[[:^word:]]/g,utf
+\x{2442}
0: +
0: \x{2442}
/[[:^xdigit:]]/g,utf
M\x{442}
0: M
0: \x{442}
/[^ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞĀĂĄĆĈĊČĎĐĒĔĖĘĚĜĞĠĢĤĦĨĪĬĮİIJĴĶĹĻĽĿŁŃŅŇŊŌŎŐŒŔŖŘŚŜŞŠŢŤŦŨŪŬŮŰŲŴŶŸŹŻŽƁƂƄƆƇƉƊƋƎƏƐƑƓƔƖƗƘƜƝƟƠƢƤƦƧƩƬƮƯƱƲƳƵƷƸƼDŽLJNJǍǏǑǓǕǗǙǛǞǠǢǤǦǨǪǬǮDZǴǶǷǸǺǼǾȀȂȄȆȈȊȌȎȐȒȔȖȘȚȜȞȠȢȤȦȨȪȬȮȰȲȺȻȽȾɁΆΈΉΊΌΎΏΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩΪΫϒϓϔϘϚϜϞϠϢϤϦϨϪϬϮϴϷϹϺϽϾϿЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯѠѢѤѦѨѪѬѮѰѲѴѶѸѺѼѾҀҊҌҎҐҒҔҖҘҚҜҞҠҢҤҦҨҪҬҮҰҲҴҶҸҺҼҾӀӁӃӅӇӉӋӍӐӒӔӖӘӚӜӞӠӢӤӦӨӪӬӮӰӲӴӶӸԀԂԄԆԈԊԌԎԱԲԳԴԵԶԷԸԹԺԻԼԽԾԿՀՁՂՃՄՅՆՇՈՉՊՋՌՍՎՏՐՑՒՓՔՕՖႠႡႢႣႤႥႦႧႨႩႪႫႬႭႮႯႰႱႲႳႴႵႶႷႸႹႺႻႼႽႾႿჀჁჂჃჄჅḀḂḄḆḈḊḌḎḐḒḔḖḘḚḜḞḠḢḤḦḨḪḬḮḰḲḴḶḸḺḼḾṀṂṄṆṈṊṌṎṐṒṔṖṘṚṜṞṠṢṤṦṨṪṬṮṰṲṴṶṸṺṼṾẀẂẄẆẈẊẌẎẐẒẔẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼẾỀỂỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỦỨỪỬỮỰỲỴỶỸἈἉἊἋἌἍἎἏἘἙἚἛἜἝἨἩἪἫἬἭἮἯἸἹἺἻἼἽἾἿὈὉὊὋὌὍὙὛὝὟὨὩὪὫὬὭὮὯᾸᾹᾺΆῈΈῊΉῘῙῚΊῨῩῪΎῬῸΌῺΏabcdefghijklmnopqrstuvwxyzªµºßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿāăąćĉċčďđēĕėęěĝğġģĥħĩīĭįıijĵķĸĺļľŀłńņňʼnŋōŏőœŕŗřśŝşšţťŧũūŭůűųŵŷźżžſƀƃƅƈƌƍƒƕƙƚƛƞơƣƥƨƪƫƭưƴƶƹƺƽƾƿdžljnjǎǐǒǔǖǘǚǜǝǟǡǣǥǧǩǫǭǯǰdzǵǹǻǽǿȁȃȅȇȉȋȍȏȑȓȕȗșțȝȟȡȣȥȧȩȫȭȯȱȳȴȵȶȷȸȹȼȿɀɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏʐʑʒʓʔʕʖʗʘʙʚʛʜʝʞʟʠʡʢʣʤʥʦʧʨʩʪʫʬʭʮʯΐάέήίΰαβγδεζηθικλμνξοπρςστυφχψωϊϋόύώϐϑϕϖϗϙϛϝϟϡϣϥϧϩϫϭϯϰϱϲϳϵϸϻϼабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџѡѣѥѧѩѫѭѯѱѳѵѷѹѻѽѿҁҋҍҏґғҕҗҙқҝҟҡңҥҧҩҫҭүұҳҵҷҹһҽҿӂӄӆӈӊӌӎӑӓӕӗәӛӝӟӡӣӥӧөӫӭӯӱӳӵӷӹԁԃԅԇԉԋԍԏաբգդեզէըթժիլխծկհձղճմյնշոչպջռսվտրցւփքօֆևᴀᴁᴂᴃᴅᴆᴇᴈᴉᴊᴋᴌᴍᴎᴒᴓᴔᴕᴖᴗᴘᴙᴚᴛᴝᴞᴟᴣᴤᴥᴧᴨᴩᴪᴫᵢᵣᵤᵥᵦᵧᵨᵩᵪᵫᵬᵭᵮᵯᵰᵱᵲᵳᵴᵵᵶᵷᵹᵺᵻᵼᵽᵾᵿᶀᶁᶂᶄᶅᶆᶇᶈᶉᶊᶋᶍᶎᶏᶐᶑᶒᶓᶔᶕᶖᶗᶘᶙᶚḁḃḅḇḉḋḍḏḑḓḕḗḙḛḝḟḡḣḥḧḩḫḭḯḱḳḵḷḹḻḽḿṁṃṅṇṉṋṍṏṑṓṕṗṙṛṝṟṡṣṥṧṩṫṭṯṱṳṵṷṹṻṽṿẁẃẅẇẉẋẍẏẑẓẕẖẗẘẙẚẛạảấầẩẫậắằẳẵặẹẻẽếềểễệỉịọỏốồổỗộớờởỡợụủứừửữựỳỵỷỹἀἁἂἃἄἅἆἇἐἑἒἓἔἕἠἡἢἣἤἥἦἧἰἱἲἳἴἵἶἷὀὁὂὃὄὅὐὑὒὓὔὕὖὗὠὡὢὣὤὥὦὧὰάὲέὴήὶίὸόὺύὼώᾀᾁᾂᾃᾄᾅᾆᾇᾐᾑᾒᾓᾔᾕᾖᾗᾠᾡᾢᾣᾤᾥᾦᾧᾰᾱᾲᾳᾴᾶᾷῂῃῄῆῇῐῑῒΐῖῗῠῡῢΰῤῥῦῧῲῳῴῶῷⲁⲃⲇⲉⲋⲍⲏⲑⲓⲕⲗⲙⲛⲝⲧⲩⲫⲭⲯⲱⲳⲵⲷⲹⲻⲽⲿⳁⳃⳅⳇⳉⳋⳍⳏⳑⳓⳕⳗⳙⳛⳝⳟⳡⳣⳤⴀⴁⴂⴃⴄⴅⴆⴇⴈⴉⴊⴋⴌⴍⴎⴏⴐⴑⴒⴓⴔⴕⴖⴗⴘⴙⴚⴛⴜⴝⴞⴟⴠⴡⴢⴣⴤⴥfffiflffifflſtstﬓﬔﬕﬖﬗ\d_^]/utf
/^[^d]*?$/
abc
0: abc
/^[^d]*?$/utf
abc
0: abc
/^[^d]*?$/i
abc
0: abc
/^[^d]*?$/i,utf
abc
0: abc
/(?i)[\xc3\xa9\xc3\xbd]|[\xc3\xa9\xc3\xbdA]/utf
/^[a\x{c0}]b/utf
\x{c0}b
0: \x{c0}b
/^([a\x{c0}]*?)aa/utf
a\x{c0}aaaa/
0: a\x{c0}aa
1: a\x{c0}
/^([a\x{c0}]*?)aa/utf
a\x{c0}aaaa/
0: a\x{c0}aa
1: a\x{c0}
a\x{c0}a\x{c0}aaa/
0: a\x{c0}a\x{c0}aa
1: a\x{c0}a\x{c0}
/^([a\x{c0}]*)aa/utf
a\x{c0}aaaa/
0: a\x{c0}aaaa
1: a\x{c0}aa
a\x{c0}a\x{c0}aaa/
0: a\x{c0}a\x{c0}aaa
1: a\x{c0}a\x{c0}a
/^([a\x{c0}]*)a\x{c0}/utf
a\x{c0}aaaa/
0: a\x{c0}
1:
a\x{c0}a\x{c0}aaa/
0: a\x{c0}a\x{c0}
1: a\x{c0}
/A*/g,utf
AAB\x{123}BAA
0: AA
0:
0:
0:
0: AA
0:
/(abc)\1/i,utf
\= Expect no match
abc
No match
/(abc)\1/utf
\= Expect no match
abc
No match
/a(*:a\x{1234}b)/utf,mark
abc
0: a
MK: a\x{1234}b
/a(*:a£b)/utf,mark
abc
0: a
MK: a\x{a3}b
# Noncharacters
/./utf
\x{fffe}
0: \x{fffe}
\x{ffff}
0: \x{ffff}
\x{1fffe}
0: \x{1fffe}
\x{1ffff}
0: \x{1ffff}
\x{2fffe}
0: \x{2fffe}
\x{2ffff}
0: \x{2ffff}
\x{3fffe}
0: \x{3fffe}
\x{3ffff}
0: \x{3ffff}
\x{4fffe}
0: \x{4fffe}
\x{4ffff}
0: \x{4ffff}
\x{5fffe}
0: \x{5fffe}
\x{5ffff}
0: \x{5ffff}
\x{6fffe}
0: \x{6fffe}
\x{6ffff}
0: \x{6ffff}
\x{7fffe}
0: \x{7fffe}
\x{7ffff}
0: \x{7ffff}
\x{8fffe}
0: \x{8fffe}
\x{8ffff}
0: \x{8ffff}
\x{9fffe}
0: \x{9fffe}
\x{9ffff}
0: \x{9ffff}
\x{afffe}
0: \x{afffe}
\x{affff}
0: \x{affff}
\x{bfffe}
0: \x{bfffe}
\x{bffff}
0: \x{bffff}
\x{cfffe}
0: \x{cfffe}
\x{cffff}
0: \x{cffff}
\x{dfffe}
0: \x{dfffe}
\x{dffff}
0: \x{dffff}
\x{efffe}
0: \x{efffe}
\x{effff}
0: \x{effff}
\x{ffffe}
0: \x{ffffe}
\x{fffff}
0: \x{fffff}
\x{10fffe}
0: \x{10fffe}
\x{10ffff}
0: \x{10ffff}
\x{fdd0}
0: \x{fdd0}
\x{fdd1}
0: \x{fdd1}
\x{fdd2}
0: \x{fdd2}
\x{fdd3}
0: \x{fdd3}
\x{fdd4}
0: \x{fdd4}
\x{fdd5}
0: \x{fdd5}
\x{fdd6}
0: \x{fdd6}
\x{fdd7}
0: \x{fdd7}
\x{fdd8}
0: \x{fdd8}
\x{fdd9}
0: \x{fdd9}
\x{fdda}
0: \x{fdda}
\x{fddb}
0: \x{fddb}
\x{fddc}
0: \x{fddc}
\x{fddd}
0: \x{fddd}
\x{fdde}
0: \x{fdde}
\x{fddf}
0: \x{fddf}
\x{fde0}
0: \x{fde0}
\x{fde1}
0: \x{fde1}
\x{fde2}
0: \x{fde2}
\x{fde3}
0: \x{fde3}
\x{fde4}
0: \x{fde4}
\x{fde5}
0: \x{fde5}
\x{fde6}
0: \x{fde6}
\x{fde7}
0: \x{fde7}
\x{fde8}
0: \x{fde8}
\x{fde9}
0: \x{fde9}
\x{fdea}
0: \x{fdea}
\x{fdeb}
0: \x{fdeb}
\x{fdec}
0: \x{fdec}
\x{fded}
0: \x{fded}
\x{fdee}
0: \x{fdee}
\x{fdef}
0: \x{fdef}
/^\d*\w{4}/utf
1234
0: 1234
\= Expect no match
123
No match
/^[^b]*\w{4}/utf
aaaa
0: aaaa
\= Expect no match
aaa
No match
/^[^b]*\w{4}/i,utf
aaaa
0: aaaa
\= Expect no match
aaa
No match
/^\x{100}*.{4}/utf
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
\= Expect no match
\x{100}\x{100}\x{100}
No match
/^\x{100}*.{4}/i,utf
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
\= Expect no match
\x{100}\x{100}\x{100}
No match
/^a+[a\x{200}]/utf
aa
0: aa
/^.\B.\B./utf
\x{10123}\x{10124}\x{10125}
0: \x{10123}\x{10124}\x{10125}
/^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/utf
#\x{10000}#\x{100}#\x{10ffff}#
0: #\x{10000}#\x{100}#\x{10ffff}#
# Unicode property support tests
/^\pC\pL\pM\pN\pP\pS\pZ</utf
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
0: \x{7f}\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
\np\x{300}9!\$ <
0: \x{0a}p\x{300}9!$ <
\= Expect no match
ap\x{300}9!\$ <
No match
/^\PC/utf
X
0: X
\= Expect no match
\x7f
No match
/^\PL/utf
9
0: 9
\= Expect no match
\x{c0}
No match
/^\PM/utf
X
0: X
\= Expect no match
\x{30f}
No match
/^\PN/utf
X
0: X
\= Expect no match
\x{660}
No match
/^\PP/utf
X
0: X
\= Expect no match
\x{66c}
No match
/^\PS/utf
X
0: X
\= Expect no match
\x{f01}
No match
/^\PZ/utf
X
0: X
\= Expect no match
\x{1680}
No match
/^\p{Cc}/utf
\x{017}
0: \x{17}
\x{09f}
0: \x{9f}
\= Expect no match
\x{0600}
No match
/^\p{Cf}/utf
\x{601}
0: \x{601}
\= Expect no match
\x{09f}
No match
/^\p{Cn}/utf
\x{e0000}
0: \x{e0000}
\= Expect no match
\x{09f}
No match
/^\p{Co}/utf
\x{f8ff}
0: \x{f8ff}
\= Expect no match
\x{09f}
No match
/^\p{Ll}/utf
a
0: a
\= Expect no match
Z
No match
\x{e000}
No match
/^\p{Lm}/utf
\x{2b0}
0: \x{2b0}
\= Expect no match
a
No match
/^\p{Lo}/utf
\x{1bb}
0: \x{1bb}
\x{3400}
0: \x{3400}
\x{3401}
0: \x{3401}
\x{4d00}
0: \x{4d00}
\x{4db4}
0: \x{4db4}
\x{4db5}
0: \x{4db5}
\x{4db6}
0: \x{4db6}
\= Expect no match
a
No match
\x{2b0}
No match
/^\p{Lt}/utf
\x{1c5}
0: \x{1c5}
\= Expect no match
a
No match
\x{2b0}
No match
/^\p{Lu}/utf
A
0: A
\= Expect no match
\x{2b0}
No match
/^\p{Mc}/utf
\x{903}
0: \x{903}
\= Expect no match
X
No match
\x{300}
No match
/^\p{Me}/utf
\x{488}
0: \x{488}
\= Expect no match
X
No match
\x{903}
No match
\x{300}
No match
/^\p{Mn}/utf
\x{300}
0: \x{300}
\= Expect no match
X
No match
\x{903}
No match
/^\p{Nd}+/utf
0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
0: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}
\x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
0: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}
\x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
0: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}
\= Expect no match
X
No match
/^\p{Nl}/utf
\x{16ee}
0: \x{16ee}
\= Expect no match
X
No match
\x{966}
No match
/^\p{No}/utf
\x{b2}
0: \x{b2}
\x{b3}
0: \x{b3}
\= Expect no match
X
No match
\x{16ee}
No match
/^\p{Pc}/utf
\x5f
0: _
\x{203f}
0: \x{203f}
\= Expect no match
X
No match
-
No match
\x{58a}
No match
/^\p{Pd}/utf
-
0: -
\x{58a}
0: \x{58a}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Pe}/utf
)
0: )
]
0: ]
}
0: }
\x{f3b}
0: \x{f3b}
\= Expect no match
X
No match
\x{203f}
No match
(
No match
[
No match
{
No match
\x{f3c}
No match
/^\p{Pf}/utf
\x{bb}
0: \x{bb}
\x{2019}
0: \x{2019}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Pi}/utf
\x{ab}
0: \x{ab}
\x{2018}
0: \x{2018}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Po}/utf
!
0: !
\x{37e}
0: \x{37e}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Ps}/utf
(
0: (
[
0: [
{
0: {
\x{f3c}
0: \x{f3c}
\= Expect no match
X
No match
)
No match
]
No match
}
No match
\x{f3b}
No match
/^\p{Sk}/utf
\x{2c2}
0: \x{2c2}
\= Expect no match
X
No match
\x{9f2}
No match
/^\p{Sm}+/utf
+<|~\x{ac}\x{2044}
0: +<|~\x{ac}\x{2044}
\= Expect no match
X
No match
\x{9f2}
No match
/^\p{So}/utf
\x{a6}
0: \x{a6}
\x{482}
0: \x{482}
\= Expect no match
X
No match
\x{9f2}
No match
/^\p{Zl}/utf
\x{2028}
0: \x{2028}
\= Expect no match
X
No match
\x{2029}
No match
/^\p{Zp}/utf
\x{2029}
0: \x{2029}
\= Expect no match
X
No match
\x{2028}
No match
/\p{Nd}+(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: AB
/\p{Nd}+?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}
1: \x{661}\x{662}
/\p{Nd}{2,}(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: AB
/\p{Nd}{2,}?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}A
1: \x{662}A
/\p{Nd}*(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: AB
/\p{Nd}*?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}
1: \x{660}\x{661}
/\p{Nd}{2}(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}A
1: \x{662}A
/\p{Nd}{2,3}(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: AB
/\p{Nd}{2,3}?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}A
1: \x{662}A
/\p{Nd}?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}
1: \x{661}\x{662}
/\p{Nd}??(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}
1: \x{660}\x{661}
/\p{Nd}*+(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: AB
/\p{Nd}*+(...)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}ABC
1: ABC
/\p{Nd}*+(....)/utf
\= Expect no match
\x{660}\x{661}\x{662}ABC
No match
/^\pN{3,}+(.)/utf
\x{7c0}8\x{662}\x{966}\x{95c}
0: \x{7c0}8\x{662}\x{966}\x{95c}
1: \x{95c}
\x{7c0}8\x{662}\x{95c}
0: \x{7c0}8\x{662}\x{95c}
1: \x{95c}
\= Expect no match
\x{7c0}8\x{662}\x{966}
No match
\x{7c0}8\x{95c}
No match
/(?<=A\p{Nd})XYZ/utf
A2XYZ
0: XYZ
123A5XYZPQR
0: XYZ
ABA\x{660}XYZpqr
0: XYZ
\= Expect no match
AXYZ
No match
XYZ
No match
/(?<!\pL)XYZ/utf
1XYZ
0: XYZ
AB=XYZ..
0: XYZ
XYZ
0: XYZ
\= Expect no match
WXYZ
No match
/[\P{Nd}]+/utf
abcd
0: abcd
\= Expect no match
1234
No match
/\D+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/\P{Nd}+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/[\D]+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/[\P{Nd}]+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/[\D\P{Nd}]+/utf
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/\pL/utf
a
0: a
A
0: A
/\pL/i,utf
a
0: a
A
0: A
/\p{Lu}/utf
A
0: A
aZ
0: Z
\= Expect no match
abc
No match
/\p{Ll}/utf
a
0: a
Az
0: z
\= Expect no match
ABC
No match
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
\= Expect no match
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
No match
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
No match
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
No match
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
No match
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
No match
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
0: a\x{391}\x{10427}\x{ff3a}\x{1fb0}
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
0: A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
0: A\x{391}\x{1044f}\x{ff3a}\x{1fb0}
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
0: A\x{391}\x{10427}\x{ff5a}\x{1fb0}
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb8}
/\x{391}+/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
/\x{391}{3,5}(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
1: X
/\x{391}{3,5}?(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
0: \x{391}\x{3b1}\x{3b1}\x{3b1}
1: \x{3b1}
/[\x{391}\x{ff3a}]/i,utf
\x{391}
0: \x{391}
\x{ff3a}
0: \x{ff3a}
\x{3b1}
0: \x{3b1}
\x{ff5a}
0: \x{ff5a}
/^(\X*)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BC
1: A\x{300}\x{301}\x{302}B
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
1: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
/^(\X*?)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BC
1: A\x{300}\x{301}\x{302}B
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BC
1: A\x{300}\x{301}\x{302}B
/^(\X*)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BCA
1: A\x{300}\x{301}\x{302}BC
2: A
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
1: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
2: C
/^(\X*?)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A
1:
2: A
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A
1:
2: A
/^\X(.)/utf
\= Expect no match
A\x{300}\x{301}\x{302}
No match
/^\X{2,3}(.)/utf
A\x{300}\x{301}B\x{300}X
0: A\x{300}\x{301}B\x{300}X
1: X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
0: A\x{300}\x{301}B\x{300}C
1: C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
1: X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
1: D
/^\X{2,3}?(.)/utf
A\x{300}\x{301}B\x{300}X
0: A\x{300}\x{301}B\x{300}X
1: X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
0: A\x{300}\x{301}B\x{300}C
1: C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
0: A\x{300}\x{301}B\x{300}C
1: C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
0: A\x{300}\x{301}B\x{300}C
1: C
/^\X{3,}+/utf
A\x{300}B\x{301}U\x{303}\x{0301}
0: A\x{300}B\x{301}U\x{303}\x{301}
A\x{300}B\x{301}U\x{303}\x{0301}X
0: A\x{300}B\x{301}U\x{303}\x{301}X
\= Expect no match
A\x{300}
No match
A\x{300}B\x{301}
No match
A\x{300}U\x{303}\x{0301}
No match
/^\X/utf
A
0: A
A\x{300}BC
0: A\x{300}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}
\x{300}
0: \x{300}
/^\p{Han}+/utf
\x{2e81}\x{3007}\x{2f804}\x{31a0}
0: \x{2e81}\x{3007}\x{2f804}
\= Expect no match
\x{2e7f}
No match
/^[\p{Arabic}]/utf
\x{06e9}
0: \x{6e9}
\x{060b}
0: \x{60b}
\= Expect no match
X\x{06e9}
No match
/^\P{Katakana}+/utf
\x{3105}
0: \x{3105}
\= Expect no match
\x{30ff}
No match
/^[\P{Yi}]/utf
\x{2f800}
0: \x{2f800}
\= Expect no match
\x{a014}
No match
\x{a4c6}
No match
/^\p{Any}X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
X
No match
/^\P{Any}X/utf
\= Expect no match
AX
No match
/^\p{Any}?X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
ABXYZ
No match
/^\P{Any}?X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
ABXYZ
No match
/^\p{Any}+X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
\= Expect no match
XYZ
No match
/^\P{Any}+X/utf
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
XYZ
No match
/^\p{Any}*X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
/^\P{Any}*X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
/^[\p{Any}]X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
X
No match
/^[\P{Any}]X/utf
\= Expect no match
AX
No match
/^[\p{Any}]?X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
ABXYZ
No match
/^[\P{Any}]?X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
ABXYZ
No match
/^[\p{Any}]+X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
\= Expect no match
XYZ
No match
/^[\P{Any}]+X/utf
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
XYZ
No match
/^[\p{Any}]*X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
/^[\P{Any}]*X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
/^\p{Any}{3,5}?/utf
abcdefgh
0: abc
\x{1234}\n\r\x{3456}xyz
0: \x{1234}\x{0a}\x{0d}
/^\p{Any}{3,5}/utf
abcdefgh
0: abcde
\x{1234}\n\r\x{3456}xyz
0: \x{1234}\x{0a}\x{0d}\x{3456}x
/^\P{Any}{3,5}?/utf
\= Expect no match
abcdefgh
No match
\x{1234}\n\r\x{3456}xyz
No match
/^\p{L&}X/utf
AXY
0: AX
aXY
0: aX
\x{1c5}XY
0: \x{1c5}X
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^[\p{L&}]X/utf
AXY
0: AX
aXY
0: aX
\x{1c5}XY
0: \x{1c5}X
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^\p{L&}+X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEXypqreX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^[\p{L&}]+X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEXypqreX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^\p{L&}+?X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^[\p{L&}]+?X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^\P{L&}X/utf
!XY
0: !X
\x{1bb}XY
0: \x{1bb}X
\x{2b0}XY
0: \x{2b0}X
\= Expect no match
\x{1c5}XY
No match
AXY
No match
/^[\P{L&}]X/utf
!XY
0: !X
\x{1bb}XY
0: \x{1bb}X
\x{2b0}XY
0: \x{2b0}X
\= Expect no match
\x{1c5}XY
No match
AXY
No match
/^(\p{Z}[^\p{C}\p{Z}]+)*$/
\xa0!
0: \xa0!
1: \xa0!
/^[\pL](abc)(?1)/
AabcabcYZ
0: Aabcabc
1: abc
/([\pL]=(abc))*X/
L=abcX
0: L=abcX
1: L=abc
2: abc
/^\p{Balinese}\p{Cuneiform}\p{Nko}\p{Phags_Pa}\p{Phoenician}/utf
\x{1b00}\x{12000}\x{7c0}\x{a840}\x{10900}
0: \x{1b00}\x{12000}\x{7c0}\x{a840}\x{10900}
# Check property support in non-UTF mode
/\p{L}{4}/
123abcdefg
0: abcd
123abc\xc4\xc5zz
0: abc\xc4
/\X{1,3}\d/
\= Expect no match
\x8aBCD
No match
/\X?\d/
\= Expect no match
\x8aBCD
No match
/\P{L}?\d/
\= Expect no match
\x8aBCD
No match
/[\PPP\x8a]{1,}\x80/
A\x80
0: A\x80
/^[\p{Arabic}]/utf
\x{604}
0: \x{604}
\x{60e}
0: \x{60e}
\x{656}
0: \x{656}
\x{657}
0: \x{657}
\x{658}
0: \x{658}
\x{659}
0: \x{659}
\x{65a}
0: \x{65a}
\x{65b}
0: \x{65b}
\x{65c}
0: \x{65c}
\x{65d}
0: \x{65d}
\x{65e}
0: \x{65e}
\x{65f}
0: \x{65f}
\x{66a}
0: \x{66a}
\x{6e9}
0: \x{6e9}
\x{6ef}
0: \x{6ef}
\x{6fa}
0: \x{6fa}
/^\p{Cyrillic}/utf
\x{1d2b}
0: \x{1d2b}
/^\p{Common}/utf
\x{2116}
0: \x{2116}
\x{1D183}
0: \x{1d183}
/^\p{Inherited}/utf
\x{200c}
0: \x{200c}
\= Expect no match
\x{64a}
No match
\x{656}
No match
/^\p{Shavian}/utf
\x{10450}
0: \x{10450}
\x{1047f}
0: \x{1047f}
/^\p{Deseret}/utf
\x{10400}
0: \x{10400}
\x{1044f}
0: \x{1044f}
/^\p{Osmanya}/utf
\x{10480}
0: \x{10480}
\x{1049d}
0: \x{1049d}
\x{104a0}
0: \x{104a0}
\x{104a9}
0: \x{104a9}
\= Expect no match
\x{1049e}
No match
\x{1049f}
No match
\x{104aa}
No match
/\p{katakana}/utf
\x{30a1}
0: \x{30a1}
\x{3001}
0: \x{3001}
/\p{scx:katakana}/utf
\x{30a1}
0: \x{30a1}
\x{3001}
0: \x{3001}
/\p{script extensions:katakana}/utf
\x{30a1}
0: \x{30a1}
\x{3001}
0: \x{3001}
/\p{sc:katakana}/utf
\x{30a1}
0: \x{30a1}
\= Expect no match
\x{3001}
No match
/\p{script:katakana}/utf
\x{30a1}
0: \x{30a1}
\= Expect no match
\x{3001}
No match
/\p{sc:katakana}{3,}/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
0: \x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}
/\p{sc:katakana}{3,}?/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
0: \x{30a1}\x{30fa}\x{32d0}
/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/utf
\x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
0: \x{102a4}\x{aa52}\x{a91d}\x{1c46}\x{10283}\x{1092e}\x{1c6b}\x{a93b}\x{a8bf}\x{1ba0}\x{a50a}
/\x{a77d}\x{1d79}/i,utf
\x{a77d}\x{1d79}
0: \x{a77d}\x{1d79}
\x{1d79}\x{a77d}
0: \x{1d79}\x{a77d}
/\x{a77d}\x{1d79}/utf
\x{a77d}\x{1d79}
0: \x{a77d}\x{1d79}
\= Expect no match
\x{1d79}\x{a77d}
No match
/(A)\1/i,utf
AA
0: AA
1: A
Aa
0: Aa
1: A
aa
0: aa
1: a
aA
0: aA
1: a
/(\x{10a})\1/i,utf
\x{10a}\x{10a}
0: \x{10a}\x{10a}
1: \x{10a}
\x{10a}\x{10b}
0: \x{10a}\x{10b}
1: \x{10a}
\x{10b}\x{10b}
0: \x{10b}\x{10b}
1: \x{10b}
\x{10b}\x{10a}
0: \x{10b}\x{10a}
1: \x{10b}
# The next two tests are for property support in non-UTF mode
/(?:\p{Lu}|\x20)+/
\x41\x20\x50\xC2\x54\xC9\x20\x54\x4F\x44\x41\x59
0: A P\xc2T\xc9 TODAY
/[\p{Lu}\x20]+/
\x41\x20\x50\xC2\x54\xC9\x20\x54\x4F\x44\x41\x59
0: A P\xc2T\xc9 TODAY
/\p{Avestan}\p{Bamum}\p{Egyptian_Hieroglyphs}\p{Imperial_Aramaic}\p{Inscriptional_Pahlavi}\p{Inscriptional_Parthian}\p{Javanese}\p{Kaithi}\p{Lisu}\p{Meetei_Mayek}\p{Old_South_Arabian}\p{Old_Turkic}\p{Samaritan}\p{Tai_Tham}\p{Tai_Viet}/utf
\x{10b00}\x{a6ef}\x{13007}\x{10857}\x{10b78}\x{10b58}\x{a980}\x{110c1}\x{a4ff}\x{abc0}\x{10a7d}\x{10c48}\x{0800}\x{1aad}\x{aac0}
0: \x{10b00}\x{a6ef}\x{13007}\x{10857}\x{10b78}\x{10b58}\x{a980}\x{110c1}\x{a4ff}\x{abc0}\x{10a7d}\x{10c48}\x{800}\x{1aad}\x{aac0}
/^\w+/utf,ucp
Az_\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}1\x{660}\x{bef}\x{16ee}
0: Az_\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}1\x{660}\x{bef}\x{16ee}
/^[[:xdigit:]]*/utf,ucp
1a\x{660}\x{bef}\x{16ee}
0: 1a
/^\d+/utf,ucp
1\x{660}\x{bef}\x{16ee}
0: 1\x{660}\x{bef}
/^[[:digit:]]+/utf,ucp
1\x{660}\x{bef}\x{16ee}
0: 1\x{660}\x{bef}
/^>\s+/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{9}\x{b}
0: > \x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{09}\x{0b}
/^>\pZ+/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{9}\x{b}
0: > \x{a0}\x{1680}\x{2028}\x{2029}\x{202f}
/^>[[:space:]]*/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{9}\x{b}
0: > \x{a0}\x{1680}\x{2028}\x{2029}\x{202f}\x{09}\x{0b}
/^>[[:blank:]]*/utf,ucp
>\x{20}\x{a0}\x{1680}\x{2000}\x{202f}\x{9}\x{b}\x{2028}
0: > \x{a0}\x{1680}\x{2000}\x{202f}\x{09}
/^[[:alpha:]]*/utf,ucp
Az\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}
0: Az\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}
/^[[:alnum:]]*/utf,ucp
Az\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}1\x{660}\x{bef}\x{16ee}
0: Az\x{aa}\x{c0}\x{1c5}\x{2b0}\x{3b6}\x{1d7c9}\x{2fa1d}1\x{660}\x{bef}\x{16ee}
/^[[:cntrl:]]*/utf,ucp
\x{0}\x{09}\x{1f}\x{7f}\x{9f}
0: \x{00}\x{09}\x{1f}\x{7f}\x{9f}
/^[[:graph:]]*/utf,ucp
A\x{a1}\x{a0}
0: A\x{a1}
/^[[:print:]]*/utf,ucp
A z\x{a0}\x{a1}
0: A z\x{a0}\x{a1}
/^[[:punct:]]*/utf,ucp
.+\x{a1}\x{a0}
0: .+\x{a1}
/\p{Zs}*?\R/
\= Expect no match
a\xFCb
No match
/\p{Zs}*\R/
\= Expect no match
a\xFCb
No match
/ⱥ/i,utf
0: \x{2c65}
Ⱥx
0: \x{23a}
Ⱥ
0: \x{23a}
/[ⱥ]/i,utf
0: \x{2c65}
Ⱥx
0: \x{23a}
Ⱥ
0: \x{23a}
/Ⱥ/i,utf
Ⱥ
0: \x{23a}
0: \x{2c65}
# These are tests for extended grapheme clusters
/^\X/utf,aftertext
G\x{34e}\x{34e}X
0: G\x{34e}\x{34e}
0+ X
\x{34e}\x{34e}X
0: \x{34e}\x{34e}
0+ X
\x04X
0: \x{04}
0+ X
\x{1100}X
0: \x{1100}
0+ X
\x{1100}\x{34e}X
0: \x{1100}\x{34e}
0+ X
\x{1b04}\x{1b04}X
0: \x{1b04}\x{1b04}
0+ X
*These match up to the roman letters
0: *
0+ These match up to the roman letters
\x{1111}\x{1111}L,L
0: \x{1111}\x{1111}
0+ L,L
\x{1111}\x{1111}\x{1169}L,L,V
0: \x{1111}\x{1111}\x{1169}
0+ L,L,V
\x{1111}\x{ae4c}L, LV
0: \x{1111}\x{ae4c}
0+ L, LV
\x{1111}\x{ad89}L, LVT
0: \x{1111}\x{ad89}
0+ L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
0: \x{1111}\x{ae4c}\x{1169}
0+ L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
0: \x{1111}\x{ae4c}\x{1169}\x{1169}
0+ L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
0: \x{1111}\x{ae4c}\x{1169}\x{11fe}
0+ L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
0: \x{1111}\x{ad89}\x{11fe}
0+ L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
0: \x{1111}\x{ad89}\x{11fe}\x{11fe}
0+ L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
0: \x{ad89}\x{11fe}\x{11fe}
0+ LVT, T, T
*These match just the first codepoint (invalid sequence)
0: *
0+ These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
0: \x{1111}
0+ \x{11fe}L, T
\x{ae4c}\x{1111}LV, L
0: \x{ae4c}
0+ \x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
0: \x{ae4c}
0+ \x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
0: \x{ae4c}
0+ \x{ad89}LV, LVT
\x{1169}\x{1111}V, L
0: \x{1169}
0+ \x{1111}V, L
\x{1169}\x{ae4c}V, LV
0: \x{1169}
0+ \x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
0: \x{1169}
0+ \x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
0: \x{ad89}
0+ \x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
0: \x{ad89}
0+ \x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
0: \x{ad89}
0+ \x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
0: \x{ad89}
0+ \x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
0: \x{11fe}
0+ \x{1111}T, L
\x{11fe}\x{1169}T, V
0: \x{11fe}
0+ \x{1169}T, V
\x{11fe}\x{ae4c}T, LV
0: \x{11fe}
0+ \x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
0: \x{11fe}
0+ \x{ad89}T, LVT
*Test extend and spacing mark
0: *
0+ Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
0: \x{1111}\x{ae4c}\x{711}
0+ L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}
0+ L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}\x{711}\x{1b04}
0+ L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
0: *
0+ Test CR, LF, and control
\x0d\x{0711}CR, extend
0: \x{0d}
0+ \x{711}CR, extend
\x0d\x{1b04}CR, spacingmark
0: \x{0d}
0+ \x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
0: \x{0a}
0+ \x{711}LF, extend
\x0a\x{1b04}LF, spacingmark
0: \x{0a}
0+ \x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
0: \x{0b}
0+ \x{711}Control, extend
\x09\x{1b04}Control, spacingmark
0: \x{09}
0+ \x{1b04}Control, spacingmark
*Test Extended Pictographic after bug fix
0: *
0+ Test Extended Pictographic after bug fix
\x{261d}\x{261d}B Extended_Pictographic Extended_Pictographic
0: \x{261d}
0+ \x{261d}B Extended_Pictographic Extended_Pictographic
\x{261D}\x{1F3FB}\x{261d}B Extended_Pictographic Extend E-P
0: \x{261d}\x{1f3fb}
0+ \x{261d}B Extended_Pictographic Extend E-P
\x{261D}\x{1F3FB}\x{200d}\x{261d}B Extended_Pictographic Extend ZWJ E-P
0: \x{261d}\x{1f3fb}\x{200d}\x{261d}
0+ B Extended_Pictographic Extend ZWJ E-P
\x{1f3f3}\x{fe0f}\x{200d}\x{1f308}\x{1f3f4}\x{200d}\x{2620}\x{fe0f}\x{1f3f3}\x{fe0f}\x{200d}\x{1f308}\x{1f3f4}\x{200d}\x{2620}\x{fe0f}
0: \x{1f3f3}\x{fe0f}\x{200d}\x{1f308}
0+ \x{1f3f4}\x{200d}\x{2620}\x{fe0f}\x{1f3f3}\x{fe0f}\x{200d}\x{1f308}\x{1f3f4}\x{200d}\x{2620}\x{fe0f}
A\x{200d}\x{1f308}B
0: A\x{200d}
0+ \x{1f308}B
A\x{200d}B A ZWJ
0: A\x{200d}
0+ B A ZWJ
\x{261D}\x{1F3FB}B Extended_Pictographic Extend
0: \x{261d}\x{1f3fb}
0+ B Extended_Pictographic Extend
\x{1F1E6}\x{1F1E7}B RegionalIndicator RegionalIndicator
0: \x{1f1e6}\x{1f1e7}
0+ B RegionalIndicator RegionalIndicator
*There are no Prepend characters, so we can't test Prepend, CR
0: *
0+ There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}?X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/\X*Z/utf,no_start_optimize
\= Expect no match
A\x{300}
No match
/\X*(.)/utf,no_start_optimize
A\x{1111}\x{ae4c}\x{1169}
0: A\x{1111}
1: \x{1111}
# --------------------------------------------
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{1e9e}]+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{00df}]+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/[z\x{1f88}]+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
# Check a reference with more than one other case
/^(\x{00b5})\1{2}$/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}
# Characters with more than one other case; test in classes
/[z\x{00b5}]+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{039c}]+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{03bc}]+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{00c5}]+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{00e5}]+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{212b}]+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{01c4}]+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c5}]+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c6}]+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c7}]+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01c8}]+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01c9}]+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01ca}]+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01cb}]+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01cc}]+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01f1}]+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{01f2}]+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{01f3}]+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{0345}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{0399}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{03b9}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{1fbe}]+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{0392}]+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{03b2}]+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{03d0}]+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{0395}]+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{03b5}]+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{03f5}]+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{0398}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03b8}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03d1}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03f4}]+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{039a}]+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03ba}]+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03f0}]+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03a0}]+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03c0}]+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03d6}]+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03a1}]+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03c1}]+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03f1}]+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03a3}]+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03c2}]+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03c3}]+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03a6}]+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03c6}]+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03d5}]+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03c9}]+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{03a9}]+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{2126}]+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{1e60}]+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e61}]+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e9b}]+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
# Perl 5.12.4 gets these wrong, but 5.15.3 is OK
/[z\x{004b}]+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{006b}]+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{212a}]+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{0053}]+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/[z\x{0073}]+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/[z\x{017f}]+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/^[a-z\x{500}-\x{1000}]{3,}[a-h]|x/utf
ab\x{600}ijklmh
0: ab\x{600}ijklmh
ab\x{600}hijklm
0: ab\x{600}h
\= Expect no match
ab\x{600}ijklm
No match
/^[a-z\x{500}-\x{1000}]{4,7}[a-h]|x/utf
ab\x{600}\x{700}ijkh
0: ab\x{600}\x{700}ijkh
ab\x{600}\x{700}hijkl
0: ab\x{600}\x{700}h
\= Expect no match
ab\x{600}\x{700}ijklh
No match
ab\x{600}h\x{700}ijklmh
No match
/([a-z\x{1000}\x{2000}]{1,2}?u)+$/utf
\x{1000}uu\x{2000}u
0: \x{1000}uu\x{2000}u
1: u\x{2000}u
\x{1001}uuuu
0: uuuu
1: uu
\x{2001}uuuuu
0: uuuuu
1: uuu
uuuu\x{1fff}#u#\x{2000}\x{1000}u\x{2000}u
0: \x{2000}\x{1000}u\x{2000}u
1: \x{2000}u
\= Expect no match
abuabuabuabu!
No match
uuuuuuuuuuuu#
No match
# --------------------------------------
/(ΣΆΜΟΣ) \1/i,utf
ΣΆΜΟΣ ΣΆΜΟΣ
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ σάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
σάμος σάμος
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος σάμοσ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος ΣΆΜΟΣ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
/(σάμος) \1/i,utf
ΣΆΜΟΣ ΣΆΜΟΣ
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ σάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
σάμος σάμος
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος σάμοσ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος ΣΆΜΟΣ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
/(ΣΆΜΟΣ) \1*/i,utf
ΣΆΜΟΣ\x20
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ ΣΆΜΟΣσάμοςσάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}\x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}\x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
# Perl matches these
/\x{00b5}+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{039c}+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{03bc}+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{00c5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{00e5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{212b}+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{01c4}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c5}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c6}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c7}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c8}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c9}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01ca}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cb}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cc}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01f1}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f2}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f3}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{0345}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0399}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{03b9}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{1fbe}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0392}+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03b2}+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03d0}+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{0395}+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03b5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03f5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{0398}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03b8}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03d1}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03f4}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{039a}+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03ba}+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03f0}+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03a0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03c0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03d6}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03a1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03c1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03f1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03a3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c2}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03a6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03d5}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{03a9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{2126}+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{1e60}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/\x{1f80}+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
# Perl 5.12.4 gets these wrong, but 5.15.3 is OK
/\x{004b}+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{006b}+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{212a}+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{0053}+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{0073}+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{017f}+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/^\p{Any}*\d{4}/utf
1234
0: 1234
\= Expect no match
123
No match
/^\X*\w{4}/utf
1234
0: 1234
\= Expect no match
123
No match
/^A\s+Z/utf,ucp
A\x{2005}Z
0: A\x{2005}Z
A\x{85}\x{2005}Z
0: A\x{85}\x{2005}Z
/^A[\s]+Z/utf,ucp
A\x{2005}Z
0: A\x{2005}Z
A\x{85}\x{2005}Z
0: A\x{85}\x{2005}Z
/^[[:graph:]]+$/utf,ucp
Letter:ABC
0: Letter:ABC
Mark:\x{300}\x{1d172}\x{1d17b}
0: Mark:\x{300}\x{1d172}\x{1d17b}
Number:9\x{660}
0: Number:9\x{660}
Punctuation:\x{66a},;
0: Punctuation:\x{66a},;
Symbol:\x{6de}<>\x{fffc}
0: Symbol:\x{6de}<>\x{fffc}
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
0: Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
0: \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
0: \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
0: \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
0: \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
\x{feff}
0: \x{feff}
\x{fff9}\x{fffa}\x{fffb}
0: \x{fff9}\x{fffa}\x{fffb}
\x{110bd}
0: \x{110bd}
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
0: \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
\x{e0001}
0: \x{e0001}
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
0: \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
\= Expect no match
\x{09}
No match
\x{0a}
No match
\x{1D}
No match
\x{20}
No match
\x{85}
No match
\x{a0}
No match
\x{1680}
No match
\x{2028}
No match
\x{2029}
No match
\x{202f}
No match
\x{2065}
No match
\x{3000}
No match
\x{e0002}
No match
\x{e001f}
No match
\x{e0080}
No match
/^[[:print:]]+$/utf,ucp
Space: \x{a0}
0: Space: \x{a0}
\x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
0: \x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
\x{2006}\x{2007}\x{2008}\x{2009}\x{200a}
0: \x{2006}\x{2007}\x{2008}\x{2009}\x{200a}
\x{202f}\x{205f}
0: \x{202f}\x{205f}
\x{3000}
0: \x{3000}
Letter:ABC
0: Letter:ABC
Mark:\x{300}\x{1d172}\x{1d17b}
0: Mark:\x{300}\x{1d172}\x{1d17b}
Number:9\x{660}
0: Number:9\x{660}
Punctuation:\x{66a},;
0: Punctuation:\x{66a},;
Symbol:\x{6de}<>\x{fffc}
0: Symbol:\x{6de}<>\x{fffc}
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
0: Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
0: \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
0: \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
\x{202f}
0: \x{202f}
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
0: \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
0: \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
\x{feff}
0: \x{feff}
\x{fff9}\x{fffa}\x{fffb}
0: \x{fff9}\x{fffa}\x{fffb}
\x{110bd}
0: \x{110bd}
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
0: \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
\x{e0001}
0: \x{e0001}
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
0: \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
\= Expect no match
\x{09}
No match
\x{1D}
No match
\x{85}
No match
\x{2028}
No match
\x{2029}
No match
\x{2065}
No match
\x{e0002}
No match
\x{e001f}
No match
\x{e0080}
No match
/^[[:punct:]]+$/utf,ucp
\$+<=>^`|~
0: $+<=>^`|~
!\"#%&'()*,-./:;?@[\\]_{}
0: !"#%&'()*,-./:;?@[\]_{}
\x{a1}\x{a7}
0: \x{a1}\x{a7}
\x{37e}
0: \x{37e}
\= Expect no match
abcde
No match
/^[[:^graph:]]+$/utf,ucp
\x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{1680}
0: \x{09}\x{0a}\x{1d} \x{85}\x{a0}\x{1680}
\x{2028}\x{2029}\x{202f}\x{2065}
0: \x{2028}\x{2029}\x{202f}\x{2065}
\x{3000}\x{e0002}\x{e001f}\x{e0080}
0: \x{3000}\x{e0002}\x{e001f}\x{e0080}
\= Expect no match
Letter:ABC
No match
Mark:\x{300}\x{1d172}\x{1d17b}
No match
Number:9\x{660}
No match
Punctuation:\x{66a},;
No match
Symbol:\x{6de}<>\x{fffc}
No match
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
No match
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
No match
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
No match
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
No match
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
No match
\x{feff}
No match
\x{fff9}\x{fffa}\x{fffb}
No match
\x{110bd}
No match
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
No match
\x{e0001}
No match
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
No match
/^[[:^print:]]+$/utf,ucp
\x{09}\x{1D}\x{85}\x{2028}\x{2029}\x{2065}
0: \x{09}\x{1d}\x{85}\x{2028}\x{2029}\x{2065}
\x{e0002}\x{e001f}\x{e0080}
0: \x{e0002}\x{e001f}\x{e0080}
\= Expect no match
Space: \x{a0}
No match
\x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
No match
\x{2006}\x{2007}\x{2008}\x{2009}\x{200a}
No match
\x{202f}\x{205f}
No match
\x{3000}
No match
Letter:ABC
No match
Mark:\x{300}\x{1d172}\x{1d17b}
No match
Number:9\x{660}
No match
Punctuation:\x{66a},;
No match
Symbol:\x{6de}<>\x{fffc}
No match
Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
No match
\x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
No match
\x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
No match
\x{202f}
No match
\x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
No match
\x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
No match
\x{feff}
No match
\x{fff9}\x{fffa}\x{fffb}
No match
\x{110bd}
No match
\x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
No match
\x{e0001}
No match
\x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
No match
/^[[:^punct:]]+$/utf,ucp
abcde
0: abcde
\= Expect no match
\$+<=>^`|~
No match
!\"#%&'()*,-./:;?@[\\]_{}
No match
\x{a1}\x{a7}
No match
\x{37e}
No match
/[RST]+/i,utf,ucp
Ss\x{17f}
0: Ss\x{17f}
/[R-T]+/i,utf,ucp
Ss\x{17f}
0: Ss\x{17f}
/[q-u]+/i,utf,ucp
Ss\x{17f}
0: Ss\x{17f}
/^s?c/im,utf
scat
0: sc
# The next four tests are for repeated caseless back references when the
# code unit length of the matched text is different to that of the original
# group in the UTF-8 case.
/^(\x{23a})\1*(.)/i,utf
\x{23a}\x{23a}\x{23a}\x{23a}
0: \x{23a}\x{23a}\x{23a}\x{23a}
1: \x{23a}
2: \x{23a}
\x{23a}\x{2c65}\x{2c65}\x{2c65}
0: \x{23a}\x{2c65}\x{2c65}\x{2c65}
1: \x{23a}
2: \x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
0: \x{23a}\x{23a}\x{2c65}\x{23a}
1: \x{23a}
2: \x{23a}
/^(\x{23a})\1*(..)/i,utf
\x{23a}\x{2c65}\x{2c65}\x{2c65}
0: \x{23a}\x{2c65}\x{2c65}\x{2c65}
1: \x{23a}
2: \x{2c65}\x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
0: \x{23a}\x{23a}\x{2c65}\x{23a}
1: \x{23a}
2: \x{2c65}\x{23a}
/^(\x{23a})\1*(...)/i,utf
\x{23a}\x{2c65}\x{2c65}\x{2c65}
0: \x{23a}\x{2c65}\x{2c65}\x{2c65}
1: \x{23a}
2: \x{2c65}\x{2c65}\x{2c65}
\x{23a}\x{23a}\x{2c65}\x{23a}
0: \x{23a}\x{23a}\x{2c65}\x{23a}
1: \x{23a}
2: \x{23a}\x{2c65}\x{23a}
/^(\x{23a})\1*(....)/i,utf
\= Expect no match
\x{23a}\x{2c65}\x{2c65}\x{2c65}
No match
\x{23a}\x{23a}\x{2c65}\x{23a}
No match
/[A-`]/i,utf
abcdefghijklmno
0: a
/[\S\V\H]/utf
/[^\p{Any}]*+x/utf
x
0: x
/[[:punct:]]/utf,ucp
\x{b4}
No match
/[[:^ascii:]]/utf,ucp
\x{100}
0: \x{100}
\x{200}
0: \x{200}
\x{300}
0: \x{300}
\x{37e}
0: \x{37e}
\= Expect no match
aa
No match
99
No match
/[[:^ascii:]\w]/utf,ucp
aa
0: a
99
0: 9
gg
0: g
\x{100}
0: \x{100}
\x{200}
0: \x{200}
\x{300}
0: \x{300}
\x{37e}
0: \x{37e}
/[\w[:^ascii:]]/utf,ucp
aa
0: a
99
0: 9
gg
0: g
\x{100}
0: \x{100}
\x{200}
0: \x{200}
\x{300}
0: \x{300}
\x{37e}
0: \x{37e}
/[^[:ascii:]\W]/utf,ucp
\x{100}
0: \x{100}
\x{200}
0: \x{200}
\= Expect no match
aa
No match
99
No match
gg
No match
\x{37e}
No match
/[^[:^ascii:]\d]/utf,ucp
a
0: a
~
0: ~
\a
0: \x{07}
\x{7f}
0: \x{7f}
\= Expect no match
0
No match
\x{389}
No match
\x{20ac}
No match
/(?=.*b)\pL/
11bb
0: b
/(?(?=.*b)(?=.*b)\pL|.*c)/
11bb
0: b
/^\x{123}+?$/utf,no_auto_possess
\x{123}\x{123}\x{123}
0: \x{123}\x{123}\x{123}
/^\x{123}+?$/i,utf,no_auto_possess
\x{123}\x{122}\x{123}
0: \x{123}\x{122}\x{123}
\= Expect no match
\x{123}\x{124}\x{123}
No match
/\N{U+1234}/utf
\x{1234}
0: \x{1234}
/[\N{U+1234}]/utf
\x{1234}
0: \x{1234}
/(\x{1234}) \1/utf
\N{U+1234} \o{11064}
0: \x{1234} \x{1234}
1: \x{1234}
# Test the full list of Unicode "Pattern White Space" characters that are to
# be ignored by /x. The pattern lines below may show up oddly in text editors
# or when listed to the screen. Note that characters such as U+2002, which are
# matched as space by \h and \v are *not* "Pattern White Space".
/A…B/x,utf
AB
0: AB
/AB/x,utf
A\x{2002}B
0: A\x{2002}B
\= Expect no match
AB
No match
# -------
/[^\x{100}-\x{ffff}]*[\x80-\xff]/utf
\x{99}\x{99}\x{99}
0: \x{99}\x{99}\x{99}
/[^\x{100}-\x{ffff}ABC]*[\x80-\xff]/utf
\x{99}\x{99}\x{99}
0: \x{99}\x{99}\x{99}
/[^\x{100}-\x{ffff}]*[\x80-\xff]/i,utf
\x{99}\x{99}\x{99}
0: \x{99}\x{99}\x{99}
# Script run tests
/^(*script_run:.{4})/utf
abcd Latin x4
0: abcd
\x{2e80}\x{2fa1d}\x{3041}\x{30a1} Han Han Hiragana Katakana
0: \x{2e80}\x{2fa1d}\x{3041}\x{30a1}
\x{3041}\x{30a1}\x{3007}\x{3007} Hiragana Katakana Han Han
0: \x{3041}\x{30a1}\x{3007}\x{3007}
\x{30a1}\x{3041}\x{3007}\x{3007} Katakana Hiragana Han Han
0: \x{30a1}\x{3041}\x{3007}\x{3007}
\x{1100}\x{2e80}\x{2e80}\x{1101} Hangul Han Han Hangul
0: \x{1100}\x{2e80}\x{2e80}\x{1101}
\x{2e80}\x{3105}\x{2e80}\x{3105} Han Bopomofo Han Bopomofo
0: \x{2e80}\x{3105}\x{2e80}\x{3105}
\x{02ea}\x{2e80}\x{2e80}\x{3105} Bopomofo-Sk Han Han Bopomofo
0: \x{2ea}\x{2e80}\x{2e80}\x{3105}
\x{3105}\x{2e80}\x{2e80}\x{3105} Bopomofo Han Han Bopomofo
0: \x{3105}\x{2e80}\x{2e80}\x{3105}
\x{0300}cd! Inherited Latin Latin Common
0: \x{300}cd!
\x{0391}12\x{03a9} Greek Common-digits Greek
0: \x{391}12\x{3a9}
\x{0400}12\x{fe2f} Cyrillic Common-digits Cyrillic
0: \x{400}12\x{fe2f}
\x{0531}12\x{fb17} Armenian Common-digits Armenian
0: \x{531}12\x{fb17}
\x{0591}12\x{fb4f} Hebrew Common-digits Hebrew
0: \x{591}12\x{fb4f}
\x{0600}12\x{1eef1} Arabic Common-digits Arabic
0: \x{600}12\x{1eef1}
\x{0600}\x{0660}\x{0669}\x{1eef1} Arabic Arabic-digits Arabic
0: \x{600}\x{660}\x{669}\x{1eef1}
\x{0700}12\x{086a} Syriac Common-digits Syriac
0: \x{700}12\x{86a}
\x{1200}12\x{ab2e} Ethiopic Common-digits Ethiopic
0: \x{1200}12\x{ab2e}
\x{1680}12\x{169c} Ogham Common-digits Ogham
0: \x{1680}12\x{169c}
\x{3041}12\x{3041} Hiragana Common-digits Hiragana
0: \x{3041}12\x{3041}
\x{0980}\x{09e6}\x{09e7}\x{0993} Bengali Bengali-digits Bengali
0: \x{980}\x{9e6}\x{9e7}\x{993}
!cde Common Latin Latin Latin
0: !cde
A..B Latin Common Common Latin
0: A..B
0abc Ascii-digit Latin Latin Latin
0: 0abc
1\x{0700}\x{0700}\x{0700} Ascii-digit Syriac x 3
0: 1\x{700}\x{700}\x{700}
\x{1A80}\x{1A80}\x{1a40}\x{1a41} Tai Tham Hora digits, letters
0: \x{1a80}\x{1a80}\x{1a40}\x{1a41}
\= Expect no match
a\x{370}bcd Latin Greek Latin Latin
No match
\x{1100}\x{02ea}\x{02ea}\x{02ea} Hangul Bopomofo x3
No match
\x{02ea}\x{02ea}\x{02ea}\x{1100} Bopomofo x3 Hangul
No match
\x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
No match
\x{0391}\x{09e6}\x{09e7}\x{03a9} Greek Bengali digits Greek
No match
\x{0600}7\x{0669}\x{1eef1} Arabic ascii-digit Arabic-digit Arabic
No match
\x{0600}\x{0669}7\x{1eef1} Arabic Arabic-digit ascii-digit Arabic
No match
A5\x{ff19}B Latin Common-ascii/notascii-digits Latin
No match
\x{0300}cd\x{0391} Inherited Latin Latin Greek
No match
!cd\x{0391} Common Latin Latin Greek
No match
\x{1A80}\x{1A90}\x{1a40}\x{1a41} Tai Tham Hora digit, Tham digit, letters
No match
A\x{1d7ce}\x{1d7ff}B Common fancy-common-2-sets-digits Common
No match
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
No match
/^(*sr:.{4}|..)/utf
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
0: \x{2e80}\x{3105}
/^(*atomic_script_run:.{4}|..)/utf
\= Expect no match
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
No match
/^(*asr:.*)/utf
\= Expect no match
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
No match
/^(?>(*sr:.*))/utf
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
0: \x{2e80}\x{3105}\x{2e80}
/^(*sr:.*)/utf
\x{2e80}\x{3105}\x{2e80}\x{30a1} Han Bopomofo Han Katakana
0: \x{2e80}\x{3105}\x{2e80}
\x{10fffd}\x{10fffd}\x{10fffd} Private use (Unknown)
0: \x{10fffd}
/^(*sr:\x{2e80}*)/utf
\x{2e80}\x{2e80}\x{3105} Han Han Bopomofo
0: \x{2e80}\x{2e80}
/^(*sr:\x{2e80}*)\x{2e80}/utf
\x{2e80}\x{2e80}\x{3105} Han Han Bopomofo
0: \x{2e80}\x{2e80}
/^(*sr:.*)Test/utf
Test script run on an empty string
0: Test
/^(*sr:(.{2})){2}/utf
\x{0600}7\x{0669}\x{1eef1} Arabic ascii-digit Arabic-digit Arabic
0: \x{600}7\x{669}\x{1eef1}
1: \x{669}\x{1eef1}
\x{1A80}\x{1A80}\x{1a40}\x{1a41} Tai Tham Hora digits, letters
0: \x{1a80}\x{1a80}\x{1a40}\x{1a41}
1: \x{1a40}\x{1a41}
\x{1A80}\x{1a40}\x{1A90}\x{1a41} Tai Tham Hora digit, letter, Tham digit, letter
0: \x{1a80}\x{1a40}\x{1a90}\x{1a41}
1: \x{1a90}\x{1a41}
\= Expect no match
\x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
No match
/^(*sr:\S*)/utf
\x{1cf4}\x{20f0}\x{900}\x{11305} [Dev,Gran,Kan] [Dev,Gran,Lat] Dev Gran
0: \x{1cf4}\x{20f0}\x{900}
\x{1cf4}\x{20f0}\x{11305}\x{900} [Dev,Gran,Kan] [Dev,Gran,Lat] Gran Dev
0: \x{1cf4}\x{20f0}\x{11305}
\x{1cf4}\x{20f0}\x{900}ABC [Dev,Gran,Kan] [Dev,Gran,Lat] Dev Lat
0: \x{1cf4}\x{20f0}\x{900}
\x{1cf4}\x{20f0}ABC [Dev,Gran,Kan] [Dev,Gran,Lat] Lat
0: \x{1cf4}\x{20f0}
\x{20f0}ABC [Dev,Gran,Lat] Lat
0: \x{20f0}ABC
XYZ\x{20f0}ABC Lat [Dev,Gran,Lat] Lat
0: XYZ\x{20f0}ABC
\x{a36}\x{a33}\x{900} [Dev,...] [Dev,...] Dev
0: \x{a36}\x{a33}
\x{3001}\x{2e80}\x{3041}\x{30a1} [Bopo, Han, etc] Han Hira Kata
0: \x{3001}\x{2e80}\x{3041}\x{30a1}
\x{3001}\x{30a1}\x{2e80}\x{3041} [Bopo, Han, etc] Kata Han Hira
0: \x{3001}\x{30a1}\x{2e80}\x{3041}
\x{3001}\x{3105}\x{2e80}\x{1101} [Bopo, Han, etc] Bopomofo Han Hangul
0: \x{3001}\x{3105}\x{2e80}
\x{3105}\x{3001}\x{2e80}\x{1101} Bopomofo [Bopo, Han, etc] Han Hangul
0: \x{3105}\x{3001}\x{2e80}
\x{3031}\x{3041}\x{30a1}\x{2e80} [Hira Kata] Hira Kata Han
0: \x{3031}\x{3041}\x{30a1}\x{2e80}
\x{060c}\x{06d4}\x{0600}\x{10d00}\x{0700} [Arab Rohg Syrc Thaa] [Arab Rohg] Arab Rohg Syrc
0: \x{60c}\x{6d4}\x{600}
\x{060c}\x{06d4}\x{0700}\x{0600}\x{10d00} [Arab Rohg Syrc Thaa] [Arab Rohg] Syrc Arab Rohg
0: \x{60c}\x{6d4}
\x{2e80}\x{3041}\x{3001}\x{3031}\x{2e80} Han Hira [Bopo, Han, etc] [Hira Kata] Han
0: \x{2e80}\x{3041}\x{3001}\x{3031}\x{2e80}
/(?<!)(*sr:)/
/(?<!X(*sr:B)C)/
/(?<=abc(?=X(*sr:BCY)Z)XBCYZ)./
abcXBCYZ!
0: !
/(?<=abc(?=X(*sr:BXY)CCC)XBXYCCC)./
abcXBXYCCC!
0: !
/^(*sr:\S*)/utf
\x{10d00}\x{10d00}\x{06d4} Rohingya Rohingya Arabic-full-stop
0: \x{10d00}\x{10d00}\x{6d4}
\x{06d4}\x{10d00}\x{10d00} Arabic-full-stop Rohingya Rohingya
0: \x{6d4}\x{10d00}\x{10d00}
\x{10d00}\x{10d00}\x{0363} Rohingya Rohingya Inherited-extend-Latin
0: \x{10d00}\x{10d00}
\x{0363}\x{10d00}\x{10d00} Inherited-extend-Latin Rohingya Rohingya
0: \x{363}
AB\x{0363} Latin Latin Inherited-extend-Latin
0: AB\x{363}
\x{0363}AB Inherited-extend-Latin Latin Latin
0: \x{363}AB
AB\x{1cf7} Latin Latin Common-extended-Beng
0: AB
\x{1cf7}AB Common-extend-Beng Latin Latin
0: \x{1cf7}
\x{1cf7}\x{0993} Common-extend-Beng Bengali
0: \x{1cf7}\x{993}
A\x{1abe}BC Test enclosing mark
0: A\x{1abe}BC
\x{0370}\x{1abe}\x{0371} Which can occur with any script (Greek here)
0: \x{370}\x{1abe}\x{371}
\x{3001}\x{adf9}\x{3001} [.. Hangul ..] Hangul [.. Hangul ..]
0: \x{3001}\x{adf9}\x{3001}
\x{3400}\x{3001}XXX Han [Han etc.]
0: \x{3400}\x{3001}
\x{3400}\x{1cd5} Han [Bengali Devanagari]
0: \x{3400}
\x{ac01}\x{3400} Hangul [.. Hangul ..]
0: \x{ac01}\x{3400}
\x{ac01}\x{1cd5} Hangul [Bengali Devanagari]
0: \x{ac01}
\x{102e0}\x{06d4}\x{1ee4d} [Arabic Coptic] [Arab Rohingya] Arabic
0: \x{102e0}\x{6d4}\x{1ee4d}
\x{102e0}\x{06d4}\x{2cc9} [Arabic Coptic] [Arab Rohingya] Coptic
0: \x{102e0}\x{6d4}
\x{102e0}\x{06d4}\x{10d30} [Arabic Coptic] [Arab Rohingya] Rohingya
0: \x{102e0}\x{6d4}
# Test loop breaking for empty string match
/^(*sr:A|)*BCD/utf
AABCD
0: AABCD
ABCD
0: ABCD
BCD
0: BCD
# The use of (*ACCEPT) breaks script run checking
/^(*sr:.*(*ACCEPT)ZZ)/utf
\x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
0: \x{1100}\x{2e80}\x{3041}\x{1101} Hangul Han Hiragana Hangul
# -------
# Test group names containing non-ASCII letters and digits
/(?'ABáC'...)\g{ABáC}/utf
abcabcdefg
0: abcabc
1: abc
/(?'XʰABC'...)/utf
xyzpq
0: xyz
1: xyz
/(?'XאABC'...)/utf
12345
0: 123
1: 123
/(?'XᾈABC'...)/utf
%^&*(...
0: %^&
1: %^&
/(?'𐨐ABC'...)/utf
abcde
0: abc
1: abc
/^(?'אABC'...)(?&אABC)(?P=אABC)/utf
123123123456
0: 123123123
1: 123
/^(?'אABC'...)(?&אABC)/utf
123123123456
0: 123123
1: 123
/\X*/
\xF3aaa\xE4\xEA\xEB\xFEa
0: \xf3aaa\xe4\xea\xeb\xfea
/Я/i,utf
\x{42f}
0: \x{42f}
\x{44f}
0: \x{44f}
/(?=Я)/i,utf
\x{42f}
0:
\x{44f}
0:
# -----------------------------------------------------------------------------
# Tests for bidi control and bidi class properties.
/\p{ bidi_control }/utf
-->\x{202c}<--
0: \x{202c}
/\p{bidicontrol}+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/\p{bidic}+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}
/\p{bidi_control}++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/[\p{bidi_c}]/utf
-->\x{202c}<--
0: \x{202c}
/[\p{bidicontrol}]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/[\p{bidicontrol}]+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}
/[\p{bidicontrol}]++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/[\p{bidicontrol}<>]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: >\x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: >\x{2066}\x{2067}\x{2068}\x{2069}<
/\P{bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: -->
0: <--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: -->
0: <--
/\p{^bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: -->
0: <--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: -->
0: <--
/\p{bidi class = al}/utf
-->\x{061D}<--
0: \x{61d}
/\p{bc = al}+/utf
-->\x{061D}\x{061e}\x{061f}<--
0: \x{61d}\x{61e}\x{61f}
/\p{bidi_class : AL}+?/utf
-->\x{061D}\x{061e}\x{061f}<--
0: \x{61d}
/\p{Bidi_Class : AL}++/utf
-->\x{061D}\x{061e}\x{061f}<--
0: \x{61d}\x{61e}\x{61f}
/\p{b_c = aN}+/utf
-->\x{061D}\x{0602}\x{0604}\x{061f}<--
0: \x{602}\x{604}
/\p{bidi class = B}+/utf
-->\x{0a}\x{0d}\x{01c}\x{01e}\x{085}\x{2029}<--
0: \x{0a}\x{0d}\x{1c}\x{1e}\x{85}\x{2029}
/\p{bidi class:BN}+/utf
-->\x{0}\x{08}\x{200c}\x{fffe}\x{dfffe}\x{10ffff}<--
0: \x{00}\x{08}\x{200c}\x{fffe}\x{dfffe}\x{10ffff}
/\p{bidiclass:cs}+/utf
-->,.\x{060c}\x{ff1a}<--
0: ,.\x{60c}\x{ff1a}
/\p{bidiclass:En}+/utf
-->09\x{b2}\x{2074}\x{1fbf9}<--
0: 09\x{b2}\x{2074}\x{1fbf9}
/\p{bidiclass:es}+/utf
==>+-\x{207a}\x{ff0d}<==
0: +-\x{207a}\x{ff0d}
/\p{bidiclass:et}+/utf
-->#\{24}%\x{a2}\x{A838}\x{1e2ff}<--
0: #
/\p{bidiclass:FSI}+/utf
-->\x{2068}<--
0: \x{2068}
/\p{bidi class:L}+/utf
-->ABC<--
0: ABC
/\P{bidi class:L}+/utf
-->ABC<--
0: -->
/\p{bidi class:LRE}+\p{bidiclass=lri}*\p{bidiclass:lro}/utf
-->\x{202a}\x{2066}\x{202d}<--
0: \x{202a}\x{2066}\x{202d}
/\p{bidi class:NSM}+/utf
-->\x{9bc}\x{a71}\x{e31}<--
0: \x{9bc}\x{a71}\x{e31}
/\p{bidi class:ON}+/utf
-->\x{21}'()*;@\x{384}\x{2039}<=-
0: >!'()*;@\x{384}\x{2039}<=
/\p{bidiclass:pdf}\p{bidiclass:pdi}/utf
-->\x{202c}\x{2069}<--
0: \x{202c}\x{2069}
/\p{bidi class:R}+/utf
-->\x{590}\x{5c6}\x{200f}\x{10805}<--
0: \x{590}\x{5c6}\x{200f}\x{10805}
/\p{bidi class:RLE}+\p{bidi class:RLI}*\p{bidi class:RLO}+/utf
-->\x{202b}\x{2067}\x{202e}<--
0: \x{202b}\x{2067}\x{202e}
/\p{bidi class:S}+\p{bidiclass:WS}+/utf
-->\x{9}\x{b}\x{1f} \x{c} \x{2000} \x{3000}<--
0: \x{09}\x{0b}\x{1f} \x{0c} \x{2000} \x{3000}
# -----------------------------------------------------------------------------
/[\p{taml}\p{sc:ugar}]+/utf
\x{0b82}\x{10380}
0: \x{b82}\x{10380}
/^[\p{sc:Arabic}]/utf
\= Expect no match
\x{650}
No match
\x{651}
No match
\x{652}
No match
\x{653}
No match
\x{654}
No match
\x{655}
No match
# -----------------------------------------------------------------------------
# Tests for newly-added Boolean Properties
/\p{ahex}\p{asciihexdigit}/utf
>4F<
0: 4F
/\p{alpha}\p{alphabetic}/g,utf
>AB<>\x{148}\x{1234}
0: AB
0: \x{148}\x{1234}
/\p{ascii}\p{ascii}/g,utf
>AB<>\x{148}\x{1234}
0: >A
0: B<
/\p{Bidi_C}\p{bidicontrol}/g,utf
>\x{202d}\x{2069}<
0: \x{202d}\x{2069}
/\p{Bidi_M}\p{bidimirrored}/g,utf
>\x{202d}\x{2069}<>\x{298b}\x{bb}<
0: <>
0: \x{298b}\x{bb}
/\p{cased}\p{cased}/g,utf
>AN<>\x{149}\x{120}<
0: AN
0: \x{149}\x{120}
/\p{caseignorable}\p{ci}/g,utf
>AN<>\x{60}\x{859}<
0: `\x{859}
/\p{changeswhencasefolded}\p{cwcf}/g,utf
>AN<>\x{149}\x{120}<
0: AN
0: \x{149}\x{120}
/\p{changeswhencasemapped}\p{cwcm}/g,utf
>AN<>\x{149}\x{120}<
0: AN
0: \x{149}\x{120}
/\p{changeswhenlowercased}\p{cwl}/g,utf
>AN<>\x{149}\x{120}<>yz<
0: AN
/\p{changeswhenuppercased}\p{cwu}/g,utf
>AN<>\x{149}\x{120}<>yz<
0: yz
/\p{changeswhentitlecased}\p{cwt}/g,utf
>AN<>\x{149}\x{120}<>yz<
0: yz
/\p{dash}\p{dash}/g,utf
>\x{2d}\x{1400}<>yz<
0: -\x{1400}
/\p{defaultignorablecodepoint}\p{di}/g,utf
>AN<>\x{ad}\x{e0fff}<>yz<
0: \x{ad}\x{e0fff}
/\p{deprecated}\p{dep}/g,utf
>AN<>\x{149}\x{e0001}<>yz<
0: \x{149}\x{e0001}
/\p{diacritic}\p{dia}/g,utf
>AN<>\x{f84}\x{5e}<>yz<
0: \x{f84}^
/\p{emojicomponent}\p{ecomp}/g,utf
>AN<>\x{200d}\x{e007f}<>yz<
0: \x{200d}\x{e007f}
/\p{emojimodifier}\p{emod}/g,utf
>AN<>\x{1f3fb}\x{1f3ff}<>yz<
0: \x{1f3fb}\x{1f3ff}
/\p{emojipresentation}\p{epres}/g,utf
>AN<>\x{2653}\x{1f6d2}<>yz<
0: \x{2653}\x{1f6d2}
/\p{extender}\p{ext}/g,utf
>AN<>\x{1e944}\x{b7}<>yz<
0: \x{1e944}\x{b7}
/\p{extendedpictographic}\p{extpict}/g,utf
>AN<>\x{26cf}\x{ae}<>yz<
0: \x{26cf}\x{ae}
/\p{graphemebase}\p{grbase}/g,utf
>AN<>\x{10f}\x{60}<>yz<
0: >A
0: N<
0: >\x{10f}
0: `<
0: >y
0: z<
/\p{graphemeextend}\p{grext}/g,utf
>AN<>\x{300}\x{b44}<>yz<
0: \x{300}\x{b44}
/\p{hexdigit}\p{hex}/g,utf
>AF23<>\x{ff46}\x{ff10}<>yz<
0: AF
0: 23
0: \x{ff46}\x{ff10}
/\p{idcontinue}\p{idc}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
0: AF
0: 23
0: \x{146}z
0: yz
/\p{ideographic}\p{ideo}/g,utf
>AF23<>\x{30000}\x{3006}<>yz<
0: \x{30000}\x{3006}
/\p{idstart}\p{ids}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
0: AF
0: \x{146}z
0: yz
/\p{idsbinaryoperator}\p{idsb}/g,utf
>AF23<>\x{2ff0}\x{2ffb}<>yz<\x{2ff2}\x{2ff1}
0: \x{2ff0}\x{2ffb}
/\p{idstrinaryoperator}\p{idst}/g,utf
>AF23<>\x{2ff2}\x{2ff3}<>yz<
0: \x{2ff2}\x{2ff3}
/\p{Join Control}\p{joinc}/g,utf
>AF23<>\x{200c}\x{200d}<>yz<
0: \x{200c}\x{200d}
/\p{logical_order_exception}\p{loe}/g,utf
>AF23<>\x{e40}\x{aabc}<>yz<
0: \x{e40}\x{aabc}
/\p{Lowercase}\p{lower}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
0: \x{146}z
0: yz
/\p{math}\p{math}/g,utf
>AF23<>\x{2215}\x{2b}<>yz<
0: <>
0: \x{2215}+
0: <>
/\p{Non Character Code Point}\p{nchar}/g,utf
>AF23<>\x{10ffff}\x{fdd0}<>yz<
0: \x{10ffff}\x{fdd0}
/\p{patternsyntax}\p{patsyn}/g,utf
>AF23<>\x{21cd}\x{21}<>yz<
0: <>
0: \x{21cd}!
0: <>
/\p{patternwhitespace}\p{patws}/g,utf
>AF23<>\x{2029}\x{85}<>yz<
0: \x{2029}\x{85}
/\p{prependedconcatenationmark}\p{pcm}/g,utf
>AF23<>\x{600}\x{110cd}<>yz<
0: \x{600}\x{110cd}
/\p{quotationmark}\p{qmark}/g,utf
>AF23<>\x{ff63}\x{22}<>yz<
0: \x{ff63}"
/\p{radical}\p{radical}/g,utf
>AF23<>\x{2fd5}\x{2e80}<>yz<
0: \x{2fd5}\x{2e80}
/\p{regionalindicator}\p{ri}/g,utf
>AF23<>\x{1f1e6}\x{1f1ff}<>yz<
0: \x{1f1e6}\x{1f1ff}
/=\p{whitespace}\p{space}\p{wspace}=/g,utf
>AF23<=\x{d}\x{1680}\x{3000}=>yz<
0: =\x{0d}\x{1680}\x{3000}=
/\p{sentenceterminal}\p{sterm}/g,utf
>AF23<>\x{1da88}\x{2e}<>yz<
0: \x{1da88}.
/\p{terminalpunctuation}\p{term}/g,utf
>AF23<>\x{1da88}\x{2e}<>yz<
0: \x{1da88}.
/\p{unified ideograph}\p{uideo}/g,utf
>AF23<>\x{30000}\x{3400}<>yz<
0: \x{30000}\x{3400}
/\p{UPPERcase}\p{upper}/g,utf
>AF23<>\x{146}\x{7a}<>yz<
0: AF
/\p{variationselector}\p{vs}/g,utf
>AF23<>\x{180b}\x{e01ef}<>yz<
0: \x{180b}\x{e01ef}
/\p{xidcontinue}\p{xidc}/g,utf
>AF23<>\x{146}\x{30}<>yz<
0: AF
0: 23
0: \x{146}0
0: yz
# -----------------------------------------------------------------------------
# Variable-length lookbehinds.
/(?<=áb?c).../g,utf
ábcdèfgácxyz
0: d\x{e8}f
0: xyz
/(?<=PQR|áb?c).../g,utf
ábcdèfgácxyzPQR123
0: d\x{e8}f
0: xyz
0: 123
/(?<=áb?c|PQR).../g,utf
ábcdèfgácxyzPQR123
0: d\x{e8}f
0: xyz
0: 123
/(?<=PQ|áb?c).../g,utf
ábcdèfgácxyzPQR123
0: d\x{e8}f
0: xyz
0: R12
/(?<=áb?c|PQ).../g,utf
ábcdèfgácxyzPQR123
0: d\x{e8}f
0: xyz
0: R12
/(?<=á(b?c|d?è?è)f)X./g,utf
ácfX1zzzáèfX2zzzádèèfX3zzzX4zzz
0: X1
1: c
0: X2
1: \x{e8}
0: X3
1: d\x{e8}\x{e8}
/(?<!á(b?c|d?è?è)f)X./g,utf
ácfX1zzzáèfX2zzzádèèfX3zzzX4zzz
0: X4
/(?(?<=áb?c)d|è)/utf
ábcdèfg
0: d
ácdèfg
0: d
áxdèfg
0: \x{e8}
/(?<=\d{2,3}|áBC)./utf
áBCD
0: D
/(?<=á(b?c){3}d)X/utf
ZXácbccdXYZ
0: X
1: c
/(?<=á(b?c){0}d)X/utf
ZXádXYZ
0: X
/(?<=á?(b?c){0}d)X./utf
ZXádXYZ
0: XY
# --------------------------------------------------------------------------
/\N{ U+1234 }/utf
\x{1234}
0: \x{1234}
/\o{ 1234 }/utf
x\o{1234}y
0: \x{29c}
/\x{ 1234 }/utf
x\x{1234}y
0: \x{1234}
/\p{ L }/
23AB56
0: A
/\w+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/[\w]+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/[[:word:]]+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/[[:xdigit:]]+/utf,ucp
--123ef\x{ff10}\x{ff19}\x{ff21}\x{ff26}\x{ff1a}
0: 123ef\x{ff10}\x{ff19}\x{ff21}\x{ff26}
/\b.+?\b/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/caf\B.+?\B/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe
# --------------------------------------------------------------------------
# Case-independent matching property tests added after changing PCRE2 to be
# compatible with Perl. All three cases (upper, lower, title) conflate.
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/utf
>AbbD<
0: AbbD
>Abb\x{01c5}<
0: Abb\x{1c5}
\= Expect no match
>aBBd<
No match
>aB!!<
No match
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/i,utf
>aB!!<
0: aB!!
>\x{01c5}B!!<
0: \x{1c5}B!!
\= Expect no match
>AbbD<
No match
>aBBd<
No match
>Abb\x{01c5}<
No match
/[.\p{Lu}][.\p{Ll}][.\P{Lu}][.\P{Ll}]/i,utf
>aB!!<
0: aB!!
\= Expect no match
>AbbD<
No match
>aBBd<
No match
>Abb\x{01c5}<
No match
/[\p{Lt}\x{36b}][\P{Lt}\x{10a0}]/i,utf
>A!<
0: A!
>\x{3c9}\x{58d}<
0: \x{3c9}\x{58d}
>\x{413}\x{940}<
0: \x{413}\x{940}
\= Expect no match
\x{3c9}\x{3c9}
No match
\x{58d}\x{58d}
No match
\x{413}\x{413}
No match
\x{940}\x{940}
No match
/^\p{Lt}+/i,utf
\x{1c5}AB
0: \x{1c5}AB
# --------------------------------------------------------------------------
/\p{ ^ L u }/
AbCd
0: b
# hex
/c3 b1/hex,utf
\N{U+00F1}
0: \x{f1}
/[^\P{Lu}1]/i,utf,ucp
a
0: a
A
0: A
\x{3a3}
0: \x{3a3}
\x{3c3}
0: \x{3c3}
\= Expect no match
1
No match
2
No match
/[^\P{Lu}1]/utf,ucp
A
0: A
\x{3a3}
0: \x{3a3}
\= Expect no match
1
No match
2
No match
a
No match
\x{3c3}
No match
/[\P{Lu}1]/i,utf,ucp
1
0: 1
2
0: 2
\= Expect no match
a
No match
A
No match
\x{3a3}
No match
\x{3c3}
No match
/[\P{Lu}1]/utf,ucp
1
0: 1
2
0: 2
a
0: a
\x{3c3}
0: \x{3c3}
\= Expect no match
A
No match
\x{3a3}
No match
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
/(?[\p{L} - \p{Lu}])/
a
0: a
\= Expect no match
A
No match
1
No match
/(?[\p{L} & \p{Lu}])/
A
0: A
\= Expect no match
a
No match
1
No match
/(?[[\p{Lu}z] ^ [\p{Ll}G]])/
A
0: A
p
0: p
\= Expect no match
G
No match
z
No match
1
No match
/(?[\p{Ll} | \p{Nd}])/
a
0: a
1
0: 1
\= Expect no match
A
No match
/(?[\p{Ll} + [\p{Nd}]])/
a
0: a
1
0: 1
\= Expect no match
A
No match
/(?[ ![\p{Nd}z] ])/
_
0: _
Z
0: Z
\= Expect no match
1
No match
z
No match
/(?[ \P{Nd} + [2] ])/
_
0: _
Z
0: Z
2
0: 2
\= Expect no match
1
No match
3
No match
/(?[ ![\P{Nd}] ])/
1
0: 1
2
0: 2
\= Expect no match
_
No match
z
No match
# caseless tests
/(?[ \p{Lu} ^ \p{Ll} ])/
a
0: a
A
0: A
\= Expect no match
_
No match
1
No match
/(?[ [\p{Lu}1] ^ \p{Ll} ])/i
1
0: 1
\= Expect no match
a
No match
A
No match
_
No match
/(?[ [\p{Lu}1] & [\p{Ll}1] ])/
1
0: 1
\= Expect no match
a
No match
A
No match
_
No match
2
No match
/(?[ [\p{Lu}1] & [\p{Ll}1] ])/i
a
0: a
A
0: A
1
0: 1
\= Expect no match
_
No match
2
No match
/(?[ \p{Lu} + \p{Ll} & [a-z] ])/utf
\x{0411}
0: \x{411}
a
0: a
A
0: A
\= Expect no match
\x{0431}
No match
/(?[ (\p{Lu} + \p{Ll}) & [a-z] ])/utf
a
0: a
\= Expect no match
\x{0411}
No match
\x{0431}
No match
A
No match
/(?[ [a-z] & \p{Lu} + \p{Ll} ])/utf
a
0: a
\x{0431}
0: \x{431}
\= Expect no match
\x{0411}
No match
A
No match
/(?[ [a-z] & (\p{Lu} + \p{Ll}) ])/utf
a
0: a
\= Expect no match
\x{0431}
No match
\x{0411}
No match
A
No match
# --------------
# End of testinput4

8225
3rd/pcre2/testdata/testoutput5 vendored Normal file

File diff suppressed because it is too large Load Diff

8167
3rd/pcre2/testdata/testoutput6 vendored Normal file

File diff suppressed because it is too large Load Diff

4552
3rd/pcre2/testdata/testoutput7 vendored Normal file
View File

@@ -0,0 +1,4552 @@
# This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests.
#subject dfa
#newline_default LF any anyCRLF
/\x{100}ab/utf
\x{100}ab
0: \x{100}ab
/a\x{100}*b/utf
ab
0: ab
a\x{100}b
0: a\x{100}b
a\x{100}\x{100}b
0: a\x{100}\x{100}b
/a\x{100}+b/utf
a\x{100}b
0: a\x{100}b
a\x{100}\x{100}b
0: a\x{100}\x{100}b
\= Expect no match
ab
No match
/\bX/utf
Xoanon
0: X
+Xoanon
0: X
\x{300}Xoanon
0: X
\= Expect no match
YXoanon
No match
/\BX/utf
YXoanon
0: X
\= Expect no match
Xoanon
No match
+Xoanon
No match
\x{300}Xoanon
No match
/X\b/utf
X+oanon
0: X
ZX\x{300}oanon
0: X
FAX
0: X
\= Expect no match
Xoanon
No match
/X\B/utf
Xoanon
0: X
\= Expect no match
X+oanon
No match
ZX\x{300}oanon
No match
FAX
No match
/[^a]/utf
abcd
0: b
a\x{100}
0: \x{100}
/^[abc\x{123}\x{400}-\x{402}]{2,3}\d/utf
ab99
0: ab9
\x{123}\x{123}45
0: \x{123}\x{123}4
\x{400}\x{401}\x{402}6
0: \x{400}\x{401}\x{402}6
\= Expect no match
d99
No match
\x{123}\x{122}4
No match
\x{400}\x{403}6
No match
\x{400}\x{401}\x{402}\x{402}6
No match
/a.b/utf
acb
0: acb
a\x7fb
0: a\x{7f}b
a\x{100}b
0: a\x{100}b
\= Expect no match
a\nb
No match
/a(.{3})b/utf
a\x{4000}xyb
0: a\x{4000}xyb
a\x{4000}\x7fyb
0: a\x{4000}\x{7f}yb
a\x{4000}\x{100}yb
0: a\x{4000}\x{100}yb
\= Expect no match
a\x{4000}b
No match
ac\ncb
No match
/a(.*?)(.)/
a\xc0\x88b
0: a\xc0\x88b
1: a\xc0\x88
2: a\xc0
/a(.*?)(.)/utf
a\x{100}b
0: a\x{100}b
1: a\x{100}
/a(.*)(.)/
a\xc0\x88b
0: a\xc0\x88b
1: a\xc0\x88
2: a\xc0
/a(.*)(.)/utf
a\x{100}b
0: a\x{100}b
1: a\x{100}
/a(.)(.)/
a\xc0\x92bcd
0: a\xc0\x92
/a(.)(.)/utf
a\x{240}bcd
0: a\x{240}b
/a(.?)(.)/
a\xc0\x92bcd
0: a\xc0\x92
1: a\xc0
/a(.?)(.)/utf
a\x{240}bcd
0: a\x{240}b
1: a\x{240}
/a(.??)(.)/
a\xc0\x92bcd
0: a\xc0\x92
1: a\xc0
/a(.??)(.)/utf
a\x{240}bcd
0: a\x{240}b
1: a\x{240}
/a(.{3})b/utf
a\x{1234}xyb
0: a\x{1234}xyb
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
\= Expect no match
a\x{1234}b
No match
ac\ncb
No match
/a(.{3,})b/utf
a\x{1234}xyb
0: a\x{1234}xyb
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
0: axxxxbcdefghijb
1: axxxxb
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
\= Expect no match
a\x{1234}b
No match
/a(.{3,}?)b/utf
a\x{1234}xyb
0: a\x{1234}xyb
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
0: axxxxbcdefghijb
1: axxxxb
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
\= Expect no match
a\x{1234}b
No match
/a(.{3,5})b/utf
a\x{1234}xyb
0: a\x{1234}xyb
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
0: axxxxb
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
axbxxbcdefghijb
0: axbxxb
axxxxxbcdefghijb
0: axxxxxb
\= Expect no match
a\x{1234}b
No match
axxxxxxbcdefghijb
No match
/a(.{3,5}?)b/utf
a\x{1234}xyb
0: a\x{1234}xyb
a\x{1234}\x{4321}yb
0: a\x{1234}\x{4321}yb
a\x{1234}\x{4321}\x{3412}b
0: a\x{1234}\x{4321}\x{3412}b
axxxxbcdefghijb
0: axxxxb
a\x{1234}\x{4321}\x{3412}\x{3421}b
0: a\x{1234}\x{4321}\x{3412}\x{3421}b
axbxxbcdefghijb
0: axbxxb
axxxxxbcdefghijb
0: axxxxxb
\= Expect no match
a\x{1234}b
No match
axxxxxxbcdefghijb
No match
/^[a\x{c0}]/utf
\= Expect no match
\x{100}
No match
/(?<=aXb)cd/utf
aXbcd
0: cd
/(?<=a\x{100}b)cd/utf
a\x{100}bcd
0: cd
/(?<=a\x{100000}b)cd/utf
a\x{100000}bcd
0: cd
/(?:\x{100}){3}b/utf
\x{100}\x{100}\x{100}b
0: \x{100}\x{100}\x{100}b
\= Expect no match
\x{100}\x{100}b
No match
/\x{ab}/utf
\x{ab}
0: \x{ab}
\xc2\xab
0: \x{ab}
\= Expect no match
\x00{ab}
No match
/(?<=(.))X/utf
WXYZ
0: X
\x{256}XYZ
0: X
\= Expect no match
XYZ
No match
/[^a]+/g,utf
bcd
0: bcd
\x{100}aY\x{256}Z
0: \x{100}
0: Y\x{256}Z
/^[^a]{2}/utf
\x{100}bc
0: \x{100}b
/^[^a]{2,}/utf
\x{100}bcAa
0: \x{100}bcA
/^[^a]{2,}?/utf
\x{100}bca
0: \x{100}bc
1: \x{100}b
/[^a]+/gi,utf
bcd
0: bcd
\x{100}aY\x{256}Z
0: \x{100}
0: Y\x{256}Z
/^[^a]{2}/i,utf
\x{100}bc
0: \x{100}b
/^[^a]{2,}/i,utf
\x{100}bcAa
0: \x{100}bc
/^[^a]{2,}?/i,utf
\x{100}bca
0: \x{100}bc
1: \x{100}b
/\x{100}{0,0}/utf
abcd
0:
/\x{100}?/utf
abcd
0:
\x{100}\x{100}
0: \x{100}
/\x{100}{0,3}/utf
\x{100}\x{100}
0: \x{100}\x{100}
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}
/\x{100}*/utf
abce
0:
\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
/\x{100}{1,1}/utf
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}
/\x{100}{1,3}/utf
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}
/\x{100}+/utf
abcd\x{100}\x{100}\x{100}\x{100}
0: \x{100}\x{100}\x{100}\x{100}
/\x{100}{3}/utf
abcd\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}
/\x{100}{3,5}/utf
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}\x{100}\x{100}
/\x{100}{3,}/utf,no_auto_possess
abcd\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}XX
0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
2: \x{100}\x{100}\x{100}\x{100}\x{100}
3: \x{100}\x{100}\x{100}\x{100}
4: \x{100}\x{100}\x{100}
/(?<=a\x{100}{2}b)X/utf
Xyyya\x{100}\x{100}bXzzz
0: X
/\D*/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
/\D*/utf,no_auto_possess
\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
Matched, but offsets vector is too small to show all matches
0: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
1: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
2: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
3: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
4: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
5: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
6: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
7: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
8: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
9: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
10: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
11: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
12: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
13: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
14: \x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}\x{100}
/\D/utf
1X2
0: X
1\x{100}2
0: \x{100}
/>\S/utf
> >X Y
0: >X
> >\x{100} Y
0: >\x{100}
/\d/utf
\x{100}3
0: 3
/\s/utf
\x{100} X
0:
/\D+/utf
12abcd34
0: abcd
\= Expect no match
1234
No match
/\D{2,3}/utf
12abcd34
0: abc
12ab34
0: ab
\= Expect no match
1234
No match
12a34
No match
/\D{2,3}?/utf
12abcd34
0: abc
1: ab
12ab34
0: ab
\= Expect no match
1234
No match
12a34
No match
/\d+/utf
12abcd34
0: 12
/\d{2,3}/utf
12abcd34
0: 12
1234abcd
0: 123
\= Expect no match
1.4
No match
/\d{2,3}?/utf
12abcd34
0: 12
1234abcd
0: 123
1: 12
\= Expect no match
1.4
No match
/\S+/utf
12abcd34
0: 12abcd34
\= Expect no match
\ \
No match
/\S{2,3}/utf
12abcd34
0: 12a
1234abcd
0: 123
\= Expect no match
\ \
No match
/\S{2,3}?/utf
12abcd34
0: 12a
1: 12
1234abcd
0: 123
1: 12
\= Expect no match
\ \
No match
/>\s+</utf
12> <34
0: > <
/>\s{2,3}</utf
ab> <cd
0: > <
ab> <ce
0: > <
\= Expect no match
ab> <cd
No match
/>\s{2,3}?</utf
ab> <cd
0: > <
ab> <ce
0: > <
\= Expect no match
ab> <cd
No match
/\w+/utf
12 34
0: 12
\= Expect no match
+++=*!
No match
/\w{2,3}/utf
ab cd
0: ab
abcd ce
0: abc
\= Expect no match
a.b.c
No match
/\w{2,3}?/utf
ab cd
0: ab
abcd ce
0: abc
1: ab
\= Expect no match
a.b.c
No match
/\W+/utf
12====34
0: ====
\= Expect no match
abcd
No match
/\W{2,3}/utf
ab====cd
0: ===
ab==cd
0: ==
\= Expect no match
a.b.c
No match
/\W{2,3}?/utf
ab====cd
0: ===
1: ==
ab==cd
0: ==
\= Expect no match
a.b.c
No match
/[\x{100}]/utf
\x{100}
0: \x{100}
Z\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[Z\x{100}]/utf
Z\x{100}
0: Z
\x{100}
0: \x{100}
\x{100}Z
0: \x{100}
/[\x{100}\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
/[\x{100}-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
/[z-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
abzcd
0: z
ab|cd
0: |
/[Q\x{100}\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
Q?
0: Q
/[Q\x{100}-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
Q?
0: Q
/[Qz-\x{200}]/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{111}cd
0: \x{111}
abzcd
0: z
ab|cd
0: |
Q?
0: Q
/[\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}\x{100}\x{200}
/[\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}\x{100}\x{200}
1: \x{200}\x{100}
2: \x{200}
/[Q\x{100}\x{200}]{1,3}/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}\x{100}\x{200}
/[Q\x{100}\x{200}]{1,3}?/utf
ab\x{100}cd
0: \x{100}
ab\x{200}cd
0: \x{200}
ab\x{200}\x{100}\x{200}\x{100}cd
0: \x{200}\x{100}\x{200}
1: \x{200}\x{100}
2: \x{200}
/(?<=[\x{100}\x{200}])X/utf
abc\x{200}X
0: X
abc\x{100}X
0: X
\= Expect no match
X
No match
/(?<=[Q\x{100}\x{200}])X/utf
abc\x{200}X
0: X
abc\x{100}X
0: X
abQX
0: X
\= Expect no match
X
No match
/(?<=[\x{100}\x{200}]{3})X/utf
abc\x{100}\x{200}\x{100}X
0: X
\= Expect no match
abc\x{200}X
No match
X
No match
/[^\x{100}\x{200}]X/utf
AX
0: AX
\x{150}X
0: \x{150}X
\x{500}X
0: \x{500}X
\= Expect no match
\x{100}X
No match
\x{200}X
No match
/[^Q\x{100}\x{200}]X/utf
AX
0: AX
\x{150}X
0: \x{150}X
\x{500}X
0: \x{500}X
\= Expect no match
\x{100}X
No match
\x{200}X
No match
QX
No match
/[^\x{100}-\x{200}]X/utf
AX
0: AX
\x{500}X
0: \x{500}X
\= Expect no match
\x{100}X
No match
\x{150}X
No match
\x{200}X
No match
/[z-\x{100}]/i,utf
z
0: z
Z
0: Z
\x{100}
0: \x{100}
\= Expect no match
\x{102}
No match
y
No match
/[\xFF]/
>\xff<
0: \xff
/[\xff]/utf
>\x{ff}<
0: \x{ff}
/[^\xFF]/
XYZ
0: X
/[^\xff]/utf
XYZ
0: X
\x{123}
0: \x{123}
/^[ac]*b/utf
\= Expect no match
xb
No match
/^[ac\x{100}]*b/utf
\= Expect no match
xb
No match
/^[^x]*b/i,utf
\= Expect no match
xb
No match
/^[^x]*b/utf
\= Expect no match
xb
No match
/^\d*b/utf
\= Expect no match
xb
No match
/(|a)/g,utf
catac
0:
0: a
1:
0:
0: a
1:
0:
0:
a\x{256}a
0: a
1:
0:
0: a
1:
0:
/^\x{85}$/i,utf
\x{85}
0: \x{85}
/^abc./gmx,newline=any,utf
abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
0: abc1
0: abc2
0: abc3
0: abc4
0: abc5
0: abc6
0: abc7
0: abc8
0: abc9
/abc.$/gmx,newline=any,utf
abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
0: abc1
0: abc2
0: abc3
0: abc4
0: abc5
0: abc6
0: abc7
0: abc8
0: abc9
/^a\Rb/bsr=unicode,utf
a\nb
0: a\x{0a}b
a\rb
0: a\x{0d}b
a\r\nb
0: a\x{0d}\x{0a}b
a\x0bb
0: a\x{0b}b
a\x0cb
0: a\x{0c}b
a\x{85}b
0: a\x{85}b
a\x{2028}b
0: a\x{2028}b
a\x{2029}b
0: a\x{2029}b
\= Expect no match
a\n\rb
No match
/^a\R*b/bsr=unicode,utf
ab
0: ab
a\nb
0: a\x{0a}b
a\rb
0: a\x{0d}b
a\r\nb
0: a\x{0d}\x{0a}b
a\x0bb
0: a\x{0b}b
a\x0c\x{2028}\x{2029}b
0: a\x{0c}\x{2028}\x{2029}b
a\x{85}b
0: a\x{85}b
a\n\rb
0: a\x{0a}\x{0d}b
a\n\r\x{85}\x0cb
0: a\x{0a}\x{0d}\x{85}\x{0c}b
/^a\R+b/bsr=unicode,utf
a\nb
0: a\x{0a}b
a\rb
0: a\x{0d}b
a\r\nb
0: a\x{0d}\x{0a}b
a\x0bb
0: a\x{0b}b
a\x0c\x{2028}\x{2029}b
0: a\x{0c}\x{2028}\x{2029}b
a\x{85}b
0: a\x{85}b
a\n\rb
0: a\x{0a}\x{0d}b
a\n\r\x{85}\x0cb
0: a\x{0a}\x{0d}\x{85}\x{0c}b
\= Expect no match
ab
No match
/^a\R{1,3}b/bsr=unicode,utf
a\nb
0: a\x{0a}b
a\n\rb
0: a\x{0a}\x{0d}b
a\n\r\x{85}b
0: a\x{0a}\x{0d}\x{85}b
a\r\n\r\nb
0: a\x{0d}\x{0a}\x{0d}\x{0a}b
a\r\n\r\n\r\nb
0: a\x{0d}\x{0a}\x{0d}\x{0a}\x{0d}\x{0a}b
a\n\r\n\rb
0: a\x{0a}\x{0d}\x{0a}\x{0d}b
a\n\n\r\nb
0: a\x{0a}\x{0a}\x{0d}\x{0a}b
\= Expect no match
a\n\n\n\rb
No match
a\r
No match
/\h+\V?\v{3,4}/utf,no_auto_possess
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
/\V?\v{3,4}/utf,no_auto_possess
\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
0: X\x{0a}\x{0b}\x{0c}\x{0d}
1: X\x{0a}\x{0b}\x{0c}
/\h+\V?\v{3,4}/utf,no_auto_possess
>\x09\x20\x{a0}X\x0a\x0a\x0a<
0: \x{09} \x{a0}X\x{0a}\x{0a}\x{0a}
/\V?\v{3,4}/utf,no_auto_possess
>\x09\x20\x{a0}X\x0a\x0a\x0a<
0: X\x{0a}\x{0a}\x{0a}
/\H\h\V\v/utf
X X\x0a
0: X X\x{0a}
X\x09X\x0b
0: X\x{09}X\x{0b}
\= Expect no match
\x{a0} X\x0a
No match
/\H*\h+\V?\v{3,4}/utf,no_auto_possess
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
1: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}
\x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}\x{0d}
1: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
\x09\x20\x{a0}\x0a\x0b\x0c
0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
\= Expect no match
\x09\x20\x{a0}\x0a\x0b
No match
/\H\h\V\v/utf
\x{3001}\x{3000}\x{2030}\x{2028}
0: \x{3001}\x{3000}\x{2030}\x{2028}
X\x{180e}X\x{85}
0: X\x{180e}X\x{85}
\= Expect no match
\x{2009} X\x0a
No match
/\H*\h+\V?\v{3,4}/utf,no_auto_possess
\x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
0: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}\x{0d}
1: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}
\x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
0: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}\x{2028}
1: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}
\x09\x20\x{202f}\x0a\x0b\x0c
0: \x{09} \x{202f}\x{0a}\x{0b}\x{0c}
\= Expect no match
\x09\x{200a}\x{a0}\x{2028}\x0b
No match
/a\Rb/I,bsr=anycrlf,utf
Capture group count = 0
Options: utf
\R matches CR, LF, or CRLF
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 3
a\rb
0: a\x{0d}b
a\nb
0: a\x{0a}b
a\r\nb
0: a\x{0d}\x{0a}b
\= Expect no match
a\x{85}b
No match
a\x0bb
No match
/a\Rb/I,bsr=unicode,utf
Capture group count = 0
Options: utf
\R matches any Unicode newline
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 3
a\rb
0: a\x{0d}b
a\nb
0: a\x{0a}b
a\r\nb
0: a\x{0d}\x{0a}b
a\x{85}b
0: a\x{85}b
a\x0bb
0: a\x{0b}b
/a\R?b/I,bsr=anycrlf,utf
Capture group count = 0
Options: utf
\R matches CR, LF, or CRLF
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 2
a\rb
0: a\x{0d}b
a\nb
0: a\x{0a}b
a\r\nb
0: a\x{0d}\x{0a}b
\= Expect no match
a\x{85}b
No match
a\x0bb
No match
/a\R?b/I,bsr=unicode,utf
Capture group count = 0
Options: utf
\R matches any Unicode newline
First code unit = 'a'
Last code unit = 'b'
Subject length lower bound = 2
a\rb
0: a\x{0d}b
a\nb
0: a\x{0a}b
a\r\nb
0: a\x{0d}\x{0a}b
a\x{85}b
0: a\x{85}b
a\x0bb
0: a\x{0b}b
/X/newline=any,utf,firstline
A\x{1ec5}ABCXYZ
0: X
/abcd*/utf
xxxxabcd\=ps
0: abcd
xxxxabcd\=ph
Partial match: abcd
/abcd*/i,utf
xxxxabcd\=ps
0: abcd
xxxxabcd\=ph
Partial match: abcd
XXXXABCD\=ps
0: ABCD
XXXXABCD\=ph
Partial match: ABCD
/abc\d*/utf
xxxxabc1\=ps
0: abc1
xxxxabc1\=ph
Partial match: abc1
/abc[de]*/utf
xxxxabcde\=ps
0: abcde
xxxxabcde\=ph
Partial match: abcde
/\bthe cat\b/utf
the cat\=ps
0: the cat
the cat\=ph
Partial match: the cat
/./newline=crlf,utf
\r\=ps
0: \x{0d}
\r\=ph
Partial match: \x{0d}
/.{2,3}/newline=crlf,utf
\r\=ps
Partial match: \x{0d}
\r\=ph
Partial match: \x{0d}
\r\r\=ps
0: \x{0d}\x{0d}
\r\r\=ph
Partial match: \x{0d}\x{0d}
\r\r\r\=ps
0: \x{0d}\x{0d}\x{0d}
\r\r\r\=ph
Partial match: \x{0d}\x{0d}\x{0d}
/.{2,3}?/newline=crlf,utf
\r\=ps
Partial match: \x{0d}
\r\=ph
Partial match: \x{0d}
\r\r\=ps
0: \x{0d}\x{0d}
\r\r\=ph
Partial match: \x{0d}\x{0d}
\r\r\r\=ps
0: \x{0d}\x{0d}\x{0d}
1: \x{0d}\x{0d}
\r\r\r\=ph
Partial match: \x{0d}\x{0d}\x{0d}
/[^\x{100}]/utf
\x{100}\x{101}X
0: \x{101}
/[^\x{100}]+/utf
\x{100}\x{101}X
0: \x{101}X
/\pL\P{Nd}/utf
AB
0: AB
\= Expect no match
A0
No match
00
No match
/\X./utf
AB
0: AB
A\x{300}BC
0: A\x{300}B
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}B
\= Expect no match
\x{300}
No match
/\X\X/utf
ABC
0: AB
A\x{300}B\x{300}\x{301}C
0: A\x{300}B\x{300}\x{301}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}B
\= Expect no match
\x{300}
No match
/^\pL+/utf
abcd
0: abcd
a
0: a
/^\PL+/utf
1234
0: 1234
=
0: =
\= Expect no match
abcd
No match
/^\X+/utf
abcdA\x{300}\x{301}\x{302}
0: abcdA\x{300}\x{301}\x{302}
A\x{300}\x{301}\x{302}
0: A\x{300}\x{301}\x{302}
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}
a
0: a
\x{300}\x{301}\x{302}
0: \x{300}\x{301}\x{302}
/\X?abc/utf
abc
0: abc
A\x{300}abc
0: A\x{300}abc
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}abc
\x{300}abc
0: \x{300}abc
/^\X?abc/utf
abc
0: abc
A\x{300}abc
0: A\x{300}abc
\x{300}abc
0: \x{300}abc
\= Expect no match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
No match
/\X*abc/utf
abc
0: abc
A\x{300}abc
0: A\x{300}abc
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
\x{300}abc
0: \x{300}abc
/^\X*abc/utf
abc
0: abc
A\x{300}abc
0: A\x{300}abc
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
\x{300}abc
0: \x{300}abc
/^\pL?=./utf
A=b
0: A=b
=c
0: =c
\= Expect no match
1=2
No match
AAAA=b
No match
/^\pL*=./utf
AAAA=b
0: AAAA=b
=c
0: =c
\= Expect no match
1=2
No match
/^\X{2,3}X/utf
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
0: A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
\= Expect no match
X
No match
A\x{300}\x{301}\x{302}X
No match
A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}A\x{300}\x{301}\x{302}X
No match
/^\pC\pL\pM\pN\pP\pS\pZ</utf
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
0: \x{7f}\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
\np\x{300}9!\$ <
0: \x{0a}p\x{300}9!$ <
\= Expect no match
ap\x{300}9!\$ <
No match
/^\PC/utf
X
0: X
\= Expect no match
\x7f
No match
/^\PL/utf
9
0: 9
\= Expect no match
\x{c0}
No match
/^\PM/utf
X
0: X
\= Expect no match
\x{30f}
No match
/^\PN/utf
X
0: X
\= Expect no match
\x{660}
No match
/^\PP/utf
X
0: X
\= Expect no match
\x{66c}
No match
/^\PS/utf
X
0: X
\= Expect no match
\x{f01}
No match
/^\PZ/utf
X
0: X
\= Expect no match
\x{1680}
No match
/^\p{Cc}/utf
\x{017}
0: \x{17}
\x{09f}
0: \x{9f}
\= Expect no match
\x{0600}
No match
/^\p{Cf}/utf
\x{601}
0: \x{601}
\x{180e}
0: \x{180e}
\x{061c}
0: \x{61c}
\x{2066}
0: \x{2066}
\x{2067}
0: \x{2067}
\x{2068}
0: \x{2068}
\x{2069}
0: \x{2069}
\= Expect no match
\x{09f}
No match
/^\p{Cn}/utf
\= Expect no match
\x{09f}
No match
/^\p{Co}/utf
\x{f8ff}
0: \x{f8ff}
\= Expect no match
\x{09f}
No match
/^\p{Cs}/utf
\x{dfff}\=no_utf_check
0: \x{dfff}
\= Expect no match
\x{09f}
No match
/^\p{Ll}/utf
a
0: a
\= Expect no match
Z
No match
\x{e000}
No match
/^\p{Lm}/utf
\x{2b0}
0: \x{2b0}
\= Expect no match
a
No match
/^\p{Lo}/utf
\x{1bb}
0: \x{1bb}
\= Expect no match
a
No match
\x{2b0}
No match
/^\p{Lt}/utf
\x{1c5}
0: \x{1c5}
\= Expect no match
a
No match
\x{2b0}
No match
/^\p{Lu}/utf
A
0: A
\= Expect no match
\x{2b0}
No match
/^\p{Mc}/utf
\x{903}
0: \x{903}
\= Expect no match
X
No match
\x{300}
No match
/^\p{Me}/utf
\x{488}
0: \x{488}
\= Expect no match
X
No match
\x{903}
No match
\x{300}
No match
/^\p{Mn}/utf
\x{300}
0: \x{300}
\x{1a1b}
0: \x{1a1b}
\= Expect no match
X
No match
\x{903}
No match
/^\p{Nd}+/utf,no_auto_possess
0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}\x{66a}
Matched, but offsets vector is too small to show all matches
0: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}\x{669}
1: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}\x{668}
2: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}\x{667}
3: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}\x{666}
4: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}\x{665}
5: 0123456789\x{660}\x{661}\x{662}\x{663}\x{664}
6: 0123456789\x{660}\x{661}\x{662}\x{663}
7: 0123456789\x{660}\x{661}\x{662}
8: 0123456789\x{660}\x{661}
9: 0123456789\x{660}
10: 0123456789
11: 012345678
12: 01234567
13: 0123456
14: 012345
\x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}\x{6fa}
0: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}\x{6f9}
1: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}\x{6f8}
2: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}\x{6f7}
3: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}\x{6f6}
4: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}\x{6f5}
5: \x{6f0}\x{6f1}\x{6f2}\x{6f3}\x{6f4}
6: \x{6f0}\x{6f1}\x{6f2}\x{6f3}
7: \x{6f0}\x{6f1}\x{6f2}
8: \x{6f0}\x{6f1}
9: \x{6f0}
\x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}\x{970}
0: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}\x{96f}
1: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}\x{96e}
2: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}\x{96d}
3: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}\x{96c}
4: \x{966}\x{967}\x{968}\x{969}\x{96a}\x{96b}
5: \x{966}\x{967}\x{968}\x{969}\x{96a}
6: \x{966}\x{967}\x{968}\x{969}
7: \x{966}\x{967}\x{968}
8: \x{966}\x{967}
9: \x{966}
\= Expect no match
X
No match
/^\p{Nl}/utf
\x{16ee}
0: \x{16ee}
\= Expect no match
X
No match
\x{966}
No match
/^\p{No}/utf
\x{b2}
0: \x{b2}
\x{b3}
0: \x{b3}
\= Expect no match
X
No match
\x{16ee}
No match
/^\p{Pc}/utf
\x5f
0: _
\x{203f}
0: \x{203f}
\= Expect no match
X
No match
-
No match
\x{58a}
No match
/^\p{Pd}/utf
-
0: -
\x{58a}
0: \x{58a}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Pe}/utf
)
0: )
]
0: ]
}
0: }
\x{f3b}
0: \x{f3b}
\x{2309}
0: \x{2309}
\x{230b}
0: \x{230b}
\= Expect no match
X
No match
\x{203f}
No match
(
No match
[
No match
{
No match
\x{f3c}
No match
/^\p{Pf}/utf
\x{bb}
0: \x{bb}
\x{2019}
0: \x{2019}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Pi}/utf
\x{ab}
0: \x{ab}
\x{2018}
0: \x{2018}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Po}/utf
!
0: !
\x{37e}
0: \x{37e}
\= Expect no match
X
No match
\x{203f}
No match
/^\p{Ps}/utf
(
0: (
[
0: [
{
0: {
\x{f3c}
0: \x{f3c}
\x{2308}
0: \x{2308}
\x{230a}
0: \x{230a}
\= Expect no match
X
No match
)
No match
]
No match
}
No match
\x{f3b}
No match
/^\p{Sc}+/utf
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
0: $\x{a2}\x{a3}\x{a4}\x{a5}
\x{9f2}
0: \x{9f2}
\= Expect no match
X
No match
\x{2c2}
No match
/^\p{Sk}/utf
\x{2c2}
0: \x{2c2}
\= Expect no match
X
No match
\x{9f2}
No match
/^\p{Sm}+/utf
+<|~\x{ac}\x{2044}
0: +<|~\x{ac}\x{2044}
\= Expect no match
X
No match
\x{9f2}
No match
/^\p{So}/utf
\x{a6}
0: \x{a6}
\x{482}
0: \x{482}
\= Expect no match
X
No match
\x{9f2}
No match
/^\p{Zl}/utf
\x{2028}
0: \x{2028}
\= Expect no match
X
No match
\x{2029}
No match
/^\p{Zp}/utf
\x{2029}
0: \x{2029}
\= Expect no match
X
No match
\x{2028}
No match
/^\p{Zs}/utf
\ \
0:
\x{a0}
0: \x{a0}
\x{1680}
0: \x{1680}
\x{2000}
0: \x{2000}
\x{2001}
0: \x{2001}
\= Expect no match
\x{2028}
No match
\x{200d}
No match
/\p{Nd}+(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
2: \x{660}\x{661}\x{662}
/\p{Nd}+?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
2: \x{660}\x{661}\x{662}
/\p{Nd}{2,}(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
/\p{Nd}{2,}?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
/\p{Nd}*(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
2: \x{660}\x{661}\x{662}
3: \x{660}\x{661}
/\p{Nd}*?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
2: \x{660}\x{661}\x{662}
3: \x{660}\x{661}
/\p{Nd}{2}(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}A
/\p{Nd}{2,3}(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
/\p{Nd}{2,3}?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
1: \x{660}\x{661}\x{662}A
/\p{Nd}?(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}
1: \x{660}\x{661}
/\p{Nd}??(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}
1: \x{660}\x{661}
/\p{Nd}*+(..)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}AB
/\p{Nd}*+(...)/utf
\x{660}\x{661}\x{662}ABC
0: \x{660}\x{661}\x{662}ABC
/\p{Nd}*+(....)/utf
\= Expect no match
\x{660}\x{661}\x{662}ABC
No match
/\p{^Lu}/i,utf
1234
0: 1
\= Expect no match
ABC
No match
/\P{Lu}/i,utf
1234
0: 1
\= Expect no match
ABC
No match
/(?<=A\p{Nd})XYZ/utf
A2XYZ
0: XYZ
123A5XYZPQR
0: XYZ
ABA\x{660}XYZpqr
0: XYZ
\= Expect no match
AXYZ
No match
XYZ
No match
/(?<!\pL)XYZ/utf
1XYZ
0: XYZ
AB=XYZ..
0: XYZ
XYZ
0: XYZ
\= Expect no match
WXYZ
No match
/[\p{Nd}]/utf
1234
0: 1
/[\p{Nd}+-]+/utf
1234
0: 1234
12-34
0: 12-34
12+\x{661}-34
0: 12+\x{661}-34
\= Expect no match
abcd
No match
/[\P{Nd}]+/utf
abcd
0: abcd
\= Expect no match
1234
No match
/\D+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/\P{Nd}+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/[\D]+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/[\P{Nd}]+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/[\D\P{Nd}]+/utf,no_auto_possess
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Matched, but offsets vector is too small to show all matches
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
2: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
3: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
4: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
5: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
6: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
7: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
8: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
9: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
10: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
12: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
13: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
14: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
\= Expect no match
11111111111111111111111111111111111111111111111111111111111111111111111
No match
/\pL/utf
a
0: a
A
0: A
/\pL/i,utf
a
0: a
A
0: A
/^\x{c0}$/i,utf
\x{c0}
0: \x{c0}
\x{e0}
0: \x{e0}
/^\x{e0}$/i,utf
\x{c0}
0: \x{c0}
\x{e0}
0: \x{e0}
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
\= Expect no match
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
No match
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
No match
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
No match
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
No match
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
No match
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
A\x{391}\x{10427}\x{ff3a}\x{1fb0}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb0}
a\x{391}\x{10427}\x{ff3a}\x{1fb0}
0: a\x{391}\x{10427}\x{ff3a}\x{1fb0}
A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
0: A\x{3b1}\x{10427}\x{ff3a}\x{1fb0}
A\x{391}\x{1044F}\x{ff3a}\x{1fb0}
0: A\x{391}\x{1044f}\x{ff3a}\x{1fb0}
A\x{391}\x{10427}\x{ff5a}\x{1fb0}
0: A\x{391}\x{10427}\x{ff5a}\x{1fb0}
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb8}
/\x{391}+/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
/\x{391}{3,5}(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
1: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
2: \x{391}\x{3b1}\x{3b1}\x{3b1}
/\x{391}{3,5}?(.)/i,utf
\x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
0: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}X
1: \x{391}\x{3b1}\x{3b1}\x{3b1}\x{391}
2: \x{391}\x{3b1}\x{3b1}\x{3b1}
/[\x{391}\x{ff3a}]/i,utf
\x{391}
0: \x{391}
\x{ff3a}
0: \x{ff3a}
\x{3b1}
0: \x{3b1}
\x{ff5a}
0: \x{ff5a}
/[\x{c0}\x{391}]/i,utf
\x{c0}
0: \x{c0}
\x{e0}
0: \x{e0}
/[\x{105}-\x{109}]/i,utf
\x{104}
0: \x{104}
\x{105}
0: \x{105}
\x{109}
0: \x{109}
\= Expect no match
\x{100}
No match
\x{10a}
No match
/[z-\x{100}]/i,utf
Z
0: Z
z
0: z
\x{39c}
0: \x{39c}
\x{178}
0: \x{178}
|
0: |
\x{80}
0: \x{80}
\x{ff}
0: \x{ff}
\x{100}
0: \x{100}
\x{101}
0: \x{101}
\= Expect no match
\x{102}
No match
Y
No match
y
No match
/[z-\x{100}]/i,utf
/^\X/utf
A
0: A
A\x{300}BC
0: A\x{300}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}
\x{300}
0: \x{300}
/^(\X*)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BC
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
1: A\x{300}\x{301}\x{302}BC
/^(\X*?)C/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BC
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
1: A\x{300}\x{301}\x{302}BC
/^(\X*)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BCA
1: A\x{300}\x{301}\x{302}BC
2: A\x{300}\x{301}\x{302}B
3: A
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
1: A\x{300}\x{301}\x{302}BCA
2: A\x{300}\x{301}\x{302}BC
3: A\x{300}\x{301}\x{302}B
4: A
/^(\X*?)(.)/utf
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}
0: A\x{300}\x{301}\x{302}BCA
1: A\x{300}\x{301}\x{302}BC
2: A\x{300}\x{301}\x{302}B
3: A
A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
0: A\x{300}\x{301}\x{302}BCA\x{300}\x{301}C
1: A\x{300}\x{301}\x{302}BCA
2: A\x{300}\x{301}\x{302}BC
3: A\x{300}\x{301}\x{302}B
4: A
/^\X(.)/utf
\= Expect no match
A\x{300}\x{301}\x{302}
No match
/^\X{2,3}(.)/utf
A\x{300}\x{301}B\x{300}X
0: A\x{300}\x{301}B\x{300}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
0: A\x{300}\x{301}B\x{300}C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
1: A\x{300}\x{301}B\x{300}C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
1: A\x{300}\x{301}B\x{300}C
/^\X{2,3}?(.)/utf
A\x{300}\x{301}B\x{300}X
0: A\x{300}\x{301}B\x{300}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}
0: A\x{300}\x{301}B\x{300}C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
1: A\x{300}\x{301}B\x{300}C
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
0: A\x{300}\x{301}B\x{300}C\x{300}\x{301}D
1: A\x{300}\x{301}B\x{300}C
/^\pN{2,3}X/
12X
0: 12X
123X
0: 123X
\= Expect no match
X
No match
1X
No match
1234X
No match
/\x{100}/i,utf
\x{100}
0: \x{100}
\x{101}
0: \x{101}
/^\p{Han}+/utf
\x{2e81}\x{3007}\x{2f804}\x{31a0}
0: \x{2e81}\x{3007}\x{2f804}
\= Expect no match
\x{2e7f}
No match
/^\P{Katakana}+/utf
\x{3105}
0: \x{3105}
\= Expect no match
\x{30ff}
No match
/^[\p{Arabic}]/utf
\x{06e9}
0: \x{6e9}
\x{060b}
0: \x{60b}
\= Expect no match
X\x{06e9}
No match
/^[\P{Yi}]/utf
\x{2f800}
0: \x{2f800}
\= Expect no match
\x{a014}
No match
\x{a4c6}
No match
/^\p{Any}X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
X
No match
/^\P{Any}X/utf
\= Expect no match
AX
No match
/^\p{Any}?X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
ABXYZ
No match
/^\P{Any}?X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
ABXYZ
No match
/^\p{Any}+X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
\= Expect no match
XYZ
No match
/^\P{Any}+X/utf
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
XYZ
No match
/^\p{Any}*X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
/^\P{Any}*X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
/^[\p{Any}]X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
X
No match
/^[\P{Any}]X/utf
\= Expect no match
AX
No match
/^[\p{Any}]?X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
\= Expect no match
ABXYZ
No match
/^[\P{Any}]?X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
ABXYZ
No match
/^[\p{Any}]+X/utf
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
\= Expect no match
XYZ
No match
/^[\P{Any}]+X/utf
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
XYZ
No match
/^[\p{Any}]*X/utf
XYZ
0: X
AXYZ
0: AX
\x{1234}XYZ
0: \x{1234}X
A\x{1234}XYZ
0: A\x{1234}X
/^[\P{Any}]*X/utf
XYZ
0: X
\= Expect no match
AXYZ
No match
\x{1234}XYZ
No match
A\x{1234}XYZ
No match
/^\p{Any}{3,5}?/utf
abcdefgh
0: abcde
1: abcd
2: abc
\x{1234}\n\r\x{3456}xyz
0: \x{1234}\x{0a}\x{0d}\x{3456}x
1: \x{1234}\x{0a}\x{0d}\x{3456}
2: \x{1234}\x{0a}\x{0d}
/^\p{Any}{3,5}/utf
abcdefgh
0: abcde
\x{1234}\n\r\x{3456}xyz
0: \x{1234}\x{0a}\x{0d}\x{3456}x
/^\P{Any}{3,5}?/utf
\= Expect no match
abcdefgh
No match
\x{1234}\n\r\x{3456}xyz
No match
/^\p{L&}X/utf
AXY
0: AX
aXY
0: aX
\x{1c5}XY
0: \x{1c5}X
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^[\p{L&}]X/utf
AXY
0: AX
aXY
0: aX
\x{1c5}XY
0: \x{1c5}X
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^\p{L&}+X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEXypqreX
1: abcDEX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^[\p{L&}]+X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEXypqreX
1: abcDEX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^\p{L&}+?X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEXypqreX
1: abcDEX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^[\p{L&}]+?X/utf
AXY
0: AX
aXY
0: aX
AbcdeXyz
0: AbcdeX
\x{1c5}AbXY
0: \x{1c5}AbX
abcDEXypqreXlmn
0: abcDEXypqreX
1: abcDEX
\= Expect no match
\x{1bb}XY
No match
\x{2b0}XY
No match
!XY
No match
/^\P{L&}X/utf
!XY
0: !X
\x{1bb}XY
0: \x{1bb}X
\x{2b0}XY
0: \x{2b0}X
\= Expect no match
\x{1c5}XY
No match
AXY
No match
/^[\P{L&}]X/utf
!XY
0: !X
\x{1bb}XY
0: \x{1bb}X
\x{2b0}XY
0: \x{2b0}X
\= Expect no match
\x{1c5}XY
No match
AXY
No match
/^\x{023a}+?(\x{0130}+)/i,utf
\x{023a}\x{2c65}\x{0130}
0: \x{23a}\x{2c65}\x{130}
/^\x{023a}+([^X])/i,utf
\x{023a}\x{2c65}X
0: \x{23a}\x{2c65}
/\x{c0}+\x{116}+/i,utf
\x{c0}\x{e0}\x{116}\x{117}
0: \x{c0}\x{e0}\x{116}\x{117}
/[\x{c0}\x{116}]+/i,utf
\x{c0}\x{e0}\x{116}\x{117}
0: \x{c0}\x{e0}\x{116}\x{117}
# Check property support in non-UTF-8 mode
/\p{L}{4}/
123abcdefg
0: abcd
123abc\xc4\xc5zz
0: abc\xc4
/\p{Carian}\p{Cham}\p{Kayah_Li}\p{Lepcha}\p{Lycian}\p{Lydian}\p{Ol_Chiki}\p{Rejang}\p{Saurashtra}\p{Sundanese}\p{Vai}/utf
\x{102A4}\x{AA52}\x{A91D}\x{1C46}\x{10283}\x{1092E}\x{1C6B}\x{A93B}\x{A8BF}\x{1BA0}\x{A50A}====
0: \x{102a4}\x{aa52}\x{a91d}\x{1c46}\x{10283}\x{1092e}\x{1c6b}\x{a93b}\x{a8bf}\x{1ba0}\x{a50a}
/\x{a77d}\x{1d79}/i,utf
\x{a77d}\x{1d79}
0: \x{a77d}\x{1d79}
\x{1d79}\x{a77d}
0: \x{1d79}\x{a77d}
/\x{a77d}\x{1d79}/utf
\x{a77d}\x{1d79}
0: \x{a77d}\x{1d79}
\= Expect no match
\x{1d79}\x{a77d}
No match
/^\p{Xan}/utf
ABCD
0: A
1234
0: 1
\x{6ca}
0: \x{6ca}
\x{a6c}
0: \x{a6c}
\x{10a7}
0: \x{10a7}
\= Expect no match
_ABC
No match
/^\p{Xan}+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
\= Expect no match
_ABC
No match
/^\p{Xan}*/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
/^\p{Xan}{2,9}/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}
/^[\p{Xan}]/utf
ABCD1234_
0: A
1234abcd_
0: 1
\x{6ca}
0: \x{6ca}
\x{a6c}
0: \x{a6c}
\x{10a7}
0: \x{10a7}
\= Expect no match
_ABC
No match
/^[\p{Xan}]+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
\= Expect no match
_ABC
No match
/^>\p{Xsp}/utf
>\x{1680}\x{2028}\x{0b}
0: >\x{1680}
\= Expect no match
\x{0b}
No match
/^>\p{Xsp}+/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
4: > \x{09}\x{0a}\x{0c}\x{0d}
5: > \x{09}\x{0a}\x{0c}
6: > \x{09}\x{0a}
7: > \x{09}
8: >
/^>\p{Xsp}*/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
4: > \x{09}\x{0a}\x{0c}\x{0d}
5: > \x{09}\x{0a}\x{0c}
6: > \x{09}\x{0a}
7: > \x{09}
8: >
9: >
/^>\p{Xsp}{2,9}/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
4: > \x{09}\x{0a}\x{0c}\x{0d}
5: > \x{09}\x{0a}\x{0c}
6: > \x{09}\x{0a}
7: > \x{09}
/^>[\p{Xsp}]/utf,no_auto_possess
>\x{2028}\x{0b}
0: >\x{2028}
/^>[\p{Xsp}]+/utf,no_auto_possess
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
4: > \x{09}\x{0a}\x{0c}\x{0d}
5: > \x{09}\x{0a}\x{0c}
6: > \x{09}\x{0a}
7: > \x{09}
8: >
/^>\p{Xps}/utf
>\x{1680}\x{2028}\x{0b}
0: >\x{1680}
>\x{a0}
0: >\x{a0}
\= Expect no match
\x{0b}
No match
/^>\p{Xps}+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}+?/utf
>\x{1680}\x{2028}\x{0b}
0: >\x{1680}\x{2028}\x{0b}
1: >\x{1680}\x{2028}
2: >\x{1680}
/^>\p{Xps}*/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}{2,9}/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^>\p{Xps}{2,9}?/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
1: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}
2: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}
3: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}
4: > \x{09}\x{0a}\x{0c}\x{0d}
5: > \x{09}\x{0a}\x{0c}
6: > \x{09}\x{0a}
7: > \x{09}
/^>[\p{Xps}]/utf
>\x{2028}\x{0b}
0: >\x{2028}
/^>[\p{Xps}]+/utf
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
0: > \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
/^\p{Xwd}/utf
ABCD
0: A
1234
0: 1
\x{6ca}
0: \x{6ca}
\x{a6c}
0: \x{a6c}
\x{10a7}
0: \x{10a7}
_ABC
0: _
\= Expect no match
[]
No match
/^\p{Xwd}+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}*/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
/^\p{Xwd}{2,9}/utf
A_12\x{6ca}\x{a6c}\x{10a7}
0: A_12\x{6ca}\x{a6c}\x{10a7}
/^[\p{Xwd}]/utf
ABCD1234_
0: A
1234abcd_
0: 1
\x{6ca}
0: \x{6ca}
\x{a6c}
0: \x{a6c}
\x{10a7}
0: \x{10a7}
_ABC
0: _
\= Expect no match
[]
No match
/^[\p{Xwd}]+/utf
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}_
# Unicode properties for \b and \B
/\b...\B/utf,ucp
abc_
0: abc
\x{37e}abc\x{376}
0: abc
\x{37e}\x{376}\x{371}\x{393}\x{394}
0: \x{376}\x{371}\x{393}
!\x{c0}++\x{c1}\x{c2}
0: ++\x{c1}
!\x{c0}+++++
0: \x{c0}++
# Without PCRE2_UCP, non-ASCII always fail, even if < 256
/\b...\B/utf
abc_
0: abc
\= Expect no match
\x{37e}abc\x{376}
No match
\x{37e}\x{376}\x{371}\x{393}\x{394}
No match
!\x{c0}++\x{c1}\x{c2}
No match
!\x{c0}+++++
No match
# With PCRE2_UCP, non-UTF8 chars that are < 256 still check properties
/\b...\B/ucp
abc_
0: abc
!\x{c0}++\x{c1}\x{c2}
0: ++\xc1
!\x{c0}+++++
0: \xc0++
# Caseless single negated characters > 127 need UCP support
/[^\x{100}]/i,utf
\x{100}\x{101}X
0: X
/[^\x{100}]+/i,utf
\x{100}\x{101}XX
0: XX
/^\X/utf
A\=ps
0: A
A\=ph
Partial match: A
A\x{300}\x{301}\=ps
0: A\x{300}\x{301}
A\x{300}\x{301}\=ph
Partial match: A\x{300}\x{301}
A\x{301}\=ps
0: A\x{301}
A\x{301}\=ph
Partial match: A\x{301}
/^\X{2,3}/utf
A\=ps
Partial match: A
A\=ph
Partial match: A
AA\=ps
0: AA
AA\=ph
Partial match: AA
A\x{300}\x{301}\=ps
Partial match: A\x{300}\x{301}
A\x{300}\x{301}\=ph
Partial match: A\x{300}\x{301}
A\x{300}\x{301}A\x{300}\x{301}\=ps
0: A\x{300}\x{301}A\x{300}\x{301}
A\x{300}\x{301}A\x{300}\x{301}\=ph
Partial match: A\x{300}\x{301}A\x{300}\x{301}
/^\X{2}/utf
AA\=ps
0: AA
AA\=ph
Partial match: AA
A\x{300}\x{301}A\x{300}\x{301}\=ps
0: A\x{300}\x{301}A\x{300}\x{301}
A\x{300}\x{301}A\x{300}\x{301}\=ph
Partial match: A\x{300}\x{301}A\x{300}\x{301}
/^\X+/utf
AA\=ps
0: AA
AA\=ph
Partial match: AA
/^\X+?Z/utf
AA\=ps
Partial match: AA
AA\=ph
Partial match: AA
# These are tests for extended grapheme clusters
/^\X/utf,aftertext
G\x{34e}\x{34e}X
0: G\x{34e}\x{34e}
0+ X
\x{34e}\x{34e}X
0: \x{34e}\x{34e}
0+ X
\x04X
0: \x{04}
0+ X
\x{1100}X
0: \x{1100}
0+ X
\x{1100}\x{34e}X
0: \x{1100}\x{34e}
0+ X
\x{1b04}\x{1b04}X
0: \x{1b04}\x{1b04}
0+ X
\= These match up to the roman letters
\x{1111}\x{1111}L,L
0: \x{1111}\x{1111}
0+ L,L
\x{1111}\x{1111}\x{1169}L,L,V
0: \x{1111}\x{1111}\x{1169}
0+ L,L,V
\x{1111}\x{ae4c}L, LV
0: \x{1111}\x{ae4c}
0+ L, LV
\x{1111}\x{ad89}L, LVT
0: \x{1111}\x{ad89}
0+ L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
0: \x{1111}\x{ae4c}\x{1169}
0+ L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
0: \x{1111}\x{ae4c}\x{1169}\x{1169}
0+ L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
0: \x{1111}\x{ae4c}\x{1169}\x{11fe}
0+ L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
0: \x{1111}\x{ad89}\x{11fe}
0+ L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
0: \x{1111}\x{ad89}\x{11fe}\x{11fe}
0+ L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
0: \x{ad89}\x{11fe}\x{11fe}
0+ LVT, T, T
\= These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
0: \x{1111}
0+ \x{11fe}L, T
\x{ae4c}\x{1111}LV, L
0: \x{ae4c}
0+ \x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
0: \x{ae4c}
0+ \x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
0: \x{ae4c}
0+ \x{ad89}LV, LVT
\x{1169}\x{1111}V, L
0: \x{1169}
0+ \x{1111}V, L
\x{1169}\x{ae4c}V, LV
0: \x{1169}
0+ \x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
0: \x{1169}
0+ \x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
0: \x{ad89}
0+ \x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
0: \x{ad89}
0+ \x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
0: \x{ad89}
0+ \x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
0: \x{ad89}
0+ \x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
0: \x{11fe}
0+ \x{1111}T, L
\x{11fe}\x{1169}T, V
0: \x{11fe}
0+ \x{1169}T, V
\x{11fe}\x{ae4c}T, LV
0: \x{11fe}
0+ \x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
0: \x{11fe}
0+ \x{ad89}T, LVT
\= Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
0: \x{1111}\x{ae4c}\x{711}
0+ L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}
0+ L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}\x{711}\x{1b04}
0+ L, LV, spacing mark, extend, spacing mark
\= Test CR, LF, and control
\x0d\x{0711}CR, extend
0: \x{0d}
0+ \x{711}CR, extend
\x0d\x{1b04}CR, spacingmark
0: \x{0d}
0+ \x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
0: \x{0a}
0+ \x{711}LF, extend
\x0a\x{1b04}LF, spacingmark
0: \x{0a}
0+ \x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
0: \x{0b}
0+ \x{711}Control, extend
\x09\x{1b04}Control, spacingmark
0: \x{09}
0+ \x{1b04}Control, spacingmark
\= There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}?X/utf,aftertext
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{1e9e}]+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{00df}]+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/[z\x{1f88}]+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
# Perl matches these
/\x{00b5}+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{039c}+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{03bc}+/i,utf
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{00c5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{00e5}+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{212b}+/i,utf
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{01c4}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c5}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c6}+/i,utf
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c7}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c8}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c9}+/i,utf
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01ca}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cb}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cc}+/i,utf
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01f1}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f2}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f3}+/i,utf
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{0345}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0399}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{03b9}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{1fbe}+/i,utf
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0392}+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03b2}+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03d0}+/i,utf
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{0395}+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03b5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03f5}+/i,utf
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{0398}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03b8}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03d1}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03f4}+/i,utf
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{039a}+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03ba}+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03f0}+/i,utf
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03a0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03c0}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03d6}+/i,utf
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03a1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03c1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03f1}+/i,utf
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03a3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c2}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c3}+/i,utf
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03a6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c6}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03d5}+/i,utf
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{03a9}+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{2126}+/i,utf
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{1e60}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/i,utf
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/i,utf
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/\x{1f80}+/i,utf
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/\x{004b}+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{006b}+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{212a}+/i,utf
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{0053}+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{0073}+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{017f}+/i,utf
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/ist/i,utf
\= Expect no match
ikt
No match
/is+t/i,utf
iSs\x{17f}t
0: iSs\x{17f}t
\= Expect no match
ikt
No match
/is+?t/i,utf
\= Expect no match
ikt
No match
/is?t/i,utf
\= Expect no match
ikt
No match
/is{2}t/i,utf
\= Expect no match
iskt
No match
/^\p{Xuc}/utf
$abc
0: $
@abc
0: @
`abc
0: `
\x{1234}abc
0: \x{1234}
\= Expect no match
abc
No match
/^\p{Xuc}+/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}\x{e000}
\= Expect no match
\x{9f}
No match
/^\p{Xuc}+?/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}\x{e000}
1: $@`\x{a0}\x{1234}
2: $@`\x{a0}
3: $@`
4: $@
5: $
\= Expect no match
\x{9f}
No match
/^\p{Xuc}+?\*/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}\x{e000}*
\= Expect no match
\x{9f}
No match
/^\p{Xuc}++/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}\x{e000}
\= Expect no match
\x{9f}
No match
/^\p{Xuc}{3,5}/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}
\= Expect no match
\x{9f}
No match
/^\p{Xuc}{3,5}?/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}
1: $@`\x{a0}
2: $@`
\= Expect no match
\x{9f}
No match
/^[\p{Xuc}]/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $
\= Expect no match
\x{9f}
No match
/^[\p{Xuc}]+/utf
$@`\x{a0}\x{1234}\x{e000}**
0: $@`\x{a0}\x{1234}\x{e000}
\= Expect no match
\x{9f}
No match
/^\P{Xuc}/utf
abc
0: a
\= Expect no match
$abc
No match
@abc
No match
`abc
No match
\x{1234}abc
No match
/^[\P{Xuc}]/utf
abc
0: a
\= Expect no match
$abc
No match
@abc
No match
`abc
No match
\x{1234}abc
No match
/^A\s+Z/utf,ucp
A\x{2005}Z
0: A\x{2005}Z
A\x{85}\x{180e}\x{2005}Z
0: A\x{85}\x{180e}\x{2005}Z
/^A[\s]+Z/utf,ucp
A\x{2005}Z
0: A\x{2005}Z
A\x{85}\x{180e}\x{2005}Z
0: A\x{85}\x{180e}\x{2005}Z
/(?<=\x{100})\x{200}(?=\x{300})/utf,allusedtext
\x{100}\x{200}\x{300}
0: \x{100}\x{200}\x{300}
<<<<<<< >>>>>>>
# -----------------------------------------------------------------------------
# Tests for bidi control and bidi class properties
/\p{ bidi_control }/utf
-->\x{202c}<--
0: \x{202c}
/\p{bidicontrol}+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/\p{bidicontrol}+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
1: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}
2: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}
3: \x{61c}\x{200e}\x{200f}\x{202a}
4: \x{61c}\x{200e}\x{200f}
5: \x{61c}\x{200e}
6: \x{61c}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
1: \x{2066}\x{2067}\x{2068}
2: \x{2066}\x{2067}
3: \x{2066}
/\p{bidicontrol}++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/[\p{bidi_control}]/utf
-->\x{202c}<--
0: \x{202c}
/[\p{bidicontrol}]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/[\p{bidicontrol}]+?/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
1: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}
2: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}
3: \x{61c}\x{200e}\x{200f}\x{202a}
4: \x{61c}\x{200e}\x{200f}
5: \x{61c}\x{200e}
6: \x{61c}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
1: \x{2066}\x{2067}\x{2068}
2: \x{2066}\x{2067}
3: \x{2066}
/[\p{bidicontrol}]++/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: \x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: \x{2066}\x{2067}\x{2068}\x{2069}
/[\p{bidicontrol}<>]+/utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: >\x{61c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: >\x{2066}\x{2067}\x{2068}\x{2069}<
/\P{bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: -->
0: <--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: -->
0: <--
/\p{^bidicontrol}+/g,utf
-->\x{061c}\x{200e}\x{200f}\x{202a}\x{202b}\x{202c}\x{202d}<--
0: -->
0: <--
-->\x{2066}\x{2067}\x{2068}\x{2069}<--
0: -->
0: <--
/\p{bidi class = al}/utf
-->\x{061D}<--
0: \x{61d}
/\p{bidi class = al}+/utf
-->\x{061D}\x{061e}\x{061f}<--
0: \x{61d}\x{61e}\x{61f}
/\p{bidi_class : AL}+?/utf
-->\x{061D}\x{061e}\x{061f}<--
0: \x{61d}\x{61e}\x{61f}
1: \x{61d}\x{61e}
2: \x{61d}
/\p{Bidi_Class : AL}++/utf
-->\x{061D}\x{061e}\x{061f}<--
0: \x{61d}\x{61e}\x{61f}
/\p{bidi class = aN}+/utf
-->\x{061D}\x{0602}\x{0604}\x{061f}<--
0: \x{602}\x{604}
/\p{bidi class = B}+/utf
-->\x{0a}\x{0d}\x{01c}\x{01e}\x{085}\x{2029}<--
0: \x{0a}\x{0d}\x{1c}\x{1e}\x{85}\x{2029}
/\p{bidi class:BN}+/utf
-->\x{0}\x{08}\x{200c}\x{fffe}\x{dfffe}\x{10ffff}<--
0: \x{00}\x{08}\x{200c}\x{fffe}\x{dfffe}\x{10ffff}
/\p{bidiclass:cs}+/utf
-->,.\x{060c}\x{ff1a}<--
0: ,.\x{60c}\x{ff1a}
/\p{bidiclass:En}+/utf
-->09\x{b2}\x{2074}\x{1fbf9}<--
0: 09\x{b2}\x{2074}\x{1fbf9}
/\p{bidiclass:es}+/utf
==>+-\x{207a}\x{ff0d}<==
0: +-\x{207a}\x{ff0d}
/\p{bidiclass:et}+/utf
-->#\{24}%\x{a2}\x{A838}\x{1e2ff}<--
0: #
/\p{bidiclass:FSI}+/utf
-->\x{2068}<--
0: \x{2068}
/\p{bidi class:L}+/utf
-->ABC<--
0: ABC
/\P{bidi class:L}+/utf
-->ABC<--
0: -->
/\p{bidi class:LRE}+\p{bidiclass=lri}*\p{bidiclass:lro}/utf
-->\x{202a}\x{2066}\x{202d}<--
0: \x{202a}\x{2066}\x{202d}
/\p{bidi class:NSM}+/utf
-->\x{9bc}\x{a71}\x{e31}<--
0: \x{9bc}\x{a71}\x{e31}
/\p{bidi class:ON}+/utf
-->\x{21}'()*;@\x{384}\x{2039}<=-
0: >!'()*;@\x{384}\x{2039}<=
/\p{bidiclass:pdf}\p{bidiclass:pdi}/utf
-->\x{202c}\x{2069}<--
0: \x{202c}\x{2069}
/\p{bidi class:R}+/utf
-->\x{590}\x{5c6}\x{200f}\x{10805}<--
0: \x{590}\x{5c6}\x{200f}\x{10805}
/\p{bidi class:RLE}+\p{bidi class:RLI}*\p{bidi class:RLO}+/utf
-->\x{202b}\x{2067}\x{202e}<--
0: \x{202b}\x{2067}\x{202e}
/\p{bidi class:S}+\p{bidiclass:WS}+/utf
-->\x{9}\x{b}\x{1f} \x{c} \x{2000} \x{3000}<--
0: \x{09}\x{0b}\x{1f} \x{0c} \x{2000} \x{3000}
# -----------------------------------------------------------------------------
/\p{katakana}/utf
\x{30a1}
0: \x{30a1}
\x{3001}
0: \x{3001}
/\p{scx:katakana}/utf
\x{30a1}
0: \x{30a1}
\x{3001}
0: \x{3001}
/\p{script extensions:katakana}/utf
\x{30a1}
0: \x{30a1}
\x{3001}
0: \x{3001}
/\p{sc:katakana}/utf
\x{30a1}
0: \x{30a1}
\= Expect no match
\x{3001}
No match
/\p{script:katakana}/utf
\x{30a1}
0: \x{30a1}
\= Expect no match
\x{3001}
No match
/\p{sc:katakana}{3,}/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
0: \x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}
/\p{sc:katakana}{3,}?/utf
\x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}\x{3001}ABC
0: \x{30a1}\x{30fa}\x{32d0}\x{1b122}\x{ff66}
1: \x{30a1}\x{30fa}\x{32d0}\x{1b122}
2: \x{30a1}\x{30fa}\x{32d0}
# Tests for PCRE2_EXTRA_CASELESS_RESTRICT. Compare each test with and without
# the restriction.
/AskZ/i,utf,caseless_restrict
AskZ
0: AskZ
aSKz
0: aSKz
\= Expect no match
A\x{17f}kZ
No match
As\x{212a}Z
No match
/AskZ/i,utf
AskZ
0: AskZ
aSKz
0: aSKz
A\x{17f}kZ
0: A\x{17f}kZ
As\x{212a}Z
0: As\x{212a}Z
/A\x{17f}\x{212a}Z/ir,utf
\= Expect no match
AskZ
No match
/A\x{17f}\x{212a}Z/i,utf
AskZ
0: AskZ
/[AskZ]+/i,utf,caseless_restrict
AskZ
0: AskZ
aSKz
0: aSKz
A\x{17f}kZ
0: A
As\x{212a}Z
0: As
/[AskZ]+/i,utf
AskZ
0: AskZ
aSKz
0: aSKz
A\x{17f}kZ
0: A\x{17f}kZ
As\x{212a}Z
0: As\x{212a}Z
/[\x{17f}\x{212a}]+/ir,utf
\= Expect no match
AskZ
No match
/[\x{17f}\x{212a}]+/i,utf
AskZ
0: sk
/[^s]+/ir,utf
A\x{17f}Z
0: A\x{17f}Z
/[^s]+/i,utf
A\x{17f}Z
0: A
/[^k]+/ir,utf
A\x{212a}Z
0: A\x{212a}Z
/[^k]+/i,utf
A\x{212a}Z
0: A
/[^sk]+/ir,utf
A\x{17f}\x{212a}Z
0: A\x{17f}\x{212a}Z
/[^sk]+/i,utf
A\x{17f}\x{212a}Z
0: A
/[^\x{17f}]+/ir,utf
AsSZ
0: AsSZ
/[^\x{17f}]+/i,utf
AsSZ
0: A
/[Ss]+/irB,utf
------------------------------------------------------------------
Bra
/i S++
Ket
End
------------------------------------------------------------------
Sss\x{17f}ss
0: Sss
/[Ss]+/iB,utf
------------------------------------------------------------------
Bra
[Ss\x{17f}]++
Ket
End
------------------------------------------------------------------
Sss\x{17f}ss
0: Sss\x{17f}ss
/[S\x{17f}]/irB,utf
------------------------------------------------------------------
Bra
[Ss\x{17f}]
Ket
End
------------------------------------------------------------------
/[S\x{17f}]/iB,utf
------------------------------------------------------------------
Bra
[Ss\x{17f}]
Ket
End
------------------------------------------------------------------
/[\x{17f}s]/irB,utf
------------------------------------------------------------------
Bra
[Ss\x{17f}]
Ket
End
------------------------------------------------------------------
/[\x{17f}s]/iB,utf
------------------------------------------------------------------
Bra
[Ss\x{17f}]
Ket
End
------------------------------------------------------------------
/[\x{4b}\x{6b}]/irB,utf
------------------------------------------------------------------
Bra
/i K
Ket
End
------------------------------------------------------------------
/[\x{4b}\x{6b}]/iB,utf
------------------------------------------------------------------
Bra
[Kk\x{212a}]
Ket
End
------------------------------------------------------------------
/s(?r)s(?-r)s(?r:s)s/i,utf
\x{17f}S\x{17f}S\x{17f}
0: \x{17f}S\x{17f}S\x{17f}
\= Expect no match
\x{17f}\x{17f}\x{17f}S\x{17f}
No match
\x{17f}S\x{17f}\x{17f}\x{17f}
No match
/k(?^i)k/ir,utf
K\x{212a}
0: K\x{212a}
\= Expect no match
\x{212a}\x{212a}
No match
# End caseless restrict tests
# TESTS for PCRE2_EXTRA_TURKISH_CASING - again, tests with and without.
/i/i,utf
i
0: i
I
0: I
\= Expect no match
\x{0130}
No match
\x{0131}
No match
/i/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/I/i,utf
i
0: i
I
0: I
\= Expect no match
\x{0130}
No match
\x{0131}
No match
/I/i,utf,turkish_casing
I
0: I
\x{0131}
0: \x{131}
\= Expect no match
i
No match
\x{0130}
No match
/\x{0130}/i,utf
\x{0130}
0: \x{130}
\= Expect no match
i
No match
I
No match
\x{0131}
No match
/\x{0130}/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/\x{0131}/i,utf
\x{0131}
0: \x{131}
\= Expect no match
i
No match
I
No match
\x{0130}
No match
/\x{0131}/i,utf,turkish_casing
I
0: I
\x{0131}
0: \x{131}
\= Expect no match
i
No match
\x{0130}
No match
/[i]/i,utf
i
0: i
I
0: I
\= Expect no match
\x{0130}
No match
\x{0131}
No match
/[i]/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/[\x{0130}]/i,utf
\x{0130}
0: \x{130}
\= Expect no match
i
No match
I
No match
\x{0131}
No match
/[\x{0130}]/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/[\x{0120}-\x{0130}]/i,utf
\x{0130}
0: \x{130}
\= Expect no match
i
No match
I
No match
\x{0131}
No match
/[\x{0120}-\x{0130}]/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/[zi]/i,utf
i
0: i
I
0: I
\= Expect no match
\x{0130}
No match
\x{0131}
No match
/[zi]/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/[z\x{0130}]/i,utf
\x{0130}
0: \x{130}
\= Expect no match
i
No match
I
No match
\x{0131}
No match
/[z\x{0130}]/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
/[iI]/i,utf
i
0: i
I
0: I
\= Expect no match
\x{0130}
No match
\x{0131}
No match
/[iI]/i,utf,turkish_casing
i
0: i
I
0: I
\x{0130}
0: \x{130}
\x{0131}
0: \x{131}
/[i\x{0130}]/i,utf
i
0: i
I
0: I
\x{0130}
0: \x{130}
\= Expect no match
\x{0131}
No match
/[i\x{0130}]/i,utf,turkish_casing
i
0: i
\x{0130}
0: \x{130}
\= Expect no match
I
No match
\x{0131}
No match
# End Turkish casing tests
# TESTS for PCRE2_EXTRA_ASCII_xxx - again, tests with and without.
# DIGITS
/\d+/i,utf
123\x{660}456
0: 123
/\d+/i,utf,ucp
123\x{660}456
0: 123\x{660}456
/\d+/i,utf,ucp,ascii_bsd
123\x{660}456
0: 123
/[\d]+/i,utf
123\x{660}456
0: 123
/[\d]+/i,utf,ucp
123\x{660}456
0: 123\x{660}456
/[\d]+/i,utf,ucp,ascii_bsd
123\x{660}456
0: 123
/\d(?aD)\d(?-aD)\d/utf,ucp
\x{660}9\x{660}
0: \x{660}9\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
No match
/\d(?-aD)\d(?aD)\d/utf,ucp,ascii_bsd
999
0: 999
9\x{660}9
0: 9\x{660}9
/\d(?a)\d(?-a)\d/utf,ucp
\x{660}9\x{660}
0: \x{660}9\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
No match
/\d(?-aD)\d(?aD)\d/utf,ucp,ascii_bsd
999
0: 999
9\x{660}9
0: 9\x{660}9
# SPACES
/>\s+</i,utf
> <
0: > <
\= Expect no match
>\x{a0} <
No match
/>\s+</i,utf,ucp
> <
0: > <
>\x{a0} <
0: >\x{a0} <
/>\s+</i,utf,ucp,ascii_bss
> <
0: > <
\= Expect no match
>\x{a0} <
No match
/>[\s]+</i,utf
> <
0: > <
\= Expect no match
>\x{a0} <
No match
/>[\s]+</i,utf,ucp
> <
0: > <
>\x{a0} <
0: >\x{a0} <
/>[\s]+</i,utf,ucp,ascii_bss
> <
0: > <
\= Expect no match
>\x{a0} <
No match
/>\s(?aS)\s(?-aS)\s</utf,ucp
>\x{a0} \x{a0}<
0: >\x{a0} \x{a0}<
\= Expect no match
>\x{a0}\x{a0}\x{a0}<
No match
/>\s(?a)\s(?-a)\s</utf,ucp
>\x{a0} \x{a0}<
0: >\x{a0} \x{a0}<
\= Expect no match
>\x{a0}\x{a0}\x{a0}<
No match
# WORDS
/\w+/i,utf
123\x{660}abc
0: 123
/\w+/i,utf,ucp
123\x{660}abc
0: 123\x{660}abc
/\w+/i,utf,ucp,ascii_bsw
123\x{660}abc
0: 123
/[\w]+/i,utf
123\x{660}abc
0: 123
/[\w]+/i,utf,ucp
123\x{660}abc
0: 123\x{660}abc
/[\w]+/i,utf,ucp,ascii_bsw
123\x{660}abc
0: 123
/\w(?aW)\w(?-aW)\w/utf,ucp
\x{660}A\x{c0}
0: \x{660}A\x{c0}
\= Expect no match
\x{660}\x{c0}\x{c0}
No match
/\w(?a)\w(?-a)\w/utf,ucp
\x{660}A\x{c0}
0: \x{660}A\x{c0}
\= Expect no match
\x{660}\x{c0}\x{c0}
No match
# POSIX
/^[[:digit:]]+$/utf,ucp
123456
0: 123456
123\x{660}456
0: 123\x{660}456
/^[[:digit:]]+$/utf,ucp,ascii_digit
123456
0: 123456
\= Expect no match
123\x{660}456
No match
/[[:digit:]]+/g,utf,ucp,ascii_digit
123\x{660}456
0: 123
0: 456
/(?-aT)[[:digit:]](?aT)[[:digit:]]/utf,ucp,ascii_digit
11
0: 11
\x{ff11}1
0: \x{ff11}1
\= Expect no match
1\x{ff11}
No match
/(?-aT:[[:digit:]])[[:digit:]]/utf,ucp,ascii_digit
11
0: 11
\x{ff11}1
0: \x{ff11}1
\= Expect no match
1\x{ff11}
No match
/(?-aT:[[:digit:]])[[:digit:]]/utf,never_ucp,ascii_digit
11
0: 11
\= Expect no match
\x{ff11}1
No match
1\x{ff11}
No match
/[[:digit:]]+/utf,ucp,ascii_posix
123\x{660}456
0: 123
/(?-aP)[[:digit:]](?aP)[[:digit:]]/utf,ucp,ascii_posix
11
0: 11
\x{ff11}1
0: \x{ff11}1
\= Expect no match
1\x{ff11}
No match
/(?-aP:[[:digit:]])[[:digit:]]/utf,ucp,ascii_posix
11
0: 11
\x{ff11}1
0: \x{ff11}1
\= Expect no match
1\x{ff11}
No match
/(?-a:[[:digit:]])[[:digit:]]/a,utf,ucp
11
0: 11
\x{ff11}1
0: \x{ff11}1
\= Expect no match
1\x{ff11}
No match
/>[[:space:]]+</utf,ucp
>\x{a0} \x{a0}<
0: >\x{a0} \x{a0}<
>\x{a0}\x{a0}\x{a0}<
0: >\x{a0}\x{a0}\x{a0}<
/>[[:space:]]+</utf,ucp,ascii_posix
\= Expect no match
>\x{a0} \x{a0}<
No match
/(?aP)[[:alnum:]]+/i,ucp,utf
abcáxyz
0: abc
abc\x{660}xyz
0: abc
/(?aP)[[:alnum:]\d]+/i,ucp,utf
abc\x{660}xyz
0: abc\x{660}xyz
/(*UCP)(*UTF)[[:alnum:]](?aP:[[:alnum:]])[[:alnum:]]/
\x{660}A\x{660}
0: \x{660}A\x{660}
\= Expect no match
\x{660}\x{660}\x{660}
No match
# VARIOUS
/[\d\s\w]+/a,ucp,utf
9 A\x{660}À
0: 9 A
9 AÀ\x{660}
0: 9 A
# End PCRE2_EXTRA_ASCII_xxx tests
/\w+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/[\w]+/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/\b.+?\b/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait
/caf\B.+?\B/utf,ucp
--cafe\x{300}_au\x{203f}lait!
0: cafe\x{300}_au\x{203f}lait!
1: cafe\x{300}_au\x{203f}lai
2: cafe\x{300}_au\x{203f}la
3: cafe\x{300}_au\x{203f}l
4: cafe\x{300}_au\x{203f}
5: cafe\x{300}_au
6: cafe\x{300}_a
7: cafe\x{300}_
8: cafe\x{300}
9: cafe
# --------------------------------------------------------------------------
# Case-independent matching property tests added after changing PCRE2 to be
# compatible with Perl. All three cases (upper, lower, title) conflate.
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/utf
>AbbD<
0: AbbD
>Abb\x{01c5}<
0: Abb\x{1c5}
\= Expect no match
>aBBd<
No match
>aB!!<
No match
/\p{Lu}\p{Ll}\P{Lu}\P{Ll}/i,utf
>aB!!<
0: aB!!
\= Expect no match
>AbbD<
No match
>aBBd<
No match
>Abb\x{01c5}<
No match
/[.\p{Lu}][.\p{Ll}][.\P{Lu}][.\P{Ll}]/i,utf
>aB!!<
0: aB!!
\= Expect no match
>AbbD<
No match
>aBBd<
No match
>Abb\x{01c5}<
No match
# --------------
# EXTENDED CHARACTER CLASSES
/[\p{Ll}[\p{Nd}]]C/alt_extended_class
aC
0: aC
1C
0: 1C
\= Expect no match
[C
No match
/[[\p{Ll}][\p{Nd}]]/alt_extended_class
a
0: a
1
0: 1
\= Expect no match
[
No match
]
No match
/[[\p{Ll}]||[\p{Nd}]]/alt_extended_class
a
0: a
1
0: 1
\= Expect no match
C
No match
/[[^\p{Ll}][\p{Nd}]]/alt_extended_class
1
0: 1
A
0: A
\= Expect no match
a
No match
/[^[\p{Ll}][\p{Nd}]]/alt_extended_class
A
0: A
\= Expect no match
a
No match
1
No match
/[^[\p{Ll}]&&[\p{Nd}]]/alt_extended_class
a
0: a
1
0: 1
A
0: A
/(?[[\p{Ll}]+[\p{Nd}]])/
a
0: a
1
0: 1
\= Expect no match
[
No match
]
No match
# --------------
# EXTENDED CHARACTER CLASSES (Perl)
/(?[[\p{Ll}Z]&[\p{Lu}a]])/
a
0: a
Z
0: Z
\= Expect no match
A
No match
z
No match
# --------------------------------------------------------------------------
# End of testinput7

1023
3rd/pcre2/testdata/testoutput8-16-2 vendored Normal file
View File

@@ -0,0 +1,1023 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 5 CBra 1
5 /i b
7 5 Ket
9 9 Ket
11 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 38
------------------------------------------------------------------
0 16 Bra
2 7 CBra 1
5 AllAny*
7 X
9 5 Alt
11 ^
12 B
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 36
------------------------------------------------------------------
0 15 Bra
2 6 Bra
4 AllAny*
6 X
8 5 Alt
10 ^
11 B
13 11 Ket
15 15 Ket
17 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 46
------------------------------------------------------------------
0 20 Bra
2 ^
3 [0-9A-Za-z]
20 20 Ket
22 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 10
------------------------------------------------------------------
0 2 Bra
2 2 Ket
4 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 x?+
4 4 Ket
6 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 x++
4 4 Ket
6 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 20
------------------------------------------------------------------
0 7 Bra
2 x
4 x{0,2}+
7 7 Ket
9 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 26
------------------------------------------------------------------
0 10 Bra
2 Braposzero
3 5 CBraPos 1
6 x
8 5 KetRpos
10 10 Ket
12 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 142
------------------------------------------------------------------
0 68 Bra
2 ^
3 63 CBra 1
6 5 CBra 2
9 a+
11 5 Ket
13 21 CBra 3
16 [ab]+?
34 21 Ket
36 21 CBra 4
39 [bc]+
57 21 Ket
59 5 CBra 5
62 \w*+
64 5 Ket
66 63 Ket
68 68 Ket
70 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 1648
------------------------------------------------------------------
0 821 Bra
2 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
820 \b
821 821 Ket
823 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 1628
------------------------------------------------------------------
0 811 Bra
2 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
810 \b
811 811 Ket
813 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 32
------------------------------------------------------------------
0 13 Bra
2 9 CBra 1
5 a
7 2 Recurse
9 b
11 9 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 40
------------------------------------------------------------------
0 17 Bra
2 13 CBra 1
5 a
7 4 SBra
9 2 Recurse
11 4 KetRmax
13 b
15 13 Ket
17 17 Ket
19 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 54
Memory allocation - data size : 52
------------------------------------------------------------------
0 24 Bra
2 a
4 5 CBra 1
7 b
9 4 Alt
11 c
13 9 Ket
15 d
17 5 CBra 2
20 e
22 5 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 64
Memory allocation - data size : 18
------------------------------------------------------------------
0 29 Bra
2 18 Bra
4 a
6 12 CBra 1
9 c
11 5 CBra 2
14 d
16 5 Ket
18 12 Ket
20 18 Ket
22 5 CBra 3
25 a
27 5 Ket
29 29 Ket
31 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 54
Memory allocation - data size : 6
------------------------------------------------------------------
0 24 Bra
2 5 CBra 1
5 a
7 5 Ket
9 Any
10 Any
11 Any
12 \1
14 bbb
20 2 Recurse
22 d
24 24 Ket
26 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 50
------------------------------------------------------------------
0 22 Bra
2 abc
8 Callout 255 10 1
12 de
16 Callout 0 16 1
20 f
22 22 Ket
24 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 78
------------------------------------------------------------------
0 36 Bra
2 Callout 255 0 1
6 a
8 Callout 255 1 1
12 b
14 Callout 255 2 1
18 c
20 Callout 255 3 1
24 d
26 Callout 255 4 1
30 e
32 Callout 255 5 0
36 36 Ket
38 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{1000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 16
------------------------------------------------------------------
0 5 Bra
2 \x{10000}
5 5 Ket
7 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 16
------------------------------------------------------------------
0 5 Bra
2 \x{100000}
5 5 Ket
7 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 16
------------------------------------------------------------------
0 5 Bra
2 \x{10ffff}
5 5 Ket
7 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{80}
4 4 Ket
6 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 10 Bra
2 A\x{2262}\x{391}.
10 10 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 22
------------------------------------------------------------------
0 8 Bra
2 \x{d55c}\x{ad6d}\x{c5b4}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 22
------------------------------------------------------------------
0 8 Bra
2 \x{65e5}\x{672c}\x{8a9e}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 54
------------------------------------------------------------------
0 24 Bra
2 [Z\x{100}]
24 24 Ket
26 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 13: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 27 Bra
2 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
27 27 Ket
29 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 [\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 58
------------------------------------------------------------------
0 26 Bra
2 [+\-0-9\p{Nd}]++
26 26 Ket
28 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 32
------------------------------------------------------------------
0 13 Bra
2 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
13 13 Ket
15 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 32
------------------------------------------------------------------
0 13 Bra
2 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
13 13 Ket
15 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 24
------------------------------------------------------------------
0 9 Bra
2 [\x{104}-\x{109}]
9 9 Ket
11 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 52
------------------------------------------------------------------
0 23 Bra
2 19 CBra 1
5 Brazero
6 13 SCBra 2
9 6 Cond
11 1 Capture ref
13 0
15 2 Alt
17 8 Ket
19 13 KetRmax
21 19 Ket
23 23 Ket
25 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 42
------------------------------------------------------------------
0 18 Bra
2 14 CBra 1
5 Brazero
6 6 SCond
8 1 Capture ref
10 0
12 2 Alt
14 8 KetRmax
16 14 Ket
18 18 Ket
20 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 9 Bra
2 [^\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Cc}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{Cc}\P{L}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 10 Bra
2 [\p{L}]++
10 10 Ket
12 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Xsp}]++
13 13 Ket
15 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 60 Bra
2 abc
8 5 CBra 1
11 d
13 4 Alt
15 e
17 9 Ket
19 *THEN
20 x
22 12 CBra 2
25 123
31 *THEN
32 4
34 24 Alt
36 567
42 5 CBra 3
45 b
47 4 Alt
49 q
51 9 Ket
53 *THEN
54 xx
58 36 Ket
60 60 Ket
62 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 35 Bra
2 Brazero
3 28 SCBra 1
6 12 CBra 2
9 7 CBra 3
12 a
14 \2
16 7 Ket
18 11 Alt
20 5 CBra 4
23 a*
25 5 Ket
27 20 Recurse
29 23 Ket
31 28 KetRmax
33 a?+
35 35 Ket
37 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 16 Bra
2 12 CBra 1
5 7 Recurse
7 5 CBra 2
10 \1
12 5 Ket
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 13 Bra
2 6 Recurse
4 6 Recurse
6 5 CBra 1
9 a
11 5 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 24 Bra
2 Any
3 7 CBra 1
6 19 Recurse
8 0 Recurse
10 4 Alt
12 \1
14 3 Alt
16 $
17 14 Ket
19 3 CBra 2
22 3 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 31 Bra
2 Any
3 14 CBra 1
6 26 Recurse
8 0 Recurse
10 3 CBra 2
13 3 Ket
15 10 Recurse
17 4 Alt
19 \1
21 3 Alt
23 $
24 21 Ket
26 3 CBra 3
29 3 Ket
31 31 Ket
33 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 50 Bra
2 4 Recurse
4 3 CBra 1
7 3 Ket
9 39 CBra 2
12 32 CBra 3
15 27 CBra 4
18 22 CBra 5
21 15 CBra 6
24 10 CBra 7
27 5 Once
29 \1+
32 5 Ket
34 10 Ket
36 15 Ket
38 \x{85}
40 22 KetRmax
42 27 Ket
44 2 Alt
46 34 Ket
48 39 Ket
50 50 Ket
52 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
Failed: error 184 at offset 1129: (?| and/or (?J: or (?x: parentheses are too deeply nested
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 79 Bra
2 70 Once
4 6 Cond
6 1 Capture ref
8 74 Recurse
10 6 Ket
12 6 Cond
14 1 Capture ref
16 74 Recurse
18 6 Ket
20 6 Cond
22 1 Capture ref
24 74 Recurse
26 6 Ket
28 6 Cond
30 1 Capture ref
32 74 Recurse
34 6 Ket
36 6 Cond
38 1 Capture ref
40 74 Recurse
42 6 Ket
44 6 Cond
46 1 Capture ref
48 74 Recurse
50 6 Ket
52 6 Cond
54 1 Capture ref
56 74 Recurse
58 6 Ket
60 10 SBraPos
62 6 SCond
64 1 Capture ref
66 74 Recurse
68 6 Ket
70 10 KetRpos
72 70 Ket
74 3 CBra 1
77 3 Ket
79 79 Ket
81 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 43 Bra
2 34 Once
4 4 Cond
6 1 Capture ref
8 8 Alt
10 a
12 38 Recurse
14 b
16 12 Ket
18 16 SBraPos
20 4 SCond
22 1 Capture ref
24 8 Alt
26 a
28 38 Recurse
30 b
32 12 Ket
34 16 KetRpos
36 34 Ket
38 3 CBra 1
41 3 Ket
43 43 Ket
45 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 133 Bra
2 41 CBra 1
5 2 Recurse
7 88 Recurse
9 93 Recurse
11 98 Recurse
13 103 Recurse
15 108 Recurse
17 113 Recurse
19 118 Recurse
21 123 Recurse
23 123 Recurse
25 118 Recurse
27 113 Recurse
29 108 Recurse
31 103 Recurse
33 98 Recurse
35 93 Recurse
37 88 Recurse
39 2 Recurse
41 0 Recurse
43 41 Ket
45 41 SCBra 1
48 2 Recurse
50 88 Recurse
52 93 Recurse
54 98 Recurse
56 103 Recurse
58 108 Recurse
60 113 Recurse
62 118 Recurse
64 123 Recurse
66 123 Recurse
68 118 Recurse
70 113 Recurse
72 108 Recurse
74 103 Recurse
76 98 Recurse
78 93 Recurse
80 88 Recurse
82 2 Recurse
84 0 Recurse
86 41 KetRmax
88 3 CBra 2
91 3 Ket
93 3 CBra 3
96 3 Ket
98 3 CBra 4
101 3 Ket
103 3 CBra 5
106 3 Ket
108 3 CBra 6
111 3 Ket
113 3 CBra 7
116 3 Ket
118 3 CBra 8
121 3 Ket
123 3 CBra 9
126 3 Ket
128 3 CBra 10
131 3 Ket
133 133 Ket
135 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
Failed: error 120 at offset 131070: regular expression is too large
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-16-3 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 32
------------------------------------------------------------------
0 12 Bra
3 6 CBra 1
7 /i b
9 6 Ket
12 12 Ket
15 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 48
------------------------------------------------------------------
0 20 Bra
3 8 CBra 1
7 AllAny*
9 X
11 6 Alt
14 ^
15 B
17 14 Ket
20 20 Ket
23 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 46
------------------------------------------------------------------
0 19 Bra
3 7 Bra
6 AllAny*
8 X
10 6 Alt
13 ^
14 B
16 13 Ket
19 19 Ket
22 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 50
------------------------------------------------------------------
0 21 Bra
3 ^
4 [0-9A-Za-z]
21 21 Ket
24 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 14
------------------------------------------------------------------
0 3 Bra
3 3 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 x?+
5 5 Ket
8 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 x++
5 5 Ket
8 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 24
------------------------------------------------------------------
0 8 Bra
3 x
5 x{0,2}+
8 8 Ket
11 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 34
------------------------------------------------------------------
0 13 Bra
3 Braposzero
4 6 CBraPos 1
8 x
10 6 KetRpos
13 13 Ket
16 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 166
------------------------------------------------------------------
0 79 Bra
3 ^
4 72 CBra 1
8 6 CBra 2
12 a+
14 6 Ket
17 22 CBra 3
21 [ab]+?
39 22 Ket
42 22 CBra 4
46 [bc]+
64 22 Ket
67 6 CBra 5
71 \w*+
73 6 Ket
76 72 Ket
79 79 Ket
82 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 1652
------------------------------------------------------------------
0 822 Bra
3 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
821 \b
822 822 Ket
825 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 1632
------------------------------------------------------------------
0 812 Bra
3 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
811 \b
812 812 Ket
815 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 42
------------------------------------------------------------------
0 17 Bra
3 11 CBra 1
7 a
9 3 Recurse
12 b
14 11 Ket
17 17 Ket
20 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 54
------------------------------------------------------------------
0 23 Bra
3 17 CBra 1
7 a
9 6 SBra
12 3 Recurse
15 6 KetRmax
18 b
20 17 Ket
23 23 Ket
26 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 68
Memory allocation - data size : 52
------------------------------------------------------------------
0 30 Bra
3 a
5 6 CBra 1
9 b
11 5 Alt
14 c
16 11 Ket
19 d
21 6 CBra 2
25 e
27 6 Ket
30 30 Ket
33 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 84
Memory allocation - data size : 18
------------------------------------------------------------------
0 38 Bra
3 23 Bra
6 a
8 15 CBra 1
12 c
14 6 CBra 2
18 d
20 6 Ket
23 15 Ket
26 23 Ket
29 6 CBra 3
33 a
35 6 Ket
38 38 Ket
41 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 64
Memory allocation - data size : 6
------------------------------------------------------------------
0 28 Bra
3 6 CBra 1
7 a
9 6 Ket
12 Any
13 Any
14 Any
15 \1
17 bbb
23 3 Recurse
26 d
28 28 Ket
31 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 62
------------------------------------------------------------------
0 27 Bra
3 abc
9 Callout 255 10 1
15 de
19 Callout 0 16 1
25 f
27 27 Ket
30 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 106
------------------------------------------------------------------
0 49 Bra
3 Callout 255 0 1
9 a
11 Callout 255 1 1
17 b
19 Callout 255 2 1
25 c
27 Callout 255 3 1
33 d
35 Callout 255 4 1
41 e
43 Callout 255 5 0
49 49 Ket
52 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{100}
5 5 Ket
8 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{1000}
5 5 Ket
8 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 6 Bra
3 \x{10000}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 6 Bra
3 \x{100000}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 6 Bra
3 \x{10ffff}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{ff}
5 5 Ket
8 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{100}
5 5 Ket
8 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{80}
5 5 Ket
8 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{ff}
5 5 Ket
8 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 A\x{2262}\x{391}.
11 11 Ket
14 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 9 Bra
3 \x{d55c}\x{ad6d}\x{c5b4}
9 9 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 9 Bra
3 \x{65e5}\x{672c}\x{8a9e}
9 9 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{100}
5 5 Ket
8 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 26 Bra
3 [Z\x{100}]
26 26 Ket
29 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 32
------------------------------------------------------------------
0 12 Bra
3 ^
4 [\x{100}-\x{150}]
12 12 Ket
15 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 32
------------------------------------------------------------------
0 12 Bra
3 ^
4 [\x{100}-\x{150}]
12 12 Ket
15 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 13: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\P{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\P{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 66
------------------------------------------------------------------
0 29 Bra
3 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
29 29 Ket
32 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\p{Nd}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 64
------------------------------------------------------------------
0 28 Bra
3 [+\-0-9\p{Nd}]++
28 28 Ket
31 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 36
------------------------------------------------------------------
0 14 Bra
3 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
14 14 Ket
17 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 36
------------------------------------------------------------------
0 14 Bra
3 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
14 14 Ket
17 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\x{104}-\x{109}]
11 11 Ket
14 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 70
------------------------------------------------------------------
0 31 Bra
3 25 CBra 1
7 Brazero
8 17 SCBra 2
12 7 Cond
15 1 Capture ref
17 0
19 3 Alt
22 10 Ket
25 17 KetRmax
28 25 Ket
31 31 Ket
34 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 56
------------------------------------------------------------------
0 24 Bra
3 18 CBra 1
7 Brazero
8 7 SCond
11 1 Capture ref
13 0
15 3 Alt
18 10 KetRmax
21 18 Ket
24 24 Ket
27 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{aa}
5 5 Ket
8 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{aa}
5 5 Ket
8 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^a] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^a] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^\x{aa}] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^\x{aa}] (not)
5 5 Ket
8 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 11 Bra
3 [^\p{Nd}]
11 11 Ket
14 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{L}\P{Cc}]++
15 15 Ket
18 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{Cc}\P{L}]++
15 15 Ket
18 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 12 Bra
3 [\p{L}]++
12 12 Ket
15 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{L}\P{Xsp}]++
15 15 Ket
18 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 70 Bra
3 abc
9 6 CBra 1
13 d
15 5 Alt
18 e
20 11 Ket
23 *THEN
24 x
26 13 CBra 2
30 123
36 *THEN
37 4
39 28 Alt
42 567
48 6 CBra 3
52 b
54 5 Alt
57 q
59 11 Ket
62 *THEN
63 xx
67 41 Ket
70 70 Ket
73 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 46 Bra
3 Brazero
4 37 SCBra 1
8 15 CBra 2
12 8 CBra 3
16 a
18 \2
20 8 Ket
23 15 Alt
26 6 CBra 4
30 a*
32 6 Ket
35 26 Recurse
38 30 Ket
41 37 KetRmax
44 a?+
46 46 Ket
49 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 22 Bra
3 16 CBra 1
7 10 Recurse
10 6 CBra 2
14 \1
16 6 Ket
19 16 Ket
22 22 Ket
25 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 18 Bra
3 9 Recurse
6 9 Recurse
9 6 CBra 1
13 a
15 6 Ket
18 18 Ket
21 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 33 Bra
3 Any
4 10 CBra 1
8 26 Recurse
11 0 Recurse
14 5 Alt
17 \1
19 4 Alt
22 $
23 19 Ket
26 4 CBra 2
30 4 Ket
33 33 Ket
36 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 43 Bra
3 Any
4 20 CBra 1
8 36 Recurse
11 0 Recurse
14 4 CBra 2
18 4 Ket
21 14 Recurse
24 5 Alt
27 \1
29 4 Alt
32 $
33 29 Ket
36 4 CBra 3
40 4 Ket
43 43 Ket
46 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 69 Bra
3 6 Recurse
6 4 CBra 1
10 4 Ket
13 53 CBra 2
17 43 CBra 3
21 36 CBra 4
25 29 CBra 5
29 20 CBra 6
33 13 CBra 7
37 6 Once
40 \1+
43 6 Ket
46 13 Ket
49 20 Ket
52 \x{85}
54 29 KetRmax
57 36 Ket
60 3 Alt
63 46 Ket
66 53 Ket
69 69 Ket
72 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 110 Bra
3 97 Once
6 8 Cond
9 1 Capture ref
11 103 Recurse
14 8 Ket
17 8 Cond
20 1 Capture ref
22 103 Recurse
25 8 Ket
28 8 Cond
31 1 Capture ref
33 103 Recurse
36 8 Ket
39 8 Cond
42 1 Capture ref
44 103 Recurse
47 8 Ket
50 8 Cond
53 1 Capture ref
55 103 Recurse
58 8 Ket
61 8 Cond
64 1 Capture ref
66 103 Recurse
69 8 Ket
72 8 Cond
75 1 Capture ref
77 103 Recurse
80 8 Ket
83 14 SBraPos
86 8 SCond
89 1 Capture ref
91 103 Recurse
94 8 Ket
97 14 KetRpos
100 97 Ket
103 4 CBra 1
107 4 Ket
110 110 Ket
113 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 58 Bra
3 45 Once
6 5 Cond
9 1 Capture ref
11 10 Alt
14 a
16 51 Recurse
19 b
21 15 Ket
24 21 SBraPos
27 5 SCond
30 1 Capture ref
32 10 Alt
35 a
37 51 Recurse
40 b
42 15 Ket
45 21 KetRpos
48 45 Ket
51 4 CBra 1
55 4 Ket
58 58 Ket
61 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 194 Bra
3 61 CBra 1
7 3 Recurse
10 131 Recurse
13 138 Recurse
16 145 Recurse
19 152 Recurse
22 159 Recurse
25 166 Recurse
28 173 Recurse
31 180 Recurse
34 180 Recurse
37 173 Recurse
40 166 Recurse
43 159 Recurse
46 152 Recurse
49 145 Recurse
52 138 Recurse
55 131 Recurse
58 3 Recurse
61 0 Recurse
64 61 Ket
67 61 SCBra 1
71 3 Recurse
74 131 Recurse
77 138 Recurse
80 145 Recurse
83 152 Recurse
86 159 Recurse
89 166 Recurse
92 173 Recurse
95 180 Recurse
98 180 Recurse
101 173 Recurse
104 166 Recurse
107 159 Recurse
110 152 Recurse
113 145 Recurse
116 138 Recurse
119 131 Recurse
122 3 Recurse
125 0 Recurse
128 61 KetRmax
131 4 CBra 2
135 4 Ket
138 4 CBra 3
142 4 Ket
145 4 CBra 4
149 4 Ket
152 4 CBra 5
156 4 Ket
159 4 CBra 6
163 4 Ket
166 4 CBra 7
170 4 Ket
173 4 CBra 8
177 4 Ket
180 4 CBra 9
184 4 Ket
187 4 CBra 10
191 4 Ket
194 194 Ket
197 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-16-4 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 32
------------------------------------------------------------------
0 12 Bra
3 6 CBra 1
7 /i b
9 6 Ket
12 12 Ket
15 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 48
------------------------------------------------------------------
0 20 Bra
3 8 CBra 1
7 AllAny*
9 X
11 6 Alt
14 ^
15 B
17 14 Ket
20 20 Ket
23 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 46
------------------------------------------------------------------
0 19 Bra
3 7 Bra
6 AllAny*
8 X
10 6 Alt
13 ^
14 B
16 13 Ket
19 19 Ket
22 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 50
------------------------------------------------------------------
0 21 Bra
3 ^
4 [0-9A-Za-z]
21 21 Ket
24 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 14
------------------------------------------------------------------
0 3 Bra
3 3 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 x?+
5 5 Ket
8 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 x++
5 5 Ket
8 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 24
------------------------------------------------------------------
0 8 Bra
3 x
5 x{0,2}+
8 8 Ket
11 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 34
------------------------------------------------------------------
0 13 Bra
3 Braposzero
4 6 CBraPos 1
8 x
10 6 KetRpos
13 13 Ket
16 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 166
------------------------------------------------------------------
0 79 Bra
3 ^
4 72 CBra 1
8 6 CBra 2
12 a+
14 6 Ket
17 22 CBra 3
21 [ab]+?
39 22 Ket
42 22 CBra 4
46 [bc]+
64 22 Ket
67 6 CBra 5
71 \w*+
73 6 Ket
76 72 Ket
79 79 Ket
82 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 1652
------------------------------------------------------------------
0 822 Bra
3 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
821 \b
822 822 Ket
825 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 1632
------------------------------------------------------------------
0 812 Bra
3 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
811 \b
812 812 Ket
815 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 42
------------------------------------------------------------------
0 17 Bra
3 11 CBra 1
7 a
9 3 Recurse
12 b
14 11 Ket
17 17 Ket
20 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 54
------------------------------------------------------------------
0 23 Bra
3 17 CBra 1
7 a
9 6 SBra
12 3 Recurse
15 6 KetRmax
18 b
20 17 Ket
23 23 Ket
26 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 68
Memory allocation - data size : 52
------------------------------------------------------------------
0 30 Bra
3 a
5 6 CBra 1
9 b
11 5 Alt
14 c
16 11 Ket
19 d
21 6 CBra 2
25 e
27 6 Ket
30 30 Ket
33 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 84
Memory allocation - data size : 18
------------------------------------------------------------------
0 38 Bra
3 23 Bra
6 a
8 15 CBra 1
12 c
14 6 CBra 2
18 d
20 6 Ket
23 15 Ket
26 23 Ket
29 6 CBra 3
33 a
35 6 Ket
38 38 Ket
41 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 64
Memory allocation - data size : 6
------------------------------------------------------------------
0 28 Bra
3 6 CBra 1
7 a
9 6 Ket
12 Any
13 Any
14 Any
15 \1
17 bbb
23 3 Recurse
26 d
28 28 Ket
31 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 62
------------------------------------------------------------------
0 27 Bra
3 abc
9 Callout 255 10 1
15 de
19 Callout 0 16 1
25 f
27 27 Ket
30 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 106
------------------------------------------------------------------
0 49 Bra
3 Callout 255 0 1
9 a
11 Callout 255 1 1
17 b
19 Callout 255 2 1
25 c
27 Callout 255 3 1
33 d
35 Callout 255 4 1
41 e
43 Callout 255 5 0
49 49 Ket
52 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{100}
5 5 Ket
8 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{1000}
5 5 Ket
8 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 6 Bra
3 \x{10000}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 6 Bra
3 \x{100000}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 6 Bra
3 \x{10ffff}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{ff}
5 5 Ket
8 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{100}
5 5 Ket
8 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{80}
5 5 Ket
8 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{ff}
5 5 Ket
8 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 A\x{2262}\x{391}.
11 11 Ket
14 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 9 Bra
3 \x{d55c}\x{ad6d}\x{c5b4}
9 9 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 26
------------------------------------------------------------------
0 9 Bra
3 \x{65e5}\x{672c}\x{8a9e}
9 9 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{100}
5 5 Ket
8 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 26 Bra
3 [Z\x{100}]
26 26 Ket
29 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 32
------------------------------------------------------------------
0 12 Bra
3 ^
4 [\x{100}-\x{150}]
12 12 Ket
15 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 32
------------------------------------------------------------------
0 12 Bra
3 ^
4 [\x{100}-\x{150}]
12 12 Ket
15 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 13: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\P{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\P{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 66
------------------------------------------------------------------
0 29 Bra
3 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
29 29 Ket
32 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\p{Nd}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 64
------------------------------------------------------------------
0 28 Bra
3 [+\-0-9\p{Nd}]++
28 28 Ket
31 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 36
------------------------------------------------------------------
0 14 Bra
3 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
14 14 Ket
17 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 36
------------------------------------------------------------------
0 14 Bra
3 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
14 14 Ket
17 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 30
------------------------------------------------------------------
0 11 Bra
3 [\x{104}-\x{109}]
11 11 Ket
14 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 70
------------------------------------------------------------------
0 31 Bra
3 25 CBra 1
7 Brazero
8 17 SCBra 2
12 7 Cond
15 1 Capture ref
17 0
19 3 Alt
22 10 Ket
25 17 KetRmax
28 25 Ket
31 31 Ket
34 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 56
------------------------------------------------------------------
0 24 Bra
3 18 CBra 1
7 Brazero
8 7 SCond
11 1 Capture ref
13 0
15 3 Alt
18 10 KetRmax
21 18 Ket
24 24 Ket
27 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{aa}
5 5 Ket
8 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 \x{aa}
5 5 Ket
8 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^a] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^a] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^\x{aa}] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 5 Bra
3 [^\x{aa}] (not)
5 5 Ket
8 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 11 Bra
3 [^\p{Nd}]
11 11 Ket
14 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{L}\P{Cc}]++
15 15 Ket
18 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{Cc}\P{L}]++
15 15 Ket
18 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 12 Bra
3 [\p{L}]++
12 12 Ket
15 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{L}\P{Xsp}]++
15 15 Ket
18 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 70 Bra
3 abc
9 6 CBra 1
13 d
15 5 Alt
18 e
20 11 Ket
23 *THEN
24 x
26 13 CBra 2
30 123
36 *THEN
37 4
39 28 Alt
42 567
48 6 CBra 3
52 b
54 5 Alt
57 q
59 11 Ket
62 *THEN
63 xx
67 41 Ket
70 70 Ket
73 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 46 Bra
3 Brazero
4 37 SCBra 1
8 15 CBra 2
12 8 CBra 3
16 a
18 \2
20 8 Ket
23 15 Alt
26 6 CBra 4
30 a*
32 6 Ket
35 26 Recurse
38 30 Ket
41 37 KetRmax
44 a?+
46 46 Ket
49 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 22 Bra
3 16 CBra 1
7 10 Recurse
10 6 CBra 2
14 \1
16 6 Ket
19 16 Ket
22 22 Ket
25 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 18 Bra
3 9 Recurse
6 9 Recurse
9 6 CBra 1
13 a
15 6 Ket
18 18 Ket
21 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 33 Bra
3 Any
4 10 CBra 1
8 26 Recurse
11 0 Recurse
14 5 Alt
17 \1
19 4 Alt
22 $
23 19 Ket
26 4 CBra 2
30 4 Ket
33 33 Ket
36 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 43 Bra
3 Any
4 20 CBra 1
8 36 Recurse
11 0 Recurse
14 4 CBra 2
18 4 Ket
21 14 Recurse
24 5 Alt
27 \1
29 4 Alt
32 $
33 29 Ket
36 4 CBra 3
40 4 Ket
43 43 Ket
46 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 69 Bra
3 6 Recurse
6 4 CBra 1
10 4 Ket
13 53 CBra 2
17 43 CBra 3
21 36 CBra 4
25 29 CBra 5
29 20 CBra 6
33 13 CBra 7
37 6 Once
40 \1+
43 6 Ket
46 13 Ket
49 20 Ket
52 \x{85}
54 29 KetRmax
57 36 Ket
60 3 Alt
63 46 Ket
66 53 Ket
69 69 Ket
72 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 110 Bra
3 97 Once
6 8 Cond
9 1 Capture ref
11 103 Recurse
14 8 Ket
17 8 Cond
20 1 Capture ref
22 103 Recurse
25 8 Ket
28 8 Cond
31 1 Capture ref
33 103 Recurse
36 8 Ket
39 8 Cond
42 1 Capture ref
44 103 Recurse
47 8 Ket
50 8 Cond
53 1 Capture ref
55 103 Recurse
58 8 Ket
61 8 Cond
64 1 Capture ref
66 103 Recurse
69 8 Ket
72 8 Cond
75 1 Capture ref
77 103 Recurse
80 8 Ket
83 14 SBraPos
86 8 SCond
89 1 Capture ref
91 103 Recurse
94 8 Ket
97 14 KetRpos
100 97 Ket
103 4 CBra 1
107 4 Ket
110 110 Ket
113 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 58 Bra
3 45 Once
6 5 Cond
9 1 Capture ref
11 10 Alt
14 a
16 51 Recurse
19 b
21 15 Ket
24 21 SBraPos
27 5 SCond
30 1 Capture ref
32 10 Alt
35 a
37 51 Recurse
40 b
42 15 Ket
45 21 KetRpos
48 45 Ket
51 4 CBra 1
55 4 Ket
58 58 Ket
61 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 194 Bra
3 61 CBra 1
7 3 Recurse
10 131 Recurse
13 138 Recurse
16 145 Recurse
19 152 Recurse
22 159 Recurse
25 166 Recurse
28 173 Recurse
31 180 Recurse
34 180 Recurse
37 173 Recurse
40 166 Recurse
43 159 Recurse
46 152 Recurse
49 145 Recurse
52 138 Recurse
55 131 Recurse
58 3 Recurse
61 0 Recurse
64 61 Ket
67 61 SCBra 1
71 3 Recurse
74 131 Recurse
77 138 Recurse
80 145 Recurse
83 152 Recurse
86 159 Recurse
89 166 Recurse
92 173 Recurse
95 180 Recurse
98 180 Recurse
101 173 Recurse
104 166 Recurse
107 159 Recurse
110 152 Recurse
113 145 Recurse
116 138 Recurse
119 131 Recurse
122 3 Recurse
125 0 Recurse
128 61 KetRmax
131 4 CBra 2
135 4 Ket
138 4 CBra 3
142 4 Ket
145 4 CBra 4
149 4 Ket
152 4 CBra 5
156 4 Ket
159 4 CBra 6
163 4 Ket
166 4 CBra 7
170 4 Ket
173 4 CBra 8
177 4 Ket
180 4 CBra 9
184 4 Ket
187 4 CBra 10
191 4 Ket
194 194 Ket
197 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
Failed: error 114 at offset 509: missing closing parenthesis
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))/-fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-32-2 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 5 CBra 1
5 /i b
7 5 Ket
9 9 Ket
11 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 76
------------------------------------------------------------------
0 16 Bra
2 7 CBra 1
5 AllAny*
7 X
9 5 Alt
11 ^
12 B
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 72
------------------------------------------------------------------
0 15 Bra
2 6 Bra
4 AllAny*
6 X
8 5 Alt
10 ^
11 B
13 11 Ket
15 15 Ket
17 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 ^
3 [0-9A-Za-z]
12 12 Ket
14 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 20
------------------------------------------------------------------
0 2 Bra
2 2 Ket
4 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 x?+
4 4 Ket
6 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 x++
4 4 Ket
6 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 40
------------------------------------------------------------------
0 7 Bra
2 x
4 x{0,2}+
7 7 Ket
9 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 Braposzero
3 5 CBraPos 1
6 x
8 5 KetRpos
10 10 Ket
12 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 220
------------------------------------------------------------------
0 52 Bra
2 ^
3 47 CBra 1
6 5 CBra 2
9 a+
11 5 Ket
13 13 CBra 3
16 [ab]+?
26 13 Ket
28 13 CBra 4
31 [bc]+
41 13 Ket
43 5 CBra 5
46 \w*+
48 5 Ket
50 47 Ket
52 52 Ket
54 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 3296
------------------------------------------------------------------
0 821 Bra
2 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
820 \b
821 821 Ket
823 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 3256
------------------------------------------------------------------
0 811 Bra
2 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
810 \b
811 811 Ket
813 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 64
------------------------------------------------------------------
0 13 Bra
2 9 CBra 1
5 a
7 2 Recurse
9 b
11 9 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 80
------------------------------------------------------------------
0 17 Bra
2 13 CBra 1
5 a
7 4 SBra
9 2 Recurse
11 4 KetRmax
13 b
15 13 Ket
17 17 Ket
19 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 108
Memory allocation - data size : 104
------------------------------------------------------------------
0 24 Bra
2 a
4 5 CBra 1
7 b
9 4 Alt
11 c
13 9 Ket
15 d
17 5 CBra 2
20 e
22 5 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 128
Memory allocation - data size : 36
------------------------------------------------------------------
0 29 Bra
2 18 Bra
4 a
6 12 CBra 1
9 c
11 5 CBra 2
14 d
16 5 Ket
18 12 Ket
20 18 Ket
22 5 CBra 3
25 a
27 5 Ket
29 29 Ket
31 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 108
Memory allocation - data size : 12
------------------------------------------------------------------
0 24 Bra
2 5 CBra 1
5 a
7 5 Ket
9 Any
10 Any
11 Any
12 \1
14 bbb
20 2 Recurse
22 d
24 24 Ket
26 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 100
------------------------------------------------------------------
0 22 Bra
2 abc
8 Callout 255 10 1
12 de
16 Callout 0 16 1
20 f
22 22 Ket
24 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 156
------------------------------------------------------------------
0 36 Bra
2 Callout 255 0 1
6 a
8 Callout 255 1 1
12 b
14 Callout 255 2 1
18 c
20 Callout 255 3 1
24 d
26 Callout 255 4 1
30 e
32 Callout 255 5 0
36 36 Ket
38 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{1000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{10000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{10ffff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{80}
4 4 Ket
6 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 A\x{2262}\x{391}.
10 10 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 44
------------------------------------------------------------------
0 8 Bra
2 \x{d55c}\x{ad6d}\x{c5b4}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 44
------------------------------------------------------------------
0 8 Bra
2 \x{65e5}\x{672c}\x{8a9e}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 76
------------------------------------------------------------------
0 16 Bra
2 [Z\x{100}]
16 16 Ket
18 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 13: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 88
------------------------------------------------------------------
0 19 Bra
2 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
19 19 Ket
21 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 84
------------------------------------------------------------------
0 18 Bra
2 [+\-0-9\p{Nd}]++
18 18 Ket
20 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
12 12 Ket
14 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
12 12 Ket
14 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\x{104}-\x{109}]
9 9 Ket
11 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 104
------------------------------------------------------------------
0 23 Bra
2 19 CBra 1
5 Brazero
6 13 SCBra 2
9 6 Cond
11 1 Capture ref
13 0
15 2 Alt
17 8 Ket
19 13 KetRmax
21 19 Ket
23 23 Ket
25 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 84
------------------------------------------------------------------
0 18 Bra
2 14 CBra 1
5 Brazero
6 6 SCond
8 1 Capture ref
10 0
12 2 Alt
14 8 KetRmax
16 14 Ket
18 18 Ket
20 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 9 Bra
2 [^\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Cc}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{Cc}\P{L}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 10 Bra
2 [\p{L}]++
10 10 Ket
12 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Xsp}]++
13 13 Ket
15 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 60 Bra
2 abc
8 5 CBra 1
11 d
13 4 Alt
15 e
17 9 Ket
19 *THEN
20 x
22 12 CBra 2
25 123
31 *THEN
32 4
34 24 Alt
36 567
42 5 CBra 3
45 b
47 4 Alt
49 q
51 9 Ket
53 *THEN
54 xx
58 36 Ket
60 60 Ket
62 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 35 Bra
2 Brazero
3 28 SCBra 1
6 12 CBra 2
9 7 CBra 3
12 a
14 \2
16 7 Ket
18 11 Alt
20 5 CBra 4
23 a*
25 5 Ket
27 20 Recurse
29 23 Ket
31 28 KetRmax
33 a?+
35 35 Ket
37 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 16 Bra
2 12 CBra 1
5 7 Recurse
7 5 CBra 2
10 \1
12 5 Ket
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 13 Bra
2 6 Recurse
4 6 Recurse
6 5 CBra 1
9 a
11 5 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 24 Bra
2 Any
3 7 CBra 1
6 19 Recurse
8 0 Recurse
10 4 Alt
12 \1
14 3 Alt
16 $
17 14 Ket
19 3 CBra 2
22 3 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 31 Bra
2 Any
3 14 CBra 1
6 26 Recurse
8 0 Recurse
10 3 CBra 2
13 3 Ket
15 10 Recurse
17 4 Alt
19 \1
21 3 Alt
23 $
24 21 Ket
26 3 CBra 3
29 3 Ket
31 31 Ket
33 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 50 Bra
2 4 Recurse
4 3 CBra 1
7 3 Ket
9 39 CBra 2
12 32 CBra 3
15 27 CBra 4
18 22 CBra 5
21 15 CBra 6
24 10 CBra 7
27 5 Once
29 \1+
32 5 Ket
34 10 Ket
36 15 Ket
38 \x{85}
40 22 KetRmax
42 27 Ket
44 2 Alt
46 34 Ket
48 39 Ket
50 50 Ket
52 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 79 Bra
2 70 Once
4 6 Cond
6 1 Capture ref
8 74 Recurse
10 6 Ket
12 6 Cond
14 1 Capture ref
16 74 Recurse
18 6 Ket
20 6 Cond
22 1 Capture ref
24 74 Recurse
26 6 Ket
28 6 Cond
30 1 Capture ref
32 74 Recurse
34 6 Ket
36 6 Cond
38 1 Capture ref
40 74 Recurse
42 6 Ket
44 6 Cond
46 1 Capture ref
48 74 Recurse
50 6 Ket
52 6 Cond
54 1 Capture ref
56 74 Recurse
58 6 Ket
60 10 SBraPos
62 6 SCond
64 1 Capture ref
66 74 Recurse
68 6 Ket
70 10 KetRpos
72 70 Ket
74 3 CBra 1
77 3 Ket
79 79 Ket
81 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 43 Bra
2 34 Once
4 4 Cond
6 1 Capture ref
8 8 Alt
10 a
12 38 Recurse
14 b
16 12 Ket
18 16 SBraPos
20 4 SCond
22 1 Capture ref
24 8 Alt
26 a
28 38 Recurse
30 b
32 12 Ket
34 16 KetRpos
36 34 Ket
38 3 CBra 1
41 3 Ket
43 43 Ket
45 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 133 Bra
2 41 CBra 1
5 2 Recurse
7 88 Recurse
9 93 Recurse
11 98 Recurse
13 103 Recurse
15 108 Recurse
17 113 Recurse
19 118 Recurse
21 123 Recurse
23 123 Recurse
25 118 Recurse
27 113 Recurse
29 108 Recurse
31 103 Recurse
33 98 Recurse
35 93 Recurse
37 88 Recurse
39 2 Recurse
41 0 Recurse
43 41 Ket
45 41 SCBra 1
48 2 Recurse
50 88 Recurse
52 93 Recurse
54 98 Recurse
56 103 Recurse
58 108 Recurse
60 113 Recurse
62 118 Recurse
64 123 Recurse
66 123 Recurse
68 118 Recurse
70 113 Recurse
72 108 Recurse
74 103 Recurse
76 98 Recurse
78 93 Recurse
80 88 Recurse
82 2 Recurse
84 0 Recurse
86 41 KetRmax
88 3 CBra 2
91 3 Ket
93 3 CBra 3
96 3 Ket
98 3 CBra 4
101 3 Ket
103 3 CBra 5
106 3 Ket
108 3 CBra 6
111 3 Ket
113 3 CBra 7
116 3 Ket
118 3 CBra 8
121 3 Ket
123 3 CBra 9
126 3 Ket
128 3 CBra 10
131 3 Ket
133 133 Ket
135 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-32-3 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 5 CBra 1
5 /i b
7 5 Ket
9 9 Ket
11 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 76
------------------------------------------------------------------
0 16 Bra
2 7 CBra 1
5 AllAny*
7 X
9 5 Alt
11 ^
12 B
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 72
------------------------------------------------------------------
0 15 Bra
2 6 Bra
4 AllAny*
6 X
8 5 Alt
10 ^
11 B
13 11 Ket
15 15 Ket
17 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 ^
3 [0-9A-Za-z]
12 12 Ket
14 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 20
------------------------------------------------------------------
0 2 Bra
2 2 Ket
4 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 x?+
4 4 Ket
6 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 x++
4 4 Ket
6 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 40
------------------------------------------------------------------
0 7 Bra
2 x
4 x{0,2}+
7 7 Ket
9 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 Braposzero
3 5 CBraPos 1
6 x
8 5 KetRpos
10 10 Ket
12 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 220
------------------------------------------------------------------
0 52 Bra
2 ^
3 47 CBra 1
6 5 CBra 2
9 a+
11 5 Ket
13 13 CBra 3
16 [ab]+?
26 13 Ket
28 13 CBra 4
31 [bc]+
41 13 Ket
43 5 CBra 5
46 \w*+
48 5 Ket
50 47 Ket
52 52 Ket
54 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 3296
------------------------------------------------------------------
0 821 Bra
2 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
820 \b
821 821 Ket
823 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 3256
------------------------------------------------------------------
0 811 Bra
2 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
810 \b
811 811 Ket
813 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 64
------------------------------------------------------------------
0 13 Bra
2 9 CBra 1
5 a
7 2 Recurse
9 b
11 9 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 80
------------------------------------------------------------------
0 17 Bra
2 13 CBra 1
5 a
7 4 SBra
9 2 Recurse
11 4 KetRmax
13 b
15 13 Ket
17 17 Ket
19 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 108
Memory allocation - data size : 104
------------------------------------------------------------------
0 24 Bra
2 a
4 5 CBra 1
7 b
9 4 Alt
11 c
13 9 Ket
15 d
17 5 CBra 2
20 e
22 5 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 128
Memory allocation - data size : 36
------------------------------------------------------------------
0 29 Bra
2 18 Bra
4 a
6 12 CBra 1
9 c
11 5 CBra 2
14 d
16 5 Ket
18 12 Ket
20 18 Ket
22 5 CBra 3
25 a
27 5 Ket
29 29 Ket
31 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 108
Memory allocation - data size : 12
------------------------------------------------------------------
0 24 Bra
2 5 CBra 1
5 a
7 5 Ket
9 Any
10 Any
11 Any
12 \1
14 bbb
20 2 Recurse
22 d
24 24 Ket
26 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 100
------------------------------------------------------------------
0 22 Bra
2 abc
8 Callout 255 10 1
12 de
16 Callout 0 16 1
20 f
22 22 Ket
24 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 156
------------------------------------------------------------------
0 36 Bra
2 Callout 255 0 1
6 a
8 Callout 255 1 1
12 b
14 Callout 255 2 1
18 c
20 Callout 255 3 1
24 d
26 Callout 255 4 1
30 e
32 Callout 255 5 0
36 36 Ket
38 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{1000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{10000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{10ffff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{80}
4 4 Ket
6 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 A\x{2262}\x{391}.
10 10 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 44
------------------------------------------------------------------
0 8 Bra
2 \x{d55c}\x{ad6d}\x{c5b4}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 44
------------------------------------------------------------------
0 8 Bra
2 \x{65e5}\x{672c}\x{8a9e}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 76
------------------------------------------------------------------
0 16 Bra
2 [Z\x{100}]
16 16 Ket
18 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 13: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 88
------------------------------------------------------------------
0 19 Bra
2 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
19 19 Ket
21 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 84
------------------------------------------------------------------
0 18 Bra
2 [+\-0-9\p{Nd}]++
18 18 Ket
20 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
12 12 Ket
14 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
12 12 Ket
14 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\x{104}-\x{109}]
9 9 Ket
11 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 104
------------------------------------------------------------------
0 23 Bra
2 19 CBra 1
5 Brazero
6 13 SCBra 2
9 6 Cond
11 1 Capture ref
13 0
15 2 Alt
17 8 Ket
19 13 KetRmax
21 19 Ket
23 23 Ket
25 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 84
------------------------------------------------------------------
0 18 Bra
2 14 CBra 1
5 Brazero
6 6 SCond
8 1 Capture ref
10 0
12 2 Alt
14 8 KetRmax
16 14 Ket
18 18 Ket
20 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 9 Bra
2 [^\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Cc}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{Cc}\P{L}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 10 Bra
2 [\p{L}]++
10 10 Ket
12 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Xsp}]++
13 13 Ket
15 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 60 Bra
2 abc
8 5 CBra 1
11 d
13 4 Alt
15 e
17 9 Ket
19 *THEN
20 x
22 12 CBra 2
25 123
31 *THEN
32 4
34 24 Alt
36 567
42 5 CBra 3
45 b
47 4 Alt
49 q
51 9 Ket
53 *THEN
54 xx
58 36 Ket
60 60 Ket
62 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 35 Bra
2 Brazero
3 28 SCBra 1
6 12 CBra 2
9 7 CBra 3
12 a
14 \2
16 7 Ket
18 11 Alt
20 5 CBra 4
23 a*
25 5 Ket
27 20 Recurse
29 23 Ket
31 28 KetRmax
33 a?+
35 35 Ket
37 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 16 Bra
2 12 CBra 1
5 7 Recurse
7 5 CBra 2
10 \1
12 5 Ket
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 13 Bra
2 6 Recurse
4 6 Recurse
6 5 CBra 1
9 a
11 5 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 24 Bra
2 Any
3 7 CBra 1
6 19 Recurse
8 0 Recurse
10 4 Alt
12 \1
14 3 Alt
16 $
17 14 Ket
19 3 CBra 2
22 3 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 31 Bra
2 Any
3 14 CBra 1
6 26 Recurse
8 0 Recurse
10 3 CBra 2
13 3 Ket
15 10 Recurse
17 4 Alt
19 \1
21 3 Alt
23 $
24 21 Ket
26 3 CBra 3
29 3 Ket
31 31 Ket
33 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 50 Bra
2 4 Recurse
4 3 CBra 1
7 3 Ket
9 39 CBra 2
12 32 CBra 3
15 27 CBra 4
18 22 CBra 5
21 15 CBra 6
24 10 CBra 7
27 5 Once
29 \1+
32 5 Ket
34 10 Ket
36 15 Ket
38 \x{85}
40 22 KetRmax
42 27 Ket
44 2 Alt
46 34 Ket
48 39 Ket
50 50 Ket
52 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 79 Bra
2 70 Once
4 6 Cond
6 1 Capture ref
8 74 Recurse
10 6 Ket
12 6 Cond
14 1 Capture ref
16 74 Recurse
18 6 Ket
20 6 Cond
22 1 Capture ref
24 74 Recurse
26 6 Ket
28 6 Cond
30 1 Capture ref
32 74 Recurse
34 6 Ket
36 6 Cond
38 1 Capture ref
40 74 Recurse
42 6 Ket
44 6 Cond
46 1 Capture ref
48 74 Recurse
50 6 Ket
52 6 Cond
54 1 Capture ref
56 74 Recurse
58 6 Ket
60 10 SBraPos
62 6 SCond
64 1 Capture ref
66 74 Recurse
68 6 Ket
70 10 KetRpos
72 70 Ket
74 3 CBra 1
77 3 Ket
79 79 Ket
81 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 43 Bra
2 34 Once
4 4 Cond
6 1 Capture ref
8 8 Alt
10 a
12 38 Recurse
14 b
16 12 Ket
18 16 SBraPos
20 4 SCond
22 1 Capture ref
24 8 Alt
26 a
28 38 Recurse
30 b
32 12 Ket
34 16 KetRpos
36 34 Ket
38 3 CBra 1
41 3 Ket
43 43 Ket
45 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 133 Bra
2 41 CBra 1
5 2 Recurse
7 88 Recurse
9 93 Recurse
11 98 Recurse
13 103 Recurse
15 108 Recurse
17 113 Recurse
19 118 Recurse
21 123 Recurse
23 123 Recurse
25 118 Recurse
27 113 Recurse
29 108 Recurse
31 103 Recurse
33 98 Recurse
35 93 Recurse
37 88 Recurse
39 2 Recurse
41 0 Recurse
43 41 Ket
45 41 SCBra 1
48 2 Recurse
50 88 Recurse
52 93 Recurse
54 98 Recurse
56 103 Recurse
58 108 Recurse
60 113 Recurse
62 118 Recurse
64 123 Recurse
66 123 Recurse
68 118 Recurse
70 113 Recurse
72 108 Recurse
74 103 Recurse
76 98 Recurse
78 93 Recurse
80 88 Recurse
82 2 Recurse
84 0 Recurse
86 41 KetRmax
88 3 CBra 2
91 3 Ket
93 3 CBra 3
96 3 Ket
98 3 CBra 4
101 3 Ket
103 3 CBra 5
106 3 Ket
108 3 CBra 6
111 3 Ket
113 3 CBra 7
116 3 Ket
118 3 CBra 8
121 3 Ket
123 3 CBra 9
126 3 Ket
128 3 CBra 10
131 3 Ket
133 133 Ket
135 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-32-4 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 5 CBra 1
5 /i b
7 5 Ket
9 9 Ket
11 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 76
------------------------------------------------------------------
0 16 Bra
2 7 CBra 1
5 AllAny*
7 X
9 5 Alt
11 ^
12 B
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 72
------------------------------------------------------------------
0 15 Bra
2 6 Bra
4 AllAny*
6 X
8 5 Alt
10 ^
11 B
13 11 Ket
15 15 Ket
17 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 ^
3 [0-9A-Za-z]
12 12 Ket
14 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 20
------------------------------------------------------------------
0 2 Bra
2 2 Ket
4 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 x?+
4 4 Ket
6 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 x++
4 4 Ket
6 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 40
------------------------------------------------------------------
0 7 Bra
2 x
4 x{0,2}+
7 7 Ket
9 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 Braposzero
3 5 CBraPos 1
6 x
8 5 KetRpos
10 10 Ket
12 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 220
------------------------------------------------------------------
0 52 Bra
2 ^
3 47 CBra 1
6 5 CBra 2
9 a+
11 5 Ket
13 13 CBra 3
16 [ab]+?
26 13 Ket
28 13 CBra 4
31 [bc]+
41 13 Ket
43 5 CBra 5
46 \w*+
48 5 Ket
50 47 Ket
52 52 Ket
54 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 3296
------------------------------------------------------------------
0 821 Bra
2 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
820 \b
821 821 Ket
823 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 3256
------------------------------------------------------------------
0 811 Bra
2 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
810 \b
811 811 Ket
813 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 64
------------------------------------------------------------------
0 13 Bra
2 9 CBra 1
5 a
7 2 Recurse
9 b
11 9 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 80
------------------------------------------------------------------
0 17 Bra
2 13 CBra 1
5 a
7 4 SBra
9 2 Recurse
11 4 KetRmax
13 b
15 13 Ket
17 17 Ket
19 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 108
Memory allocation - data size : 104
------------------------------------------------------------------
0 24 Bra
2 a
4 5 CBra 1
7 b
9 4 Alt
11 c
13 9 Ket
15 d
17 5 CBra 2
20 e
22 5 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 128
Memory allocation - data size : 36
------------------------------------------------------------------
0 29 Bra
2 18 Bra
4 a
6 12 CBra 1
9 c
11 5 CBra 2
14 d
16 5 Ket
18 12 Ket
20 18 Ket
22 5 CBra 3
25 a
27 5 Ket
29 29 Ket
31 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 108
Memory allocation - data size : 12
------------------------------------------------------------------
0 24 Bra
2 5 CBra 1
5 a
7 5 Ket
9 Any
10 Any
11 Any
12 \1
14 bbb
20 2 Recurse
22 d
24 24 Ket
26 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 100
------------------------------------------------------------------
0 22 Bra
2 abc
8 Callout 255 10 1
12 de
16 Callout 0 16 1
20 f
22 22 Ket
24 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 156
------------------------------------------------------------------
0 36 Bra
2 Callout 255 0 1
6 a
8 Callout 255 1 1
12 b
14 Callout 255 2 1
18 c
20 Callout 255 3 1
24 d
26 Callout 255 4 1
30 e
32 Callout 255 5 0
36 36 Ket
38 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{1000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{10000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100000}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{10ffff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{80}
4 4 Ket
6 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{ff}
4 4 Ket
6 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 A\x{2262}\x{391}.
10 10 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 44
------------------------------------------------------------------
0 8 Bra
2 \x{d55c}\x{ad6d}\x{c5b4}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{d55c}
Last code unit = \x{c5b4}
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 44
------------------------------------------------------------------
0 8 Bra
2 \x{65e5}\x{672c}\x{8a9e}
8 8 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \x{65e5}
Last code unit = \x{8a9e}
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{100}
4 4 Ket
6 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 76
------------------------------------------------------------------
0 16 Bra
2 [Z\x{100}]
16 16 Ket
18 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 52
------------------------------------------------------------------
0 10 Bra
2 ^
3 [\x{100}-\x{150}]
10 10 Ket
12 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 13: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\P{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{L}]
9 9 Ket
11 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 88
------------------------------------------------------------------
0 19 Bra
2 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
19 19 Ket
21 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 84
------------------------------------------------------------------
0 18 Bra
2 [+\-0-9\p{Nd}]++
18 18 Ket
20 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
12 12 Ket
14 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 60
------------------------------------------------------------------
0 12 Bra
2 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
12 12 Ket
14 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 9 Bra
2 [\x{104}-\x{109}]
9 9 Ket
11 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 104
------------------------------------------------------------------
0 23 Bra
2 19 CBra 1
5 Brazero
6 13 SCBra 2
9 6 Cond
11 1 Capture ref
13 0
15 2 Alt
17 8 Ket
19 13 KetRmax
21 19 Ket
23 23 Ket
25 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 84
------------------------------------------------------------------
0 18 Bra
2 14 CBra 1
5 Brazero
6 6 SCond
8 1 Capture ref
10 0
12 2 Alt
14 8 KetRmax
16 14 Ket
18 18 Ket
20 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 a
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 \x{aa}
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^a] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 28
------------------------------------------------------------------
0 4 Bra
2 [^\x{aa}] (not)
4 4 Ket
6 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 9 Bra
2 [^\p{Nd}]
9 9 Ket
11 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Cc}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{Cc}\P{L}]++
13 13 Ket
15 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 10 Bra
2 [\p{L}]++
10 10 Ket
12 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 13 Bra
2 [\P{L}\P{Xsp}]++
13 13 Ket
15 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 60 Bra
2 abc
8 5 CBra 1
11 d
13 4 Alt
15 e
17 9 Ket
19 *THEN
20 x
22 12 CBra 2
25 123
31 *THEN
32 4
34 24 Alt
36 567
42 5 CBra 3
45 b
47 4 Alt
49 q
51 9 Ket
53 *THEN
54 xx
58 36 Ket
60 60 Ket
62 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 35 Bra
2 Brazero
3 28 SCBra 1
6 12 CBra 2
9 7 CBra 3
12 a
14 \2
16 7 Ket
18 11 Alt
20 5 CBra 4
23 a*
25 5 Ket
27 20 Recurse
29 23 Ket
31 28 KetRmax
33 a?+
35 35 Ket
37 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 16 Bra
2 12 CBra 1
5 7 Recurse
7 5 CBra 2
10 \1
12 5 Ket
14 12 Ket
16 16 Ket
18 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 13 Bra
2 6 Recurse
4 6 Recurse
6 5 CBra 1
9 a
11 5 Ket
13 13 Ket
15 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 24 Bra
2 Any
3 7 CBra 1
6 19 Recurse
8 0 Recurse
10 4 Alt
12 \1
14 3 Alt
16 $
17 14 Ket
19 3 CBra 2
22 3 Ket
24 24 Ket
26 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 31 Bra
2 Any
3 14 CBra 1
6 26 Recurse
8 0 Recurse
10 3 CBra 2
13 3 Ket
15 10 Recurse
17 4 Alt
19 \1
21 3 Alt
23 $
24 21 Ket
26 3 CBra 3
29 3 Ket
31 31 Ket
33 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 50 Bra
2 4 Recurse
4 3 CBra 1
7 3 Ket
9 39 CBra 2
12 32 CBra 3
15 27 CBra 4
18 22 CBra 5
21 15 CBra 6
24 10 CBra 7
27 5 Once
29 \1+
32 5 Ket
34 10 Ket
36 15 Ket
38 \x{85}
40 22 KetRmax
42 27 Ket
44 2 Alt
46 34 Ket
48 39 Ket
50 50 Ket
52 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 79 Bra
2 70 Once
4 6 Cond
6 1 Capture ref
8 74 Recurse
10 6 Ket
12 6 Cond
14 1 Capture ref
16 74 Recurse
18 6 Ket
20 6 Cond
22 1 Capture ref
24 74 Recurse
26 6 Ket
28 6 Cond
30 1 Capture ref
32 74 Recurse
34 6 Ket
36 6 Cond
38 1 Capture ref
40 74 Recurse
42 6 Ket
44 6 Cond
46 1 Capture ref
48 74 Recurse
50 6 Ket
52 6 Cond
54 1 Capture ref
56 74 Recurse
58 6 Ket
60 10 SBraPos
62 6 SCond
64 1 Capture ref
66 74 Recurse
68 6 Ket
70 10 KetRpos
72 70 Ket
74 3 CBra 1
77 3 Ket
79 79 Ket
81 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 43 Bra
2 34 Once
4 4 Cond
6 1 Capture ref
8 8 Alt
10 a
12 38 Recurse
14 b
16 12 Ket
18 16 SBraPos
20 4 SCond
22 1 Capture ref
24 8 Alt
26 a
28 38 Recurse
30 b
32 12 Ket
34 16 KetRpos
36 34 Ket
38 3 CBra 1
41 3 Ket
43 43 Ket
45 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 133 Bra
2 41 CBra 1
5 2 Recurse
7 88 Recurse
9 93 Recurse
11 98 Recurse
13 103 Recurse
15 108 Recurse
17 113 Recurse
19 118 Recurse
21 123 Recurse
23 123 Recurse
25 118 Recurse
27 113 Recurse
29 108 Recurse
31 103 Recurse
33 98 Recurse
35 93 Recurse
37 88 Recurse
39 2 Recurse
41 0 Recurse
43 41 Ket
45 41 SCBra 1
48 2 Recurse
50 88 Recurse
52 93 Recurse
54 98 Recurse
56 103 Recurse
58 108 Recurse
60 113 Recurse
62 118 Recurse
64 123 Recurse
66 123 Recurse
68 118 Recurse
70 113 Recurse
72 108 Recurse
74 103 Recurse
76 98 Recurse
78 93 Recurse
80 88 Recurse
82 2 Recurse
84 0 Recurse
86 41 KetRmax
88 3 CBra 2
91 3 Ket
93 3 CBra 3
96 3 Ket
98 3 CBra 4
101 3 Ket
103 3 CBra 5
106 3 Ket
108 3 CBra 6
111 3 Ket
113 3 CBra 7
116 3 Ket
118 3 CBra 8
121 3 Ket
123 3 CBra 9
126 3 Ket
128 3 CBra 10
131 3 Ket
133 133 Ket
135 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

1023
3rd/pcre2/testdata/testoutput8-8-2 vendored Normal file
View File

@@ -0,0 +1,1023 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 17
------------------------------------------------------------------
0 13 Bra
3 7 CBra 1
8 /i b
10 7 Ket
13 13 Ket
16 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 25
------------------------------------------------------------------
0 21 Bra
3 9 CBra 1
8 AllAny*
10 X
12 6 Alt
15 ^
16 B
18 15 Ket
21 21 Ket
24 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 23
------------------------------------------------------------------
0 19 Bra
3 7 Bra
6 AllAny*
8 X
10 6 Alt
13 ^
14 B
16 13 Ket
19 19 Ket
22 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 41
------------------------------------------------------------------
0 37 Bra
3 ^
4 [0-9A-Za-z]
37 37 Ket
40 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 7
------------------------------------------------------------------
0 3 Bra
3 3 Ket
6 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 x?+
5 5 Ket
8 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 x++
5 5 Ket
8 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 13
------------------------------------------------------------------
0 9 Bra
3 x
5 x{0,2}+
9 9 Ket
12 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 18
------------------------------------------------------------------
0 14 Bra
3 Braposzero
4 7 CBraPos 1
9 x
11 7 KetRpos
14 14 Ket
17 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 120
------------------------------------------------------------------
0 116 Bra
3 ^
4 109 CBra 1
9 7 CBra 2
14 a+
16 7 Ket
19 39 CBra 3
24 [ab]+?
58 39 Ket
61 39 CBra 4
66 [bc]+
100 39 Ket
103 7 CBra 5
108 \w*+
110 7 Ket
113 109 Ket
116 116 Ket
119 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 826
------------------------------------------------------------------
0 822 Bra
3 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
821 \b
822 822 Ket
825 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 816
------------------------------------------------------------------
0 812 Bra
3 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
811 \b
812 812 Ket
815 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 22
------------------------------------------------------------------
0 18 Bra
3 12 CBra 1
8 a
10 3 Recurse
13 b
15 12 Ket
18 18 Ket
21 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 28
------------------------------------------------------------------
0 24 Bra
3 18 CBra 1
8 a
10 6 SBra
13 3 Recurse
16 6 KetRmax
19 b
21 18 Ket
24 24 Ket
27 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 36
Memory allocation - data size : 28
------------------------------------------------------------------
0 32 Bra
3 a
5 7 CBra 1
10 b
12 5 Alt
15 c
17 12 Ket
20 d
22 7 CBra 2
27 e
29 7 Ket
32 32 Ket
35 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 45
Memory allocation - data size : 12
------------------------------------------------------------------
0 41 Bra
3 25 Bra
6 a
8 17 CBra 1
13 c
15 7 CBra 2
20 d
22 7 Ket
25 17 Ket
28 25 Ket
31 7 CBra 3
36 a
38 7 Ket
41 41 Ket
44 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 34
Memory allocation - data size : 4
------------------------------------------------------------------
0 30 Bra
3 7 CBra 1
8 a
10 7 Ket
13 Any
14 Any
15 Any
16 \1
19 bbb
25 3 Recurse
28 d
30 30 Ket
33 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 31
------------------------------------------------------------------
0 27 Bra
3 abc
9 Callout 255 10 1
15 de
19 Callout 0 16 1
25 f
27 27 Ket
30 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 53
------------------------------------------------------------------
0 49 Bra
3 Callout 255 0 1
9 a
11 Callout 255 1 1
17 b
19 Callout 255 2 1
25 c
27 Callout 255 3 1
33 d
35 Callout 255 4 1
41 e
43 Callout 255 5 0
49 49 Ket
52 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{100}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 11
------------------------------------------------------------------
0 7 Bra
3 \x{1000}
7 7 Ket
10 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 8 Bra
3 \x{10000}
8 8 Ket
11 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 8 Bra
3 \x{100000}
8 8 Ket
11 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 8 Bra
3 \x{10ffff}
8 8 Ket
11 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{ff}
6 6 Ket
9 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{100}
6 6 Ket
9 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{80}
6 6 Ket
9 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{ff}
6 6 Ket
9 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 14 Bra
3 A\x{2262}\x{391}.
14 14 Ket
17 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 19
------------------------------------------------------------------
0 15 Bra
3 \x{d55c}\x{ad6d}\x{c5b4}
15 15 Ket
18 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xed
Last code unit = \xb4
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 19
------------------------------------------------------------------
0 15 Bra
3 \x{65e5}\x{672c}\x{8a9e}
15 15 Ket
18 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xe6
Last code unit = \x9e
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{100}
6 6 Ket
9 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 47
------------------------------------------------------------------
0 43 Bra
3 [Z\x{100}]
43 43 Ket
46 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 14 Bra
3 ^
4 [\x{100}-\x{150}]
14 14 Ket
17 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 14 Bra
3 ^
4 [\x{100}-\x{150}]
14 14 Ket
17 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 15: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 15
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 15
------------------------------------------------------------------
0 11 Bra
3 [\P{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 15
------------------------------------------------------------------
0 11 Bra
3 [\P{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 15
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket
14 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 50
------------------------------------------------------------------
0 46 Bra
3 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
46 46 Ket
49 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 15
------------------------------------------------------------------
0 11 Bra
3 [\p{Nd}]
11 11 Ket
14 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 48
------------------------------------------------------------------
0 44 Bra
3 [+\-0-9\p{Nd}]++
44 44 Ket
47 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 25
------------------------------------------------------------------
0 21 Bra
3 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
21 21 Ket
24 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 25
------------------------------------------------------------------
0 21 Bra
3 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
21 21 Ket
24 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 17
------------------------------------------------------------------
0 13 Bra
3 [\x{104}-\x{109}]
13 13 Ket
16 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 38
------------------------------------------------------------------
0 34 Bra
3 28 CBra 1
8 Brazero
9 19 SCBra 2
14 8 Cond
17 1 Capture ref
20 0
22 3 Alt
25 11 Ket
28 19 KetRmax
31 28 Ket
34 34 Ket
37 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 30
------------------------------------------------------------------
0 26 Bra
3 20 CBra 1
8 Brazero
9 8 SCond
12 1 Capture ref
15 0
17 3 Alt
20 11 KetRmax
23 20 Ket
26 26 Ket
29 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 a
5 5 Ket
8 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 \x{aa}
5 5 Ket
8 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 \x{aa}
6 6 Ket
9 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 [^a] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 [^a] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 9
------------------------------------------------------------------
0 5 Bra
3 [^\x{aa}] (not)
5 5 Ket
8 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 10
------------------------------------------------------------------
0 6 Bra
3 [^\x{aa}] (not)
6 6 Ket
9 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 11 Bra
3 [^\p{Nd}]
11 11 Ket
14 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{L}\P{Cc}]++
15 15 Ket
18 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{Cc}\P{L}]++
15 15 Ket
18 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 12 Bra
3 [\p{L}]++
12 12 Ket
15 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 15 Bra
3 [\P{L}\P{Xsp}]++
15 15 Ket
18 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 73 Bra
3 abc
9 7 CBra 1
14 d
16 5 Alt
19 e
21 12 Ket
24 *THEN
25 x
27 14 CBra 2
32 123
38 *THEN
39 4
41 29 Alt
44 567
50 7 CBra 3
55 b
57 5 Alt
60 q
62 12 Ket
65 *THEN
66 xx
70 43 Ket
73 73 Ket
76 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 51 Bra
3 Brazero
4 42 SCBra 1
9 18 CBra 2
14 10 CBra 3
19 a
21 \2
24 10 Ket
27 16 Alt
30 7 CBra 4
35 a*
37 7 Ket
40 30 Recurse
43 34 Ket
46 42 KetRmax
49 a?+
51 51 Ket
54 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 25 Bra
3 19 CBra 1
8 11 Recurse
11 8 CBra 2
16 \1
19 8 Ket
22 19 Ket
25 25 Ket
28 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 19 Bra
3 9 Recurse
6 9 Recurse
9 7 CBra 1
14 a
16 7 Ket
19 19 Ket
22 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 36 Bra
3 Any
4 11 CBra 1
9 28 Recurse
12 0 Recurse
15 6 Alt
18 \1
21 4 Alt
24 $
25 21 Ket
28 5 CBra 2
33 5 Ket
36 36 Ket
39 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 47 Bra
3 Any
4 22 CBra 1
9 39 Recurse
12 0 Recurse
15 5 CBra 2
20 5 Ket
23 15 Recurse
26 6 Alt
29 \1
32 4 Alt
35 $
36 32 Ket
39 5 CBra 3
44 5 Ket
47 47 Ket
50 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 77 Bra
3 6 Recurse
6 5 CBra 1
11 5 Ket
14 60 CBra 2
19 49 CBra 3
24 41 CBra 4
29 33 CBra 5
34 23 CBra 6
39 15 CBra 7
44 7 Once
47 \1+
51 7 Ket
54 15 Ket
57 23 Ket
60 \x{85}
62 33 KetRmax
65 41 Ket
68 3 Alt
71 52 Ket
74 60 Ket
77 77 Ket
80 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
Failed: error 184 at offset 1129: (?| and/or (?J: or (?x: parentheses are too deeply nested
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 119 Bra
3 105 Once
6 9 Cond
9 1 Capture ref
12 111 Recurse
15 9 Ket
18 9 Cond
21 1 Capture ref
24 111 Recurse
27 9 Ket
30 9 Cond
33 1 Capture ref
36 111 Recurse
39 9 Ket
42 9 Cond
45 1 Capture ref
48 111 Recurse
51 9 Ket
54 9 Cond
57 1 Capture ref
60 111 Recurse
63 9 Ket
66 9 Cond
69 1 Capture ref
72 111 Recurse
75 9 Ket
78 9 Cond
81 1 Capture ref
84 111 Recurse
87 9 Ket
90 15 SBraPos
93 9 SCond
96 1 Capture ref
99 111 Recurse
102 9 Ket
105 15 KetRpos
108 105 Ket
111 5 CBra 1
116 5 Ket
119 119 Ket
122 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 61 Bra
3 47 Once
6 6 Cond
9 1 Capture ref
12 10 Alt
15 a
17 53 Recurse
20 b
22 16 Ket
25 22 SBraPos
28 6 SCond
31 1 Capture ref
34 10 Alt
37 a
39 53 Recurse
42 b
44 16 Ket
47 22 KetRpos
50 47 Ket
53 5 CBra 1
58 5 Ket
61 61 Ket
64 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 205 Bra
3 62 CBra 1
8 3 Recurse
11 133 Recurse
14 141 Recurse
17 149 Recurse
20 157 Recurse
23 165 Recurse
26 173 Recurse
29 181 Recurse
32 189 Recurse
35 189 Recurse
38 181 Recurse
41 173 Recurse
44 165 Recurse
47 157 Recurse
50 149 Recurse
53 141 Recurse
56 133 Recurse
59 3 Recurse
62 0 Recurse
65 62 Ket
68 62 SCBra 1
73 3 Recurse
76 133 Recurse
79 141 Recurse
82 149 Recurse
85 157 Recurse
88 165 Recurse
91 173 Recurse
94 181 Recurse
97 189 Recurse
100 189 Recurse
103 181 Recurse
106 173 Recurse
109 165 Recurse
112 157 Recurse
115 149 Recurse
118 141 Recurse
121 133 Recurse
124 3 Recurse
127 0 Recurse
130 62 KetRmax
133 5 CBra 2
138 5 Ket
141 5 CBra 3
146 5 Ket
149 5 CBra 4
154 5 Ket
157 5 CBra 5
162 5 Ket
165 5 CBra 6
170 5 Ket
173 5 CBra 7
178 5 Ket
181 5 CBra 8
186 5 Ket
189 5 CBra 9
194 5 Ket
197 5 CBra 10
202 5 Ket
205 205 Ket
208 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
Failed: error 120 at offset 131070: regular expression is too large
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-8-3 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 21
------------------------------------------------------------------
0 16 Bra
4 8 CBra 1
10 /i b
12 8 Ket
16 16 Ket
20 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 30
------------------------------------------------------------------
0 25 Bra
4 10 CBra 1
10 AllAny*
12 X
14 7 Alt
18 ^
19 B
21 17 Ket
25 25 Ket
29 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 28
------------------------------------------------------------------
0 23 Bra
4 8 Bra
8 AllAny*
10 X
12 7 Alt
16 ^
17 B
19 15 Ket
23 23 Ket
27 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 43
------------------------------------------------------------------
0 38 Bra
4 ^
5 [0-9A-Za-z]
38 38 Ket
42 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 9
------------------------------------------------------------------
0 4 Bra
4 4 Ket
8 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 a
6 6 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 x?+
6 6 Ket
10 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 x++
6 6 Ket
10 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 15
------------------------------------------------------------------
0 10 Bra
4 x
6 x{0,2}+
10 10 Ket
14 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 22
------------------------------------------------------------------
0 17 Bra
4 Braposzero
5 8 CBraPos 1
11 x
13 8 KetRpos
17 17 Ket
21 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 132
------------------------------------------------------------------
0 127 Bra
4 ^
5 118 CBra 1
11 8 CBra 2
17 a+
19 8 Ket
23 40 CBra 3
29 [ab]+?
63 40 Ket
67 40 CBra 4
73 [bc]+
107 40 Ket
111 8 CBra 5
117 \w*+
119 8 Ket
123 118 Ket
127 127 Ket
131 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 828
------------------------------------------------------------------
0 823 Bra
4 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
822 \b
823 823 Ket
827 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 818
------------------------------------------------------------------
0 813 Bra
4 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
812 \b
813 813 Ket
817 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 27
------------------------------------------------------------------
0 22 Bra
4 14 CBra 1
10 a
12 4 Recurse
16 b
18 14 Ket
22 22 Ket
26 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 35
------------------------------------------------------------------
0 30 Bra
4 22 CBra 1
10 a
12 8 SBra
16 4 Recurse
20 8 KetRmax
24 b
26 22 Ket
30 30 Ket
34 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 43
Memory allocation - data size : 28
------------------------------------------------------------------
0 38 Bra
4 a
6 8 CBra 1
12 b
14 6 Alt
18 c
20 14 Ket
24 d
26 8 CBra 2
32 e
34 8 Ket
38 38 Ket
42 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 55
Memory allocation - data size : 12
------------------------------------------------------------------
0 50 Bra
4 30 Bra
8 a
10 20 CBra 1
16 c
18 8 CBra 2
24 d
26 8 Ket
30 20 Ket
34 30 Ket
38 8 CBra 3
44 a
46 8 Ket
50 50 Ket
54 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 39
Memory allocation - data size : 4
------------------------------------------------------------------
0 34 Bra
4 8 CBra 1
10 a
12 8 Ket
16 Any
17 Any
18 Any
19 \1
22 bbb
28 4 Recurse
32 d
34 34 Ket
38 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 37
------------------------------------------------------------------
0 32 Bra
4 abc
10 Callout 255 10 1
18 de
22 Callout 0 16 1
30 f
32 32 Ket
36 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 67
------------------------------------------------------------------
0 62 Bra
4 Callout 255 0 1
12 a
14 Callout 255 1 1
22 b
24 Callout 255 2 1
32 c
34 Callout 255 3 1
42 d
44 Callout 255 4 1
52 e
54 Callout 255 5 0
62 62 Ket
66 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{100}
7 7 Ket
11 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 13
------------------------------------------------------------------
0 8 Bra
4 \x{1000}
8 8 Ket
12 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 9 Bra
4 \x{10000}
9 9 Ket
13 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 9 Bra
4 \x{100000}
9 9 Ket
13 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 9 Bra
4 \x{10ffff}
9 9 Ket
13 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{ff}
7 7 Ket
11 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{100}
7 7 Ket
11 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{80}
7 7 Ket
11 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{ff}
7 7 Ket
11 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 15 Bra
4 A\x{2262}\x{391}.
15 15 Ket
19 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 21
------------------------------------------------------------------
0 16 Bra
4 \x{d55c}\x{ad6d}\x{c5b4}
16 16 Ket
20 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xed
Last code unit = \xb4
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 21
------------------------------------------------------------------
0 16 Bra
4 \x{65e5}\x{672c}\x{8a9e}
16 16 Ket
20 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xe6
Last code unit = \x9e
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{100}
7 7 Ket
11 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 50
------------------------------------------------------------------
0 45 Bra
4 [Z\x{100}]
45 45 Ket
49 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 21
------------------------------------------------------------------
0 16 Bra
4 ^
5 [\x{100}-\x{150}]
16 16 Ket
20 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 21
------------------------------------------------------------------
0 16 Bra
4 ^
5 [\x{100}-\x{150}]
16 16 Ket
20 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 15: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 13 Bra
4 [\p{L}]
13 13 Ket
17 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 13 Bra
4 [\P{L}]
13 13 Ket
17 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 13 Bra
4 [\P{L}]
13 13 Ket
17 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 18
------------------------------------------------------------------
0 13 Bra
4 [\p{L}]
13 13 Ket
17 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 53
------------------------------------------------------------------
0 48 Bra
4 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
48 48 Ket
52 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 18
------------------------------------------------------------------
0 13 Bra
4 [\p{Nd}]
13 13 Ket
17 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 51
------------------------------------------------------------------
0 46 Bra
4 [+\-0-9\p{Nd}]++
46 46 Ket
50 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 27
------------------------------------------------------------------
0 22 Bra
4 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
22 22 Ket
26 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 27
------------------------------------------------------------------
0 22 Bra
4 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
22 22 Ket
26 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 20
------------------------------------------------------------------
0 15 Bra
4 [\x{104}-\x{109}]
15 15 Ket
19 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 47
------------------------------------------------------------------
0 42 Bra
4 34 CBra 1
10 Brazero
11 23 SCBra 2
17 9 Cond
21 1 Capture ref
24 0
26 4 Alt
30 13 Ket
34 23 KetRmax
38 34 Ket
42 42 Ket
46 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 37
------------------------------------------------------------------
0 32 Bra
4 24 CBra 1
10 Brazero
11 9 SCond
15 1 Capture ref
18 0
20 4 Alt
24 13 KetRmax
28 24 Ket
32 32 Ket
36 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 a
6 6 Ket
10 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 a
6 6 Ket
10 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 \x{aa}
6 6 Ket
10 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 \x{aa}
7 7 Ket
11 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 [^a] (not)
6 6 Ket
10 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 [^a] (not)
6 6 Ket
10 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 11
------------------------------------------------------------------
0 6 Bra
4 [^\x{aa}] (not)
6 6 Ket
10 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 12
------------------------------------------------------------------
0 7 Bra
4 [^\x{aa}] (not)
7 7 Ket
11 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 13 Bra
4 [^\p{Nd}]
13 13 Ket
17 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 17 Bra
4 [\P{L}\P{Cc}]++
17 17 Ket
21 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 17 Bra
4 [\P{Cc}\P{L}]++
17 17 Ket
21 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 14 Bra
4 [\p{L}]++
14 14 Ket
18 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 17 Bra
4 [\P{L}\P{Xsp}]++
17 17 Ket
21 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 83 Bra
4 abc
10 8 CBra 1
16 d
18 6 Alt
22 e
24 14 Ket
28 *THEN
29 x
31 15 CBra 2
37 123
43 *THEN
44 4
46 33 Alt
50 567
56 8 CBra 3
62 b
64 6 Alt
68 q
70 14 Ket
74 *THEN
75 xx
79 48 Ket
83 83 Ket
87 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 62 Bra
4 Brazero
5 51 SCBra 1
11 21 CBra 2
17 11 CBra 3
23 a
25 \2
28 11 Ket
32 20 Alt
36 8 CBra 4
42 a*
44 8 Ket
48 36 Recurse
52 41 Ket
56 51 KetRmax
60 a?+
62 62 Ket
66 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 31 Bra
4 23 CBra 1
10 14 Recurse
14 9 CBra 2
20 \1
23 9 Ket
27 23 Ket
31 31 Ket
35 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 24 Bra
4 12 Recurse
8 12 Recurse
12 8 CBra 1
18 a
20 8 Ket
24 24 Ket
28 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 45 Bra
4 Any
5 14 CBra 1
11 35 Recurse
15 0 Recurse
19 7 Alt
23 \1
26 5 Alt
30 $
31 26 Ket
35 6 CBra 2
41 6 Ket
45 45 Ket
49 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 59 Bra
4 Any
5 28 CBra 1
11 49 Recurse
15 0 Recurse
19 6 CBra 2
25 6 Ket
29 19 Recurse
33 7 Alt
37 \1
40 5 Alt
44 $
45 40 Ket
49 6 CBra 3
55 6 Ket
59 59 Ket
63 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 96 Bra
4 8 Recurse
8 6 CBra 1
14 6 Ket
18 74 CBra 2
24 60 CBra 3
30 50 CBra 4
36 40 CBra 5
42 28 CBra 6
48 18 CBra 7
54 8 Once
58 \1+
62 8 Ket
66 18 Ket
70 28 Ket
74 \x{85}
76 40 KetRmax
80 50 Ket
84 4 Alt
88 64 Ket
92 74 Ket
96 96 Ket
100 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 150 Bra
4 132 Once
8 11 Cond
12 1 Capture ref
15 140 Recurse
19 11 Ket
23 11 Cond
27 1 Capture ref
30 140 Recurse
34 11 Ket
38 11 Cond
42 1 Capture ref
45 140 Recurse
49 11 Ket
53 11 Cond
57 1 Capture ref
60 140 Recurse
64 11 Ket
68 11 Cond
72 1 Capture ref
75 140 Recurse
79 11 Ket
83 11 Cond
87 1 Capture ref
90 140 Recurse
94 11 Ket
98 11 Cond
102 1 Capture ref
105 140 Recurse
109 11 Ket
113 19 SBraPos
117 11 SCond
121 1 Capture ref
124 140 Recurse
128 11 Ket
132 19 KetRpos
136 132 Ket
140 6 CBra 1
146 6 Ket
150 150 Ket
154 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 76 Bra
4 58 Once
8 7 Cond
12 1 Capture ref
15 12 Alt
19 a
21 66 Recurse
25 b
27 19 Ket
31 27 SBraPos
35 7 SCond
39 1 Capture ref
42 12 Alt
46 a
48 66 Recurse
52 b
54 19 Ket
58 27 KetRpos
62 58 Ket
66 6 CBra 1
72 6 Ket
76 76 Ket
80 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 266 Bra
4 82 CBra 1
10 4 Recurse
14 176 Recurse
18 186 Recurse
22 196 Recurse
26 206 Recurse
30 216 Recurse
34 226 Recurse
38 236 Recurse
42 246 Recurse
46 246 Recurse
50 236 Recurse
54 226 Recurse
58 216 Recurse
62 206 Recurse
66 196 Recurse
70 186 Recurse
74 176 Recurse
78 4 Recurse
82 0 Recurse
86 82 Ket
90 82 SCBra 1
96 4 Recurse
100 176 Recurse
104 186 Recurse
108 196 Recurse
112 206 Recurse
116 216 Recurse
120 226 Recurse
124 236 Recurse
128 246 Recurse
132 246 Recurse
136 236 Recurse
140 226 Recurse
144 216 Recurse
148 206 Recurse
152 196 Recurse
156 186 Recurse
160 176 Recurse
164 4 Recurse
168 0 Recurse
172 82 KetRmax
176 6 CBra 2
182 6 Ket
186 6 CBra 3
192 6 Ket
196 6 CBra 4
202 6 Ket
206 6 CBra 5
212 6 Ket
216 6 CBra 6
222 6 Ket
226 6 CBra 7
232 6 Ket
236 6 CBra 8
242 6 Ket
246 6 CBra 9
252 6 Ket
256 6 CBra 10
262 6 Ket
266 266 Ket
270 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

1021
3rd/pcre2/testdata/testoutput8-8-4 vendored Normal file
View File

@@ -0,0 +1,1021 @@
# There are two sorts of patterns in this test. A number of them are
# representative patterns whose lengths and offsets are checked. This is just a
# doublecheck test to ensure the sizes don't go horribly wrong when something
# is changed. The operation of these patterns is checked in other tests.
#
# This file also contains tests whose output varies with code unit size and/or
# link size. Unicode support is required for these tests. There are separate
# output files for each code unit size and link size.
#pattern fullbincode,memory
/((?i)b)/
Memory allocation - code size : 25
------------------------------------------------------------------
0 19 Bra
5 9 CBra 1
12 /i b
14 9 Ket
19 19 Ket
24 End
------------------------------------------------------------------
/(?s)(.*X|^B)/
Memory allocation - code size : 35
------------------------------------------------------------------
0 29 Bra
5 11 CBra 1
12 AllAny*
14 X
16 8 Alt
21 ^
22 B
24 19 Ket
29 29 Ket
34 End
------------------------------------------------------------------
/(?s:.*X|^B)/
Memory allocation - code size : 33
------------------------------------------------------------------
0 27 Bra
5 9 Bra
10 AllAny*
12 X
14 8 Alt
19 ^
20 B
22 17 Ket
27 27 Ket
32 End
------------------------------------------------------------------
/^[[:alnum:]]/
Memory allocation - code size : 45
------------------------------------------------------------------
0 39 Bra
5 ^
6 [0-9A-Za-z]
39 39 Ket
44 End
------------------------------------------------------------------
/#/Ix
Memory allocation - code size : 11
------------------------------------------------------------------
0 5 Bra
5 5 Ket
10 End
------------------------------------------------------------------
Capture group count = 0
May match empty string
Options: extended
Subject length lower bound = 0
/a#/Ix
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 a
7 7 Ket
12 End
------------------------------------------------------------------
Capture group count = 0
Options: extended
First code unit = 'a'
Subject length lower bound = 1
/x?+/
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 x?+
7 7 Ket
12 End
------------------------------------------------------------------
/x++/
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 x++
7 7 Ket
12 End
------------------------------------------------------------------
/x{1,3}+/
Memory allocation - code size : 17
------------------------------------------------------------------
0 11 Bra
5 x
7 x{0,2}+
11 11 Ket
16 End
------------------------------------------------------------------
/(x)*+/
Memory allocation - code size : 26
------------------------------------------------------------------
0 20 Bra
5 Braposzero
6 9 CBraPos 1
13 x
15 9 KetRpos
20 20 Ket
25 End
------------------------------------------------------------------
/^((a+)(?U)([ab]+)(?-U)([bc]+)(\w*))/
Memory allocation - code size : 144
------------------------------------------------------------------
0 138 Bra
5 ^
6 127 CBra 1
13 9 CBra 2
20 a+
22 9 Ket
27 41 CBra 3
34 [ab]+?
68 41 Ket
73 41 CBra 4
80 [bc]+
114 41 Ket
119 9 CBra 5
126 \w*+
128 9 Ket
133 127 Ket
138 138 Ket
143 End
------------------------------------------------------------------
"8J\$WE\<\.rX\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 830
------------------------------------------------------------------
0 824 Bra
5 8J$WE<.rX+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
823 \b
824 824 Ket
829 End
------------------------------------------------------------------
"\$\<\.X\+ix\[d1b\!H\#\?vV0vrK\:ZH1\=2M\>iV\;\?aPhFB\<\*vW\@QW\@sO9\}cfZA\-i\'w\%hKd6gt1UJP\,15_\#QY\$M\^Mss_U\/\]\&LK9\[5vQub\^w\[KDD\<EjmhUZ\?\.akp2dF\>qmj\;2\}YWFdYx\.Ap\]hjCPTP\(n28k\+3\;o\&WXqs\/gOXdr\$\:r\'do0\;b4c\(f_Gr\=\"\\4\)\[01T7ajQJvL\$W\~mL_sS\/4h\:x\*\[ZN\=KLs\&L5zX\/\/\>it\,o\:aU\(\;Z\>pW\&T7oP\'2K\^E\:x9\'c\[\%z\-\,64JQ5AeH_G\#KijUKghQw\^\\vea3a\?kka_G\$8\#\`\*kynsxzBLru\'\]k_\[7FrVx\}\^\=\$blx\>s\-N\%j\;D\*aZDnsw\:YKZ\%Q\.Kne9\#hP\?\+b3\(SOvL\,\^\;\&u5\@\?5C5Bhb\=m\-vEh_L15Jl\]U\)0RP6\{q\%L\^_z5E\'Dw6X\b"
Memory allocation - code size : 820
------------------------------------------------------------------
0 814 Bra
5 $<.X+ix[d1b!H#?vV0vrK:ZH1=2M>iV;?aPhFB<*vW@QW@sO9}cfZA-i'w%hKd6gt1UJP,15_#QY$M^Mss_U/]&LK9[5vQub^w[KDD<EjmhUZ?.akp2dF>qmj;2}YWFdYx.Ap]hjCPTP(n28k+3;o&WXqs/gOXdr$:r'do0;b4c(f_Gr="\4)[01T7ajQJvL$W~mL_sS/4h:x*[ZN=KLs&L5zX//>it,o:aU(;Z>pW&T7oP'2K^E:x9'c[%z-,64JQ5AeH_G#KijUKghQw^\vea3a?kka_G$8#`*kynsxzBLru']k_[7FrVx}^=$blx>s-N%j;D*aZDnsw:YKZ%Q.Kne9#hP?+b3(SOvL,^;&u5@?5C5Bhb=m-vEh_L15Jl]U)0RP6{q%L^_z5E'Dw6X
813 \b
814 814 Ket
819 End
------------------------------------------------------------------
/(a(?1)b)/
Memory allocation - code size : 32
------------------------------------------------------------------
0 26 Bra
5 16 CBra 1
12 a
14 5 Recurse
19 b
21 16 Ket
26 26 Ket
31 End
------------------------------------------------------------------
/(a(?1)+b)/
Memory allocation - code size : 42
------------------------------------------------------------------
0 36 Bra
5 26 CBra 1
12 a
14 10 SBra
19 5 Recurse
24 10 KetRmax
29 b
31 26 Ket
36 36 Ket
41 End
------------------------------------------------------------------
/a(?P<name1>b|c)d(?P<longername2>e)/
Memory allocation - code size : 50
Memory allocation - data size : 28
------------------------------------------------------------------
0 44 Bra
5 a
7 9 CBra 1
14 b
16 7 Alt
21 c
23 16 Ket
28 d
30 9 CBra 2
37 e
39 9 Ket
44 44 Ket
49 End
------------------------------------------------------------------
/(?:a(?P<c>c(?P<d>d)))(?P<a>a)/
Memory allocation - code size : 65
Memory allocation - data size : 12
------------------------------------------------------------------
0 59 Bra
5 35 Bra
10 a
12 23 CBra 1
19 c
21 9 CBra 2
28 d
30 9 Ket
35 23 Ket
40 35 Ket
45 9 CBra 3
52 a
54 9 Ket
59 59 Ket
64 End
------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/
Memory allocation - code size : 44
Memory allocation - data size : 4
------------------------------------------------------------------
0 38 Bra
5 9 CBra 1
12 a
14 9 Ket
19 Any
20 Any
21 Any
22 \1
25 bbb
31 5 Recurse
36 d
38 38 Ket
43 End
------------------------------------------------------------------
/abc(?C255)de(?C)f/
Memory allocation - code size : 43
------------------------------------------------------------------
0 37 Bra
5 abc
11 Callout 255 10 1
21 de
25 Callout 0 16 1
35 f
37 37 Ket
42 End
------------------------------------------------------------------
/abcde/auto_callout
Memory allocation - code size : 81
------------------------------------------------------------------
0 75 Bra
5 Callout 255 0 1
15 a
17 Callout 255 1 1
27 b
29 Callout 255 2 1
39 c
41 Callout 255 3 1
51 d
53 Callout 255 4 1
63 e
65 Callout 255 5 0
75 75 Ket
80 End
------------------------------------------------------------------
/\x{100}/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{100}
8 8 Ket
13 End
------------------------------------------------------------------
/\x{1000}/utf
Memory allocation - code size : 15
------------------------------------------------------------------
0 9 Bra
5 \x{1000}
9 9 Ket
14 End
------------------------------------------------------------------
/\x{10000}/utf
Memory allocation - code size : 16
------------------------------------------------------------------
0 10 Bra
5 \x{10000}
10 10 Ket
15 End
------------------------------------------------------------------
/\x{100000}/utf
Memory allocation - code size : 16
------------------------------------------------------------------
0 10 Bra
5 \x{100000}
10 10 Ket
15 End
------------------------------------------------------------------
/\x{10ffff}/utf
Memory allocation - code size : 16
------------------------------------------------------------------
0 10 Bra
5 \x{10ffff}
10 10 Ket
15 End
------------------------------------------------------------------
/\x{110000}/utf
Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too large
/[\x{ff}]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{ff}
8 8 Ket
13 End
------------------------------------------------------------------
/[\x{100}]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{100}
8 8 Ket
13 End
------------------------------------------------------------------
/\x80/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{80}
8 8 Ket
13 End
------------------------------------------------------------------
/\xff/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{ff}
8 8 Ket
13 End
------------------------------------------------------------------
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
Memory allocation - code size : 22
------------------------------------------------------------------
0 16 Bra
5 A\x{2262}\x{391}.
16 16 Ket
21 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = 'A'
Last code unit = '.'
Subject length lower bound = 4
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
Memory allocation - code size : 23
------------------------------------------------------------------
0 17 Bra
5 \x{d55c}\x{ad6d}\x{c5b4}
17 17 Ket
22 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xed
Last code unit = \xb4
Subject length lower bound = 3
/\x{65e5}\x{672c}\x{8a9e}/I,utf
Memory allocation - code size : 23
------------------------------------------------------------------
0 17 Bra
5 \x{65e5}\x{672c}\x{8a9e}
17 17 Ket
22 End
------------------------------------------------------------------
Capture group count = 0
Options: utf
First code unit = \xe6
Last code unit = \x9e
Subject length lower bound = 3
/[\x{100}]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{100}
8 8 Ket
13 End
------------------------------------------------------------------
/[Z\x{100}]/utf
Memory allocation - code size : 53
------------------------------------------------------------------
0 47 Bra
5 [Z\x{100}]
47 47 Ket
52 End
------------------------------------------------------------------
/^[\x{100}\E-\Q\E\x{150}]/utf
Memory allocation - code size : 24
------------------------------------------------------------------
0 18 Bra
5 ^
6 [\x{100}-\x{150}]
18 18 Ket
23 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E]/utf
Memory allocation - code size : 24
------------------------------------------------------------------
0 18 Bra
5 ^
6 [\x{100}-\x{150}]
18 18 Ket
23 End
------------------------------------------------------------------
/^[\QĀ\E-\QŐ\E/utf
Failed: error 106 at offset 15: missing terminating ] for character class
/[\p{L}]/
Memory allocation - code size : 21
------------------------------------------------------------------
0 15 Bra
5 [\p{L}]
15 15 Ket
20 End
------------------------------------------------------------------
/[\p{^L}]/
Memory allocation - code size : 21
------------------------------------------------------------------
0 15 Bra
5 [\P{L}]
15 15 Ket
20 End
------------------------------------------------------------------
/[\P{L}]/
Memory allocation - code size : 21
------------------------------------------------------------------
0 15 Bra
5 [\P{L}]
15 15 Ket
20 End
------------------------------------------------------------------
/[\P{^L}]/
Memory allocation - code size : 21
------------------------------------------------------------------
0 15 Bra
5 [\p{L}]
15 15 Ket
20 End
------------------------------------------------------------------
/[abc\p{L}\x{0660}]/utf
Memory allocation - code size : 56
------------------------------------------------------------------
0 50 Bra
5 [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff\p{L}\x{660}]
50 50 Ket
55 End
------------------------------------------------------------------
/[\p{Nd}]/utf
Memory allocation - code size : 21
------------------------------------------------------------------
0 15 Bra
5 [\p{Nd}]
15 15 Ket
20 End
------------------------------------------------------------------
/[\p{Nd}+-]+/utf
Memory allocation - code size : 54
------------------------------------------------------------------
0 48 Bra
5 [+\-0-9\p{Nd}]++
48 48 Ket
53 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/i,utf
Memory allocation - code size : 29
------------------------------------------------------------------
0 23 Bra
5 /i A\x{391}\x{10427}\x{ff3a}\x{1fb0}
23 23 Ket
28 End
------------------------------------------------------------------
/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/utf
Memory allocation - code size : 29
------------------------------------------------------------------
0 23 Bra
5 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
23 23 Ket
28 End
------------------------------------------------------------------
/[\x{105}-\x{109}]/i,utf
Memory allocation - code size : 23
------------------------------------------------------------------
0 17 Bra
5 [\x{104}-\x{109}]
17 17 Ket
22 End
------------------------------------------------------------------
/( ( (?(1)0|) )* )/x
Memory allocation - code size : 56
------------------------------------------------------------------
0 50 Bra
5 40 CBra 1
12 Brazero
13 27 SCBra 2
20 10 Cond
25 1 Capture ref
28 0
30 5 Alt
35 15 Ket
40 27 KetRmax
45 40 Ket
50 50 Ket
55 End
------------------------------------------------------------------
/( (?(1)0|)* )/x
Memory allocation - code size : 44
------------------------------------------------------------------
0 38 Bra
5 28 CBra 1
12 Brazero
13 10 SCond
18 1 Capture ref
21 0
23 5 Alt
28 15 KetRmax
33 28 Ket
38 38 Ket
43 End
------------------------------------------------------------------
/[a]/
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 a
7 7 Ket
12 End
------------------------------------------------------------------
/[a]/utf
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 a
7 7 Ket
12 End
------------------------------------------------------------------
/[\xaa]/
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 \x{aa}
7 7 Ket
12 End
------------------------------------------------------------------
/[\xaa]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 \x{aa}
8 8 Ket
13 End
------------------------------------------------------------------
/[^a]/
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 [^a] (not)
7 7 Ket
12 End
------------------------------------------------------------------
/[^a]/utf
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 [^a] (not)
7 7 Ket
12 End
------------------------------------------------------------------
/[^\xaa]/
Memory allocation - code size : 13
------------------------------------------------------------------
0 7 Bra
5 [^\x{aa}] (not)
7 7 Ket
12 End
------------------------------------------------------------------
/[^\xaa]/utf
Memory allocation - code size : 14
------------------------------------------------------------------
0 8 Bra
5 [^\x{aa}] (not)
8 8 Ket
13 End
------------------------------------------------------------------
#pattern -memory
/[^\d]/utf,ucp
------------------------------------------------------------------
0 15 Bra
5 [^\p{Nd}]
15 15 Ket
20 End
------------------------------------------------------------------
/[[:^alpha:][:^cntrl:]]+/utf,ucp
------------------------------------------------------------------
0 19 Bra
5 [\P{L}\P{Cc}]++
19 19 Ket
24 End
------------------------------------------------------------------
/[[:^cntrl:][:^alpha:]]+/utf,ucp
------------------------------------------------------------------
0 19 Bra
5 [\P{Cc}\P{L}]++
19 19 Ket
24 End
------------------------------------------------------------------
/[[:alpha:]]+/utf,ucp
------------------------------------------------------------------
0 16 Bra
5 [\p{L}]++
16 16 Ket
21 End
------------------------------------------------------------------
/[[:^alpha:]\S]+/utf,ucp
------------------------------------------------------------------
0 19 Bra
5 [\P{L}\P{Xsp}]++
19 19 Ket
24 End
------------------------------------------------------------------
/abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/
------------------------------------------------------------------
0 93 Bra
5 abc
11 9 CBra 1
18 d
20 7 Alt
25 e
27 16 Ket
32 *THEN
33 x
35 16 CBra 2
42 123
48 *THEN
49 4
51 37 Alt
56 567
62 9 CBra 3
69 b
71 7 Alt
76 q
78 16 Ket
83 *THEN
84 xx
88 53 Ket
93 93 Ket
98 End
------------------------------------------------------------------
/(((a\2)|(a*)\g<-1>))*a?/
------------------------------------------------------------------
0 73 Bra
5 Brazero
6 60 SCBra 1
13 24 CBra 2
20 12 CBra 3
27 a
29 \2
32 12 Ket
37 24 Alt
42 9 CBra 4
49 a*
51 9 Ket
56 42 Recurse
61 48 Ket
66 60 KetRmax
71 a?+
73 73 Ket
78 End
------------------------------------------------------------------
/((?+1)(\1))/
------------------------------------------------------------------
0 37 Bra
5 27 CBra 1
12 17 Recurse
17 10 CBra 2
24 \1
27 10 Ket
32 27 Ket
37 37 Ket
42 End
------------------------------------------------------------------
"(?1)(?#?'){2}(a)"
------------------------------------------------------------------
0 29 Bra
5 15 Recurse
10 15 Recurse
15 9 CBra 1
22 a
24 9 Ket
29 29 Ket
34 End
------------------------------------------------------------------
/.((?2)(?R)|\1|$)()/
------------------------------------------------------------------
0 54 Bra
5 Any
6 17 CBra 1
13 42 Recurse
18 0 Recurse
23 8 Alt
28 \1
31 6 Alt
36 $
37 31 Ket
42 7 CBra 2
49 7 Ket
54 54 Ket
59 End
------------------------------------------------------------------
/.((?3)(?R)()(?2)|\1|$)()/
------------------------------------------------------------------
0 71 Bra
5 Any
6 34 CBra 1
13 59 Recurse
18 0 Recurse
23 7 CBra 2
30 7 Ket
35 23 Recurse
40 8 Alt
45 \1
48 6 Alt
53 $
54 48 Ket
59 7 CBra 3
66 7 Ket
71 71 Ket
76 End
------------------------------------------------------------------
/(?1)()((((((\1++))\x85)+)|))/
------------------------------------------------------------------
0 115 Bra
5 10 Recurse
10 7 CBra 1
17 7 Ket
22 88 CBra 2
29 71 CBra 3
36 59 CBra 4
43 47 CBra 5
50 33 CBra 6
57 21 CBra 7
64 9 Once
69 \1+
73 9 Ket
78 21 Ket
83 33 Ket
88 \x{85}
90 47 KetRmax
95 59 Ket
100 5 Alt
105 76 Ket
110 88 Ket
115 115 Ket
120 End
------------------------------------------------------------------
# Check the absolute limit on nesting (?| etc. This varies with code unit
# width because the workspace is a different number of bytes. It will fail
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|

/parens_nest_limit=1000,-fullbincode
# Use "expand" to create some very long patterns with nested parentheses, in
# order to test workspace overflow. Again, this varies with code unit width,
# and even when it fails in two modes, the error offset differs. It also varies
# with link size - hence multiple tests with different values.
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
Failed: error 186 at offset 12820: regular expression is too complicated
/(?(1)(?1)){8,}+()/debug
------------------------------------------------------------------
0 181 Bra
5 159 Once
10 13 Cond
15 1 Capture ref
18 169 Recurse
23 13 Ket
28 13 Cond
33 1 Capture ref
36 169 Recurse
41 13 Ket
46 13 Cond
51 1 Capture ref
54 169 Recurse
59 13 Ket
64 13 Cond
69 1 Capture ref
72 169 Recurse
77 13 Ket
82 13 Cond
87 1 Capture ref
90 169 Recurse
95 13 Ket
100 13 Cond
105 1 Capture ref
108 169 Recurse
113 13 Ket
118 13 Cond
123 1 Capture ref
126 169 Recurse
131 13 Ket
136 23 SBraPos
141 13 SCond
146 1 Capture ref
149 169 Recurse
154 13 Ket
159 23 KetRpos
164 159 Ket
169 7 CBra 1
176 7 Ket
181 181 Ket
186 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcd
0:
1:
/(?(1)|a(?1)b){2,}+()/debug
------------------------------------------------------------------
0 91 Bra
5 69 Once
10 8 Cond
15 1 Capture ref
18 14 Alt
23 a
25 79 Recurse
30 b
32 22 Ket
37 32 SBraPos
42 8 SCond
47 1 Capture ref
50 14 Alt
55 a
57 79 Recurse
62 b
64 22 Ket
69 32 KetRpos
74 69 Ket
79 7 CBra 1
86 7 Ket
91 91 Ket
96 End
------------------------------------------------------------------
Capture group count = 1
Max back reference = 1
May match empty string
Subject length lower bound = 0
abcde
No match
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
------------------------------------------------------------------
0 327 Bra
5 102 CBra 1
12 5 Recurse
17 219 Recurse
22 231 Recurse
27 243 Recurse
32 255 Recurse
37 267 Recurse
42 279 Recurse
47 291 Recurse
52 303 Recurse
57 303 Recurse
62 291 Recurse
67 279 Recurse
72 267 Recurse
77 255 Recurse
82 243 Recurse
87 231 Recurse
92 219 Recurse
97 5 Recurse
102 0 Recurse
107 102 Ket
112 102 SCBra 1
119 5 Recurse
124 219 Recurse
129 231 Recurse
134 243 Recurse
139 255 Recurse
144 267 Recurse
149 279 Recurse
154 291 Recurse
159 303 Recurse
164 303 Recurse
169 291 Recurse
174 279 Recurse
179 267 Recurse
184 255 Recurse
189 243 Recurse
194 231 Recurse
199 219 Recurse
204 5 Recurse
209 0 Recurse
214 102 KetRmax
219 7 CBra 2
226 7 Ket
231 7 CBra 3
238 7 Ket
243 7 CBra 4
250 7 Ket
255 7 CBra 5
262 7 Ket
267 7 CBra 6
274 7 Ket
279 7 CBra 7
286 7 Ket
291 7 CBra 8
298 7 Ket
303 7 CBra 9
310 7 Ket
315 7 CBra 10
322 7 Ket
327 327 Ket
332 End
------------------------------------------------------------------
Capture group count = 10
May match empty string
Subject length lower bound = 0

Failed: error 114 at offset 509: missing closing parenthesis
fullbincode
#pattern -fullbincode
/\[()]{65535}/expand
# End of testinput8

408
3rd/pcre2/testdata/testoutput9 vendored Normal file
View File

@@ -0,0 +1,408 @@
# This set of tests is run only with the 8-bit library. They must not require
# UTF-8 or Unicode property support. */
#forbid_utf
#newline_default lf any anycrlf
/a\xc4\xa3b/
a\N{U+123}b
0: a\xc4\xa3b
\= Expect no match # error message (too big char)
a\x{0123}b
** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
No match
a\o{00443}b
** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
No match
a\443b
** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
No match
/fd bf bf bf bf bf/I,hex
Capture group count = 0
First code unit = \xfd
Last code unit = \xbf
Subject length lower bound = 6
\= Expect warning
\N{U+7fffffff}
** Warning: character \N{U+7fffffff} is greater than 0x10ffff and should not be encoded as UTF-8
0: \xfd\xbf\xbf\xbf\xbf\xbf
\= Expect no match # error message (too big char)
\x{7fffffff}
** Character \x{7fffffff} is greater than 255 and UTF-8 mode is not enabled.
** Truncation will probably give the wrong result.
No match
/\x{100}/I
Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
/\o{400}/I
Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
/ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional leading comment
(?: (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address
| # or
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # one word, optionally followed by....
(?:
[^()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037] | # atom and space parts, or...
\(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) | # comments, or...
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
# quoted strings
)*
< (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # leading <
(?: @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* , (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
)* # further okay, if led by comma
: # closing colon
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* )? # optional route
(?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) # initial word
(?: (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
|
" (?: # opening quote...
[^\\\x80-\xff\n\015"] # Anything except backslash and quote
| # or
\\ [^\x80-\xff] # Escaped something (something != CR)
)* " # closing quote
) )* # further okay, if led by a period
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* @ (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # initial subdomain
(?: #
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* \. # if led by a period...
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* (?:
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+ # some number of atom characters...
(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]) # ..not followed by something that could be part of an atom
| \[ # [
(?: [^\\\x80-\xff\n\015\[\]] | \\ [^\x80-\xff] )* # stuff
\] # ]
) # ...further okay
)*
# address spec
(?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* > # trailing >
# name and address
) (?: [\040\t] | \(
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
\) )* # optional trailing comment
/Ix
Capture group count = 0
Contains explicit CR or LF match
Options: extended
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
Subject length lower bound = 3
/\h/I
Capture group count = 0
Starting code units: \x09 \x20 \xa0
Subject length lower bound = 1
/\H/I
Capture group count = 0
Subject length lower bound = 1
/\v/I
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85
Subject length lower bound = 1
/\V/I
Capture group count = 0
Subject length lower bound = 1
/\R/I
Capture group count = 0
Starting code units: \x0a \x0b \x0c \x0d \x85
Subject length lower bound = 1
/[\h]/B
------------------------------------------------------------------
Bra
[\x09 \xa0]
Ket
End
------------------------------------------------------------------
>\x09<
0: \x09
/[\h]+/B
------------------------------------------------------------------
Bra
[\x09 \xa0]++
Ket
End
------------------------------------------------------------------
>\x09\x20\xa0<
0: \x09 \xa0
/[\v]/B
------------------------------------------------------------------
Bra
[\x0a-\x0d\x85]
Ket
End
------------------------------------------------------------------
/[\H]/B
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff]
Ket
End
------------------------------------------------------------------
/[^\h]/B
------------------------------------------------------------------
Bra
[^\x09 \xa0]
Ket
End
------------------------------------------------------------------
/[\V]/B
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff]
Ket
End
------------------------------------------------------------------
/[\x0a\V]/B
------------------------------------------------------------------
Bra
[\x00-\x0a\x0e-\x84\x86-\xff]
Ket
End
------------------------------------------------------------------
/\777/I
Failed: error 151 at offset 4: octal value is greater than \377 in 8-bit non-UTF-8 mode
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
XX
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark,alt_verbnames
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
XX
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
XX
0: XX
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark,alt_verbnames
XX
0: XX
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
Failed: error 177 at offset 6: character code point value in \u.... sequence is too large
/[\u0100-\u0200]/alt_bsux,allow_empty_class,match_unset_backref,dupnames
Failed: error 177 at offset 7: character code point value in \u.... sequence is too large
/[^\x00-a]{12,}[^b-\xff]*/B
------------------------------------------------------------------
Bra
[^\x00-a]{12,}+
[^b-\xff]*+
Ket
End
------------------------------------------------------------------
/[^\s]*\s* [^\W]+\W+ [^\d]*?\d0 [^\d\w]{4,6}?\w*A/B
------------------------------------------------------------------
Bra
[^\x09-\x0d ]*+
\s*
[0-9A-Z_a-z]++
\W+
[^0-9]*+
\d
0
[^0-9A-Z_a-z]{4,6}+
\w*
A
Ket
End
------------------------------------------------------------------
/(*MARK:a\x{100}b)z/alt_verbnames
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
/(?i:A{1,}\6666666666)/
Failed: error 151 at offset 13: octal value is greater than \377 in 8-bit non-UTF-8 mode
A\x{1b6}6666666
# Should cause an error
/abc/substitute_extended,replace=>\777<
abc
Failed: error -57 at offset 5 in replacement: bad escape sequence in replacement string
# Should cause an error
/abc/substitute_extended,replace=>\o{012345}<
abc
Failed: error -57 at offset 10 in replacement: bad escape sequence in replacement string
/i/turkish_casing
Failed: error 204 at offset 0: PCRE2_EXTRA_TURKISH_CASING require Unicode (UTF or UCP) mode
# End of testinput9

206
3rd/pcre2/testdata/testoutputEBC vendored Normal file
View File

@@ -0,0 +1,206 @@
PCRE2 version 10.32-RC1 2018-02-19
# This is a specialized test for checking, when PCRE2 is compiled with the
# EBCDIC option but in an ASCII environment, that newline, white space, and \c
# functionality is working. It catches cases where explicit values such as 0x0a
# have been used instead of names like CHAR_LF. Needless to say, it is not a
# genuine EBCDIC test! In patterns, alphabetic characters that follow a
# backslash must be in EBCDIC code. In data, NL, NEL, LF, ESC, and DEL must be
# in EBCDIC, but can of course be specified as escapes.
# Test default newline and variations
/^A/m
ABC
0: A
12\x15ABC
0: A
/^A/m,newline=any
12\x15ABC
0: A
12\x0dABC
0: A
12\x0d\x15ABC
0: A
12\x25ABC
0: A
/^A/m,newline=anycrlf
12\x15ABC
0: A
12\x0dABC
0: A
12\x0d\x15ABC
0: A
** Fail
No match
12\x25ABC
No match
# Test \h
/^A\<5C>/
A B
0: A\x20
A\x41B
0: AA
# Test \H
/^A\<5C>/
AB
0: AB
A\x42B
0: AB
** Fail
No match
A B
No match
A\x41B
No match
# Test \R
/^A\<5C>/
A\x15B
0: A\x15
A\x0dB
0: A\x0d
A\x25B
0: A\x25
A\x0bB
0: A\x0b
A\x0cB
0: A\x0c
** Fail
No match
A B
No match
# Test \v
/^A\<5C>/
A\x15B
0: A\x15
A\x0dB
0: A\x0d
A\x25B
0: A\x25
A\x0bB
0: A\x0b
A\x0cB
0: A\x0c
** Fail
No match
A B
No match
# Test \V
/^A\<5C>/
A B
0: A\x20
** Fail
No match
A\x15B
No match
A\x0dB
No match
A\x25B
No match
A\x0bB
No match
A\x0cB
No match
# For repeated items, use an atomic group so that the output is the same
# for DFA matching (otherwise it may show multiple matches).
# Test \h+
/^A(?>\<5C>+)/
A B
0: A\x20
# Test \H+
/^A(?>\<5C>+)/
AB
0: AB
** Fail
No match
A B
No match
# Test \R+
/^A(?>\<5C>+)/
A\x15B
0: A\x15
A\x0dB
0: A\x0d
A\x25B
0: A\x25
A\x0bB
0: A\x0b
A\x0cB
0: A\x0c
** Fail
No match
A B
No match
# Test \v+
/^A(?>\<5C>+)/
A\x15B
0: A\x15
A\x0dB
0: A\x0d
A\x25B
0: A\x25
A\x0bB
0: A\x0b
A\x0cB
0: A\x0c
** Fail
No match
A B
No match
# Test \V+
/^A(?>\<5C>+)/
A B
0: A\x20B
** Fail
No match
A\x15B
No match
A\x0dB
No match
A\x25B
No match
A\x0bB
No match
A\x0cB
No match
# Test \c functionality
/\<5C>@\<5C>A\<5C>b\<5C>C\<5C>d\<5C>E\<5C>f\<5C>G\<5C>h\<5C>I\<5C>J\<5C>K\<5C>l\<5C>m\<5C>N\<5C>O\<5C>p\<5C>q\<5C>r\<5C>S\<5C>T\<5C>u\<5C>V\<5C>W\<5C>X\<5C>y\<5C>Z/
\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f
0: \x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a
/\<5C>[\<5C>\\<5C>]\<5C>^\<5C>_/
\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f
0: \x1b\x1c\x1d\x1e\x1f
/\<5C>?/
A\xffB
0: \xff
/\<5C>&/
Failed: error 168 at offset 3: \c\x20must\x20be\x20followed\x20by\x20a\x20letter\x20or\x20one\x20of\x20[\]^_\x3f
# End

Some files were not shown because too many files have changed in this diff Show More