Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: othersrc/external/bsd/agcre



Module Name:    othersrc
Committed By:   agc
Date:           Wed Aug 16 23:38:14 UTC 2017

Added Files:
        othersrc/external/bsd/agcre: Makefile README
        othersrc/external/bsd/agcre/bin: Makefile
        othersrc/external/bsd/agcre/dist: Makefile.bsd Makefile.in
            Makefile.lib.in Makefile.libtool.in agcre.1 agcre.h agcre_format.7
            comp.c configure error.c exec.c free.c lex.c lex.h libagcre.3
            main.c mkdist new.c set.c set.h unicode.c unicode.h
        othersrc/external/bsd/agcre/dist/tests: 1.expected 1.in 10.expected
            11.expected 12.expected 13.expected 14.expected 15.expected
            16.expected 17.expected 18.expected 19.expected 2.expected 2.in
            20.expected 21.expected 22.expected 23.expected 24.expected
            25.expected 26.expected 27.expected 28.expected 29.expected
            3.expected 3.in 30.expected 31.expected 32.expected 33.expected
            34.expected 35.expected 36.expected 37.expected 38.expected
            39.expected 4.expected 40.expected 41.expected 42.expected
            43.expected 44.expected 45.expected 46.expected 47.expected
            48.expected 49.expected 5.expected 50.expected 51.expected
            52.expected 53.expected 54.expected 55.expected 56.expected
            57.expected 58.expected 59.expected 6.expected 60.expected
            61.expected 62.expected 63.expected 64.expected 65.expected
            66.expected 67.expected 68.expected 69.expected 7.expected
            70.expected 71.expected 72.expected 73.expected 74.expected
            75.expected 76.expected 77.expected 78.expected 79.expected
            8.expected 80.expected 81.expected 82.expected 83.expected
            84.expected 85.expected 86.expected 87.expected 88.expected
            89.expected 9.expected 90.expected 91.expected 92.expected
            UTF-8-demo.txt expression words
        othersrc/external/bsd/agcre/lib: Makefile shlib_version

Log Message:
Just what this world needs - another regexp library. However, for
something I was doing, I needed a regexp library in C, BSD-licensed,
and able to be exposed to a wide range of expressions, some better
controlled than others.

The resulting library is libagcre, which implements regular expression
compilation and execution. It uses the Pike Virtual Machine approach,
and features:

+ standard POSIX features where sane
+ some/most Perl escapes
+ lazy matching via '?'
+ non-capture parenthese (?:...)
+ in-expression case-insensitive directives are supported (?i)...(?-i)
+ all case-insensitivity is actioned at expression exec time.
Case-insensitivity can be specified at expression compile-time,
and, if so, it will be remembered.  But the expression itself, once
compiled, can be used to match in both a case-sensitive and insensitive
manner
+ utf8 is supported both for expressions and for input text when
matching
+ unicode escapes (in the Java format of \uABCD) are supported
+ exact multiple repetition specifiers {N}, and {N,M} are supported
+ backreferences are supported
+ utf16 (LE and BE) and utf32 (LE and BE) are supported, both for the
expression and for the input being searched
+ at the most basic level, individual 32bit unicode characters are
matched
+ an egrep/grep implementation for matching unicode regexps
is included

A simple implementation of sets is used to provide inclusion and
exclusion information for unicode characters, which is taken directly
from unicode.org. No bitmasks are used - ranges are specified by
using an upper and a lower bound for the codepoints. Callbacks can
also be added to these sets, to provide functionality similar to
the ctype macros across the whole unicode character set.

The standard regular expression basic3 torture test is passed with
4 known (and, I'd argue, incorrect) results flagged.  As expected,
the expression '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' matches
in linear time, as does the expression
'((((((((((((((((((((((((((((((x))))))))))))))))))))))))))))))'

        % time agcre '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' dist/tests/2.in
        aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
        0.063u 0.000s 0:00.06 100.0%    0+0k 0+0io 0pf+0w
        % time egrep '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' dist/tests/2.in
        ^C88.462u 0.730s 1:29.21 99.9%  0+0k 0+0io 0pf+0w
        %

The library and agcre utility have been run through valgrind to
confirm no memory leaks.

In general, the emphasis is on a modern, predictable, VM-style,
well-featured regexp library, in C, with a BSD license. In
particular, sljit has not been used to speed up on certain platforms,
most Perl regexp features are supported, as are back references,
and UTF-8, UTF-16 and UTF32.

Once again, I wouldn't expect anyone to use this as the main engine
in egrep. But I am always amazed at the uses for some of the things
that I write.

For more information about the Pike VM, and comparison to other
regexp implementations, please see:

        https://swtch.com/~rsc/regexp/regexp2.html

Alistair Crooks
Tue Aug 15 07:43:34 PDT 2017


To generate a diff of this commit:
cvs rdiff -u -r0 -r1.1 othersrc/external/bsd/agcre/Makefile \
    othersrc/external/bsd/agcre/README
cvs rdiff -u -r0 -r1.1 othersrc/external/bsd/agcre/bin/Makefile
cvs rdiff -u -r0 -r1.1 othersrc/external/bsd/agcre/dist/Makefile.bsd \
    othersrc/external/bsd/agcre/dist/Makefile.in \
    othersrc/external/bsd/agcre/dist/Makefile.lib.in \
    othersrc/external/bsd/agcre/dist/Makefile.libtool.in \
    othersrc/external/bsd/agcre/dist/agcre.1 \
    othersrc/external/bsd/agcre/dist/agcre.h \
    othersrc/external/bsd/agcre/dist/agcre_format.7 \
    othersrc/external/bsd/agcre/dist/comp.c \
    othersrc/external/bsd/agcre/dist/configure \
    othersrc/external/bsd/agcre/dist/error.c \
    othersrc/external/bsd/agcre/dist/exec.c \
    othersrc/external/bsd/agcre/dist/free.c \
    othersrc/external/bsd/agcre/dist/lex.c \
    othersrc/external/bsd/agcre/dist/lex.h \
    othersrc/external/bsd/agcre/dist/libagcre.3 \
    othersrc/external/bsd/agcre/dist/main.c \
    othersrc/external/bsd/agcre/dist/mkdist \
    othersrc/external/bsd/agcre/dist/new.c \
    othersrc/external/bsd/agcre/dist/set.c \
    othersrc/external/bsd/agcre/dist/set.h \
    othersrc/external/bsd/agcre/dist/unicode.c \
    othersrc/external/bsd/agcre/dist/unicode.h
cvs rdiff -u -r0 -r1.1 othersrc/external/bsd/agcre/dist/tests/1.expected \
    othersrc/external/bsd/agcre/dist/tests/1.in \
    othersrc/external/bsd/agcre/dist/tests/10.expected \
    othersrc/external/bsd/agcre/dist/tests/11.expected \
    othersrc/external/bsd/agcre/dist/tests/12.expected \
    othersrc/external/bsd/agcre/dist/tests/13.expected \
    othersrc/external/bsd/agcre/dist/tests/14.expected \
    othersrc/external/bsd/agcre/dist/tests/15.expected \
    othersrc/external/bsd/agcre/dist/tests/16.expected \
    othersrc/external/bsd/agcre/dist/tests/17.expected \
    othersrc/external/bsd/agcre/dist/tests/18.expected \
    othersrc/external/bsd/agcre/dist/tests/19.expected \
    othersrc/external/bsd/agcre/dist/tests/2.expected \
    othersrc/external/bsd/agcre/dist/tests/2.in \
    othersrc/external/bsd/agcre/dist/tests/20.expected \
    othersrc/external/bsd/agcre/dist/tests/21.expected \
    othersrc/external/bsd/agcre/dist/tests/22.expected \
    othersrc/external/bsd/agcre/dist/tests/23.expected \
    othersrc/external/bsd/agcre/dist/tests/24.expected \
    othersrc/external/bsd/agcre/dist/tests/25.expected \
    othersrc/external/bsd/agcre/dist/tests/26.expected \
    othersrc/external/bsd/agcre/dist/tests/27.expected \
    othersrc/external/bsd/agcre/dist/tests/28.expected \
    othersrc/external/bsd/agcre/dist/tests/29.expected \
    othersrc/external/bsd/agcre/dist/tests/3.expected \
    othersrc/external/bsd/agcre/dist/tests/3.in \
    othersrc/external/bsd/agcre/dist/tests/30.expected \
    othersrc/external/bsd/agcre/dist/tests/31.expected \
    othersrc/external/bsd/agcre/dist/tests/32.expected \
    othersrc/external/bsd/agcre/dist/tests/33.expected \
    othersrc/external/bsd/agcre/dist/tests/34.expected \
    othersrc/external/bsd/agcre/dist/tests/35.expected \
    othersrc/external/bsd/agcre/dist/tests/36.expected \
    othersrc/external/bsd/agcre/dist/tests/37.expected \
    othersrc/external/bsd/agcre/dist/tests/38.expected \
    othersrc/external/bsd/agcre/dist/tests/39.expected \
    othersrc/external/bsd/agcre/dist/tests/4.expected \
    othersrc/external/bsd/agcre/dist/tests/40.expected \
    othersrc/external/bsd/agcre/dist/tests/41.expected \
    othersrc/external/bsd/agcre/dist/tests/42.expected \
    othersrc/external/bsd/agcre/dist/tests/43.expected \
    othersrc/external/bsd/agcre/dist/tests/44.expected \
    othersrc/external/bsd/agcre/dist/tests/45.expected \
    othersrc/external/bsd/agcre/dist/tests/46.expected \
    othersrc/external/bsd/agcre/dist/tests/47.expected \
    othersrc/external/bsd/agcre/dist/tests/48.expected \
    othersrc/external/bsd/agcre/dist/tests/49.expected \
    othersrc/external/bsd/agcre/dist/tests/5.expected \
    othersrc/external/bsd/agcre/dist/tests/50.expected \
    othersrc/external/bsd/agcre/dist/tests/51.expected \
    othersrc/external/bsd/agcre/dist/tests/52.expected \
    othersrc/external/bsd/agcre/dist/tests/53.expected \
    othersrc/external/bsd/agcre/dist/tests/54.expected \
    othersrc/external/bsd/agcre/dist/tests/55.expected \
    othersrc/external/bsd/agcre/dist/tests/56.expected \
    othersrc/external/bsd/agcre/dist/tests/57.expected \
    othersrc/external/bsd/agcre/dist/tests/58.expected \
    othersrc/external/bsd/agcre/dist/tests/59.expected \
    othersrc/external/bsd/agcre/dist/tests/6.expected \
    othersrc/external/bsd/agcre/dist/tests/60.expected \
    othersrc/external/bsd/agcre/dist/tests/61.expected \
    othersrc/external/bsd/agcre/dist/tests/62.expected \
    othersrc/external/bsd/agcre/dist/tests/63.expected \
    othersrc/external/bsd/agcre/dist/tests/64.expected \
    othersrc/external/bsd/agcre/dist/tests/65.expected \
    othersrc/external/bsd/agcre/dist/tests/66.expected \
    othersrc/external/bsd/agcre/dist/tests/67.expected \
    othersrc/external/bsd/agcre/dist/tests/68.expected \
    othersrc/external/bsd/agcre/dist/tests/69.expected \
    othersrc/external/bsd/agcre/dist/tests/7.expected \
    othersrc/external/bsd/agcre/dist/tests/70.expected \
    othersrc/external/bsd/agcre/dist/tests/71.expected \
    othersrc/external/bsd/agcre/dist/tests/72.expected \
    othersrc/external/bsd/agcre/dist/tests/73.expected \
    othersrc/external/bsd/agcre/dist/tests/74.expected \
    othersrc/external/bsd/agcre/dist/tests/75.expected \
    othersrc/external/bsd/agcre/dist/tests/76.expected \
    othersrc/external/bsd/agcre/dist/tests/77.expected \
    othersrc/external/bsd/agcre/dist/tests/78.expected \
    othersrc/external/bsd/agcre/dist/tests/79.expected \
    othersrc/external/bsd/agcre/dist/tests/8.expected \
    othersrc/external/bsd/agcre/dist/tests/80.expected \
    othersrc/external/bsd/agcre/dist/tests/81.expected \
    othersrc/external/bsd/agcre/dist/tests/82.expected \
    othersrc/external/bsd/agcre/dist/tests/83.expected \
    othersrc/external/bsd/agcre/dist/tests/84.expected \
    othersrc/external/bsd/agcre/dist/tests/85.expected \
    othersrc/external/bsd/agcre/dist/tests/86.expected \
    othersrc/external/bsd/agcre/dist/tests/87.expected \
    othersrc/external/bsd/agcre/dist/tests/88.expected \
    othersrc/external/bsd/agcre/dist/tests/89.expected \
    othersrc/external/bsd/agcre/dist/tests/9.expected \
    othersrc/external/bsd/agcre/dist/tests/90.expected \
    othersrc/external/bsd/agcre/dist/tests/91.expected \
    othersrc/external/bsd/agcre/dist/tests/92.expected \
    othersrc/external/bsd/agcre/dist/tests/UTF-8-demo.txt \
    othersrc/external/bsd/agcre/dist/tests/expression \
    othersrc/external/bsd/agcre/dist/tests/words
cvs rdiff -u -r0 -r1.1 othersrc/external/bsd/agcre/lib/Makefile \
    othersrc/external/bsd/agcre/lib/shlib_version

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index