grep: fix segfault under -P + PCRE2 <=10.30 + (*NO_JIT)
authorÆvar Arnfjörð Bjarmason <avarab@gmail.com>
Thu, 23 Nov 2017 14:16:58 +0000 (14:16 +0000)
committerJunio C Hamano <gitster@pobox.com>
Fri, 24 Nov 2017 07:12:26 +0000 (16:12 +0900)
Fix a bug in the compilation of PCRE2 patterns under JIT (the most
common runtime configuration). Any pattern with a (*NO_JIT) verb would
segfault in any currently released PCRE2 version:

$ git grep -P '(*NO_JIT)hi.*there'
Segmentation fault

That this segfaulted was a bug in PCRE2 itself, after reporting it[1]
on pcre-dev it's been fixed in a yet-to-be-released version of
PCRE (presumably released first as 10.31). Now it'll die with:

$ git grep -P '(*NO_JIT)hi.*there'
fatal: pcre2_jit_match failed with error code -45: bad JIT option

But the cause of the bug is in our own code dating back to my
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01).

As explained at more length in the comment being added here, it isn't
sufficient to just check pcre2_config() to see whether the JIT should
be used, pcre2_pattern_info() also has to be asked.

This is something I discovered myself when fiddling around with PCRE2
verbs in patterns passed to git. I don't expect that any user of git
has encountered this given the obscurity of passing PCRE2 verbs
through to the library, along with the relative obscurity of (*NO_JIT)
itself.

1. "How am I supposed to use PCRE2 JIT in the face of (*NO_JIT) ?"
(<CACBZZX5mMqDuWuFmi7sRBp3wH6CFyd-ghACukd=v0NN=rBMnJg@mail.gmail.com> &
https://lists.exim.org/lurker/thread/20171123.101502.7f0d38ca.en.html)
on the pcre-dev mailing list

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep.c
t/t7810-grep.sh
diff --git a/grep.c b/grep.c
index d0b9b6cdfa74537ed67dcfef4ae486466d7185f8..e8ae0b5d8f78c68ec13038f59b53f0e1abe3482a 100644 (file)
--- a/grep.c
+++ b/grep.c
@@ -477,6 +477,8 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
        int options = PCRE2_MULTILINE;
        const uint8_t *character_tables = NULL;
        int jitret;
+       int patinforet;
+       size_t jitsizearg;
 
        assert(opt->pcre2);
 
@@ -511,6 +513,30 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
                jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
                if (jitret)
                        die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+
+               /*
+                * The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
+                * tells us whether the library itself supports JIT,
+                * but to see whether we're going to be actually using
+                * JIT we need to extract PCRE2_INFO_JITSIZE from the
+                * pattern *after* we do pcre2_jit_compile() above.
+                *
+                * This is because if the pattern contains the
+                * (*NO_JIT) verb (see pcre2syntax(3))
+                * pcre2_jit_compile() will exit early with 0. If we
+                * then proceed to call pcre2_jit_match() further down
+                * the line instead of pcre2_match() we'll either
+                * segfault (pre PCRE 10.31) or run into a fatal error
+                * (post PCRE2 10.31)
+                */
+               patinforet = pcre2_pattern_info(p->pcre2_pattern, PCRE2_INFO_JITSIZE, &jitsizearg);
+               if (patinforet)
+                       BUG("pcre2_pattern_info() failed: %d", patinforet);
+               if (jitsizearg == 0) {
+                       p->pcre2_jit_on = 0;
+                       return;
+               }
+
                p->pcre2_jit_stack = pcre2_jit_stack_create(1, 1024 * 1024, NULL);
                if (!p->pcre2_jit_stack)
                        die("Couldn't allocate PCRE2 JIT stack");
index 2a6679c2f596fb13a34786a4aa4b954e61c61113..c8ff50cc306126e06f58ead078ba0327fe16a48a 100755 (executable)
@@ -1110,6 +1110,12 @@ test_expect_success PCRE 'grep -P pattern' '
        test_cmp expected actual
 '
 
+test_expect_success LIBPCRE2 "grep -P with (*NO_JIT) doesn't error out" '
+       git grep -P "(*NO_JIT)\p{Ps}.*?\p{Pe}" hello.c >actual &&
+       test_cmp expected actual
+
+'
+
 test_expect_success !PCRE 'grep -P pattern errors without PCRE' '
        test_must_fail git grep -P "foo.*bar"
 '