fast-import: duplicate parsed encoding string
authorJeff King <peff@peff.net>
Sun, 25 Aug 2019 08:08:21 +0000 (04:08 -0400)
committerJunio C Hamano <gitster@pobox.com>
Tue, 27 Aug 2019 22:02:49 +0000 (15:02 -0700)
We read each line of the fast-import stream into the command_buf strbuf.
When reading a commit, we parse a line like "encoding foo" by storing a
pointer to "foo", but not making a copy. We may then read an unbounded
number of other lines (e.g., one for each modified file in the commit),
each of which writes into command_buf.

This works out in practice for small cases, because we hand off
ownership of the heap buffer from command_buf to the cmd_hist array, and
read new commands into a fresh heap buffer. And thus the pointer to
"foo" remains valid as long as there aren't so many intermediate lines
that we end up dropping the original "encoding" line from the history.

But as the test modification shows, if we go over our default of 100
lines, we end up with our encoding string pointing into freed heap
memory. This seems to fail reliably by writing garbage into the output,
but running under ASan definitely detects this as a use-after-free.

We can fix it by duplicating the encoding value, just as we do for other
parsed lines (e.g., an author line ends up in parse_ident, which copies
it to a new string).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import.c
t/t9300-fast-import.sh
index b44d6a467ef17f3d2f541475ab1f4ce968504a5e..ee7258037a15cf04170377f4394a8f8dfe6d2179 100644 (file)
@@ -2588,7 +2588,7 @@ static void parse_new_commit(const char *arg)
        struct branch *b;
        char *author = NULL;
        char *committer = NULL;
-       const char *encoding = NULL;
+       char *encoding = NULL;
        struct hash_list *merge_list = NULL;
        unsigned int merge_count;
        unsigned char prev_fanout, new_fanout;
@@ -2611,8 +2611,10 @@ static void parse_new_commit(const char *arg)
        }
        if (!committer)
                die("Expected committer but didn't get one");
-       if (skip_prefix(command_buf.buf, "encoding ", &encoding))
+       if (skip_prefix(command_buf.buf, "encoding ", &v)) {
+               encoding = xstrdup(v);
                read_next_command();
+       }
        parse_data(&msg, 0, NULL);
        read_next_command();
        parse_from(b);
@@ -2686,6 +2688,7 @@ static void parse_new_commit(const char *arg)
        strbuf_addbuf(&new_data, &msg);
        free(author);
        free(committer);
+       free(encoding);
 
        if (!store_object(OBJ_COMMIT, &new_data, NULL, &b->oid, next_mark))
                b->pack_id = pack_id;
index 141b7fa35e74b860d91ea7cdabf48730442ed635..cf66b40ebcc509dc17d6b5964ec5bc934a46ba93 100755 (executable)
@@ -3314,6 +3314,11 @@ test_expect_success 'X: handling encoding' '
 
        printf "Pi: \360\nCOMMIT\n" >>input &&
 
+       for i in $(test_seq 100)
+       do
+               echo "M 644 $EMPTY_BLOB file-$i"
+       done >>input &&
+
        git fast-import <input &&
        git cat-file -p encoding | grep $(printf "\360") &&
        git log -1 --format=%B encoding | grep $(printf "\317\200")