fast-import: duplicate into history rather than passing ownership
authorJeff King <peff@peff.net>
Sun, 25 Aug 2019 08:10:55 +0000 (04:10 -0400)
committerJunio C Hamano <gitster@pobox.com>
Tue, 27 Aug 2019 22:03:01 +0000 (15:03 -0700)
Fast-import's read_next_command() has somewhat odd memory ownership
semantics for the command_buf strbuf. After reading a command, we copy
the strbuf's pointer (without duplicating the string) into our cmd_hist
array of recent commands. And then when we're about to read a new
command, we clear the strbuf by calling strbuf_detach(), dropping
ownership from the strbuf (leaving the cmd_hist reference as the
remaining owner).

This has a few surprising implications:

- if the strbuf hasn't been copied into cmd_hist (e.g., because we
haven't ready any commands yet), then the strbuf_detach() will leak
the resulting string

- any modification to command_buf risks invalidating the pointer held
by cmd_hist. There doesn't seem to be any way to trigger this
currently (since we tend to modify it only by detaching and reading
in a new value), but it's subtly dangerous.

- any pointers into an input string will remain valid as long as
cmd_hist points to them. So in general, you can point into
command_buf.buf and call read_next_command() up to 100 times before
your string is cycled out and freed, leaving you with a dangling
pointer. This makes it easy to miss bugs during testing, as they
might trigger only for a sufficiently large commit (e.g., the bug
fixed in the previous commit).

Instead, let's make a new string to copy the command into the history
array, rather than having dual ownership with the old. Then we can drop
the strbuf_detach() calls entirely, and just reuse the same buffer
within command_buf over and over. We'd normally have to strbuf_reset()
it before using it again, but in both cases here we're using
strbuf_getline(), which does it automatically for us.

This fixes the leak, and it means that even a single call to
read_next_command() will invalidate any held pointers, making it easier
to find bugs. In fact, we can drop the extra input lines added to the
test case by the previous commit, as the unfixed bug would now trigger
just from reading the commit message, even without any modified files in
the commit.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import.c
t/t9300-fast-import.sh
index ee7258037a15cf04170377f4394a8f8dfe6d2179..1f9160b645030e4dbbf1af39f4e7f6a76214376a 100644 (file)
@@ -1763,7 +1763,6 @@ static int read_next_command(void)
                } else {
                        struct recent_command *rc;
 
-                       strbuf_detach(&command_buf, NULL);
                        stdin_eof = strbuf_getline_lf(&command_buf, stdin);
                        if (stdin_eof)
                                return EOF;
@@ -1784,7 +1783,7 @@ static int read_next_command(void)
                                free(rc->buf);
                        }
 
-                       rc->buf = command_buf.buf;
+                       rc->buf = xstrdup(command_buf.buf);
                        rc->prev = cmd_tail;
                        rc->next = cmd_hist.prev;
                        rc->prev->next = rc;
@@ -1833,7 +1832,6 @@ static int parse_data(struct strbuf *sb, uintmax_t limit, uintmax_t *len_res)
                char *term = xstrdup(data);
                size_t term_len = command_buf.len - (data - command_buf.buf);
 
-               strbuf_detach(&command_buf, NULL);
                for (;;) {
                        if (strbuf_getline_lf(&command_buf, stdin) == EOF)
                                die("EOF in data (terminator '%s' not found)", term);
index cf66b40ebcc509dc17d6b5964ec5bc934a46ba93..141b7fa35e74b860d91ea7cdabf48730442ed635 100755 (executable)
@@ -3314,11 +3314,6 @@ test_expect_success 'X: handling encoding' '
 
        printf "Pi: \360\nCOMMIT\n" >>input &&
 
-       for i in $(test_seq 100)
-       do
-               echo "M 644 $EMPTY_BLOB file-$i"
-       done >>input &&
-
        git fast-import <input &&
        git cat-file -p encoding | grep $(printf "\360") &&
        git log -1 --format=%B encoding | grep $(printf "\317\200")