From: Junio C Hamano Date: Mon, 25 Feb 2008 01:23:17 +0000 (-0800) Subject: Merge branch 'jc/apply-whitespace' X-Git-Tag: v1.5.5-rc0~156 X-Git-Url: https://git.lorimer.id.au/gitweb.git/diff_plain/e38f892d1832977511c4e7c82204c7f94c3a3232?hp=-c Merge branch 'jc/apply-whitespace' * jc/apply-whitespace: ws_fix_copy(): move the whitespace fixing function to ws.c apply: do not barf on patch with too large an offset core.whitespace: cr-at-eol git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run builtin-apply.c: pass ws_rule down to match_fragment() builtin-apply.c: move copy_wsfix() function a bit higher. builtin-apply.c: do not feed copy_wsfix() leading '+' builtin-apply.c: simplify calling site to apply_line() builtin-apply.c: clean-up apply_one_fragment() builtin-apply.c: mark common context lines in lineinfo structure. builtin-apply.c: optimize match_beginning/end processing a bit. builtin-apply.c: make it more line oriented builtin-apply.c: push match-beginning/end logic down builtin-apply.c: restructure "offset" matching builtin-apply.c: refactor small part that matches context --- e38f892d1832977511c4e7c82204c7f94c3a3232 diff --combined Documentation/config.txt index 7b676710ba,44cb640fb8..fb6dae0cc2 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@@ -139,51 -139,6 +139,51 @@@ core.autocrlf: "text" (i.e. be subjected to the autocrlf mechanism) is decided purely based on the contents. +core.safecrlf:: + If true, makes git check if converting `CRLF` as controlled by + `core.autocrlf` is reversible. Git will verify if a command + modifies a file in the work tree either directly or indirectly. + For example, committing a file followed by checking out the + same file should yield the original file in the work tree. If + this is not the case for the current setting of + `core.autocrlf`, git will reject the file. The variable can + be set to "warn", in which case git will only warn about an + irreversible conversion but continue the operation. ++ +CRLF conversion bears a slight chance of corrupting data. +autocrlf=true will convert CRLF to LF during commit and LF to +CRLF during checkout. A file that contains a mixture of LF and +CRLF before the commit cannot be recreated by git. For text +files this is the right thing to do: it corrects line endings +such that we have only LF line endings in the repository. +But for binary files that are accidentally classified as text the +conversion can corrupt data. ++ +If you recognize such corruption early you can easily fix it by +setting the conversion type explicitly in .gitattributes. Right +after committing you still have the original file in your work +tree and this file is not yet corrupted. You can explicitly tell +git that this file is binary and git will handle the file +appropriately. ++ +Unfortunately, the desired effect of cleaning up text files with +mixed line endings and the undesired effect of corrupting binary +files cannot be distinguished. In both cases CRLFs are removed +in an irreversible way. For text files this is the right thing +to do because CRLFs are line endings, while for binary files +converting CRLFs corrupts data. ++ +Note, this safety check does not mean that a checkout will generate a +file identical to the original file for a different setting of +`core.autocrlf`, but only for the current one. For example, a text +file with `LF` would be accepted with `core.autocrlf=input` and could +later be checked out with `core.autocrlf=true`, in which case the +resulting file would contain `CRLF`, although the original file +contained `LF`. However, in both work trees the line endings would be +consistent, that is either all `LF` or all `CRLF`, but never mixed. A +file with mixed line endings would be reported by the `core.safecrlf` +mechanism. + core.symlinks:: If false, symbolic links are checked out as small plain files that contain the link text. linkgit:git-update-index[1] and @@@ -353,6 -308,10 +353,10 @@@ core.whitespace: error (enabled by default). * `indent-with-non-tab` treats a line that is indented with 8 or more space characters as an error (not enabled by default). + * `cr-at-eol` treats a carriage-return at the end of line as + part of the line terminator, i.e. with it, `trailing-space` + does not trigger if the character before such a carriage-return + is not a whitespace (not enabled by default). alias.*:: Command aliases for the linkgit:git[1] command wrapper - e.g. @@@ -378,7 -337,7 +382,7 @@@ branch.autosetupmerge: so that linkgit:git-pull[1] will appropriately merge from that remote branch. Note that even if this option is not set, this behavior can be chosen per-branch using the `--track` - and `--no-track` options. This option defaults to false. + and `--no-track` options. This option defaults to true. branch..remote:: When in branch , it tells `git fetch` which remote to fetch. @@@ -489,13 -448,6 +493,13 @@@ color.status.: commit.template:: Specify a file to use as the template for new commit messages. +color.ui:: + When set to `always`, always use colors in all git commands which + are capable of colored output. When false (or `never`), never. When + set to `true` or `auto`, use colors only when the output is to the + terminal. When more specific variables of color.* are set, they always + take precedence over this setting. Defaults to false. + diff.autorefreshindex:: When using `git diff` to compare with work tree files, do not consider stat-only change as changed. @@@ -818,12 -770,6 +822,12 @@@ pack.indexVersion: whenever the corresponding pack is larger than 2 GB. Otherwise the default is 1. +pack.packSizeLimit: + The default maximum size of a pack. This setting only affects + packing to a file, i.e. the git:// protocol is unaffected. It + can be overridden by the `\--max-pack-size` option of + linkgit:git-repack[1]. + pull.octopus:: The default merge strategy to use when pulling multiple branches at once. diff --combined builtin-apply.c index 6a88ff018d,64471a27e7..a3f075df4b --- a/builtin-apply.c +++ b/builtin-apply.c @@@ -161,6 -161,84 +161,84 @@@ struct patch struct patch *next; }; + /* + * A line in a file, len-bytes long (includes the terminating LF, + * except for an incomplete line at the end if the file ends with + * one), and its contents hashes to 'hash'. + */ + struct line { + size_t len; + unsigned hash : 24; + unsigned flag : 8; + #define LINE_COMMON 1 + }; + + /* + * This represents a "file", which is an array of "lines". + */ + struct image { + char *buf; + size_t len; + size_t nr; + size_t alloc; + struct line *line_allocated; + struct line *line; + }; + + static uint32_t hash_line(const char *cp, size_t len) + { + size_t i; + uint32_t h; + for (i = 0, h = 0; i < len; i++) { + if (!isspace(cp[i])) { + h = h * 3 + (cp[i] & 0xff); + } + } + return h; + } + + static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag) + { + ALLOC_GROW(img->line_allocated, img->nr + 1, img->alloc); + img->line_allocated[img->nr].len = len; + img->line_allocated[img->nr].hash = hash_line(bol, len); + img->line_allocated[img->nr].flag = flag; + img->nr++; + } + + static void prepare_image(struct image *image, char *buf, size_t len, + int prepare_linetable) + { + const char *cp, *ep; + + memset(image, 0, sizeof(*image)); + image->buf = buf; + image->len = len; + + if (!prepare_linetable) + return; + + ep = image->buf + image->len; + cp = image->buf; + while (cp < ep) { + const char *next; + for (next = cp; next < ep && *next != '\n'; next++) + ; + if (next < ep) + next++; + add_line_info(image, cp, next - cp, 0); + cp = next; + } + image->line = image->line_allocated; + } + + static void clear_image(struct image *image) + { + free(image->buf); + image->buf = NULL; + image->len = 0; + } + static void say_patch_name(FILE *output, const char *pre, struct patch *patch, const char *post) { @@@ -1430,234 -1508,345 +1508,345 @@@ static int read_old_data(struct stat *s case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error("unable to open or read %s", path); - convert_to_git(path, buf->buf, buf->len, buf); + convert_to_git(path, buf->buf, buf->len, buf, 0); return 0; default: return -1; } } - static int find_offset(const char *buf, unsigned long size, - const char *fragment, unsigned long fragsize, - int line, int *lines) + static void update_pre_post_images(struct image *preimage, + struct image *postimage, + char *buf, + size_t len) { - int i; - unsigned long start, backwards, forwards; + int i, ctx; + char *new, *old, *fixed; + struct image fixed_preimage; - if (fragsize > size) - return -1; + /* + * Update the preimage with whitespace fixes. Note that we + * are not losing preimage->buf -- apply_one_fragment() will + * free "oldlines". + */ + prepare_image(&fixed_preimage, buf, len, 1); + assert(fixed_preimage.nr == preimage->nr); + for (i = 0; i < preimage->nr; i++) + fixed_preimage.line[i].flag = preimage->line[i].flag; + free(preimage->line_allocated); + *preimage = fixed_preimage; - start = 0; - if (line > 1) { - unsigned long offset = 0; - i = line-1; - while (offset + fragsize <= size) { - if (buf[offset++] == '\n') { - start = offset; - if (!--i) - break; - } + /* + * Adjust the common context lines in postimage, in place. + * This is possible because whitespace fixing does not make + * the string grow. + */ + new = old = postimage->buf; + fixed = preimage->buf; + for (i = ctx = 0; i < postimage->nr; i++) { + size_t len = postimage->line[i].len; + if (!(postimage->line[i].flag & LINE_COMMON)) { + /* an added line -- no counterparts in preimage */ + memmove(new, old, len); + old += len; + new += len; + continue; } + + /* a common context -- skip it in the original postimage */ + old += len; + + /* and find the corresponding one in the fixed preimage */ + while (ctx < preimage->nr && + !(preimage->line[ctx].flag & LINE_COMMON)) { + fixed += preimage->line[ctx].len; + ctx++; + } + if (preimage->nr <= ctx) + die("oops"); + + /* and copy it in, while fixing the line length */ + len = preimage->line[ctx].len; + memcpy(new, fixed, len); + new += len; + fixed += len; + postimage->line[i].len = len; + ctx++; + } + + /* Fix the length of the whole thing */ + postimage->len = new - postimage->buf; + } + + static int match_fragment(struct image *img, + struct image *preimage, + struct image *postimage, + unsigned long try, + int try_lno, + unsigned ws_rule, + int match_beginning, int match_end) + { + int i; + char *fixed_buf, *buf, *orig, *target; + + if (preimage->nr + try_lno > img->nr) + return 0; + + if (match_beginning && try_lno) + return 0; + + if (match_end && preimage->nr + try_lno != img->nr) + return 0; + + /* Quick hash check */ + for (i = 0; i < preimage->nr; i++) + if (preimage->line[i].hash != img->line[try_lno + i].hash) + return 0; + + /* + * Do we have an exact match? If we were told to match + * at the end, size must be exactly at try+fragsize, + * otherwise try+fragsize must be still within the preimage, + * and either case, the old piece should match the preimage + * exactly. + */ + if ((match_end + ? (try + preimage->len == img->len) + : (try + preimage->len <= img->len)) && + !memcmp(img->buf + try, preimage->buf, preimage->len)) + return 1; + + if (ws_error_action != correct_ws_error) + return 0; + + /* + * The hunk does not apply byte-by-byte, but the hash says + * it might with whitespace fuzz. + */ + fixed_buf = xmalloc(preimage->len + 1); + buf = fixed_buf; + orig = preimage->buf; + target = img->buf + try; + for (i = 0; i < preimage->nr; i++) { + size_t fixlen; /* length after fixing the preimage */ + size_t oldlen = preimage->line[i].len; + size_t tgtlen = img->line[try_lno + i].len; + size_t tgtfixlen; /* length after fixing the target line */ + char tgtfixbuf[1024], *tgtfix; + int match; + + /* Try fixing the line in the preimage */ + fixlen = ws_fix_copy(buf, orig, oldlen, ws_rule, NULL); + + /* Try fixing the line in the target */ + if (sizeof(tgtfixbuf) < tgtlen) + tgtfix = tgtfixbuf; + else + tgtfix = xmalloc(tgtlen); + tgtfixlen = ws_fix_copy(tgtfix, target, tgtlen, ws_rule, NULL); + + /* + * If they match, either the preimage was based on + * a version before our tree fixed whitespace breakage, + * or we are lacking a whitespace-fix patch the tree + * the preimage was based on already had (i.e. target + * has whitespace breakage, the preimage doesn't). + * In either case, we are fixing the whitespace breakages + * so we might as well take the fix together with their + * real change. + */ + match = (tgtfixlen == fixlen && !memcmp(tgtfix, buf, fixlen)); + + if (tgtfix != tgtfixbuf) + free(tgtfix); + if (!match) + goto unmatch_exit; + + orig += oldlen; + buf += fixlen; + target += tgtlen; } - /* Exact line number? */ - if ((start + fragsize <= size) && - !memcmp(buf + start, fragment, fragsize)) - return start; + /* + * Yes, the preimage is based on an older version that still + * has whitespace breakages unfixed, and fixing them makes the + * hunk match. Update the context lines in the postimage. + */ + update_pre_post_images(preimage, postimage, + fixed_buf, buf - fixed_buf); + return 1; + + unmatch_exit: + free(fixed_buf); + return 0; + } + + static int find_pos(struct image *img, + struct image *preimage, + struct image *postimage, + int line, + unsigned ws_rule, + int match_beginning, int match_end) + { + int i; + unsigned long backwards, forwards, try; + int backwards_lno, forwards_lno, try_lno; + + if (preimage->nr > img->nr) + return -1; + + /* + * If match_begining or match_end is specified, there is no + * point starting from a wrong line that will never match and + * wander around and wait for a match at the specified end. + */ + if (match_beginning) + line = 0; + else if (match_end) + line = img->nr - preimage->nr; + + if (line > img->nr) + line = img->nr; + + try = 0; + for (i = 0; i < line; i++) + try += img->line[i].len; /* * There's probably some smart way to do this, but I'll leave * that to the smart and beautiful people. I'm simple and stupid. */ - backwards = start; - forwards = start; + backwards = try; + backwards_lno = line; + forwards = try; + forwards_lno = line; + try_lno = line; + for (i = 0; ; i++) { - unsigned long try; - int n; + if (match_fragment(img, preimage, postimage, + try, try_lno, ws_rule, + match_beginning, match_end)) + return try_lno; + + again: + if (backwards_lno == 0 && forwards_lno == img->nr) + break; - /* "backward" */ if (i & 1) { - if (!backwards) { - if (forwards + fragsize > size) - break; - continue; + if (backwards_lno == 0) { + i++; + goto again; } - do { - --backwards; - } while (backwards && buf[backwards-1] != '\n'); + backwards_lno--; + backwards -= img->line[backwards_lno].len; try = backwards; + try_lno = backwards_lno; } else { - while (forwards + fragsize <= size) { - if (buf[forwards++] == '\n') - break; + if (forwards_lno == img->nr) { + i++; + goto again; } + forwards += img->line[forwards_lno].len; + forwards_lno++; try = forwards; + try_lno = forwards_lno; } - if (try + fragsize > size) - continue; - if (memcmp(buf + try, fragment, fragsize)) - continue; - n = (i >> 1)+1; - if (i & 1) - n = -n; - *lines = n; - return try; } - - /* - * We should start searching forward and backward. - */ return -1; } - static void remove_first_line(const char **rbuf, int *rsize) + static void remove_first_line(struct image *img) { - const char *buf = *rbuf; - int size = *rsize; - unsigned long offset; - offset = 0; - while (offset <= size) { - if (buf[offset++] == '\n') - break; - } - *rsize = size - offset; - *rbuf = buf + offset; + img->buf += img->line[0].len; + img->len -= img->line[0].len; + img->line++; + img->nr--; } - static void remove_last_line(const char **rbuf, int *rsize) + static void remove_last_line(struct image *img) { - const char *buf = *rbuf; - int size = *rsize; - unsigned long offset; - offset = size - 1; - while (offset > 0) { - if (buf[--offset] == '\n') - break; - } - *rsize = offset + 1; + img->len -= img->line[--img->nr].len; } - static int apply_line(char *output, const char *patch, int plen, - unsigned ws_rule) + static void update_image(struct image *img, + int applied_pos, + struct image *preimage, + struct image *postimage) { /* - * plen is number of bytes to be copied from patch, - * starting at patch+1 (patch[0] is '+'). Typically - * patch[plen] is '\n', unless this is the incomplete - * last line. - */ - int i; - int add_nl_to_tail = 0; - int fixed = 0; - int last_tab_in_indent = 0; - int last_space_in_indent = 0; - int need_fix_leading_space = 0; - char *buf; - - if ((ws_error_action != correct_ws_error) || !whitespace_error || - *patch != '+') { - memcpy(output, patch + 1, plen); - return plen; - } - - /* - * Strip trailing whitespace - */ - if ((ws_rule & WS_TRAILING_SPACE) && - (1 < plen && isspace(patch[plen-1]))) { - if (patch[plen] == '\n') - add_nl_to_tail = 1; - plen--; - while (0 < plen && isspace(patch[plen])) - plen--; - fixed = 1; - } - - /* - * Check leading whitespaces (indent) + * remove the copy of preimage at offset in img + * and replace it with postimage */ - for (i = 1; i < plen; i++) { - char ch = patch[i]; - if (ch == '\t') { - last_tab_in_indent = i; - if ((ws_rule & WS_SPACE_BEFORE_TAB) && - 0 < last_space_in_indent) - need_fix_leading_space = 1; - } else if (ch == ' ') { - last_space_in_indent = i; - if ((ws_rule & WS_INDENT_WITH_NON_TAB) && - 8 <= i - last_tab_in_indent) - need_fix_leading_space = 1; - } - else - break; - } - - buf = output; - if (need_fix_leading_space) { - int consecutive_spaces = 0; - int last = last_tab_in_indent + 1; - - if (ws_rule & WS_INDENT_WITH_NON_TAB) { - /* have "last" point at one past the indent */ - if (last_tab_in_indent < last_space_in_indent) - last = last_space_in_indent + 1; - else - last = last_tab_in_indent + 1; - } + int i, nr; + size_t remove_count, insert_count, applied_at = 0; + char *result; + for (i = 0; i < applied_pos; i++) + applied_at += img->line[i].len; + + remove_count = 0; + for (i = 0; i < preimage->nr; i++) + remove_count += img->line[applied_pos + i].len; + insert_count = postimage->len; + + /* Adjust the contents */ + result = xmalloc(img->len + insert_count - remove_count + 1); + memcpy(result, img->buf, applied_at); + memcpy(result + applied_at, postimage->buf, postimage->len); + memcpy(result + applied_at + postimage->len, + img->buf + (applied_at + remove_count), + img->len - (applied_at + remove_count)); + free(img->buf); + img->buf = result; + img->len += insert_count - remove_count; + result[img->len] = '\0'; + + /* Adjust the line table */ + nr = img->nr + postimage->nr - preimage->nr; + if (preimage->nr < postimage->nr) { /* - * between patch[1..last], strip the funny spaces, - * updating them to tab as needed. + * NOTE: this knows that we never call remove_first_line() + * on anything other than pre/post image. */ - for (i = 1; i < last; i++, plen--) { - char ch = patch[i]; - if (ch != ' ') { - consecutive_spaces = 0; - *output++ = ch; - } else { - consecutive_spaces++; - if (consecutive_spaces == 8) { - *output++ = '\t'; - consecutive_spaces = 0; - } - } - } - while (0 < consecutive_spaces--) - *output++ = ' '; - fixed = 1; - i = last; + img->line = xrealloc(img->line, nr * sizeof(*img->line)); + img->line_allocated = img->line; } - else - i = 1; - - memcpy(output, patch + i, plen); - if (add_nl_to_tail) - output[plen++] = '\n'; - if (fixed) - applied_after_fixing_ws++; - return output + plen - buf; + if (preimage->nr != postimage->nr) + memmove(img->line + applied_pos + postimage->nr, + img->line + applied_pos + preimage->nr, + (img->nr - (applied_pos + preimage->nr)) * + sizeof(*img->line)); + memcpy(img->line + applied_pos, + postimage->line, + postimage->nr * sizeof(*img->line)); + img->nr = nr; } - static int apply_one_fragment(struct strbuf *buf, struct fragment *frag, + static int apply_one_fragment(struct image *img, struct fragment *frag, int inaccurate_eof, unsigned ws_rule) { int match_beginning, match_end; const char *patch = frag->patch; - int offset, size = frag->size; - char *old = xmalloc(size); - char *new = xmalloc(size); - const char *oldlines, *newlines; - int oldsize = 0, newsize = 0; + int size = frag->size; + char *old, *new, *oldlines, *newlines; int new_blank_lines_at_end = 0; unsigned long leading, trailing; - int pos, lines; + int pos, applied_pos; + struct image preimage; + struct image postimage; + memset(&preimage, 0, sizeof(preimage)); + memset(&postimage, 0, sizeof(postimage)); + oldlines = xmalloc(size); + newlines = xmalloc(size); + + old = oldlines; + new = newlines; while (size > 0) { char first; int len = linelen(patch, size); - int plen; + int plen, added; int added_blank_line = 0; if (!len) @@@ -1670,7 -1859,7 +1859,7 @@@ * followed by "\ No newline", then we also remove the * last one (which is the newline, of course). */ - plen = len-1; + plen = len - 1; if (len < size && patch[len] == '\\') plen--; first = *patch; @@@ -1687,25 -1876,40 +1876,40 @@@ if (plen < 0) /* ... followed by '\No newline'; nothing */ break; - old[oldsize++] = '\n'; - new[newsize++] = '\n'; + *old++ = '\n'; + *new++ = '\n'; + add_line_info(&preimage, "\n", 1, LINE_COMMON); + add_line_info(&postimage, "\n", 1, LINE_COMMON); break; case ' ': case '-': - memcpy(old + oldsize, patch + 1, plen); - oldsize += plen; + memcpy(old, patch + 1, plen); + add_line_info(&preimage, old, plen, + (first == ' ' ? LINE_COMMON : 0)); + old += plen; if (first == '-') break; /* Fall-through for ' ' */ case '+': - if (first != '+' || !no_add) { - int added = apply_line(new + newsize, patch, - plen, ws_rule); - newsize += added; - if (first == '+' && - added == 1 && new[newsize-1] == '\n') - added_blank_line = 1; + /* --no-add does not add new lines */ + if (first == '+' && no_add) + break; + + if (first != '+' || + !whitespace_error || + ws_error_action != correct_ws_error) { + memcpy(new, patch + 1, plen); + added = plen; + } + else { + added = ws_fix_copy(new, patch + 1, plen, ws_rule, &applied_after_fixing_ws); } + add_line_info(&postimage, new, added, + (first == '+' ? 0 : LINE_COMMON)); + new += added; + if (first == '+' && + added == 1 && new[-1] == '\n') + added_blank_line = 1; break; case '@': case '\\': /* Ignore it, we already handled it */ @@@ -1722,16 -1926,13 +1926,13 @@@ patch += len; size -= len; } - if (inaccurate_eof && - oldsize > 0 && old[oldsize - 1] == '\n' && - newsize > 0 && new[newsize - 1] == '\n') { - oldsize--; - newsize--; + old > oldlines && old[-1] == '\n' && + new > newlines && new[-1] == '\n') { + old--; + new--; } - oldlines = old; - newlines = new; leading = frag->leading; trailing = frag->trailing; @@@ -1752,33 -1953,21 +1953,21 @@@ match_end = !trailing; } - lines = 0; - pos = frag->newpos; + pos = frag->newpos ? (frag->newpos - 1) : 0; + preimage.buf = oldlines; + preimage.len = old - oldlines; + postimage.buf = newlines; + postimage.len = new - newlines; + preimage.line = preimage.line_allocated; + postimage.line = postimage.line_allocated; + for (;;) { - offset = find_offset(buf->buf, buf->len, - oldlines, oldsize, pos, &lines); - if (match_end && offset + oldsize != buf->len) - offset = -1; - if (match_beginning && offset) - offset = -1; - if (offset >= 0) { - if (ws_error_action == correct_ws_error && - (buf->len - oldsize - offset == 0)) /* end of file? */ - newsize -= new_blank_lines_at_end; - - /* Warn if it was necessary to reduce the number - * of context lines. - */ - if ((leading != frag->leading) || - (trailing != frag->trailing)) - fprintf(stderr, "Context reduced to (%ld/%ld)" - " to apply fragment at %d\n", - leading, trailing, pos + lines); - - strbuf_splice(buf, offset, oldsize, newlines, newsize); - offset = 0; + + applied_pos = find_pos(img, &preimage, &postimage, pos, + ws_rule, match_beginning, match_end); + + if (applied_pos >= 0) break; - } /* Am I at my context limits? */ if ((leading <= p_context) && (trailing <= p_context)) @@@ -1787,33 -1976,64 +1976,64 @@@ match_beginning = match_end = 0; continue; } + /* * Reduce the number of context lines; reduce both * leading and trailing if they are equal otherwise * just reduce the larger context. */ if (leading >= trailing) { - remove_first_line(&oldlines, &oldsize); - remove_first_line(&newlines, &newsize); + remove_first_line(&preimage); + remove_first_line(&postimage); pos--; leading--; } if (trailing > leading) { - remove_last_line(&oldlines, &oldsize); - remove_last_line(&newlines, &newsize); + remove_last_line(&preimage); + remove_last_line(&postimage); trailing--; } } - if (offset && apply_verbosely) - error("while searching for:\n%.*s", oldsize, oldlines); + if (applied_pos >= 0) { + if (ws_error_action == correct_ws_error && + new_blank_lines_at_end && + postimage.nr + applied_pos == img->nr) { + /* + * If the patch application adds blank lines + * at the end, and if the patch applies at the + * end of the image, remove those added blank + * lines. + */ + while (new_blank_lines_at_end--) + remove_last_line(&postimage); + } - free(old); - free(new); - return offset; + /* + * Warn if it was necessary to reduce the number + * of context lines. + */ + if ((leading != frag->leading) || + (trailing != frag->trailing)) + fprintf(stderr, "Context reduced to (%ld/%ld)" + " to apply fragment at %d\n", + leading, trailing, applied_pos+1); + update_image(img, applied_pos, &preimage, &postimage); + } else { + if (apply_verbosely) + error("while searching for:\n%.*s", + (int)(old - oldlines), oldlines); + } + + free(oldlines); + free(newlines); + free(preimage.line_allocated); + free(postimage.line_allocated); + + return (applied_pos < 0); } - static int apply_binary_fragment(struct strbuf *buf, struct patch *patch) + static int apply_binary_fragment(struct image *img, struct patch *patch) { struct fragment *fragment = patch->fragments; unsigned long len; @@@ -1830,22 -2050,26 +2050,26 @@@ } switch (fragment->binary_patch_method) { case BINARY_DELTA_DEFLATED: - dst = patch_delta(buf->buf, buf->len, fragment->patch, + dst = patch_delta(img->buf, img->len, fragment->patch, fragment->size, &len); if (!dst) return -1; - /* XXX patch_delta NUL-terminates */ - strbuf_attach(buf, dst, len, len + 1); + clear_image(img); + img->buf = dst; + img->len = len; return 0; case BINARY_LITERAL_DEFLATED: - strbuf_reset(buf); - strbuf_add(buf, fragment->patch, fragment->size); + clear_image(img); + img->len = fragment->size; + img->buf = xmalloc(img->len+1); + memcpy(img->buf, fragment->patch, img->len); + img->buf[img->len] = '\0'; return 0; } return -1; } - static int apply_binary(struct strbuf *buf, struct patch *patch) + static int apply_binary(struct image *img, struct patch *patch) { const char *name = patch->old_name ? patch->old_name : patch->new_name; unsigned char sha1[20]; @@@ -1866,7 -2090,7 +2090,7 @@@ * See if the old one matches what the patch * applies to. */ - hash_sha1_file(buf->buf, buf->len, blob_type, sha1); + hash_sha1_file(img->buf, img->len, blob_type, sha1); if (strcmp(sha1_to_hex(sha1), patch->old_sha1_prefix)) return error("the patch applies to '%s' (%s), " "which does not match the " @@@ -1875,14 -2099,14 +2099,14 @@@ } else { /* Otherwise, the old one must be empty. */ - if (buf->len) + if (img->len) return error("the patch applies to an empty " "'%s' but it is not empty", name); } get_sha1_hex(patch->new_sha1_prefix, sha1); if (is_null_sha1(sha1)) { - strbuf_release(buf); + clear_image(img); return 0; /* deletion patch */ } @@@ -1897,20 -2121,21 +2121,21 @@@ return error("the necessary postimage %s for " "'%s' cannot be read", patch->new_sha1_prefix, name); - /* XXX read_sha1_file NUL-terminates */ - strbuf_attach(buf, result, size, size + 1); + clear_image(img); + img->buf = result; + img->len = size; } else { /* * We have verified buf matches the preimage; * apply the patch data to it, which is stored * in the patch->fragments->{patch,size}. */ - if (apply_binary_fragment(buf, patch)) + if (apply_binary_fragment(img, patch)) return error("binary patch does not apply to '%s'", name); /* verify that the result matches */ - hash_sha1_file(buf->buf, buf->len, blob_type, sha1); + hash_sha1_file(img->buf, img->len, blob_type, sha1); if (strcmp(sha1_to_hex(sha1), patch->new_sha1_prefix)) return error("binary patch to '%s' creates incorrect result (expecting %s, got %s)", name, patch->new_sha1_prefix, sha1_to_hex(sha1)); @@@ -1919,7 -2144,7 +2144,7 @@@ return 0; } - static int apply_fragments(struct strbuf *buf, struct patch *patch) + static int apply_fragments(struct image *img, struct patch *patch) { struct fragment *frag = patch->fragments; const char *name = patch->old_name ? patch->old_name : patch->new_name; @@@ -1927,10 -2152,10 +2152,10 @@@ unsigned inaccurate_eof = patch->inaccurate_eof; if (patch->is_binary) - return apply_binary(buf, patch); + return apply_binary(img, patch); while (frag) { - if (apply_one_fragment(buf, frag, inaccurate_eof, ws_rule)) { + if (apply_one_fragment(img, frag, inaccurate_eof, ws_rule)) { error("patch failed: %s:%ld", name, frag->oldpos); if (!apply_with_reject) return -1; @@@ -1946,7 -2171,7 +2171,7 @@@ static int read_file_or_gitlink(struct if (!ce) return 0; - if (S_ISGITLINK(ntohl(ce->ce_mode))) { + if (S_ISGITLINK(ce->ce_mode)) { strbuf_grow(buf, 100); strbuf_addf(buf, "Subproject commit %s\n", sha1_to_hex(ce->sha1)); } else { @@@ -1966,6 -2191,9 +2191,9 @@@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *ce) { struct strbuf buf; + struct image image; + size_t len; + char *img; strbuf_init(&buf, 0); if (cached) { @@@ -1988,9 -2216,14 +2216,14 @@@ } } - if (apply_fragments(&buf, patch) < 0) + img = strbuf_detach(&buf, &len); + prepare_image(&image, img, len, !patch->is_binary); + + if (apply_fragments(&image, patch) < 0) return -1; /* note with --reject this succeeds. */ - patch->result = strbuf_detach(&buf, &patch->resultsize); + patch->result = image.buf; + patch->resultsize = image.len; + free(image.line_allocated); if (0 < patch->is_delete && patch->resultsize) return error("removal patch leaves file contents"); @@@ -2023,7 -2256,7 +2256,7 @@@ static int check_to_create_blob(const c static int verify_index_match(struct cache_entry *ce, struct stat *st) { - if (S_ISGITLINK(ntohl(ce->ce_mode))) { + if (S_ISGITLINK(ce->ce_mode)) { if (!S_ISDIR(st->st_mode)) return -1; return 0; @@@ -2082,12 -2315,12 +2315,12 @@@ static int check_patch(struct patch *pa return error("%s: does not match index", old_name); if (cached) - st_mode = ntohl(ce->ce_mode); + st_mode = ce->ce_mode; } else if (stat_ret < 0) return error("%s: %s", old_name, strerror(errno)); if (!cached) - st_mode = ntohl(ce_mode_from_stat(ce, st.st_mode)); + st_mode = ce_mode_from_stat(ce, st.st_mode); if (patch->is_new < 0) patch->is_new = 0; @@@ -2388,7 -2621,7 +2621,7 @@@ static void add_index_file(const char * ce = xcalloc(1, ce_size); memcpy(ce->name, path, namelen); ce->ce_mode = create_ce_mode(mode); - ce->ce_flags = htons(namelen); + ce->ce_flags = namelen; if (S_ISGITLINK(mode)) { const char *s = buf; @@@ -2746,8 -2979,6 +2979,8 @@@ static int apply_patch(int fd, const ch static int git_apply_config(const char *var, const char *value) { if (!strcmp(var, "apply.whitespace")) { + if (!value) + return config_error_nonbool(var); apply_default_whitespace = xstrdup(value); return 0; } diff --combined cache.h index fa5a9e523e,3d4c6b078c..4fa69f0ee4 --- a/cache.h +++ b/cache.h @@@ -3,7 -3,6 +3,7 @@@ #include "git-compat-util.h" #include "strbuf.h" +#include "hash.h" #include SHA1_HEADER #include @@@ -95,148 -94,66 +95,148 @@@ struct cache_time * We save the fields in big-endian order to allow using the * index file over NFS transparently. */ +struct ondisk_cache_entry { + struct cache_time ctime; + struct cache_time mtime; + unsigned int dev; + unsigned int ino; + unsigned int mode; + unsigned int uid; + unsigned int gid; + unsigned int size; + unsigned char sha1[20]; + unsigned short flags; + char name[FLEX_ARRAY]; /* more */ +}; + struct cache_entry { - struct cache_time ce_ctime; - struct cache_time ce_mtime; + unsigned int ce_ctime; + unsigned int ce_mtime; unsigned int ce_dev; unsigned int ce_ino; unsigned int ce_mode; unsigned int ce_uid; unsigned int ce_gid; unsigned int ce_size; + unsigned int ce_flags; unsigned char sha1[20]; - unsigned short ce_flags; + struct cache_entry *next; char name[FLEX_ARRAY]; /* more */ }; #define CE_NAMEMASK (0x0fff) #define CE_STAGEMASK (0x3000) -#define CE_UPDATE (0x4000) #define CE_VALID (0x8000) #define CE_STAGESHIFT 12 -#define create_ce_flags(len, stage) htons((len) | ((stage) << CE_STAGESHIFT)) -#define ce_namelen(ce) (CE_NAMEMASK & ntohs((ce)->ce_flags)) +/* In-memory only */ +#define CE_UPDATE (0x10000) +#define CE_REMOVE (0x20000) +#define CE_UPTODATE (0x40000) + +#define CE_HASHED (0x100000) +#define CE_UNHASHED (0x200000) + +/* + * Copy the sha1 and stat state of a cache entry from one to + * another. But we never change the name, or the hash state! + */ +#define CE_STATE_MASK (CE_HASHED | CE_UNHASHED) +static inline void copy_cache_entry(struct cache_entry *dst, struct cache_entry *src) +{ + unsigned int state = dst->ce_flags & CE_STATE_MASK; + + /* Don't copy hash chain and name */ + memcpy(dst, src, offsetof(struct cache_entry, next)); + + /* Restore the hash state */ + dst->ce_flags = (dst->ce_flags & ~CE_STATE_MASK) | state; +} + +/* + * We don't actually *remove* it, we can just mark it invalid so that + * we won't find it in lookups. + * + * Not only would we have to search the lists (simple enough), but + * we'd also have to rehash other hash buckets in case this makes the + * hash bucket empty (common). So it's much better to just mark + * it. + */ +static inline void remove_index_entry(struct cache_entry *ce) +{ + ce->ce_flags |= CE_UNHASHED; +} + +static inline unsigned create_ce_flags(size_t len, unsigned stage) +{ + if (len >= CE_NAMEMASK) + len = CE_NAMEMASK; + return (len | (stage << CE_STAGESHIFT)); +} + +static inline size_t ce_namelen(const struct cache_entry *ce) +{ + size_t len = ce->ce_flags & CE_NAMEMASK; + if (len < CE_NAMEMASK) + return len; + return strlen(ce->name + CE_NAMEMASK) + CE_NAMEMASK; +} + #define ce_size(ce) cache_entry_size(ce_namelen(ce)) -#define ce_stage(ce) ((CE_STAGEMASK & ntohs((ce)->ce_flags)) >> CE_STAGESHIFT) +#define ondisk_ce_size(ce) ondisk_cache_entry_size(ce_namelen(ce)) +#define ce_stage(ce) ((CE_STAGEMASK & (ce)->ce_flags) >> CE_STAGESHIFT) +#define ce_uptodate(ce) ((ce)->ce_flags & CE_UPTODATE) +#define ce_mark_uptodate(ce) ((ce)->ce_flags |= CE_UPTODATE) #define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644) static inline unsigned int create_ce_mode(unsigned int mode) { if (S_ISLNK(mode)) - return htonl(S_IFLNK); + return S_IFLNK; if (S_ISDIR(mode) || S_ISGITLINK(mode)) - return htonl(S_IFGITLINK); - return htonl(S_IFREG | ce_permissions(mode)); + return S_IFGITLINK; + return S_IFREG | ce_permissions(mode); } static inline unsigned int ce_mode_from_stat(struct cache_entry *ce, unsigned int mode) { extern int trust_executable_bit, has_symlinks; if (!has_symlinks && S_ISREG(mode) && - ce && S_ISLNK(ntohl(ce->ce_mode))) + ce && S_ISLNK(ce->ce_mode)) return ce->ce_mode; if (!trust_executable_bit && S_ISREG(mode)) { - if (ce && S_ISREG(ntohl(ce->ce_mode))) + if (ce && S_ISREG(ce->ce_mode)) return ce->ce_mode; return create_ce_mode(0666); } return create_ce_mode(mode); } +static inline int ce_to_dtype(const struct cache_entry *ce) +{ + unsigned ce_mode = ntohl(ce->ce_mode); + if (S_ISREG(ce_mode)) + return DT_REG; + else if (S_ISDIR(ce_mode) || S_ISGITLINK(ce_mode)) + return DT_DIR; + else if (S_ISLNK(ce_mode)) + return DT_LNK; + else + return DT_UNKNOWN; +} #define canon_mode(mode) \ (S_ISREG(mode) ? (S_IFREG | ce_permissions(mode)) : \ S_ISLNK(mode) ? S_IFLNK : S_ISDIR(mode) ? S_IFDIR : S_IFGITLINK) #define cache_entry_size(len) ((offsetof(struct cache_entry,name) + (len) + 8) & ~7) +#define ondisk_cache_entry_size(len) ((offsetof(struct ondisk_cache_entry,name) + (len) + 8) & ~7) struct index_state { struct cache_entry **cache; unsigned int cache_nr, cache_alloc, cache_changed; struct cache_tree *cache_tree; time_t timestamp; - void *mmap; - size_t mmap_size; + void *alloc; + unsigned name_hash_initialized : 1; + struct hash_table name_hash; }; extern struct index_state the_index; @@@ -260,7 -177,6 +260,7 @@@ #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL) #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), (options)) #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), (options)) +#define cache_name_exists(name, namelen) index_name_exists(&the_index, (name), (namelen)) #endif enum object_type { @@@ -347,7 -263,6 +347,7 @@@ extern int read_index_from(struct index extern int write_index(struct index_state *, int newfd); extern int discard_index(struct index_state *); extern int verify_path(const char *path); +extern int index_name_exists(struct index_state *istate, const char *name, int namelen); extern int index_name_pos(struct index_state *, const char *name, int namelen); #define ADD_CACHE_OK_TO_ADD 1 /* Ok to add */ #define ADD_CACHE_OK_TO_REPLACE 2 /* Ok to replace file/directory */ @@@ -415,14 -330,6 +415,14 @@@ extern size_t packed_git_limit extern size_t delta_base_cache_limit; extern int auto_crlf; +enum safe_crlf { + SAFE_CRLF_FALSE = 0, + SAFE_CRLF_FAIL = 1, + SAFE_CRLF_WARN = 2, +}; + +extern enum safe_crlf safe_crlf; + #define GIT_REPO_VERSION 0 extern int repository_format_version; extern int check_repository_format(void); @@@ -677,16 -584,11 +677,16 @@@ extern int git_parse_ulong(const char * extern int git_config_int(const char *, const char *); extern unsigned long git_config_ulong(const char *, const char *); extern int git_config_bool(const char *, const char *); +extern int git_config_string(const char **, const char *, const char *); extern int git_config_set(const char *, const char *); extern int git_config_set_multivar(const char *, const char *, const char *, int); extern int git_config_rename_section(const char *, const char *); extern const char *git_etc_gitconfig(void); extern int check_repository_format_version(const char *var, const char *value); +extern int git_env_bool(const char *, int); +extern int git_config_system(void); +extern int git_config_global(void); +extern int config_error_nonbool(const char *); #define MAX_GITNAME (1000) extern char git_default_email[MAX_GITNAME]; @@@ -706,12 -608,12 +706,12 @@@ extern int write_or_whine_pipe(int fd, /* pager.c */ extern void setup_pager(void); -extern char *pager_program; +extern const char *pager_program; extern int pager_in_use(void); extern int pager_use_color; -extern char *editor_program; -extern char *excludes_file; +extern const char *editor_program; +extern const char *excludes_file; /* base85 */ int decode_85(char *dst, const char *line, int linelen); @@@ -731,8 -633,7 +731,8 @@@ extern void trace_argv_printf(const cha /* convert.c */ /* returns 1 if *dst was used */ -extern int convert_to_git(const char *path, const char *src, size_t len, struct strbuf *dst); +extern int convert_to_git(const char *path, const char *src, size_t len, + struct strbuf *dst, enum safe_crlf checksafe); extern int convert_to_working_tree(const char *path, const char *src, size_t len, struct strbuf *dst); /* add */ @@@ -751,6 -652,7 +751,7 @@@ void shift_tree(const unsigned char *, #define WS_TRAILING_SPACE 01 #define WS_SPACE_BEFORE_TAB 02 #define WS_INDENT_WITH_NON_TAB 04 + #define WS_CR_AT_EOL 010 #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB) extern unsigned whitespace_rule_cfg; extern unsigned whitespace_rule(const char *); @@@ -759,6 -661,7 +760,7 @@@ extern unsigned check_and_emit_line(con FILE *stream, const char *set, const char *reset, const char *ws); extern char *whitespace_error_string(unsigned ws); + extern int ws_fix_copy(char *, const char *, int, unsigned, int *); /* ls-files */ int pathspec_match(const char **spec, char *matched, const char *filename, int skiplen);