gitweb.git
merge-recursive: improve rename/rename(1to2)/add[/add... Elijah Newren Thu, 8 Nov 2018 04:40:29 +0000 (20:40 -0800)

merge-recursive: improve rename/rename(1to2)/add[/add] handling

When we have a rename/rename(1to2) conflict, each of the renames can
collide with a file addition. Each of these rename/add conflicts suffered
from the same kinds of problems that normal rename/add suffered from.
Make the code use handle_file_conflicts() as well so that we get all the
same fixes and consistent behavior between the different conflict types.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: use handle_file_collision for add... Elijah Newren Thu, 8 Nov 2018 04:40:28 +0000 (20:40 -0800)

merge-recursive: use handle_file_collision for add/add conflicts

This results in no-net change of behavior, it simply ensures that all
file-collision conflict handling types are being handled the same by
calling the same function.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: improve handling for rename/rename... Elijah Newren Thu, 8 Nov 2018 04:40:27 +0000 (20:40 -0800)

merge-recursive: improve handling for rename/rename(2to1) conflicts

This makes the rename/rename(2to1) conflicts use the new
handle_file_collision() function. Since that function was based
originally on the rename/rename(2to1) handling code, the main
differences here are in what was added. In particular:

* Instead of storing files at collide_path~HEAD and collide_path~MERGE,
the files are two-way merged and recorded at collide_path.

* Instead of recording the version of the renamed file that existed
on the renamed side in the index (thus ignoring any changes that
were made to the file on the side of history without the rename),
we do a three-way content merge on the renamed path, then store
that at either stage 2 or stage 3.

* Note that since the content merge for each rename may have conflicts,
and then we have to merge the two renamed files, we can end up with
nested conflict markers.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: fix rename/add conflict handlingElijah Newren Thu, 8 Nov 2018 04:40:26 +0000 (20:40 -0800)

merge-recursive: fix rename/add conflict handling

This makes the rename/add conflict handling make use of the new
handle_file_collision() function, which fixes several bugs and improves
things for the rename/add case significantly. Previously, rename/add
would:

* Not leave any higher order stage entries in the index, making it
appear as if there were no conflict.
* Would place the rename file at the colliding path, and move the
added file elsewhere, which combined with the lack of higher order
stage entries felt really odd. It's not clear to me why the
rename should take precedence over the add; if one should be moved
out of the way, they both probably should.
* In the recursive case, it would do a two way merge of the added
file and the version of the renamed file on the renamed side,
completely excluding modifications to the renamed file on the
unrenamed side of history.

Use the new handle_file_collision() to fix all of these issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: new function for better colliding... Elijah Newren Thu, 8 Nov 2018 04:40:25 +0000 (20:40 -0800)

merge-recursive: new function for better colliding conflict resolutions

There are three conflict types that represent two (possibly entirely
unrelated) files colliding at the same location:
* add/add
* rename/add
* rename/rename(2to1)

These three conflict types already share more similarity than might be
immediately apparent from their description: (1) the handling of the
rename variants already involves removing any entries from the index
corresponding to the original file names[*], thus only leaving entries
in the index for the colliding path; (2) likewise, any trace of the
original file name in the working tree is also removed. So, in all
three cases we're left with how to represent two colliding files in both
the index and the working copy.

[*] Technically, this isn't quite true because rename/rename(2to1)
conflicts in the recursive (o->call_depth > 0) case do an "unrename"
since about seven years ago. But even in that case, Junio felt
compelled to explain that my decision to "unrename" wasn't necessarily
the only or right answer -- search for "Comment from Junio" in t6036 for
details.

My initial motivation for looking at these three conflict types was that
if the handling of these three conflict types is the same, at least in
the limited set of cases where a renamed file is unmodified on the side
of history where the file is not renamed, then a significant performance
improvement for rename detection during merges is possible. However,
while that served as motivation to look at these three types of
conflicts, the actual goal of this new function is to try to improve the
handling for all three cases, not to merely make them the same as each
other in that special circumstance.

=== Handling the working tree ===

The previous behavior for these conflict types in regards to the
working tree (assuming the file collision occurs at 'foo') was:
* add/add does a two-way merge of the two files and records it as 'foo'.
* rename/rename(2to1) records the two different files into two new
uniquely named files (foo~HEAD and foo~$MERGE), while removing 'foo'
from the working tree.
* rename/add records the two different files into two different
locations, recording the add at foo~$SIDE and, oddly, recording
the rename at foo (why is the rename more important than the add?)

So, the question for what to write to the working tree boils down to
whether the two colliding files should be two-way merged and recorded in
place, or recorded into separate files. As per discussion on the git
mailing lit, two-way merging was deemed to always be preferred, as that
makes these cases all more like content conflicts that users can handle
from within their favorite editor, IDE, or merge tool. Note that since
renames already involve a content merge, rename/add and
rename/rename(2to1) conflicts could result in nested conflict markers.

=== Handling of the index ===

For a typical rename, unpack_trees() would set up the index in the
following fashion:
old_path new_path
stage1: 5ca1ab1e 00000000
stage2: f005ba11 00000000
stage3: 00000000 b0a710ad
And merge-recursive would rewrite this to
new_path
stage1: 5ca1ab1e
stage2: f005ba11
stage3: b0a710ad
Removing old_path from the index means the user won't have to `git rm
old_path` manually every time a renamed path has a content conflict.
It also means they can use `git checkout [--ours|--theirs|--conflict|-m]
new_path`, `git diff [--ours|--theirs]` and various other commands that
would be difficult otherwise.

This strategy becomes a problem when we have a rename/add or
rename/rename(2to1) conflict, however, because then we have only three
slots to store blob sha1s and we need either four or six. Previously,
this was handled by continuing to delete old_path from the index, and
just outright ignoring any blob shas from old_path. That had the
downside of deleting any trace of changes made to old_path on the other
side of history. This function instead does a three-way content merge of
the renamed file, and stores the blob sha1 for that at either stage2 or
stage3 for new_path (depending on which side the rename came from). That
has the advantage of bringing information about changes on both sides and
still allows for easy resolution (no need to git rm old_path, etc.), but
does have the downside that if the content merge had conflict markers,
then what we store in the index is the sha1 of a blob with conflict
markers. While that is a downside, it seems less problematic than the
downsides of any obvious alternatives, and certainly makes more sense
than the previous handling. Further, it has a precedent in that when we
do recursive merges, we may accept a file with conflict markers as the
resolution for the merge of the merge-bases, which will then show up in
the index of the outer merge at stage 1 if a conflict exists at the outer
level.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: increase marker length with depth... Elijah Newren Thu, 8 Nov 2018 04:40:24 +0000 (20:40 -0800)

merge-recursive: increase marker length with depth of recursion

Later patches in this series will modify file collision conflict
handling (e.g. from rename/add and rename/rename(2to1) conflicts) so
that multiply nested conflict markers can arise even before considering
conflicts in the virtual merge base. Including the virtual merge base
will provide a way to get triply (or higher) nested conflict markers.
This new way to get nested conflict markers will force the need for a
more general mechanism to extend the length of conflict markers in order
to differentiate between different nestings.

Along with this change to conflict marker length handling, we want to
make sure that we don't regress handling for other types of conflicts
with nested conflict markers. Add a more involved testcase using
merge.conflictstyle=diff3, where not only does the virtual merge base
contain conflicts, but its virtual merge base does as well (i.e. a case
with triply nested conflict markers). While there are multiple
reasonable ways to handle nested conflict markers in the virtual merge
base for this type of situation, the easiest approach that dovetails
well with the new needs for the file collision conflict handling is to
require that the length of the conflict markers increase with each
subsequent nesting.

Subsequent patches which change the rename/add and rename/rename(2to1)
conflict handling will modify the extra_marker_size flag appropriately
for their new needs.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t6036, t6042: testcases for rename collision of already... Elijah Newren Thu, 8 Nov 2018 04:40:23 +0000 (20:40 -0800)

t6036, t6042: testcases for rename collision of already conflicting files

When a single file is renamed, it can also be modified, yielding the
possibility of that renamed file having content conflicts. If two
different such files are renamed into the same location, then two-way
merging those files may result in nested conflicts. Add a testcase that
makes sure we get this case correct, and uses different lengths of
conflict markers to differentiate between the different nestings.

Also add another case with an extra (i.e. third) level of conflict
markers due to using merge.conflictstyle=diff3 and the virtual merge
base also having conflicts present.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t6042: add tests for consistency in file collision... Elijah Newren Thu, 8 Nov 2018 04:40:22 +0000 (20:40 -0800)

t6042: add tests for consistency in file collision conflict handling

Add testcases dealing with file collisions for the following types of
conflicts:
* add/add
* rename/add
* rename/rename(2to1)

All these conflict types simplify down to two files "colliding"
and should thus be handled similarly. This means that rename/add and
rename/rename(2to1) conflicts need to be modified to behave the same as
add/add conflicts currently do: the colliding files should be two-way
merged (instead of the current behavior of writing the two colliding
files out to separate temporary unique pathnames). Add testcases which
check this; subsequent commits will fix the conflict handling to make
these tests pass.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: avoid showing conflicts with merge... Elijah Newren Tue, 16 Oct 2018 20:19:48 +0000 (13:19 -0700)

merge-recursive: avoid showing conflicts with merge branch before HEAD

We want to load unmerged entries from HEAD into the index at stage 2 and
from MERGE_HEAD into stage 3. Similarly, folks expect merge conflicts
to look like

<<<<<<<< HEAD
content from our side
========
content from their side
>>>>>>>> MERGE_HEAD

not

<<<<<<<< MERGE_HEAD
content from their side
========
content from our side
>>>>>>>> HEAD

The correct order usually comes naturally and for free, but with renames
we often have data in the form {rename_branch, other_branch}, and
working relative to the rename first (e.g. for rename/add) is more
convenient elsewhere in the code. Address the slight impedance
mismatch by having some functions re-call themselves with flipped
arguments when the branch order is reversed.

Note that setup_rename_conflict_info() has one asymmetry in it, in
setting dst_entry1->processed=0 but not doing similarly for
dst_entry2->processed. When dealing with rename/rename and similar
conflicts, we do not want the processing to happen twice, so the
desire to only set one of the entries to unprocessed is intentional.
So, while this change modifies which branch's entry will be marked as
unprocessed, that dovetails nicely with putting HEAD first so that we
get the index stage entries and conflict markers in the right order.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: improve auto-merging messages with... Elijah Newren Tue, 16 Oct 2018 20:19:47 +0000 (13:19 -0700)

merge-recursive: improve auto-merging messages with path collisions

Each individual file involved in a rename could have also been modified
on both sides of history, meaning it may need to have content merges.
If two such files are renamed into the same location, then on top of the
two natural auto-merging messages we also have to two-way merge the
result, giving us messages that look like

Auto-merging somefile.c (was somecase.c)
Auto-merging somefile.c (was somefolder.c)
Auto-merging somefile.c

However, despite the fact that I was the one who put the "(was %s)"
portions into the messages (and just a few months ago), I was still
initially confused when running into a rename/rename(2to1) case and
wondered if somefile.c had been merged three times. Update this to
instead be:

Auto-merging version of somefile.c from somecase.c
Auto-merging version of somefile.c from someportfolio.c
Auto-merging somefile.c

This is an admittedly long set of messages for a single path, but you
only get all three messages when dealing with the rare case of a
rename/rename(2to1) conflict where both sides of both original files
were also modified, in conflicting ways.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: rename merge_file_1() and merge_content()Elijah Newren Wed, 19 Sep 2018 16:14:34 +0000 (09:14 -0700)

merge-recursive: rename merge_file_1() and merge_content()

Summary:
merge_file_1() -> merge_mode_and_contents()
merge_content() -> handle_content_merge()

merge_file_1() is a very unhelpful name. Rename it to
merge_mode_and_contents() to reflect what it does.

merge_content() calls merge_mode_and_contents() to do the main part of
its work, but most of this function was about higher level stuff, e.g.
printing out conflict messages, updating skip_worktree bits, checking
for ability to avoid updating the working tree or for D/F conflicts
being in the way, etc. Since there are several handle_*() functions for
similar levels of checking and handling in merge-recursive.c (e.g.
handle_change_delete(), handle_rename_rename_2to1()), let's rename this
function to handle_content_merge().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: remove final remaining caller of merge... Elijah Newren Wed, 19 Sep 2018 16:14:33 +0000 (09:14 -0700)

merge-recursive: remove final remaining caller of merge_file_one()

The function names merge_file_one() and merge_file_1() aren't
particularly intuitive function names, especially since there is no
associated merge_file() function that these are related to. The
previous commit showed that merge_file_one() was prone to be called when
merge_file_1() should be, and since it is just a thin wrapper around
merge_file_1() anyway and only has one caller left, let's just remove
merge_file_one() entirely.

(It also turns out that the one remaining caller of merge_file_one()
has very broken code that needs to be completely rewritten, but that's
the subject of a future patch series; for now, we're just translating
it into a merge_file_1() call.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: avoid wrapper function when unnecessar... Elijah Newren Wed, 19 Sep 2018 16:14:32 +0000 (09:14 -0700)

merge-recursive: avoid wrapper function when unnecessary and wasteful

merge_file_one() is a convenience function taking a bunch of oids and
modes, combining each pair into a diff_filespec, and then calling
merge_file_1(). When we already start with diff_filespec's, we can
just call merge_file_1() directly instead of splitting out the oids
and modes for the wrapper to recombine into what we already had.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

merge-recursive: set paths correctly when three-way... Elijah Newren Wed, 19 Sep 2018 16:14:31 +0000 (09:14 -0700)

merge-recursive: set paths correctly when three-way merging content

merge_3way() has code to mark different sides of the conflict with info
about where the content comes from. If the names of the files involved
match, it simply uses the branch name. If the names of the files do not
match, it uses branchname:filename. Unfortunately, merge_content()
previously always called it with one.path = a.path = b.path. Granted,
it didn't have other path information available to it for years, but
that was corrected by passing rename_conflict_info in commit
3c217c077a86 ("merge-recursive: Provide more info in conflict markers
with file renames", 2011-08-11). In that commit, instead of just fixing
the bug with the pathnames, it created fake branch names incorporating
both the branch name and file name.

This "fake branch" workaround was extended further when I pulled that
logic out into a special function in commit dac4741554e7
("merge-recursive: Create function for merging with branchname:file
markers", 2011-08-11), and a number of other sites outside of
merge_content() have been added which call into that. However, this
Rube-Goldberg-esque setup is not merely duplicate code and unnecessary
work, it also risked having other callsites invoke it in a way that
would result in markers of the form branchname:filename:filename (i.e.
with the filename repeated).

Fix this whole mess by:
- setting one.path, a.path, and b.path appropriately
- calling merge_file_1() directly
- deleting the merge_file_special_markers() workaround wrapper

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Initial batch post 2.19Junio C Hamano Mon, 17 Sep 2018 21:15:00 +0000 (14:15 -0700)

Initial batch post 2.19

Merge branch 'nd/bisect-show-list-fix'Junio C Hamano Mon, 17 Sep 2018 20:54:00 +0000 (13:54 -0700)

Merge branch 'nd/bisect-show-list-fix'

Debugging aid update.

* nd/bisect-show-list-fix:
bisect.c: make show_list() build again

Merge branch 'ab/fetch-tags-noclobber'Junio C Hamano Mon, 17 Sep 2018 20:54:00 +0000 (13:54 -0700)

Merge branch 'ab/fetch-tags-noclobber'

The rules used by "git push" and "git fetch" to determine if a ref
can or cannot be updated were inconsistent; specifically, fetching
to update existing tags were allowed even though tags are supposed
to be unmoving anchoring points. "git fetch" was taught to forbid
updates to existing tags without the "--force" option.

* ab/fetch-tags-noclobber:
fetch: stop clobbering existing tags without --force
fetch: document local ref updates with/without --force
push doc: correct lies about how push refspecs work
push doc: move mention of "tag <tag>" later in the prose
push doc: remove confusing mention of remote merger
fetch tests: add a test for clobbering tag behavior
push tests: use spaces in interpolated string
push tests: make use of unused $1 in test description
fetch: change "branch" to "reference" in --force -h output

Merge branch 'es/worktree-forced-ops-fix'Junio C Hamano Mon, 17 Sep 2018 20:53:59 +0000 (13:53 -0700)

Merge branch 'es/worktree-forced-ops-fix'

Fix a bug in which the same path could be registered under multiple
worktree entries if the path was missing (for instance, was removed
manually). Also, as a convenience, expand the number of cases in
which --force is applicable.

* es/worktree-forced-ops-fix:
doc-diff: force worktree add
worktree: delete .git/worktrees if empty after 'remove'
worktree: teach 'remove' to override lock when --force given twice
worktree: teach 'move' to override lock when --force given twice
worktree: teach 'add' to respect --force for registered but missing path
worktree: disallow adding same path multiple times
worktree: prepare for more checks of whether path can become worktree
worktree: generalize delete_git_dir() to reduce code duplication
worktree: move delete_git_dir() earlier in file for upcoming new callers
worktree: don't die() in library function find_worktree()

Merge branch 'sg/doc-trace-appends'Junio C Hamano Mon, 17 Sep 2018 20:53:59 +0000 (13:53 -0700)

Merge branch 'sg/doc-trace-appends'

Docfix.

* sg/doc-trace-appends:
Documentation/git.txt: clarify that GIT_TRACE=/path appends

Merge branch 'jk/diff-rendered-docs'Junio C Hamano Mon, 17 Sep 2018 20:53:59 +0000 (13:53 -0700)

Merge branch 'jk/diff-rendered-docs'

Dev doc update.

* jk/diff-rendered-docs:
Revert "doc/Makefile: drop doc-diff worktree and temporary files on "make clean""
doc/Makefile: drop doc-diff worktree and temporary files on "make clean"
doc-diff: add --clean mode to remove temporary working gunk
doc-diff: fix non-portable 'man' invocation
doc-diff: always use oids inside worktree
SubmittingPatches: mention doc-diff

Merge branch 'jk/patch-corrupted-delta-fix'Junio C Hamano Mon, 17 Sep 2018 20:53:58 +0000 (13:53 -0700)

Merge branch 'jk/patch-corrupted-delta-fix'

Malformed or crafted data in packstream can make our code attempt
to read or write past the allocated buffer and abort, instead of
reporting an error, which has been fixed.

* jk/patch-corrupted-delta-fix:
t5303: use printf to generate delta bases
patch-delta: handle truncated copy parameters
patch-delta: consistently report corruption
patch-delta: fix oob read
t5303: test some corrupt deltas
test-delta: read input into a heap buffer

Merge branch 'ds/commit-graph-tests'Junio C Hamano Mon, 17 Sep 2018 20:53:58 +0000 (13:53 -0700)

Merge branch 'ds/commit-graph-tests'

We can now optionally run tests with commit-graph enabled.

* ds/commit-graph-tests:
commit-graph: define GIT_TEST_COMMIT_GRAPH

Merge branch 'jk/pack-objects-with-bitmap-fix'Junio C Hamano Mon, 17 Sep 2018 20:53:58 +0000 (13:53 -0700)

Merge branch 'jk/pack-objects-with-bitmap-fix'

Hotfix of the base topic.

* jk/pack-objects-with-bitmap-fix:
pack-bitmap: drop "loaded" flag
traverse_bitmap_commit_list(): don't free result
t5310: test delta reuse with bitmaps
bitmap_has_sha1_in_uninteresting(): drop BUG check

Merge branch 'rs/mailinfo-format-flowed'Junio C Hamano Mon, 17 Sep 2018 20:53:57 +0000 (13:53 -0700)

Merge branch 'rs/mailinfo-format-flowed'

"git mailinfo" used in "git am" learned to make a best-effort
recovery of a patch corrupted by MUA that sends text/plain with
format=flawed option.

* rs/mailinfo-format-flowed:
mailinfo: support format=flowed

Merge branch 'jk/cocci'Junio C Hamano Mon, 17 Sep 2018 20:53:57 +0000 (13:53 -0700)

Merge branch 'jk/cocci'

spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.

* jk/cocci:
show_dirstat: simplify same-content check
read-cache: use oideq() in ce_compare functions
convert hashmap comparison functions to oideq()
convert "hashcmp() != 0" to "!hasheq()"
convert "oidcmp() != 0" to "!oideq()"
convert "hashcmp() == 0" to hasheq()
convert "oidcmp() == 0" to oideq()
introduce hasheq() and oideq()
coccinelle: use <...> for function exclusion

Merge branch 'tg/rerere-doc-updates'Junio C Hamano Mon, 17 Sep 2018 20:53:56 +0000 (13:53 -0700)

Merge branch 'tg/rerere-doc-updates'

Clarify a part of technical documentation for rerere.

* tg/rerere-doc-updates:
rerere: add note about files with existing conflict markers
rerere: mention caveat about unmatched conflict markers

Merge branch 'es/format-patch-rangediff'Junio C Hamano Mon, 17 Sep 2018 20:53:56 +0000 (13:53 -0700)

Merge branch 'es/format-patch-rangediff'

"git format-patch" learned a new "--range-diff" option to explain
the difference between this version and the previous attempt in
the cover letter (or after the tree-dashes as a comment).

* es/format-patch-rangediff:
format-patch: allow --range-diff to apply to a lone-patch
format-patch: add --creation-factor tweak for --range-diff
format-patch: teach --range-diff to respect -v/--reroll-count
format-patch: extend --range-diff to accept revision range
format-patch: add --range-diff option to embed diff in cover letter
range-diff: relieve callers of low-level configuration burden
range-diff: publish default creation factor
range-diff: respect diff_option.file rather than assuming 'stdout'

Merge branch 'es/format-patch-interdiff'Junio C Hamano Mon, 17 Sep 2018 20:53:55 +0000 (13:53 -0700)

Merge branch 'es/format-patch-interdiff'

"git format-patch" learned a new "--interdiff" option to explain
the difference between this version and the previous atttempt in
the cover letter (or after the tree-dashes as a comment).

* es/format-patch-interdiff:
format-patch: allow --interdiff to apply to a lone-patch
log-tree: show_log: make commentary block delimiting reusable
interdiff: teach show_interdiff() to indent interdiff
format-patch: teach --interdiff to respect -v/--reroll-count
format-patch: add --interdiff option to embed diff in cover letter
format-patch: allow additional generated content in make_cover_letter()

Merge branch 'cc/delta-islands'Junio C Hamano Mon, 17 Sep 2018 20:53:55 +0000 (13:53 -0700)

Merge branch 'cc/delta-islands'

Lift code from GitHub to restrict delta computation so that an
object that exists in one fork is not made into a delta against
another object that does not appear in the same forked repository.

* cc/delta-islands:
pack-objects: move 'layer' into 'struct packing_data'
pack-objects: move tree_depth into 'struct packing_data'
t5320: tests for delta islands
repack: add delta-islands support
pack-objects: add delta-islands support
pack-objects: refactor code into compute_layer_order()
Add delta-islands.{c,h}

Merge branch 'jk/trailer-fixes'Junio C Hamano Mon, 17 Sep 2018 20:53:54 +0000 (13:53 -0700)

Merge branch 'jk/trailer-fixes'

"git interpret-trailers" and its underlying machinery had a buggy
code that attempted to ignore patch text after commit log message,
which triggered in various codepaths that will always get the log
message alone and never get such an input.

* jk/trailer-fixes:
append_signoff: use size_t for string offsets
sequencer: ignore "---" divider when parsing trailers
pretty, ref-filter: format %(trailers) with no_divider option
interpret-trailers: allow suppressing "---" divider
interpret-trailers: tighten check for "---" patch boundary
trailer: pass process_trailer_opts to trailer_info_get()
trailer: use size_t for iterating trailer list
trailer: use size_t for string offsets

Merge branch 'sb/range-diff-colors'Junio C Hamano Mon, 17 Sep 2018 20:53:54 +0000 (13:53 -0700)

Merge branch 'sb/range-diff-colors'

The color output support for recently introduced "range-diff"
command got tweaked a bit.

* sb/range-diff-colors:
range-diff: indent special lines as context
range-diff: make use of different output indicators
diff.c: add --output-indicator-{new, old, context}
diff.c: rewrite emit_line_0 more understandably
diff.c: omit check for line prefix in emit_line_0
diff: use emit_line_0 once per line
diff.c: add set_sign to emit_line_0
diff.c: reorder arguments for emit_line_ws_markup
diff.c: simplify caller of emit_line_0
t3206: add color test for range-diff --dual-color
test_decode_color: understand FAINT and ITALIC

Merge branch 'jk/pack-delta-reuse-with-bitmap'Junio C Hamano Mon, 17 Sep 2018 20:53:53 +0000 (13:53 -0700)

Merge branch 'jk/pack-delta-reuse-with-bitmap'

When creating a thin pack, which allows objects to be made into a
delta against another object that is not in the resulting pack but
is known to be present on the receiving end, the code learned to
take advantage of the reachability bitmap; this allows the server
to send a delta against a base beyond the "boundary" commit.

* jk/pack-delta-reuse-with-bitmap:
pack-objects: reuse on-disk deltas for thin "have" objects
pack-bitmap: save "have" bitmap from walk
t/perf: add perf tests for fetches from a bitmapped server
t/perf: add infrastructure for measuring sizes
t/perf: factor out percent calculations
t/perf: factor boilerplate out of test_perf

Merge branch 'nd/unpack-trees-with-cache-tree'Junio C Hamano Mon, 17 Sep 2018 20:53:53 +0000 (13:53 -0700)

Merge branch 'nd/unpack-trees-with-cache-tree'

The unpack_trees() API used in checking out a branch and merging
walks one or more trees along with the index. When the cache-tree
in the index tells us that we are walking a tree whose flattened
contents is known (i.e. matches a span in the index), as linearly
scanning a span in the index is much more efficient than having to
open tree objects recursively and listing their entries, the walk
can be optimized, which is done in this topic.

* nd/unpack-trees-with-cache-tree:
Document update for nd/unpack-trees-with-cache-tree
cache-tree: verify valid cache-tree in the test suite
unpack-trees: add missing cache invalidation
unpack-trees: reuse (still valid) cache-tree from src_index
unpack-trees: reduce malloc in cache-tree walk
unpack-trees: optimize walking same trees with cache-tree
unpack-trees: add performance tracing
trace.h: support nested performance tracing

Merge branch 'ds/reachable'Junio C Hamano Mon, 17 Sep 2018 20:53:52 +0000 (13:53 -0700)

Merge branch 'ds/reachable'

The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.

* ds/reachable:
commit-reach: correct accidental #include of C file
commit-reach: use can_all_from_reach
commit-reach: make can_all_from_reach... linear
commit-reach: replace ref_newer logic
test-reach: test commit_contains
test-reach: test can_all_from_reach_with_flags
test-reach: test reduce_heads
test-reach: test get_merge_bases_many
test-reach: test is_descendant_of
test-reach: test in_merge_bases
test-reach: create new test tool for ref_newer
commit-reach: move can_all_from_reach_with_flags
upload-pack: generalize commit date cutoff
upload-pack: refactor ok_to_give_up()
upload-pack: make reachable() more generic
commit-reach: move commit_contains from ref-filter
commit-reach: move ref_newer from remote.c
commit.h: remove method declarations
commit-reach: move walk methods from commit.c

Merge branch 'sb/submodule-update-in-c'Junio C Hamano Mon, 17 Sep 2018 20:53:51 +0000 (13:53 -0700)

Merge branch 'sb/submodule-update-in-c'

"git submodule update" is getting rewritten piece-by-piece into C.

* sb/submodule-update-in-c:
submodule--helper: introduce new update-module-mode helper
submodule--helper: replace connect-gitdir-workingtree by ensure-core-worktree
builtin/submodule--helper: factor out method to update a single submodule
builtin/submodule--helper: store update_clone information in a struct
builtin/submodule--helper: factor out submodule updating
git-submodule.sh: rename unused variables
git-submodule.sh: align error reporting for update mode to use path

Merge branch 'tg/rerere'Junio C Hamano Mon, 17 Sep 2018 20:53:51 +0000 (13:53 -0700)

Merge branch 'tg/rerere'

Fixes to "git rerere" corner cases, especially when conflict
markers cannot be parsed in the file.

* tg/rerere:
rerere: recalculate conflict ID when unresolved conflict is committed
rerere: teach rerere to handle nested conflicts
rerere: return strbuf from handle path
rerere: factor out handle_conflict function
rerere: only return whether a path has conflicts or not
rerere: fix crash with files rerere can't handle
rerere: add documentation for conflict normalization
rerere: mark strings for translation
rerere: wrap paths in output in sq
rerere: lowercase error messages
rerere: unify error messages when read_cache fails

Merge branch 'ds/multi-pack-index'Junio C Hamano Mon, 17 Sep 2018 20:53:50 +0000 (13:53 -0700)

Merge branch 'ds/multi-pack-index'

When there are too many packfiles in a repository (which is not
recommended), looking up an object in these would require
consulting many pack .idx files; a new mechanism to have a single
file that consolidates all of these .idx files is introduced.

* ds/multi-pack-index: (32 commits)
pack-objects: consider packs in multi-pack-index
midx: test a few commands that use get_all_packs
treewide: use get_all_packs
packfile: add all_packs list
midx: fix bug that skips midx with alternates
midx: stop reporting garbage
midx: mark bad packed objects
multi-pack-index: store local property
multi-pack-index: provide more helpful usage info
midx: clear midx on repack
packfile: skip loading index if in multi-pack-index
midx: prevent duplicate packfile loads
midx: use midx in approximate_object_count
midx: use existing midx when writing new one
midx: use midx in abbreviation calculations
midx: read objects from multi-pack-index
config: create core.multiPackIndex setting
midx: write object offsets
midx: write object id fanout chunk
midx: write object ids in a chunk
...

Merge branch 'jk/branch-l-1-repurpose'Junio C Hamano Mon, 17 Sep 2018 20:53:50 +0000 (13:53 -0700)

Merge branch 'jk/branch-l-1-repurpose'

Updated plan to repurpose the "-l" option to "git branch".

* jk/branch-l-1-repurpose:
doc/git-branch: remove obsolete "-l" references
branch: make "-l" a synonym for "--list"

Merge branch 'tg/conflict-marker-size'Junio C Hamano Mon, 17 Sep 2018 20:53:49 +0000 (13:53 -0700)

Merge branch 'tg/conflict-marker-size'

Developer aid.

* tg/conflict-marker-size:
.gitattributes: add conflict-marker-size for relevant files

Merge branch 'ts/doc-build-manpage-xsl-quietly'Junio C Hamano Mon, 17 Sep 2018 20:53:49 +0000 (13:53 -0700)

Merge branch 'ts/doc-build-manpage-xsl-quietly'

Build tweak.

* ts/doc-build-manpage-xsl-quietly:
Documentation/Makefile: make manpage-base-url.xsl generation quieter

Merge branch 'jk/rev-list-stdin-noop-is-ok'Junio C Hamano Mon, 17 Sep 2018 20:53:48 +0000 (13:53 -0700)

Merge branch 'jk/rev-list-stdin-noop-is-ok'

"git rev-list --stdin </dev/null" used to be an error; it now shows
no output without an error. "git rev-list --stdin --default HEAD"
still falls back to the given default when nothing is given on the
standard input.

* jk/rev-list-stdin-noop-is-ok:
rev-list: make empty --stdin not an error

Merge branch 'bp/checkout-new-branch-optim'Junio C Hamano Mon, 17 Sep 2018 20:53:48 +0000 (13:53 -0700)

Merge branch 'bp/checkout-new-branch-optim'

"git checkout -b newbranch [HEAD]" should not have to do as much as
checking out a commit different from HEAD. An attempt is made to
optimize this special case.

* bp/checkout-new-branch-optim:
checkout: optimize "git checkout -b <new_branch>"

Merge branch 'sg/t1404-update-ref-test-timeout'Junio C Hamano Mon, 17 Sep 2018 20:53:47 +0000 (13:53 -0700)

Merge branch 'sg/t1404-update-ref-test-timeout'

An attempt to unflake a test a bit.

* sg/t1404-update-ref-test-timeout:
t1404: increase core.packedRefsTimeout to avoid occasional test failure

Merge branch 'nd/clone-case-smashing-warning'Junio C Hamano Mon, 17 Sep 2018 20:53:47 +0000 (13:53 -0700)

Merge branch 'nd/clone-case-smashing-warning'

Running "git clone" against a project that contain two files with
pathnames that differ only in cases on a case insensitive
filesystem would result in one of the files lost because the
underlying filesystem is incapable of holding both at the same
time. An attempt is made to detect such a case and warn.

* nd/clone-case-smashing-warning:
clone: report duplicate entries on case-insensitive filesystems

Merge branch 'mk/http-backend-content-length'Junio C Hamano Mon, 17 Sep 2018 20:53:46 +0000 (13:53 -0700)

Merge branch 'mk/http-backend-content-length'

Test update.

* mk/http-backend-content-length:
http-backend test: make empty CONTENT_LENGTH test more realistic

Revert "doc/Makefile: drop doc-diff worktree and tempor... Junio C Hamano Mon, 17 Sep 2018 19:46:18 +0000 (12:46 -0700)

Revert "doc/Makefile: drop doc-diff worktree and temporary files on "make clean""

This reverts commit 6f924265a0bf6efa677e9a684cebdde958e5ba06, which
started to require that we have an executable git available in order
to say "make clean", which gives us a chicken-and-egg problem.

Having to have Git installed, or be in a repository, in order to be
able to run an optional "doc-diff" tool is fine. Requiring either
in order to run "make clean" is a different story.

Reported by Jonathan Nieder <jrnieder@gmail.com>.

http-backend test: make empty CONTENT_LENGTH test more... Max Kirillov Tue, 11 Sep 2018 20:33:36 +0000 (23:33 +0300)

http-backend test: make empty CONTENT_LENGTH test more realistic

This is a test of smart HTTP, so it should use the smart HTTP endpoints
(e.g. /info/refs?service=git-receive-pack), not dumb HTTP (HEAD).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Git 2.19 v2.19.0Junio C Hamano Mon, 10 Sep 2018 17:41:56 +0000 (10:41 -0700)

Git 2.19

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge tag 'l10n-2.19.0-rnd2' of git://github.com/git... Junio C Hamano Mon, 10 Sep 2018 17:41:11 +0000 (10:41 -0700)

Merge tag 'l10n-2.19.0-rnd2' of git://github.com/git-l10n/git-po

l10n for Git 2.19.0 round 2

* tag 'l10n-2.19.0-rnd2' of git://github.com/git-l10n/git-po:
l10n: zh_CN: for git v2.19.0 l10n round 1 to 2
l10n: bg.po: Updated Bulgarian translation (3958t)
l10n: vi.po(3958t): updated Vietnamese translation v2.19.0 round 2
l10n: es.po v2.19.0 round 2
l10n: fr.po v2.19.0 rnd 2
l10n: fr.po v2.19.0 rnd 1
l10n: fr: fix a message seen in git bisect
l10n: sv.po: Update Swedish translation (3958t0f0u)
l10n: git.pot: v2.19.0 round 2 (3 new, 5 removed)
l10n: ru.po: update Russian translation
l10n: git.pot: v2.19.0 round 1 (382 new, 30 removed)
l10n: de.po: translate 108 new messages
l10n: zh_CN: review for git 2.18.0
l10n: sv.po: Update Swedish translation(3608t0f0u)

Merge branch 'jn/submodule-core-worktree-revert'Junio C Hamano Mon, 10 Sep 2018 17:38:58 +0000 (10:38 -0700)

Merge branch 'jn/submodule-core-worktree-revert'

* jn/submodule-core-worktree-revert:
Revert "Merge branch 'sb/submodule-core-worktree'"

Merge branch 'mk/http-backend-content-length'Junio C Hamano Mon, 10 Sep 2018 17:29:16 +0000 (10:29 -0700)

Merge branch 'mk/http-backend-content-length'

The earlier attempt barfed when given a CONTENT_LENGTH that is
set to an empty string. RFC 3875 is fairly clear that in this
case we should not read any message body, but we've been reading
through to the EOF in previous versions (which did not even pay
attention to the environment variable), so keep that behaviour for
now in this late update.

* mk/http-backend-content-length:
http-backend: allow empty CONTENT_LENGTH

l10n: zh_CN: for git v2.19.0 l10n round 1 to 2Jiang Xin Tue, 21 Aug 2018 00:40:05 +0000 (08:40 +0800)

l10n: zh_CN: for git v2.19.0 l10n round 1 to 2

Translate 382 new messages (3958t0f0u) for git 2.19.0.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

Merge branch 'master' of git://github.com/alshopov... Jiang Xin Sun, 9 Sep 2018 11:05:41 +0000 (19:05 +0800)

Merge branch 'master' of git://github.com/alshopov/git-po

* 'master' of git://github.com/alshopov/git-po:
l10n: bg.po: Updated Bulgarian translation (3958t)

l10n: bg.po: Updated Bulgarian translation (3958t)Alexander Shopov Thu, 9 Aug 2018 15:04:10 +0000 (17:04 +0200)

l10n: bg.po: Updated Bulgarian translation (3958t)

Signed-off-by: Alexander Shopov <ash@kambanaria.org>

Revert "Merge branch 'sb/submodule-core-worktree'"Jonathan Nieder Sat, 8 Sep 2018 00:09:46 +0000 (17:09 -0700)

Revert "Merge branch 'sb/submodule-core-worktree'"

This reverts commit 7e25437d35a70791b345872af202eabfb3e1a8bc, reversing
changes made to 00624d608cc69bd62801c93e74d1ea7a7ddd6598.

v2.19.0-rc0~165^2~1 (submodule: ensure core.worktree is set after
update, 2018-06-18) assumes an "absorbed" submodule layout, where the
submodule's Git directory is in the superproject's .git/modules/
directory and .git in the submodule worktree is a .git file pointing
there. In particular, it uses $GIT_DIR/modules/$name to find the
submodule to find out whether it already has core.worktree set, and it
uses connect_work_tree_and_git_dir if not, resulting in

fatal: could not open sub/.git for writing

The context behind that patch: v2.19.0-rc0~165^2~2 (submodule: unset
core.worktree if no working tree is present, 2018-06-12) unsets
core.worktree when running commands like "git checkout
--recurse-submodules" to switch to a branch without the submodule. If
a user then uses "git checkout --no-recurse-submodules" to switch back
to a branch with the submodule and runs "git submodule update", this
patch is needed to ensure that commands using the submodule directly
are aware of the path to the worktree.

It is late in the release cycle, so revert the whole 3-patch series.
We can try again later for 2.20.

Reported-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

http-backend: allow empty CONTENT_LENGTHMax Kirillov Fri, 7 Sep 2018 03:36:07 +0000 (06:36 +0300)

http-backend: allow empty CONTENT_LENGTH

According to RFC3875, empty environment variable is equivalent to unset,
and for CONTENT_LENGTH it should mean zero body to read.

However, unset CONTENT_LENGTH is also used for chunked encoding to indicate
reading until EOF. At least, the test "large fetch-pack requests can be split
across POSTs" from t5551 starts faliing, if unset or empty CONTENT_LENGTH is
treated as zero length body. So keep the existing behavior as much as possible.

Add a test for the case.

Reported-By: Jelmer Vernooij <jelmer@jelmer.uk>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: vi.po(3958t): updated Vietnamese translation... Tran Ngoc Quan Fri, 7 Sep 2018 06:41:08 +0000 (13:41 +0700)

l10n: vi.po(3958t): updated Vietnamese translation v2.19.0 round 2

Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>

l10n: es.po v2.19.0 round 2Christopher Diaz Riveros Thu, 6 Sep 2018 09:27:56 +0000 (04:27 -0500)

l10n: es.po v2.19.0 round 2

Signed-off-by: Christopher Diaz Riveros <chrisadr@gentoo.org>

Merge branch 'fr_2.19.0_rnd1' of git://github.com/jnavi... Jiang Xin Thu, 6 Sep 2018 01:17:55 +0000 (09:17 +0800)

Merge branch 'fr_2.19.0_rnd1' of git://github.com/jnavila/git

* 'fr_2.19.0_rnd1' of git://github.com/jnavila/git:
l10n: fr.po v2.19.0 rnd 2
l10n: fr.po v2.19.0 rnd 1
l10n: fr: fix a message seen in git bisect

l10n: fr.po v2.19.0 rnd 2Jean-Noël Avila Wed, 5 Sep 2018 20:19:13 +0000 (22:19 +0200)

l10n: fr.po v2.19.0 rnd 2

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>

l10n: fr.po v2.19.0 rnd 1Jean-Noël Avila Thu, 23 Aug 2018 20:50:52 +0000 (22:50 +0200)

l10n: fr.po v2.19.0 rnd 1

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>

l10n: fr: fix a message seen in git bisectRaphaël Hertzog Wed, 4 Jul 2018 15:43:56 +0000 (17:43 +0200)

l10n: fr: fix a message seen in git bisect

"cette" can be only be used before a word (like in "cette bouteille" for
"this bottle"), but here "this" refers to the current step and we have
to use "ceci" in French.

Signed-off-by: Raphaël Hertzog <hertzog@debian.org>

doc-diff: force worktree addJeff King Thu, 30 Aug 2018 07:54:31 +0000 (03:54 -0400)

doc-diff: force worktree add

We avoid re-creating our temporary worktree if it's already
there. But we may run into a situation where the worktree
has been deleted, but an entry still exists in
$GIT_DIR/worktrees.

Older versions of git-worktree would annoyingly create a
series of duplicate entries. Recent versions now detect and
prevent this, allowing you to override with "-f". Since we
know that the worktree in question was just our temporary
workspace, it's safe for us to always pass "-f".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: sv.po: Update Swedish translation (3958t0f0u)Peter Krefting Tue, 4 Sep 2018 21:34:09 +0000 (22:34 +0100)

l10n: sv.po: Update Swedish translation (3958t0f0u)

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>

Git 2.19-rc2 v2.19.0-rc2Junio C Hamano Tue, 4 Sep 2018 21:33:27 +0000 (14:33 -0700)

Git 2.19-rc2

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'es/chain-lint-more'Junio C Hamano Tue, 4 Sep 2018 21:31:40 +0000 (14:31 -0700)

Merge branch 'es/chain-lint-more'

The test linter code has learned that the end of here-doc mark
"EOF" can be quoted in a double-quote pair, not just in a
single-quote pair.

* es/chain-lint-more:
chainlint: match "quoted" here-doc tags

Merge branch 'ab/portable-more'Junio C Hamano Tue, 4 Sep 2018 21:31:40 +0000 (14:31 -0700)

Merge branch 'ab/portable-more'

Portability fix.

* ab/portable-more:
tests: fix non-portable iconv invocation
tests: fix non-portable "${var:-"str"}" construct
tests: fix and add lint for non-portable grep --file
tests: fix version-specific portability issue in Perl JSON
tests: use shorter labels in chainlint.sed for AIX sed
tests: fix comment syntax in chainlint.sed for AIX sed
tests: fix and add lint for non-portable seq
tests: fix and add lint for non-portable head -c N

Merge branch 'es/freebsd-iconv-portability'Junio C Hamano Tue, 4 Sep 2018 21:31:39 +0000 (14:31 -0700)

Merge branch 'es/freebsd-iconv-portability'

Build fix.

* es/freebsd-iconv-portability:
config.mak.uname: resolve FreeBSD iconv-related compilation warning

Merge branch 'ds/commit-graph-lockfile-fix'Junio C Hamano Tue, 4 Sep 2018 21:31:39 +0000 (14:31 -0700)

Merge branch 'ds/commit-graph-lockfile-fix'

"git merge-base" in 2.19-rc1 has performance regression when the
(experimental) commit-graph feature is in use, which has been
mitigated.

* ds/commit-graph-lockfile-fix:
commit: don't use generation numbers if not needed

Merge branch 'en/directory-renames-nothanks'Junio C Hamano Tue, 4 Sep 2018 21:31:38 +0000 (14:31 -0700)

Merge branch 'en/directory-renames-nothanks'

Recent addition of "directory rename" heuristics to the
merge-recursive backend makes the command susceptible to false
positives and false negatives. In the context of "git am -3",
which does not know about surrounding unmodified paths and thus
cannot inform the merge machinery about the full trees involved,
this risk is particularly severe. As such, the heuristic is
disabled for "git am -3" to keep the machinery "more stupid but
predictable".

* en/directory-renames-nothanks:
am: avoid directory rename detection when calling recursive merge machinery
merge-recursive: add ability to turn off directory rename detection
t3401: add another directory rename testcase for rebase and am

Merge branch 'pw/rebase-i-author-script-fix'Junio C Hamano Tue, 4 Sep 2018 21:31:38 +0000 (14:31 -0700)

Merge branch 'pw/rebase-i-author-script-fix'

Recent "git rebase -i" update started to write bogusly formatted
author-script, with a matching broken reading code. These are
fixed.

* pw/rebase-i-author-script-fix:
sequencer: fix quoting in write_author_script
sequencer: handle errors from read_author_ident()

Documentation/git.txt: clarify that GIT_TRACE=/path... SZEDER Gábor Tue, 4 Sep 2018 00:05:44 +0000 (02:05 +0200)

Documentation/git.txt: clarify that GIT_TRACE=/path appends

The current wording of the description of GIT_TRACE=/path/to/file
("... will try to write the trace messages into it") might be
misunderstood as "overwriting"; at least I interpreted it that way on
a cursory first read.

State it more explicitly that the trace messages are appended.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bisect.c: make show_list() build againNguyễn Thái Ngọc Duy Sun, 2 Sep 2018 07:42:50 +0000 (09:42 +0200)

bisect.c: make show_list() build again

This function only compiles when DEBUG_BISECT is 1, which is often not
the case. As a result there are two commits [1] [2] that break it but
the breakages went unnoticed because the code did not compile by
default. Update the function and include the new header file to make this
function build again.

In order to stop this from happening again, the function is now
compiled unconditionally but exits early unless DEBUG_BISECT is
non-zero. A smart compiler generates no extra code (not even a
function call). But even if it does not, this function does not seem
to be in a hot path that the extra cost becomes a big problem.

[1] bb408ac95d (bisect.c: use commit-slab for commit weight instead of
commit->util - 2018-05-19)

[2] cbd53a2193 (object-store: move object access functions to
object-store.h - 2018-05-15)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: drop "loaded" flagJeff King Sat, 1 Sep 2018 07:50:57 +0000 (03:50 -0400)

pack-bitmap: drop "loaded" flag

In the early days of the bitmap code, there was a single
static bitmap_index struct that was used behind the scenes,
and any bitmap-related functions could lazily check
bitmap_git.loaded to see if they needed to read the on-disk
data.

But since 3ae5fa0768 (pack-bitmap: remove bitmap_git global
variable, 2018-06-07), the caller is responsible for the
lifetime of the bitmap_index struct, and we return it from
prepare_bitmap_git() and prepare_bitmap_walk(), both of
which load the on-disk data (or return NULL).

So outside of these functions, it's not possible to have a
bitmap_index for which the loaded flag is not true. Nor is
it possible to accidentally pass an already-loaded
bitmap_index to the loading function (which is static-local
to the file).

We can drop this unnecessary and confusing flag.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

traverse_bitmap_commit_list(): don't free resultJeff King Sat, 1 Sep 2018 07:49:48 +0000 (03:49 -0400)

traverse_bitmap_commit_list(): don't free result

Since it was introduced in fff42755ef (pack-bitmap: add
support for bitmap indexes, 2013-12-21), this function has
freed the result after traversing it. That is an artifact of
the early days of the bitmap code, when we had a single
static "struct bitmap_index". Back then, it was intended
that you would do:

prepare_bitmap_walk(&revs);
traverse_bitmap_commit_list(&revs);

Since the actual bitmap_index struct was totally behind the
scenes, it was convenient for traverse_bitmap_commit_list()
to clean it up, clearing the way for another traversal.

But since 3ae5fa0768 (pack-bitmap: remove bitmap_git global
variable, 2018-06-07), the caller explicitly manages the
bitmap_index struct itself, like this:

b = prepare_bitmap_walk(&revs);
traverse_bitmap_commit_list(b, &revs);
free_bitmap_index(b);

It no longer makes sense to auto-free the result after the
traversal. If you want to do another traversal, you'd just
create a new bitmap_index. And while nobody tries to call
traverse_bitmap_commit_list() twice, the fact that it throws
away the result might be surprising, and is better avoided.

Note that in the "old" way it was possible for two walks to
amortize the cost of opening the on-disk .bitmap file (since
it was stored in the global bitmap_index), but we lost that
in 3ae5fa0768. However, no code actually does this, so it's
not worth addressing now. The solution might involve a new:

reset_bitmap_walk(b, &revs);

call. Or we might even attach the bitmap data to its
matching packed_git struct, so that multiple
prepare_bitmap_walk() calls could use it. That can wait
until somebody actually has need of the optimization (and
until then, we'll do the correct, unsurprising thing).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5310: test delta reuse with bitmapsJeff King Sat, 1 Sep 2018 07:48:13 +0000 (03:48 -0400)

t5310: test delta reuse with bitmaps

Commit 6a1e32d532 (pack-objects: reuse on-disk deltas for
thin "have" objects, 2018-08-21) taught pack-objects a new
optimization trick. Since this wasn't meant to change
user-visible behavior, but only produce smaller packs more
quickly, testing focused on t/perf/p5311.

However, since people don't run perf tests very often, we
should make sure that the feature is exercised in the
regular test suite. This patch does so.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bitmap_has_sha1_in_uninteresting(): drop BUG checkJeff King Sat, 1 Sep 2018 07:44:48 +0000 (03:44 -0400)

bitmap_has_sha1_in_uninteresting(): drop BUG check

Commit 30cdc33fba (pack-bitmap: save "have" bitmap from
walk, 2018-08-21) introduced a new function for looking at
the "have" side of a bitmap walk. Because it only makes
sense to do so after we've finished the walk, we added an
extra safety assertion, making sure that bitmap_git->result
is non-NULL.

However, this safety is misguided. It was trying to catch
the case where we had called prepare_bitmap_walk() to give
us a "struct bitmap_index", but had not yet called
traverse_bitmap_commit_list() to walk it. But all of the
interesting computation (including setting up the result and
"have" bitmaps) happens in the first function! The latter
function only delivers the result to a callback function.

So the case we were worried about is impossible; if you get
a non-NULL result from prepare_bitmap_walk(), then its
"have" field will be fully formed.

But much worse, traverse_bitmap_commit_list() actually frees
the result field as it finishes. Which means that this
assertion is worse than useless: it's almost guaranteed to
trigger!

Our test suite didn't catch this because the function isn't
actually exercised at all. The only caller comes from
6a1e32d532 (pack-objects: reuse on-disk deltas for thin
"have" objects, 2018-08-21), and that's triggered only when
you fetch or push history that contains an object with a
base that is found deep in history. Our test suite fetches
and pushes either don't use bitmaps, or use too-small
example repositories. But any reasonably-sized real-world
push or fetch (with bitmaps) would trigger this.

This patch drops the harmful assertion and tweaks the
docstring for the function to make the precondition clear.
The tests need to be improved to exercise this new
pack-objects feature, but we'll do that in a separate
commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: git.pot: v2.19.0 round 2 (3 new, 5 removed)Jiang Xin Tue, 4 Sep 2018 00:51:58 +0000 (08:51 +0800)

l10n: git.pot: v2.19.0 round 2 (3 new, 5 removed)

Generate po/git.pot from v2.19.0-rc1 for git v2.19.0 l10n round 2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

Merge branch 'master' of git://github.com/git-l10n... Jiang Xin Tue, 4 Sep 2018 00:49:54 +0000 (08:49 +0800)

Merge branch 'master' of git://github.com/git-l10n/git-po

* 'master' of git://github.com/git-l10n/git-po:
l10n: ru.po: update Russian translation
l10n: git.pot: v2.19.0 round 1 (382 new, 30 removed)
l10n: de.po: translate 108 new messages
l10n: zh_CN: review for git 2.18.0
l10n: sv.po: Update Swedish translation(3608t0f0u)

fetch: stop clobbering existing tags without --forceÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:10:04 +0000 (20:10 +0000)

fetch: stop clobbering existing tags without --force

Change "fetch" to treat "+" in refspecs (aka --force) to mean we
should clobber a local tag of the same name.

This changes the long-standing behavior of "fetch" added in
853a3697dc ("[PATCH] Multi-head fetch.", 2005-08-20). Before this
change, all tag fetches effectively had --force enabled. See the
git-fetch-script code in fast_forward_local() with the comment:

> Tags need not be pointing at commits so there is no way to
> guarantee "fast-forward" anyway.

That commit and the rest of the history of "fetch" shows that the
"+" (--force) part of refpecs was only conceived for branch updates,
while tags have accepted any changes from upstream unconditionally and
clobbered the local tag object. Changing this behavior has been
discussed as early as 2011[1].

The current behavior doesn't make sense to me, it easily results in
local tags accidentally being clobbered. We could namespace our tags
per-remote and not locally populate refs/tags/*, but as with my
97716d217c ("fetch: add a --prune-tags option and fetch.pruneTags
config", 2018-02-09) it's easier to work around the current
implementation than to fix the root cause.

So this change implements suggestion #1 from Jeff's 2011 E-Mail[1],
"fetch" now only clobbers the tag if either "+" is provided as part of
the refspec, or if "--force" is provided on the command-line.

This also makes it nicely symmetrical with how "tag" itself works when
creating tags. I.e. we refuse to clobber any existing tags unless
"--force" is supplied. Now we can refuse all such clobbering, whether
it would happen by clobbering a local tag with "tag", or by fetching
it from the remote with "fetch".

Ref updates outside refs/{tags,heads/* are still still not symmetrical
with how "git push" works, as discussed in the recently changed
pull-fetch-param.txt documentation. This change brings the two
divergent behaviors more into line with one another. I don't think
there's any reason "fetch" couldn't fully converge with the behavior
used by "push", but that's a topic for another change.

One of the tests added in 31b808a032 ("clone --single: limit the fetch
refspec to fetched branch", 2012-09-20) is being changed to use
--force where a clone would clobber a tag. This changes nothing about
the existing behavior of the test.

1. https://public-inbox.org/git/20111123221658.GA22313@sigill.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: document local ref updates with/without --forceÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:10:03 +0000 (20:10 +0000)

fetch: document local ref updates with/without --force

Refer to the new git-push(1) documentation about when ref updates are
and aren't allowed with and without --force, noting how "git-fetch"
differs from the behavior of "git-push".

Perhaps it would be better to split this all out into a new
gitrefspecs(7) man page, or present this information using tables.

In lieu of that, this is accurate, and fixes a big omission in the
existing refspec docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

push doc: correct lies about how push refspecs workÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:10:02 +0000 (20:10 +0000)

push doc: correct lies about how push refspecs work

There's complex rules governing whether a push is allowed to take
place depending on whether we're pushing to refs/heads/*, refs/tags/*
or refs/not-that/*. See is_branch() in refs.c, and the various
assertions in refs/files-backend.c. (e.g. "trying to write non-commit
object %s to branch '%s'").

This documentation has never been quite correct, but went downhill
after dbfeddb12e ("push: require force for refs under refs/tags/",
2012-11-29) when we started claiming that <dst> couldn't be a tag
object, which is incorrect. After some of the logic in that patch was
changed in 256b9d70a4 ("push: fix "refs/tags/ hierarchy cannot be
updated without --force"", 2013-01-16) the docs weren't updated, and
we've had some version of documentation that confused whether <src>
was a tag or not with whether <dst> would accept either an annotated
tag object or the commit it points to.

This makes the intro somewhat more verbose & complex, perhaps we
should have a shorter description here and split the full complexity
into a dedicated section. Very few users will find themselves needing
to e.g. push blobs or trees to refs/custom-namespace/* (or blobs or
trees at all), and that could be covered separately as an advanced
topic.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

push doc: move mention of "tag <tag>" later in the... Ævar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:10:01 +0000 (20:10 +0000)

push doc: move mention of "tag <tag>" later in the prose

This change will be followed-up with a subsequent change where I'll
change both sides of this mention of "tag <tag>" to be something
that's best read without interruption.

To make that change smaller, let's move this mention of "tag <tag>" to
the end of the "<refspec>..." section, it's now somewhere in the
middle.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

push doc: remove confusing mention of remote mergerÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:10:00 +0000 (20:10 +0000)

push doc: remove confusing mention of remote merger

Saying that "git push <remote> <src>:<dst>" won't push a merger of
<src> and <dst> to <dst> is clear from the rest of the context here,
so mentioning it is redundant, furthermore the mention of "EXAMPLES
below" isn't specific or useful.

This phrase was originally added in 149f6ddfb3 ("Docs: Expand
explanation of the use of + in git push refspecs.", 2009-02-19), as
can be seen in that change the point of the example being cited was to
show that force pushing can leave unreferenced commits on the
remote. It's enough that we explain that in its own section, it
doesn't need to be mentioned here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch tests: add a test for clobbering tag behaviorÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:09:59 +0000 (20:09 +0000)

fetch tests: add a test for clobbering tag behavior

The test suite only incidentally (and unintentionally) tested for the
current behavior of eager tag clobbering on "fetch". This is a
followup to 380efb65df ("push tests: assert re-pushing annotated
tags", 2018-07-31) which tests for it explicitly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

push tests: use spaces in interpolated stringÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:09:58 +0000 (20:09 +0000)

push tests: use spaces in interpolated string

The quoted -m'msg' option would mean the same as -mmsg when passed
through the test_force_push_tag helper. Let's instead use a string
with spaces in it, to have a working example in case we need to pass
other whitespace-delimited arguments to git-tag.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

push tests: make use of unused $1 in test descriptionÆvar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:09:57 +0000 (20:09 +0000)

push tests: make use of unused $1 in test description

Fix up a logic error in 380efb65df ("push tests: assert re-pushing
annotated tags", 2018-07-31), where the $tag_type_description variable
was assigned to but never used, unlike in the subsequently added
companion test for fetches in 2d216a7ef6 ("fetch tests: add a test for
clobbering tag behavior", 2018-04-29).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: change "branch" to "reference" in --force -h... Ævar Arnfjörð Bjarmason Fri, 31 Aug 2018 20:09:56 +0000 (20:09 +0000)

fetch: change "branch" to "reference" in --force -h output

The -h output has been referring to the --force command as forcing the
overwriting of local branches, but since "fetch" more generally
fetches all sorts of references in all refs/ namespaces, let's talk
about forcing the update of a a "reference" instead.

This wording was initially introduced in 8320199873 ("Rewrite
builtin-fetch option parsing to use parse_options().", 2007-12-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: resolve FreeBSD iconv-related compila... Eric Sunshine Fri, 31 Aug 2018 08:33:42 +0000 (04:33 -0400)

config.mak.uname: resolve FreeBSD iconv-related compilation warning

OLD_ICONV has long been needed by FreeBSD so config.mak.uname defines
it unconditionally. However, recent versions do not need it, and its
presence results in compilation warnings. Resolve this issue by defining
OLD_ICONV only for older FreeBSD versions.

Specifically, revision r281550[1], which is part of FreeBSD 11, removed
the need for OLD_ICONV, and r282275[2] back-ported that change to 10.2.
Versions prior to 10.2 do need it.

[1] https://github.com/freebsd/freebsd/commit/b0813ee288f64f677a2cebf7815754b027a8215b
[2] https://github.com/freebsd/freebsd/commit/b709ec868adb5170d09bc5a66b18d0e0d5987ab6

[es: commit message; tweak version check to distinguish 10.x versions]

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc/Makefile: drop doc-diff worktree and temporary... Eric Sunshine Fri, 31 Aug 2018 06:33:18 +0000 (02:33 -0400)

doc/Makefile: drop doc-diff worktree and temporary files on "make clean"

doc-diff creates a temporary working tree (git-worktree) and generates a
bunch of temporary files which it does not remove since they act as a
cache to speed up subsequent runs. Although doc-diff's working tree and
generated files are not strictly build products of the Makefile (which,
itself, never runs doc-diff), as a convenience, update "make clean" to
clean up doc-diff's working tree and generated files along with other
development detritus normally removed by "make clean".

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc-diff: add --clean mode to remove temporary working... Eric Sunshine Fri, 31 Aug 2018 06:33:17 +0000 (02:33 -0400)

doc-diff: add --clean mode to remove temporary working gunk

As part of its operation, doc-diff creates a bunch of temporary
working files and holds onto them in order to speed up subsequent
invocations. These files are never deleted. Moreover, it creates a
temporary working tree (via git-wortkree) which likewise never gets
removed.

Without knowing the implementation details of the tool, a user may not
know how to clean up manually afterward. Worse, the user may find it
surprising and alarming to discover a working tree which s/he did not
create explicitly.

To address these issues, add a --clean mode which removes the
temporary working tree and deletes all generated files.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc-diff: fix non-portable 'man' invocationEric Sunshine Fri, 31 Aug 2018 06:33:16 +0000 (02:33 -0400)

doc-diff: fix non-portable 'man' invocation

doc-diff invokes 'man' with the -l option to force "local" mode,
however, neither MacOS nor FreeBSD recognize this option. On those
platforms, if the argument to 'man' contains a slash, it is
automatically interpreted as a file specification, so a "local"-like
mode is not needed. And, it turns out, 'man' which does support -l
falls back to enabling -l automatically if it can't otherwise find a
manual entry corresponding to the argument. Since doc-diff always
passes an absolute path of the nroff source file to 'man', the -l
option kicks in anyhow, despite not being specified explicitly.
Therefore, make the invocation portable to the various platforms by
simply dropping -l.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc/git-branch: remove obsolete "-l" referencesJeff King Thu, 30 Aug 2018 20:04:53 +0000 (16:04 -0400)

doc/git-branch: remove obsolete "-l" references

The previous commit switched "-l" to meaning "--list", but a
few vestiges of its prior meaning as "--create-reflog"
remained:

- the synopsis mentioned "-l" when creating a new branch;
we can drop this entirely, as it has been the default
for years

- the --list command mentions the unfortunate "-l"
confusion, but we've now fixed that

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5303: use printf to generate delta basesJeff King Thu, 30 Aug 2018 19:13:39 +0000 (15:13 -0400)

t5303: use printf to generate delta bases

The exact byte count of the delta base file is important.
The test-delta helper will feed it to patch_delta(), which
will barf if it doesn't match the size byte given in the
delta. Using "echo" may end up with unexpected line endings
on some platforms (e.g,. "\r\n" instead of just "\n").

This actually wouldn't cause the test to fail (since we
already expect test-delta to complain about these bogus
deltas), but would mean that we're not exercising the code
we think we are.

Let's use printf instead (which we already trust to give us
byte-perfect output when we generate the deltas).

While we're here, let's tighten the 5-byte result size used
in the "truncated copy parameters" test. This just needs to
have enough room to attempt to parse the bogus copy command,
meaning 2 is sufficient. Using 5 was arbitrary and just
copied from the base size; since those no longer match, it's
simply confusing. Let's use a more meaningful number.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: don't use generation numbers if not neededDerrick Stolee Thu, 30 Aug 2018 12:58:09 +0000 (05:58 -0700)

commit: don't use generation numbers if not needed

In 3afc679b "commit: use generations in paint_down_to_common()",
the queue in paint_down_to_common() was changed to use a priority
order based on generation number before commit date. This served
two purposes:

1. When generation numbers are present, the walk guarantees
correct topological relationships, regardless of clock skew in
commit dates.

2. It enables short-circuiting the walk when the min_generation
parameter is added in d7c1ec3e "commit: add short-circuit to
paint_down_to_common()". This short-circuit helps commands
like 'git branch --contains' from needing to walk to a merge
base when we know the result is false.

The commit message for 3afc679b includes the following sentence:

This change does not affect the number of commits that are
walked during the execution of paint_down_to_common(), only
the order that those commits are inspected.

This statement is incorrect. Because it changes the order in which
the commits are inspected, it changes the order they are added to
the queue, and hence can change the number of loops before the
queue_has_nonstale() method returns true.

This change makes a concrete difference depending on the topology
of the commit graph. For instance, computing the merge-base between
consecutive versions of the Linux kernel has no effect for versions
after v4.9, but 'git merge-base v4.8 v4.9' presents a performance
regression:

v2.18.0: 0.122s
v2.19.0-rc1: 0.547s
HEAD: 0.127s

To determine that this was simply an ordering issue, I inserted
a counter within the while loop of paint_down_to_common() and
found that the loop runs 167,468 times in v2.18.0 and 635,579
times in v2.19.0-rc1.

The topology of this case can be described in a simplified way
here:

v4.9
| \
| \
v4.8 \
| \ \
| \ |
... A B
| / /
| / /
|/__/
C

Here, the "..." means "a very long line of commits". By generation
number, A and B have generation one more than C. However, A and B
have commit date higher than most of the commits reachable from
v4.8. When the walk reaches v4.8, we realize that it has PARENT1
and PARENT2 flags, so everything it can reach is marked as STALE,
including A. B has only the PARENT1 flag, so is not STALE.

When paint_down_to_common() is run using
compare_commits_by_commit_date, A and B are removed from the queue
early and C is inserted into the queue. At this point, C and the
rest of the queue entries are marked as STALE. The loop then
terminates.

When paint_down_to_common() is run using
compare_commits_by_gen_then_commit_date, B is removed from the
queue only after the many commits reachable from v4.8 are explored.
This causes the loop to run longer. The reason for this regression
is simple: the queue order is intended to not explore a commit
until everything that _could_ reach that commit is explored. From
the information gathered by the original ordering, we have no
guarantee that there is not a commit D reachable from v4.8 that
can also reach B. We gained absolute correctness in exchange for
a performance regression.

The performance regression is probably the worse option, since
these incorrect results in paint_down_to_common() are rare. The
topology required for the performance regression are less rare,
but still require multiple merge commits where the parents differ
greatly in generation number. In our example above, the commit A
is as important as the commit B to demonstrate the problem, since
otherwise the commit C will sit in the queue as non-stale just as
long in both orders.

The solution provided uses the min_generation parameter to decide
if we should use generation numbers in our ordering. When
min_generation is equal to zero, it means that the caller has no
known cutoff for the walk, so we should rely on our commit-date
heuristic as before; this is the case with merge_bases_many().
When min_generation is non-zero, then the caller knows a valuable
cutoff for the short-circuit mechanism; this is the case with
remove_redundant() and in_merge_bases_many().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

patch-delta: handle truncated copy parametersJeff King Thu, 30 Aug 2018 07:12:52 +0000 (03:12 -0400)

patch-delta: handle truncated copy parameters

When we see a delta command instructing us to copy bytes
from the base, we have to read the offset and size from the
delta stream. We do this without checking whether we're at
the end of the stream, meaning we may read past the end of
the buffer.

In practice this isn't exploitable in any interesting way
because:

1. Deltas are always in packfiles, so we have at least a
20-byte trailer that we'll end up reading.

2. The worst case is that we try to perform a nonsense
copy from the base object into the result, based on
whatever was in the pack stream next. In most cases
this will simply fail due to our bounds-checks against
the base or the result.

But even if you carefully constructed a pack stream for
which it succeeds, it wouldn't perform any delta
operation that you couldn't have simply included in a
non-broken form.

But obviously it's poor form to read past the end of the
buffer we've been given. Unfortunately there's no easy way
to do a single length check, since the number of bytes we
need depends on the number of bits set in the initial
command byte. So we'll just check each byte as we parse. We
can hide the complexity in a macro; it's ugly, but not as
ugly as writing out each individual conditional.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

patch-delta: consistently report corruptionJann Horn Thu, 30 Aug 2018 07:10:26 +0000 (03:10 -0400)

patch-delta: consistently report corruption

When applying a delta, if we see an opcode that cannot be
fulfilled (e.g., asking to write more bytes than the
destination has left), we break out of our parsing loop but
don't signal an explicit error. We rely on the sanity check
after the loop to see if we have leftover delta bytes or
didn't fill our result buffer.

This can silently ignore corruption when the delta buffer
ends with a bogus command and the destination buffer is
already full. Instead, let's jump into the error handler
directly when we see this case.

Note that the tests also cover the "bad opcode" case, which
already handles this correctly.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

patch-delta: fix oob readJann Horn Thu, 30 Aug 2018 07:09:45 +0000 (03:09 -0400)

patch-delta: fix oob read

If `cmd` is in the range [0x01,0x7f] and `cmd > top-data`, the
`memcpy(out, data, cmd)` can copy out-of-bounds data from after `delta_buf`
into `dst_buf`.

This is not an exploitable bug because triggering the bug increments the
`data` pointer beyond `top`, causing the `data != top` sanity check after
the loop to trigger and discard the destination buffer - which means that
the result of the out-of-bounds read is never used for anything.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5303: test some corrupt deltasJeff King Thu, 30 Aug 2018 07:09:32 +0000 (03:09 -0400)

t5303: test some corrupt deltas

We don't have any tests that specifically check boundary
cases in patch_delta(). It obviously gets exercised by tests
which read from packfiles, but it's hard to create packfiles
with bogus deltas.

So let's cover some obvious boundary cases:

1. commands that overflow the result buffer

a. literal content from the delta

b. copies from a base

2. commands where the source isn't large enough

a. literal content from a truncated delta

b. copies that need more bytes than the base has

3. copy commands who parameters are truncated

And indeed, we have problems with both 2a and 3. I've marked
these both as expect_failure, though note that because they
involve reading past the end of a buffer, they will
typically only be caught when run under valgrind or ASan.

There's one more test here, too, which just applies a basic
delta. Since all of the other tests expect failure and we
don't otherwise use "test-tool delta" in the test suite,
this gives a sanity check that the tool works at all.

These are based on an earlier patch by Jann Horn
<jannh@google.com>.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-delta: read input into a heap bufferJeff King Thu, 30 Aug 2018 07:07:52 +0000 (03:07 -0400)

test-delta: read input into a heap buffer

We currently read the input to test-delta by mmap()-ing it.
However, memory-checking tools like valgrind and ASan are
less able to detect reads/writes past the end of an mmap'd
buffer, because the OS is likely to give us extra bytes to
pad out the final page size. So instead, let's read into a
heap buffer.

As a bonus, this also makes it possible to write tests with
empty bases, as mmap() will complain about a zero-length
map.

This is based on a patch by Jann Horn <jannh@google.com>
which actually aligned the data at the end of a page, and
followed it with another page marked with mprotect(). That
would detect problems even without a tool like ASan, but it
was significantly more complex and may have introduced
portability problems. By comparison, this approach pushes
the complexity onto existing memory-checking tools.

Note that this could be done even more simply by using
strbuf_read_file(), but that would defeat the purpose:
strbufs generally overallocate (and at the very least
include a trailing NUL which we do not care about), which
would defeat most memory checkers.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>