Andrew's git - gitweb.git/log

attr: remove index from git_attr_set_direction()Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:33 +0000 (18:14 +0200)

attr: remove index from git_attr_set_direction()

Since attr checking API now take the index, there's no need to set an
index in advance with this call. Most call sites are straightforward
because they either pass the_index or NULL (which defaults back to
the_index previously). There's only one suspicious call site in
unpack-trees.c where it sets a different index.

This code in unpack-trees is about to check out entries from the
new/temporary index after merging is done in it. The attributes will
be used by entry.c code to do crlf conversion if needed. entry.c now
respects struct checkout's istate field, and this field is correctly
set in unpack-trees.c, there should be no regression from this change.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

entry.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:32 +0000 (18:14 +0200)

entry.c: use the right index instead of the_index

checkout-index.c needs update because if checkout->istate is NULL,
ie_match_stat() will crash. Previously this is ie_match_stat(&the_index, ..)
so it will not crash, but it is not technically correct either.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

submodule.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:31 +0000 (18:14 +0200)

submodule.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pathspec.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:30 +0000 (18:14 +0200)

pathspec.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

unpack-trees: avoid the_index in verify_absent()Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:29 +0000 (18:14 +0200)

unpack-trees: avoid the_index in verify_absent()

Both functions that are updated in this commit are called by
verify_absent(), which is part of the "unpack-trees" operation that is
supposed to work on any index file specified by the caller. Thanks to
Brandon [1] [2], an implicit dependency on the_index is exposed. This
commit fixes it.

In both functions, it makes sense to use src_index to check for
exclusion because it's almost unchanged and should give us the same
outcome as if running the exclude check before the unpack.

It's "almost unchanged" because we do invalidate cache-tree and
untracked cache in the source index. But this should not affect how
exclude machinery uses the index: to see if a file is tracked, and to
read a blob from the index instead of worktree if it's marked
skip-worktree (i.e. it's not available in worktree)

[1] a0bba65b10 (dir: convert is_excluded to take an index - 2017-05-05
[2] 2c1eb10454 (dir: convert read_directory to take an index - 2017-05-05)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

unpack-trees: convert clear_ce_flags* to avoid the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:28 +0000 (18:14 +0200)

unpack-trees: convert clear_ce_flags* to avoid the_index

Prior to fba92be8f7, this code implicitly (and incorrectly) assumes
the_index when running the exclude machinery. fba92be8f7 helps show
this problem clearer because unpack-trees operation is supposed to
work on whatever index the caller specifies... not specifically
the_index.

Update the code to use "istate" argument that's originally from
mark_new_skip_worktree(). From the call sites, both in unpack_trees(),
you can see that this function works on two separate indexes:
o->src_index and o->result. The second mark_new_skip_worktree() so far
has incorecctly applied exclude rules on o->src_index instead of
o->result. It's unclear what is the consequences of this, but it's
definitely wrong.

[1] fba92be8f7 (dir: convert is_excluded_from_list to take an index -
2017-05-05)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

unpack-trees: don't shadow global var the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:27 +0000 (18:14 +0200)

unpack-trees: don't shadow global var the_index

This function mark_new_skip_worktree() has an argument named the_index
which is also the name of a global variable. While they have different
types (the global the_index is not a pointer) mistakes can easily
happen and it's also confusing for readers. Rename the function
argument to something other than the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

unpack-trees: add a note about path invalidationNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:26 +0000 (18:14 +0200)

unpack-trees: add a note about path invalidation

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

unpack-trees: remove 'extern' on function declarationNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:25 +0000 (18:14 +0200)

unpack-trees: remove 'extern' on function declaration

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

ls-files: correct index argument to get_convert_attr_as... Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:24 +0000 (18:14 +0200)

ls-files: correct index argument to get_convert_attr_ascii()

write_eolinfo() does take an istate as function argument and it should
be used instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

preload-index.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:23 +0000 (18:14 +0200)

preload-index.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

dir.c: remove an implicit dependency on the_index in... Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:22 +0000 (18:14 +0200)

dir.c: remove an implicit dependency on the_index in pathspec code

Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

convert.c: remove an implicit dependency on the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:21 +0000 (18:14 +0200)

convert.c: remove an implicit dependency on the_index

Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

attr: remove an implicit dependency on the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:20 +0000 (18:14 +0200)

attr: remove an implicit dependency on the_index

Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.

There is one ugly temporary workaround added in attr.c that needs some
more explanation.

Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.

The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

cache-tree: wrap the_index based wrappers with #ifdefNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:19 +0000 (18:14 +0200)

cache-tree: wrap the_index based wrappers with #ifdef

This puts update_main_cache_tree() and write_cache_as_tree() in the
same group of "index compat" functions that assume the_index
implicitly, which should only be used within builtin/ or t/helper.

sequencer.c is also updated to not use these functions. As of now, no
files outside builtin/ use these functions anymore.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

diff.c: move read_index() code back to the callerNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:18 +0000 (18:14 +0200)

diff.c: move read_index() code back to the caller

This code is only needed for diff-tree (since f0c6b2a2fd ([PATCH]
Optimize diff-tree -[CM] --stdin - 2005-05-27)). Let the caller do the
preparation instead and avoid read_index() in diff.c code.

read_index() should be avoided (in addition to the_index) because it
uses get_index_file() underneath to get the path $GIT_DIR/index. This
effectively pulls the_repository in and may become the only reason to
pull a 'struct repository *' in diff.c. Let's keep the dependencies as
few as possible and kick it back to diff-tree.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

cat-file: support "unordered" output for --batch-all... Jeff King Fri, 10 Aug 2018 23:24:57 +0000 (19:24 -0400)

cat-file: support "unordered" output for --batch-all-objects

If you're going to access the contents of every object in a
packfile, it's generally much more efficient to do so in
pack order, rather than in hash order. That increases the
locality of access within the packfile, which in turn is
friendlier to the delta base cache, since the packfile puts
related deltas next to each other. By contrast, hash order
is effectively random, since the sha1 has no discernible
relationship to the content.

This patch introduces an "--unordered" option to cat-file
which iterates over packs in pack-order under the hood. You
can see the results when dumping all of the file content:

$ time ./git cat-file --batch-all-objects --buffer --batch | wc -c
6883195596

real 0m44.491s
user 0m42.902s
sys 0m5.230s

$ time ./git cat-file --unordered \
--batch-all-objects --buffer --batch | wc -c
6883195596

real 0m6.075s
user 0m4.774s
sys 0m3.548s

Same output, different order, way faster. The same speed-up
applies even if you end up accessing the object content in a
different process, like:

git cat-file --batch-all-objects --buffer --batch-check |
grep blob |
git cat-file --batch='%(objectname) %(rest)' |
wc -c

Adding "--unordered" to the first command drops the runtime
in git.git from 24s to 3.5s.

Side note: there are actually further speedups available
for doing it all in-process now. Since we are outputting
the object content during the actual pack iteration, we
know where to find the object and could skip the extra
lookup done by oid_object_info(). This patch stops short
of that optimization since the underlying API isn't ready
for us to make those sorts of direct requests.

So if --unordered is so much better, why not make it the
default? Two reasons:

1. We've promised in the documentation that --batch-all-objects
outputs in hash order. Since cat-file is plumbing,
people may be relying on that default, and we can't
change it.

2. It's actually _slower_ for some cases. We have to
compute the pack revindex to walk in pack order. And
our de-duplication step uses an oidset, rather than a
sort-and-dedup, which can end up being more expensive.
If we're just accessing the type and size of each
object, for example, like:

git cat-file --batch-all-objects --buffer --batch-check

my best-of-five warm cache timings go from 900ms to
1100ms using --unordered. Though it's possible in a
cold-cache or under memory pressure that we could do
better, since we'd have better locality within the
packfile.

And one final question: why is it "--unordered" and not
"--pack-order"? The answer is again two-fold:

1. "pack order" isn't a well-defined thing across the
whole set of objects. We're hitting loose objects, as
well as objects in multiple packs, and the only
ordering we're promising is _within_ a single pack. The
rest is apparently random.

2. The point here is optimization. So we don't want to
promise any particular ordering, but only to say that
we will choose an ordering which is likely to be
efficient for accessing the object content. That leaves
the door open for further changes in the future without
having to add another compatibility option.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

cat-file: rename batch_{loose,packed}_object callbacksJeff King Fri, 10 Aug 2018 23:17:14 +0000 (19:17 -0400)

cat-file: rename batch_{loose,packed}_object callbacks

We're not really doing the batch-show operation in these
callbacks, but just collecting the set of objects. That
distinction will become more important in a future patch, so
let's rename them now to avoid cluttering that diff.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t1006: test cat-file --batch-all-objects with duplicatesJeff King Fri, 10 Aug 2018 23:16:40 +0000 (19:16 -0400)

t1006: test cat-file --batch-all-objects with duplicates

The test for --batch-all-objects in t1006 covers a variety
of object storage situations, but one thing it doesn't cover
is that we avoid mentioning duplicate objects. We won't have
any because running "git repack -ad" will have packed them
all and deleted the loose ones.

This does work (because we sort and de-dup the output list),
but it's good to include it in our test. And doubly so for
when we add an unordered mode which has to de-dup in a
different way.

Note that we cannot just re-create one of the objects, as
Git will omit the write of an object that is already
present. However, we can create a new pack with one of the
objects, which forces the duplication.

One alternative would be to just use "git repack -a" instead
of "-ad". But then _every_ object would be duplicated as
loose and packed, and we might miss a bug that omits packed
objects (because we'd show their loose counterparts).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

for_each_packed_object: support iterating in pack-orderJeff King Fri, 10 Aug 2018 23:15:49 +0000 (19:15 -0400)

for_each_packed_object: support iterating in pack-order

We currently iterate over objects within a pack in .idx
order, which uses the object hashes. That means that it
is effectively random with respect to the location of the
object within the pack. If you're going to access the actual
object data, there are two reasons to move linearly through
the pack itself:

1. It improves the locality of access in the packfile. In
the cold-cache case, this may mean fewer disk seeks, or
better usage of disk cache.

2. We store related deltas together in the packfile. Which
means that the delta base cache can operate much more
efficiently if we visit all of those related deltas in
sequence, as the earlier items are likely to still be
in the cache. Whereas if we visit the objects in
random order, our cache entries are much more likely to
have been evicted by unrelated deltas in the meantime.

So in general, if you're going to access the object contents
pack order is generally going to end up more efficient.

But if you're simply generating a list of object names, or
if you're going to end up sorting the result anyway, you're
better off just using the .idx order, as finding the pack
order means generating the in-memory pack-revindex.
According to the numbers in 8b8dfd5132 (pack-revindex:
radix-sort the revindex, 2013-07-11), that takes about 200ms
for linux.git, and 20ms for git.git (those numbers are a few
years old but are still a good ballpark).

That makes it a good optimization for some cases (we can
save tens of seconds in git.git by having good locality of
delta access, for a 20ms cost), but a bad one for others
(e.g., right now "cat-file --batch-all-objects
--batch-check="%(objectname)" is 170ms in git.git, so adding
20ms to that is noticeable).

Hence this patch makes it an optional flag. You can't
actually do any interesting timings yet, as it's not plumbed
through to any user-facing tools like cat-file. That will
come in a later patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

for_each_*_object: give more comprehensive docstringsJeff King Fri, 10 Aug 2018 23:11:14 +0000 (19:11 -0400)

for_each_*_object: give more comprehensive docstrings

We already mention the local/alternate behavior of these
functions, but we can help clarify a few other behaviors:

- there's no need to mention LOCAL_ONLY specifically, since
we already reference the flags by type (and as we add
more flags, we don't want to have to mention each)

- clarify that reachability doesn't matter here; this is
all accessible objects

- what ordering/uniqueness guarantees we give

- how pack-specific flags are handled for the loose case

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

for_each_*_object: take flag arguments as enumJeff King Fri, 10 Aug 2018 23:09:44 +0000 (19:09 -0400)

for_each_*_object: take flag arguments as enum

It's not wrong to pass our flags in an "unsigned", as we
know it will be at least as large as the enum. However,
using the enum in the declaration makes it more obvious
where to find the list of flags.

While we're here, let's also drop the "extern" noise-words
from the declarations, per our modern coding style.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

for_each_*_object: store flag definitions in a single... Jeff King Fri, 10 Aug 2018 23:09:06 +0000 (19:09 -0400)

for_each_*_object: store flag definitions in a single location

These flags were split between cache.h and packfile.h,
because some of the flags apply only to packs. However, they
share a single numeric namespace, since both are respected
for the packed variant. Let's make sure they're defined
together so that nobody accidentally adds a new flag in one
location that duplicates the other.

While we're here, let's also put them in an enum (which
helps debugger visibility) and use "(1<<n)" rather than
counting powers of 2 manually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pull doc: fix a long-standing grammar errorÆvar Arnfjörð Bjarmason Mon, 13 Aug 2018 19:22:49 +0000 (19:22 +0000)

pull doc: fix a long-standing grammar error

It should be "is not an empty string" not "is not empty string". This
fixes wording originally introduced in ab9b31386b ("Documentation:
multi-head fetch.", 2005-08-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch tests: correct a comment "remove it" -> "remove... Ævar Arnfjörð Bjarmason Mon, 13 Aug 2018 19:22:48 +0000 (19:22 +0000)

fetch tests: correct a comment "remove it" -> "remove them"

Correct a comment referring to the removal of just the branch to also
refer to the tag. This should have been changed in my
ca3065e7e7 ("fetch tests: add a tag to be deleted to the pruning
tests", 2018-02-09) when the tag deletion was added, but I missed it
at the time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

chainlint: add test of pathological case which triggere... Eric Sunshine Mon, 13 Aug 2018 08:47:39 +0000 (04:47 -0400)

chainlint: add test of pathological case which triggered false positive

This extract from contrib/subtree/t7900 triggered a false positive due
to three chainlint limitations:

* recognizing only a "blessed" set of here-doc tag names in a subshell
("EOF", "EOT", "INPUT_END"), of which "TXT" is not a member

* inability to recognize multi-line $(...) when the first statement of
the body is cuddled with the opening "$("

* inability to recognize multiple constructs on a single line, such as
opening a multi-line $(...) and starting a here-doc

Now that all of these shortcomings have been addressed, turn this rather
pathological bit of shell coding into a chainlint test case.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

chainlint: recognize multi-line quoted strings more... Eric Sunshine Mon, 13 Aug 2018 08:47:38 +0000 (04:47 -0400)

chainlint: recognize multi-line quoted strings more robustly

chainlint.sed recognizes multi-line quoted strings within subshells:

echo "abc
def" >out &&

so it can avoid incorrectly classifying lines internal to the string as
breaking the &&-chain. To identify the first line of a multi-line
string, it checks if the line contains a single quote. However, this is
fragile and can be easily fooled by a line containing multiple strings:

echo "xyz" "abc
def" >out &&

Make detection more robust by checking for an odd number of quotes
rather than only a single one.

(Escaped quotes are not handled, but support may be added later.)

The original multi-line string recognizer rather cavalierly threw away
all but the final quote, whereas the new one is careful to retain all
quotes, so the "expected" output of a couple existing chainlint tests is
updated to account for this new behavior.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

chainlint: let here-doc and multi-line string commence... Eric Sunshine Mon, 13 Aug 2018 08:47:37 +0000 (04:47 -0400)

chainlint: let here-doc and multi-line string commence on same line

After swallowing a here-doc, chainlint.sed assumes that no other
processing needs to be done on the line aside from checking for &&-chain
breakage; likewise, after folding a multi-line quoted string. However,
it's conceivable (even if unlikely in practice) that both a here-doc and
a multi-line quoted string might commence on the same line:

cat <<\EOF && echo "foo
bar"
data
EOF

Support this case by sending the line (after swallowing and folding)
through the normal processing sequence rather than jumping directly to
the check for broken &&-chain.

This change also allows other somewhat pathological cases to be handled,
such as closing a subshell on the same line starting a here-doc:

(
cat <<-\INPUT)
data
INPUT

or, for instance, opening a multi-line $(...) expression on the same
line starting a here-doc:

x=$(cat <<-\END &&
data
END
echo "x")

among others.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

chainlint: recognize multi-line $(...) when command... Eric Sunshine Mon, 13 Aug 2018 08:47:36 +0000 (04:47 -0400)

chainlint: recognize multi-line $(...) when command cuddled with "$("

For multi-line $(...) expressions nested within subshells, chainlint.sed
only recognizes:

x=$(
echo foo &&
...

but it is not unlikely that test authors may also cuddle the command
with the opening "$(", so support that style, as well:

x=$(echo foo &&
...

The closing ")" is already correctly recognized when cuddled or not.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

chainlint: match 'quoted' here-doc tagsEric Sunshine Mon, 13 Aug 2018 08:47:35 +0000 (04:47 -0400)

chainlint: match 'quoted' here-doc tags

A here-doc tag can be quoted ('EOF') or escaped (\EOF) to suppress
interpolation within the body. Although, chainlint recognizes escaped
tags, it does not know about quoted tags. For completeness, teach it to
recognize quoted tags, as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

chainlint: match arbitrary here-docs tags rather than... Eric Sunshine Mon, 13 Aug 2018 08:47:34 +0000 (04:47 -0400)

chainlint: match arbitrary here-docs tags rather than hard-coded names

chainlint.sed swallows top-level here-docs to avoid being fooled by
content which might look like start-of-subshell. It likewise swallows
here-docs in subshells to avoid marking content lines as breaking the
&&-chain, and to avoid being fooled by content which might look like
end-of-subshell, start-of-nested-subshell, or other specially-recognized
constructs.

At the time of implementation, it was believed that it was not possible
to support arbitrary here-doc tag names since 'sed' provides no way to
stash the opening tag name in a variable for later comparison against a
line signaling end-of-here-doc. Consequently, tag names are hard-coded,
with "EOF" being the only tag recognized at the top-level, and only
"EOF", "EOT", and "INPUT_END" being recognized within subshells. Also,
special care was taken to avoid being confused by here-docs nested
within other here-docs.

In practice, this limited number of hard-coded tag names has been "good
enough" for the 13000+ existing Git test, despite many of those tests
using tags other than the recognized ones, since the bodies of those
here-docs do not contain content which would fool the linter.
Nevertheless, the situation is not ideal since someone writing new
tests, and choosing a name not in the "blessed" set could potentially
trigger a false-positive.

To address this shortcoming, upgrade chainlint.sed to handle arbitrary
here-doc tag names, both at the top-level and within subshells.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

mergetool: don't suggest to continue after last fileNicholas Guriev Mon, 13 Aug 2018 05:09:29 +0000 (08:09 +0300)

mergetool: don't suggest to continue after last file

Eliminate an unnecessary prompt to continue after failed merger, by
not calling the prompt_after_failed_merge function when only one
iteration remains.

Uses positional parameters to count files in the list to make it
easier to see if we have any more paths to process from within the
loop.

Signed-off-by: Nicholas Guriev <guriev-ns@ya.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t5318: avoid unnecessary command substitutionsSZEDER Gábor Mon, 13 Aug 2018 00:30:10 +0000 (02:30 +0200)

t5318: avoid unnecessary command substitutions

Two tests added in dade47c06c (commit-graph: add repo arg to graph
readers, 2018-07-11) prepare the contents of 'expect' files by
'echo'ing the results of command substitutions. That's unncessary,
avoid them by directly saving the output of the commands executed in
those command substitutions.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t5318: use 'test_cmp_bin' to compare commit-graph filesSZEDER Gábor Mon, 13 Aug 2018 11:52:43 +0000 (13:52 +0200)

t5318: use 'test_cmp_bin' to compare commit-graph files

The commit-graph files are binary files, so they should not be
compared with 'test_cmp', because that might cause issues like
crashing[1] or infinite loop[2] on Windows, where 'test_cmp' is a
shell function to deal with random LF-CRLF conversions[3].

Use 'test_cmp_bin' instead.

1 - b93e6e3663 (t5000, t5003: do not use test_cmp to compare binary
files, 2014-06-04)
2 - f9f3851b4d (t9300: use test_cmp_bin instead of test_cmp to compare
binary files, 2014-09-12)
3 - 4d715ac05c (Windows: a test_cmp that is agnostic to random LF <>
CRLF conversions, 2013-10-26)

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: use dim/bold cues to improve dual color... Johannes Schindelin Mon, 13 Aug 2018 11:33:32 +0000 (04:33 -0700)

range-diff: use dim/bold cues to improve dual color mode

It *is* a confusing thing to look at a diff of diffs. All too easy is it
to mix up whether the -/+ markers refer to the "inner" or the "outer"
diff, i.e. whether a `+` indicates that a line was added by either the
old or the new diff (or both), or whether the new diff does something
different than the old diff.

To make things easier to process for normal developers, we introduced
the dual color mode which colors the lines according to the commit diff,
i.e. lines that are added by a commit (whether old, new, or both) are
colored in green. In non-dual color mode, the lines would be colored
according to the outer diff: if the old commit added a line, it would be
colored red (because that line addition is only present in the first
commit range that was specified on the command-line, i.e. the "old"
commit, but not in the second commit range, i.e. the "new" commit).

However, this dual color mode is still not making things clear enough,
as we are looking at two levels of diffs, and we still only pick a color
according to *one* of them (the outer diff marker is colored
differently, of course, but in particular with deep indentation, it is
easy to lose track of that outer diff marker's background color).

Therefore, let's add another dimension to the mix. Still use
green/red/normal according to the commit diffs, but now also dim the
lines that were only in the old commit, and use bold face for the lines
that are only in the new commit.

That way, it is much easier not to lose track of, say, when we are
looking at a line that was added in the previous iteration of a patch
series but the new iteration adds a slightly different version: the
obsolete change will be dimmed, the current version of the patch will be
bold.

At least this developer has a much easier time reading the range-diffs
that way.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: make --dual-color the default modeJohannes Schindelin Mon, 13 Aug 2018 11:33:30 +0000 (04:33 -0700)

range-diff: make --dual-color the default mode

After using this command extensively for the last two months, this
developer came to the conclusion that even if the dual color mode still
leaves a lot of room for confusion about what was actually changed, the
non-dual color mode is substantially worse in that regard.

Therefore, we really want to make the dual color mode the default.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: left-pad patch numbersJohannes Schindelin Mon, 13 Aug 2018 11:33:28 +0000 (04:33 -0700)

range-diff: left-pad patch numbers

As pointed out by Elijah Newren, tbdiff has this neat little alignment
trick where it outputs the commit pairs with patch numbers that are
padded to the maximal patch number's width:

1: cafedead = 1: acefade first patch
[...]
314: beefeada < 314: facecab up to PI!

Let's do the same in range-diff, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

completion: support `git range-diff`Johannes Schindelin Mon, 13 Aug 2018 11:33:27 +0000 (04:33 -0700)

completion: support `git range-diff`

Tab completion of `git range-diff` is very convenient, especially
given that the revision arguments to specify the commit ranges to
compare are typically more complex than, say, what is normally passed
to `git log`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: populate the man pageJohannes Schindelin Mon, 13 Aug 2018 11:33:25 +0000 (04:33 -0700)

range-diff: populate the man page

The bulk of this patch consists of a heavily butchered version of
tbdiff's README written by Thomas Rast and Thomas Gummerer, lifted from
https://github.com/trast/tbdiff.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff --dual-color: skip white-space warningsJohannes Schindelin Mon, 13 Aug 2018 11:33:24 +0000 (04:33 -0700)

range-diff --dual-color: skip white-space warnings

When displaying a diff of diffs, it is possible that there is an outer
`+` before a context line. That happens when the context changed between
old and new commit. When that context line starts with a tab (after the
space that marks it as context line), our diff machinery spits out a
white-space error (space before tab), but in this case, that is
incorrect.

Rather than adding a specific whitespace flag that specifically ignores
the first space in the output (and might miss other problems with the
white-space warnings), let's just skip handling white-space errors in
dual color mode to begin with.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: offer to dual-color the diffsJohannes Schindelin Mon, 13 Aug 2018 11:33:22 +0000 (04:33 -0700)

range-diff: offer to dual-color the diffs

When showing what changed between old and new commits, we show a diff of
the patches. This diff is a diff between diffs, therefore there are
nested +/- signs, and it can be relatively hard to understand what is
going on.

With the --dual-color option, the preimage and the postimage are colored
like the diffs they are, and the *outer* +/- sign is inverted for
clarity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

diff: add an internal option to dual-color diffs of... Johannes Schindelin Mon, 13 Aug 2018 11:33:20 +0000 (04:33 -0700)

diff: add an internal option to dual-color diffs of diffs

When diffing diffs, it can be quite daunting to figure out what the heck
is going on, as there are nested +/- signs.

Let's make this easier by adding a flag in diff_options that allows
color-coding the outer diff sign with inverted colors, so that the
preimage and postimage is colored like the diff it is.

Of course, this really only makes sense when the preimage and postimage
*are* diffs. So let's not expose this flag via a command-line option for
now.

This is a feature that was invented by git-tbdiff, and it will be used
by `git range-diff` in the next commit, by offering it via a new option:
`--dual-color`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

color: add the meta color GIT_COLOR_REVERSEJohannes Schindelin Mon, 13 Aug 2018 11:33:19 +0000 (04:33 -0700)

color: add the meta color GIT_COLOR_REVERSE

This "color" simply reverts background and foreground. It will be used
in the upcoming "dual color" mode of `git range-diff`, where we will
reverse colors for the -/+ markers and the fragment headers of the
"outer" diff.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: use color for the commit pairsJohannes Schindelin Mon, 13 Aug 2018 11:33:18 +0000 (04:33 -0700)

range-diff: use color for the commit pairs

Arguably the most important part of `git range-diff`'s output is the
list of commits in the two branches, together with their relationships.

For that reason, tbdiff introduced color-coding that is pretty
intuitive, especially for unchanged patches (all dim yellow, like the
first line in `git show`'s output) vs modified patches (old commit is
red, new commit is green). Let's imitate that color scheme.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: add testsThomas Rast Mon, 13 Aug 2018 11:33:16 +0000 (04:33 -0700)

range-diff: add tests

These are essentially lifted from https://github.com/trast/tbdiff, with
light touch-ups to account for the command now being named `git
range-diff`.

Apart from renaming `tbdiff` to `range-diff`, only one test case needed
to be adjusted: 11 - 'changed message'.

The underlying reason it had to be adjusted is that diff generation is
sometimes ambiguous. In this case, a comment line and an empty line are
added, but it is ambiguous whether they were added after the existing
empty line, or whether an empty line and the comment line are added
*before* the existing empty line. And apparently xdiff picks a different
option here than Python's difflib.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: do not show "function names" in hunk headersJohannes Schindelin Mon, 13 Aug 2018 11:33:14 +0000 (04:33 -0700)

range-diff: do not show "function names" in hunk headers

We are comparing complete, formatted commit messages with patches. There
are no function names here, so stop looking for them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: adjust the output of the commit pairsJohannes Schindelin Mon, 13 Aug 2018 11:33:13 +0000 (04:33 -0700)

range-diff: adjust the output of the commit pairs

This not only uses "dashed stand-ins" for "pairs" where one side is
missing (i.e. unmatched commits that are present only in one of the two
commit ranges), but also adds onelines for the reader's pleasure.

This change brings `git range-diff` yet another step closer to
feature parity with tbdiff: it now shows the oneline, too, and indicates
with `=` when the commits have identical diffs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: suppress the diff headersJohannes Schindelin Mon, 13 Aug 2018 11:33:11 +0000 (04:33 -0700)

range-diff: suppress the diff headers

When showing the diff between corresponding patches of the two branch
versions, we have to make up a fake filename to run the diff machinery.

That filename does not carry any meaningful information, hence tbdiff
suppresses it. So we should, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: indent the diffs just like tbdiffJohannes Schindelin Mon, 13 Aug 2018 11:33:10 +0000 (04:33 -0700)

range-diff: indent the diffs just like tbdiff

The main information in the `range-diff` view comes from the list of
matching and non-matching commits, the diffs are additional information.
Indenting them helps with the reading flow.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: right-trim commit messagesJohannes Schindelin Mon, 13 Aug 2018 11:33:08 +0000 (04:33 -0700)

range-diff: right-trim commit messages

When comparing commit messages, we need to keep in mind that they are
indented by four spaces. That is, empty lines are no longer empty, but
have "trailing whitespace". When displaying them in color, that results
in those nagging red lines.

Let's just right-trim the lines in the commit message, it's not like
trailing white-space in the commit messages are important enough to care
about in `git range-diff`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: also show the diff between patchesJohannes Schindelin Mon, 13 Aug 2018 11:33:07 +0000 (04:33 -0700)

range-diff: also show the diff between patches

Just like tbdiff, we now show the diff between matching patches. This is
a "diff of two diffs", so it can be a bit daunting to read for the
beginner.

An alternative would be to display an interdiff, i.e. the hypothetical
diff which is the result of first reverting the old diff and then
applying the new diff.

Especially when rebasing frequently, an interdiff is often not feasible,
though: if the old diff cannot be applied in reverse (due to a moving
upstream), an interdiff can simply not be inferred.

This commit brings `range-diff` closer to feature parity with regard
to tbdiff.

To make `git range-diff` respect e.g. color.diff.* settings, we have
to adjust git_branch_config() accordingly.

Note: while we now parse diff options such as --color, the effect is not
yet the same as in tbdiff, where also the commit pairs would be colored.
This is left for a later commit.

Note also: while tbdiff accepts the `--no-patches` option to suppress
these diffs between patches, we prefer the `-s` (or `--no-patch`) option
that is automatically supported via our use of diff_opt_parse().

And finally note: to support diff options, we have to call
`parse_options()` such that it keeps unknown options, and then loop over
those and let `diff_opt_parse()` handle them. After that loop, we have
to call `parse_options()` again, to make sure that no unknown options
are left.

Helped-by: Thomas Gummerer <t.gummerer@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: improve the order of the shown commitsJohannes Schindelin Mon, 13 Aug 2018 11:33:05 +0000 (04:33 -0700)

range-diff: improve the order of the shown commits

This patch lets `git range-diff` use the same order as tbdiff.

The idea is simple: for left-to-right readers, it is natural to assume
that the `git range-diff` is performed between an older vs a newer
version of the branch. As such, the user is probably more interested in
the question "where did this come from?" rather than "where did that one
go?".

To that end, we list the commits in the order of the second commit range
("the newer version"), inserting the unmatched commits of the first
commit range as soon as all their predecessors have been shown.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

range-diff: first rudimentary implementationJohannes Schindelin Mon, 13 Aug 2018 11:33:04 +0000 (04:33 -0700)

range-diff: first rudimentary implementation

At this stage, `git range-diff` can determine corresponding commits
of two related commit ranges. This makes use of the recently introduced
implementation of the linear assignment algorithm.

The core of this patch is a straight port of the ideas of tbdiff, the
apparently dormant project at https://github.com/trast/tbdiff.

The output does not at all match `tbdiff`'s output yet, as this patch
really concentrates on getting the patch matching part right.

Note: due to differences in the diff algorithm (`tbdiff` uses the Python
module `difflib`, Git uses its xdiff fork), the cost matrix calculated
by `range-diff` is different (but very similar) to the one calculated
by `tbdiff`. Therefore, it is possible that they find different matching
commits in corner cases (e.g. when a patch was split into two patches of
roughly equal length).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

Introduce `range-diff` to compare iterations of a topic... Johannes Schindelin Mon, 13 Aug 2018 11:33:02 +0000 (04:33 -0700)

Introduce `range-diff` to compare iterations of a topic branch

This command does not do a whole lot so far, apart from showing a usage
that is oddly similar to that of `git tbdiff`. And for a good reason:
the next commits will turn `range-branch` into a full-blown replacement
for `tbdiff`.

At this point, we ignore tbdiff's color options, as they will all be
implemented later using diff_options.

Since f318d739159 (generate-cmds.sh: export all commands to
command-list.h, 2018-05-10), every new command *requires* a man page to
build right away, so let's also add a blank man page, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

linear-assignment: a function to solve least-cost assig... Johannes Schindelin Mon, 13 Aug 2018 11:33:00 +0000 (04:33 -0700)

linear-assignment: a function to solve least-cost assignment problems

The problem solved by the code introduced in this commit goes like this:
given two sets of items, and a cost matrix which says how much it
"costs" to assign any given item of the first set to any given item of
the second, assign all items (except when the sets have different size)
in the cheapest way.

We use the Jonker-Volgenant algorithm to solve the assignment problem to
answer questions such as: given two different versions of a topic branch
(or iterations of a patch series), what is the best pairing of
commits/patches between the different versions?

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t5552: suppress upload-pack trace outputJeff King Fri, 10 Aug 2018 14:09:08 +0000 (10:09 -0400)

t5552: suppress upload-pack trace output

The t5552 test script uses GIT_TRACE_PACKET to monitor what
git-fetch sends and receives. However, because we're
accessing a local repository, the child upload-pack also
sends trace output to the same file.

On Linux, this works out OK. We open the trace file with
O_APPEND, so all writes are atomically positioned at the end
of the file. No data can be overwritten or omitted. And
since we prepare our small writes in a strbuf and write them
with a single write(), we should see each line as an atomic
unit. The order of lines between the two processes is
undefined, but the test script greps only for "fetch>" or
"fetch<" lines. So under Linux, the test results are
deterministic.

The test fails intermittently on Windows, however,
reportedly even overwriting bits of the output file (i.e.,
O_APPEND does not seem to give us an atomic position+write).

Since the test only cares about the trace output from fetch,
we can just disable the output from upload-pack. That
doesn't solve the greater question of O_APPEND/trace issues
under Windows, but it easily fixes the flakiness from this
test.

Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

gpg-interface: propagate exit status from gpg back... Junio C Hamano Thu, 9 Aug 2018 18:40:27 +0000 (11:40 -0700)

gpg-interface: propagate exit status from gpg back to the callers

When gpg-interface API unified support for signature verification
codepaths for signed tags and signed commits in mid 2015 at around
v2.6.0-rc0~114, we accidentally loosened the GPG signature
verification.

Before that change, signed commits were verified by looking for
"G"ood signature from GPG, while ignoring the exit status of "gpg
--verify" process, while signed tags were verified by simply passing
the exit status of "gpg --verify" through. The unified code we
currently have ignores the exit status of "gpg --verify" and returns
successful verification when the signature matches an unexpired key
regardless of the trust placed on the key (i.e. in addition to "G"ood
ones, we accept "U"ntrusted ones).

Make these commands signal failure with their exit status when
underlying "gpg --verify" (or the custom command specified by
"gpg.program" configuration variable) does so. This essentially
changes their behaviour in a backward incompatible way to reject
signatures that have been made with untrusted keys even if they
correctly verify, as that is how "gpg --verify" behaves.

Note that the code still overrides a zero exit status obtained from
"gpg" (or gpg.program) if the output does not say the signature is
good or computes correctly but made with untrusted keys, to catch
a poorly written wrapper around "gpg" the user may give us.

We could exclude "U"ntrusted support from this fallback code, but
that would be making two backward incompatible changes in a single
commit, so let's avoid that for now. A follow-up change could do so
if desired.

Helped-by: Vojtech Myslivec <vojtech.myslivec@nic.cz>
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

repack: repack promisor objects if -a or -A is setJonathan Tan Wed, 8 Aug 2018 22:34:06 +0000 (15:34 -0700)

repack: repack promisor objects if -a or -A is set

Currently, repack does not touch promisor packfiles at all, potentially
causing the performance of repositories that have many such packfiles to
drop. Therefore, repack all promisor objects if invoked with -a or -A.

This is done by an additional invocation of pack-objects on all promisor
objects individually given, which takes care of deduplication and allows
the resulting packfiles to respect flags such as --max-pack-size.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

repack: refactor setup of pack-objects cmdJonathan Tan Wed, 8 Aug 2018 22:34:05 +0000 (15:34 -0700)

repack: refactor setup of pack-objects cmd

A subsequent patch will teach repack to run pack-objects with some same
and some different arguments if repacking of promisor objects is
required. Refactor the setup of the pack-objects cmd so that setting up
the arguments common to both is done in a function.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

rebase --exec: make it work with --rebase-mergesJohannes Schindelin Thu, 9 Aug 2018 09:41:11 +0000 (02:41 -0700)

rebase --exec: make it work with --rebase-merges

The idea of `--exec` is to append an `exec` call after each `pick`.

Since the introduction of fixup!/squash! commits, this idea was extended
to apply to "pick, possibly followed by a fixup/squash chain", i.e. an
exec would not be inserted between a `pick` and any of its corresponding
`fixup` or `squash` lines.

The current implementation uses a dirty trick to achieve that: it
assumes that there are only pick/fixup/squash commands, and then
*inserts* the `exec` lines before any `pick` but the first, and appends
a final one.

With the todo lists generated by `git rebase --rebase-merges`, this
simple implementation shows its problems: it produces the exact wrong
thing when there are `label`, `reset` and `merge` commands.

Let's change the implementation to do exactly what we want: look for
`pick` lines, skip any fixup/squash chains, and then insert the `exec`
line. Lather, rinse, repeat.

Note: we take pains to insert *before* comment lines whenever possible,
as empty commits are represented by commented-out pick lines (and we
want to insert a preceding pick's exec line *before* such a line, not
afterward).

While at it, also add `exec` lines after `merge` commands, because they
are similar in spirit to `pick` commands: they add new commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

sideband: highlight keywords in remote sideband outputHan-Wen Nienhuys Tue, 7 Aug 2018 12:51:08 +0000 (14:51 +0200)

sideband: highlight keywords in remote sideband output

The colorization is controlled with the config setting "color.remote".

Supported keywords are "error", "warning", "hint" and "success". They
are highlighted if they appear at the start of the line, which is
common in error messages, eg.

ERROR: commit is missing Change-Id

The Git push process itself prints lots of non-actionable messages
(eg. bandwidth statistics, object counters for different phases of the
process). This obscures actionable error messages that servers may
send back. Highlighting keywords in the sideband draws more attention
to those messages.

The background for this change is that Gerrit does server-side
processing to create or update code reviews, and actionable error
messages (eg. missing Change-Id) must be communicated back to the user
during the push. User research has shown that new users have trouble
seeing these messages.

The highlighting is done on the client rather than server side, so
servers don't have to grow capabilities to understand terminal escape
codes and terminal state. It also consistent with the current state
where Git is control of the local display (eg. prefixing messages with
"remote: ").

The highlighting can be configured using color.remote.<KEYWORD>
configuration settings. Since the keys are matched case insensitively,
we match the keywords case insensitively too.

Finally, this solution is backwards compatible: many servers already
prefix their messages with "error", and they will benefit from this
change without requiring a server update. By contrast, a server-side
solution would likely require plumbing the TERM variable through the
git protocol, so it would require changes to both server and client.

Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

update-index: there no longer is `apply --index-info`Junio C Hamano Wed, 8 Aug 2018 21:35:18 +0000 (14:35 -0700)

update-index: there no longer is `apply --index-info`

Back when we removed `git apply --index-info` in 2007, we forgot to
adjust the documentation for update-index that reads its output.

Let's reorder the description of three formats to present the other
two formats that are still generated by git commands before this
format, and stop mentioning `git apply --index-info`.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

git-update-index.txt: reword possibly confusing exampleElijah Newren Wed, 8 Aug 2018 20:28:07 +0000 (13:28 -0700)

git-update-index.txt: reword possibly confusing example

The following phrase could be interpreted multiple ways:
"To pretend you have a file with mode and sha1 at path"

In particular, I can think of two:
1. Pretend we have some new file, which happens to have a given mode
and sha1
2. Pretend one of the files we are already tracking has a different
mode and sha1 than what it really does

I think people could easily assume either case while reading, but the
example command provided doesn't actually handle the first case, which
caused some minor frustration to at least one user. Modify the example
command so that it correctly handles both cases, and re-order the
wording in a way that makes it more likely folks will assume the first
interpretation. I believe the new example shouldn't pose any obstacles
to those wanting the second interpretation (at worst, they pass an
unnecessary extra flag).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

git-config: document accidental multi-line setting... Stefan Beller Wed, 8 Aug 2018 19:50:20 +0000 (12:50 -0700)

git-config: document accidental multi-line setting in deprecated syntax

The bug was noticed when writing the previous patch; a fix for this bug
is not easy though: If we choose to ignore the case of the subsection
(and revert most of the code of the previous patch, just keeping
s/strncasecmp/strcmp/), then we'd introduce new sections using the
new syntax, such that

--------
[section.subsection]
key = value1
--------

git config section.Subsection.key value2

would result in

--------
[section.subsection]
key = value1
[section.Subsection]
key = value2
--------

which is even more confusing. A proper fix would replace the first
occurrence of 'key'. As the syntax is deprecated, let's prefer to not
spend time on fixing the behavior and just document it instead.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

config: fix case sensitive subsection names on writingStefan Beller Wed, 8 Aug 2018 19:50:19 +0000 (12:50 -0700)

config: fix case sensitive subsection names on writing

A user reported a submodule issue regarding a section mix-up,
but it could be boiled down to the following test case:

$ git init test && cd test
$ git config foo."Bar".key test
$ git config foo."bar".key test
$ tail -n 3 .git/config
[foo "Bar"]
key = test
key = test

Sub sections are case sensitive and we have a test for correctly reading
them. However we do not have a test for writing out config correctly with
case sensitive subsection names, which is why this went unnoticed in
6ae996f2acf (git_config_set: make use of the config parser's event
stream, 2018-04-09)

Unfortunately we have to make a distinction between old style configuration
that looks like

[foo.Bar]
key = test

and the new quoted style as seen above. The old style is documented as
case-agnostic, hence we need to keep 'strncasecmp'; although the
resulting setting for the old style config differs from the configuration.
That will be fixed in a follow up patch.

Reported-by: JP Sugarbroad <jpsugar@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t7406: avoid using test_must_fail for commands other... Elijah Newren Wed, 8 Aug 2018 16:31:07 +0000 (09:31 -0700)

t7406: avoid using test_must_fail for commands other than git

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t7406: prefer test_* helper functions to test -[feds]Elijah Newren Wed, 8 Aug 2018 16:31:06 +0000 (09:31 -0700)

t7406: prefer test_* helper functions to test -[feds]

test -e, test -s, etc. do not provide nice error messages when we hit
test failures, so use the test_* helper functions from
test-lib-functions.sh.

Also, add test_path_exists() to test-lib-function.sh while at it, so
that we don't need to worry whether submodule/.git is a file or a
directory. It currently is a file with contents of the form
gitdir: ../.git/modules/submodule
but it could be changed in the future to be a directory; this test
only really cares that it exists.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t7406: avoid having git commands upstream of a pipeElijah Newren Wed, 8 Aug 2018 16:31:05 +0000 (09:31 -0700)

t7406: avoid having git commands upstream of a pipe

When a git command is on the left side of a pipe, the pipe will swallow
its exit status, preventing us from detecting failures in said commands.
Restructure the tests to put the output in a temporary file to avoid
this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t7406: simplify by using diff --name-only instead of... Elijah Newren Wed, 8 Aug 2018 16:31:04 +0000 (09:31 -0700)

t7406: simplify by using diff --name-only instead of diff --raw

We can get rid of some quoted tabs and make a few tests slightly easier
to read and edit by just asking for the names of the files modified,
since that's all these tests were interested in anyway.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t7406: fix call that was failing for the wrong reasonElijah Newren Wed, 8 Aug 2018 16:31:03 +0000 (09:31 -0700)

t7406: fix call that was failing for the wrong reason

A test making use of test_must_fail was failing like this:
fatal: ambiguous argument '|': unknown revision or path not in the working tree.
when the intent was to verify that a specific string was not found
in the output of the git diff command, i.e. that grep returned
non-zero. Fix the test to do that.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

remote-curl: remove spurious periodJohannes Schindelin Wed, 8 Aug 2018 11:50:00 +0000 (04:50 -0700)

remote-curl: remove spurious period

We should not interrupt. sentences in the middle.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

git-compat-util.h: fix typoJohannes Schindelin Wed, 8 Aug 2018 11:49:58 +0000 (04:49 -0700)

git-compat-util.h: fix typo

The words "save" and "safe" are both very wonderful words, each with
their own set of meanings. Let's not confuse them with one another save
on occasion of a pun.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

git-instaweb: fix apache2 config with apache >= 2.4Sebastian Kisela Tue, 7 Aug 2018 07:25:48 +0000 (09:25 +0200)

git-instaweb: fix apache2 config with apache >= 2.4

The generated apache2 config fails with apache >= 2.4. The error log
states:

AH00136: Server MUST relinquish startup privileges before accepting
connections. Please ensure mod_unixd or other system security
module is loaded.
AH00016: Configuration Failed

Fix this by loading the unixd module. This works with older httpd as
well, so no IfVersion conditional is needed. (Tested with httpd-2.2.15
on CentOS-6.)

Written with assistance of Todd Zullinger <tmz@pobox.com>

Signed-off-by: Sebastian Kisela <skisela@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

git-instaweb: support Fedora/Red Hat apache module... Sebastian Kisela Wed, 8 Aug 2018 08:49:18 +0000 (10:49 +0200)

git-instaweb: support Fedora/Red Hat apache module path

On Fedora-derived systems, the apache httpd package installs modules
under /usr/lib{,64}/httpd/modules, depending on whether the system is
32- or 64-bit. A symlink from /etc/httpd/modules is created which
points to the proper module path. Use it to support apache on Fedora,
CentOS, and Red Hat systems.

Written with assistance of Todd Zullinger <tmz@pobox.com> and
Junio C Hamano <gitster@pobox.com>.

Signed-off-by: Sebastian Kisela <skisela@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

sequencer: fix quoting in write_author_scriptPhillip Wood Tue, 7 Aug 2018 09:34:52 +0000 (10:34 +0100)

sequencer: fix quoting in write_author_script

Single quotes should be escaped as \' not \\'. The bad quoting breaks
the interactive version of 'rebase --root' (which is used when there
is no '--onto' even if the user does not specify --interactive) for
authors that contain "'" as sq_dequote() called by read_author_ident()
errors out on the bad quoting.

For other interactive rebases this only affects external scripts that
read the author script and users whose git is upgraded from the shell
version of rebase -i while rebase was stopped when the author contains
"'". This is because the parsing in read_env_script() expected the
broken quoting.

This patch includes code to handle the broken quoting when
git has been upgraded while rebase was stopped. It does this by
detecting the missing "'" at the end of the GIT_AUTHOR_DATE line to see
if it should dequote \\' as "'". Note this is only implemented for
normal picks, not for creating a new root commit (rebase will stop with
an error complaining out bad quoting in that case).

The fallback code has been manually tested by reverting both the quoting
fixes in write_author_script() and the previous fix for the missing "'"
at the end of the GIT_AUTHOR_DATE line and running
t3404-rebase-interactive.sh.

Ideally rebase and am would share the same code for reading and
writing the author script, but this commit just fixes the immediate
bug.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

sequencer: handle errors from read_author_ident()Phillip Wood Tue, 7 Aug 2018 09:34:51 +0000 (10:34 +0100)

sequencer: handle errors from read_author_ident()

Check for a NULL return value from read_author_ident() that indicates
an error. Previously the NULL author was passed to commit_tree() which
would then fallback to using the default author when creating the new
commit. This changed the date and potentially the author of the commit
which corrupted the author data compared to its expected value.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

doc hash-function-transition: pick SHA-256 as NewHashJonathan Nieder Sat, 4 Aug 2018 08:52:47 +0000 (01:52 -0700)

doc hash-function-transition: pick SHA-256 as NewHash

From a security perspective, it seems that SHA-256, BLAKE2, SHA3-256,
K12, and so on are all believed to have similar security properties.
All are good options from a security point of view.

SHA-256 has a number of advantages:

* It has been around for a while, is widely used, and is supported by
just about every single crypto library (OpenSSL, mbedTLS, CryptoNG,
SecureTransport, etc).

* When you compare against SHA1DC, most vectorized SHA-256
implementations are indeed faster, even without acceleration.

* If we're doing signatures with OpenPGP (or even, I suppose, CMS),
we're going to be using SHA-2, so it doesn't make sense to have our
security depend on two separate algorithms when either one of them
alone could break the security when we could just depend on one.

So SHA-256 it is. Update the hash-function-transition design doc to
say so.

After this patch, there are no remaining instances of the string
"NewHash", except for an unrelated use from 2008 as a variable name in
t/t9700/test.pl.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Acked-by: Dan Shumow <danshu@microsoft.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t: factor out FUNNYNAMES as shared lazy prereqWilliam Chargin Mon, 6 Aug 2018 18:35:08 +0000 (11:35 -0700)

t: factor out FUNNYNAMES as shared lazy prereq

A fair number of tests need to check that the filesystem supports file
names including "funny" characters, like newline, tab, and double-quote.
Jonathan Nieder suggested that this be extracted into a lazy prereq in
the top-level `test-lib.sh`. This patch effects that change.

The FUNNYNAMES prereq now uniformly requires support for newlines, tabs,
and double-quotes in filenames. This very slightly decreases the power
of some tests, which might have run previously on a system that supports
(e.g.) newlines and tabs but not double-quotes, but now will not. This
seems to me like an acceptable tradeoff for consistency.

One test (`t/t9902-completion.sh`) defined FUNNYNAMES to further require
the separators \034 through \037, the test for which was implemented
using the Bash-specific $'\034' syntax. I've elected to leave this one
as is, renaming it to FUNNIERNAMES.

After this patch, `git grep 'test_$set\|lazy$_prereq.*FUNNYNAMES'` has
only one result.

Signed-off-by: William Chargin <wchargin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

Makefile: add missing dependency for command-list.hNguyễn Thái Ngọc Duy Mon, 6 Aug 2018 16:34:21 +0000 (18:34 +0200)

Makefile: add missing dependency for command-list.h

Commit 3ac68a93fd (help: add --config to list all available config -
2018-05-26) makes generate-cmdlist.sh adds a new input source
config.txt but it's not a Makefile dependency. Any changes in
config.txt will not trigger command-list.h regeneration and the config
list in this file becomes outdated. Correct the dependency.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t3430: demonstrate what -r, --autosquash & --exec should doJohannes Schindelin Mon, 6 Aug 2018 09:52:52 +0000 (02:52 -0700)

t3430: demonstrate what -r, --autosquash & --exec should do

The --exec option's implementation is not really well-prepared for
--rebase-merges. Demonstrate this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t4150: fix broken test for am --scissorsAndrei Rybak Mon, 6 Aug 2018 17:49:38 +0000 (19:49 +0200)

t4150: fix broken test for am --scissors

Tests for "git am --[no-]scissors" [1] work in the following way:

1. Create files with commit messages
2. Use these files to create expected commits
3. Generate eml file with patch from expected commits
4. Create commits using git am with these eml files
5. Compare these commits with expected

The test for "git am --scissors" is supposed to take an e-mail with a
scissors line and in-body "Subject:" header and demonstrate that the
subject line from the e-mail itself is overridden by the in-body header
and that only text below the scissors line is included in the commit
message of the commit created by the invocation of "git am --scissors".
However, the setup of the test incorrectly uses a commit without the
scissors line and without the in-body header in the commit message,
producing eml file not suitable for testing of "git am --scissors".

This can be checked by intentionally breaking is_scissors_line function
in mailinfo.c, for example, by changing string ">8", which is used by
the test. With such change the test should fail, but does not.

Fix broken test by generating eml file with scissors line and in-body
header "Subject:". Since the two tests for --scissors and --no-scissors
options are there to test cutting or keeping the commit message, update
both tests to change the test file in the same way, which allows us to
generate only one eml file to be passed to git am. To clarify the
intention of the test, give files and tags more explicit names.

[1]: introduced in bf72ac17d (t4150: tests for am --[no-]scissors,
2015-07-19)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pull --rebase=<type>: allow single-letter abbreviations... Johannes Schindelin Sat, 4 Aug 2018 19:23:09 +0000 (12:23 -0700)

pull --rebase=<type>: allow single-letter abbreviations for the type

Git for Windows' original 4aa8b8c8283 (Teach 'git pull' to handle
--rebase=interactive, 2011-10-21) had support for the very convenient
abbreviation

git pull --rebase=i

which was later lost when it was ported to the builtin `git pull`, and
it was not introduced before the patch eventually made it into Git as
f5eb87b98dd (pull: allow interactive rebase with --rebase=interactive,
2016-01-13).

However, it is *really* a useful short hand for the occasional rebasing
pull on branches that do not usually want to be rebased.

So let's reintroduce this convenience, at long last.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

add a script to diff rendered documentationJeff King Mon, 6 Aug 2018 17:37:20 +0000 (13:37 -0400)

add a script to diff rendered documentation

After making a change to the documentation, it's easy to
forget to check the rendered version to make sure it was
formatted as you intended. And simply doing a diff between
the two built versions is less trivial than you might hope:

- diffing the roff or html output isn't particularly
readable; what we really care about is what the end user
will see

- you have to tweak a few build variables to avoid
spurious differences (e.g., version numbers, build
times)

Let's provide a script that builds and installs the manpages
for two commits, renders the results using "man", and diffs
the result. Since this is time-consuming, we'll also do our
best to avoid repeated work, keeping intermediate results
between runs.

Some of this could probably be made a little less ugly if we
built support into Documentation/Makefile. But by relying
only on "make install-man" working, this script should work
for generating a diff between any two versions, whether they
include this script or not.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>