gitweb.git
Merge branch 'nd/pack-objects-threading-doc'Junio C Hamano Wed, 15 Aug 2018 22:08:27 +0000 (15:08 -0700)

Merge branch 'nd/pack-objects-threading-doc'

Doc fix.

* nd/pack-objects-threading-doc:
pack-objects: document about thread synchronization

Merge branch 'jn/subtree-test-fixes'Junio C Hamano Wed, 15 Aug 2018 22:08:27 +0000 (15:08 -0700)

Merge branch 'jn/subtree-test-fixes'

Test fix.

* jn/subtree-test-fixes:
subtree test: simplify preparation of expected results
subtree test: add missing && to &&-chain

Merge branch 'cb/p4-pre-submit-hook'Junio C Hamano Wed, 15 Aug 2018 22:08:27 +0000 (15:08 -0700)

Merge branch 'cb/p4-pre-submit-hook'

"git p4 submit" learns to ask its own pre-submit hook if it should
continue with submitting.

* cb/p4-pre-submit-hook:
git-p4: add the `p4-pre-submit` hook

Merge branch 'js/vscode'Junio C Hamano Wed, 15 Aug 2018 22:08:26 +0000 (15:08 -0700)

Merge branch 'js/vscode'

Add a script (in contrib/) to help users of VSCode work better with
our codebase.

* js/vscode:
vscode: let cSpell work on commit messages, too
vscode: add a dictionary for cSpell
vscode: use 8-space tabs, no trailing ws, etc for Git's source code
vscode: wrap commit messages at column 72 by default
vscode: only overwrite C/C++ settings
mingw: define WIN32 explicitly
cache.h: extract enum declaration from inside a struct declaration
vscode: hard-code a couple defines
contrib: add a script to initialize VS Code configuration

Merge branch 'bb/redecl-enum-fix'Junio C Hamano Wed, 15 Aug 2018 22:08:26 +0000 (15:08 -0700)

Merge branch 'bb/redecl-enum-fix'

Compilation fix.

* bb/redecl-enum-fix:
packfile: ensure that enum object_type is defined

Merge branch 'jk/banned-function'Junio C Hamano Wed, 15 Aug 2018 22:08:26 +0000 (15:08 -0700)

Merge branch 'jk/banned-function'

It is too easy to misuse system API functions such as strcat();
these selected functions are now forbidden in this codebase and
will cause a compilation failure.

* jk/banned-function:
banned.h: mark strncpy() as banned
banned.h: mark sprintf() as banned
banned.h: mark strcat() as banned
automatically ban strcpy()

Merge branch 'en/merge-recursive-skip-fix'Junio C Hamano Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)

Merge branch 'en/merge-recursive-skip-fix'

When the sparse checkout feature is in use, "git cherry-pick" and
other mergy operations lost the skip_worktree bit when a path that
is excluded from checkout requires content level merge, which is
resolved as the same as the HEAD version, without materializing the
merge result in the working tree, which made the path appear as
deleted. This has been corrected by preserving the skip_worktree
bit (and not materializing the file in the working tree).

* en/merge-recursive-skip-fix:
merge-recursive: preserve skip_worktree bit when necessary
t3507: add a testcase showing failure with sparse checkout

Merge branch 'jt/tag-following-with-proto-v2-fix'Junio C Hamano Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)

Merge branch 'jt/tag-following-with-proto-v2-fix'

The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
fetch $remote branch:branch" that asks tags that point into the
history leading to the "branch" automatically followed sent to
narrow prefix and broke the tag following, which has been fixed.

* jt/tag-following-with-proto-v2-fix:
fetch: send "refs/tags/" prefix upon CLI refspecs
t5702: test fetch with multiple refspecs at a time

Merge branch 'jk/size-t'Junio C Hamano Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)

Merge branch 'jk/size-t'

Code clean-up to use size_t/ssize_t when they are the right type.

* jk/size-t:
strbuf_humanise: use unsigned variables
pass st.st_size as hint for strbuf_readlink()
strbuf_readlink: use ssize_t
strbuf: use size_t for length in intermediate variables
reencode_string: use size_t for string lengths
reencode_string: use st_add/st_mult helpers

Merge branch 'sg/coccicheck-updates'Junio C Hamano Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)

Merge branch 'sg/coccicheck-updates'

Update the way we use Coccinelle to find out-of-style code that
need to be modernised.

* sg/coccicheck-updates:
coccinelle: extract dedicated make target to clean Coccinelle's results
coccinelle: put sane filenames into output patches
coccinelle: exclude sha1dc source files from static analysis
coccinelle: use $(addsuffix) in 'coccicheck' make target
coccinelle: mark the 'coccicheck' make target as .PHONY

Merge branch 'sb/histogram-less-memory'Junio C Hamano Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)

Merge branch 'sb/histogram-less-memory'

"git diff --histogram" had a bad memory usage pattern, which has
been rearranged to reduce the peak usage.

* sb/histogram-less-memory:
xdiff/histogram: remove tail recursion
xdiff/xhistogram: move index allocation into find_lcs
xdiff/xhistogram: factor out memory cleanup into free_index()
xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff

Merge branch 'nd/i18n'Junio C Hamano Wed, 15 Aug 2018 22:08:23 +0000 (15:08 -0700)

Merge branch 'nd/i18n'

Many more strings are prepared for l10n.

* nd/i18n: (23 commits)
transport-helper.c: mark more strings for translation
transport.c: mark more strings for translation
sha1-file.c: mark more strings for translation
sequencer.c: mark more strings for translation
replace-object.c: mark more strings for translation
refspec.c: mark more strings for translation
refs.c: mark more strings for translation
pkt-line.c: mark more strings for translation
object.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
environment.c: mark more strings for translation
dir.c: mark more strings for translation
convert.c: mark more strings for translation
connect.c: mark more strings for translation
config.c: mark more strings for translation
commit-graph.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/config.c: mark more strings for translation
...

Merge branch 'hs/gpgsm'Junio C Hamano Wed, 15 Aug 2018 22:08:23 +0000 (15:08 -0700)

Merge branch 'hs/gpgsm'

Teach "git tag -s" etc. a few configuration variables (gpg.format
that can be set to "openpgp" or "x509", and gpg.<format>.program
that is used to specify what program to use to deal with the format)
to allow x.509 certs with CMS via "gpgsm" to be used instead of
openpgp via "gnupg".

* hs/gpgsm:
gpg-interface t: extend the existing GPG tests with GPGSM
gpg-interface: introduce new signature format "x509" using gpgsm
gpg-interface: introduce new config to select per gpg format program
gpg-interface: do not hardcode the key string len anymore
gpg-interface: introduce an abstraction for multiple gpg formats
t/t7510: check the validation of the new config gpg.format
gpg-interface: add new config to select how to sign a commit

Merge branch 'bw/clone-ref-prefixes'Junio C Hamano Wed, 15 Aug 2018 22:08:23 +0000 (15:08 -0700)

Merge branch 'bw/clone-ref-prefixes'

The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
clone" when learned to speak v2 forgot to do so, which has been
corrected.

* bw/clone-ref-prefixes:
clone: send ref-prefixes when using protocol v2

Merge branch 'jk/core-use-replace-refs'Junio C Hamano Wed, 15 Aug 2018 22:08:22 +0000 (15:08 -0700)

Merge branch 'jk/core-use-replace-refs'

A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.

* jk/core-use-replace-refs:
add core.usereplacerefs config option
check_replace_refs: rename to read_replace_refs
check_replace_refs: fix outdated comment

Merge branch 'jh/json-writer'Junio C Hamano Wed, 15 Aug 2018 22:08:22 +0000 (15:08 -0700)

Merge branch 'jh/json-writer'

Preparatory code to later add json output for telemetry data.

* jh/json-writer:
json_writer: new routines to create JSON data

Merge branch 'bb/make-developer-pedantic'Junio C Hamano Wed, 15 Aug 2018 22:08:22 +0000 (15:08 -0700)

Merge branch 'bb/make-developer-pedantic'

"make DEVELOPER=1 DEVOPTS=pedantic" allows developers to compile
with -pedantic option, which may catch more problematic program
constructs and potential bugs.

* bb/make-developer-pedantic:
Makefile: add a DEVOPTS flag to get pedantic compilation

Merge branch 'es/diff-color-moved-fix'Junio C Hamano Wed, 15 Aug 2018 22:08:22 +0000 (15:08 -0700)

Merge branch 'es/diff-color-moved-fix'

One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".

* es/diff-color-moved-fix:
diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"

Merge branch 'bw/protocol-v2'Junio C Hamano Wed, 15 Aug 2018 22:08:21 +0000 (15:08 -0700)

Merge branch 'bw/protocol-v2'

Doc update.

* bw/protocol-v2:
pack-protocol: mention and point to docs for protocol v2

Merge branch 'sg/travis-cocci-diagnose-failure'Junio C Hamano Wed, 15 Aug 2018 22:08:21 +0000 (15:08 -0700)

Merge branch 'sg/travis-cocci-diagnose-failure'

Update the way we run static analysis tool at TravisCI to make it
easier to use its findings.

* sg/travis-cocci-diagnose-failure:
travis-ci: fail if Coccinelle static analysis found something to transform
travis-ci: run Coccinelle static analysis with two parallel jobs

Merge branch 'js/t7406-recursive-submodule-update-order... Junio C Hamano Wed, 15 Aug 2018 22:08:21 +0000 (15:08 -0700)

Merge branch 'js/t7406-recursive-submodule-update-order-fix'

Test fix.

* js/t7406-recursive-submodule-update-order-fix:
t7406: avoid failures solely due to timing issues

Merge branch 'bw/fetch-pack-i18n'Junio C Hamano Wed, 15 Aug 2018 22:08:20 +0000 (15:08 -0700)

Merge branch 'bw/fetch-pack-i18n'

i18n updates.

* bw/fetch-pack-i18n:
fetch-pack: mark die strings for translation

Merge branch 'sg/fast-import-dump-refs-on-checkpoint... Junio C Hamano Wed, 15 Aug 2018 22:08:20 +0000 (15:08 -0700)

Merge branch 'sg/fast-import-dump-refs-on-checkpoint-fix'

Test update.

* sg/fast-import-dump-refs-on-checkpoint-fix:
t9300: wait for background fast-import process to die after killing it

Merge branch 'sb/trailers-docfix'Junio C Hamano Wed, 15 Aug 2018 22:08:19 +0000 (15:08 -0700)

Merge branch 'sb/trailers-docfix'

Doc update.

* sb/trailers-docfix:
Documentation/git-interpret-trailers: explain possible values

Merge branch 'jk/ui-color-always-to-auto'Junio C Hamano Wed, 15 Aug 2018 22:08:19 +0000 (15:08 -0700)

Merge branch 'jk/ui-color-always-to-auto'

Doc formatting fix.

* jk/ui-color-always-to-auto:
Documentation: fix --color option formatting

Remove forward declaration of an enumElijah Newren Wed, 15 Aug 2018 17:54:10 +0000 (10:54 -0700)

Remove forward declaration of an enum

According to http://c-faq.com/null/machexamp.html, sizeof(char*) !=
sizeof(int*) on some platforms. Since an enum could be a char or int
(or long or...), knowing the size of the enum thus is important to
knowing the size of a pointer to an enum, so we cannot just forward
declare an enum the way we can a struct. (Also, modern C++ compilers
apparently define forward declarations of an enum to either be useless
because the enum was defined, or require an explicit size specifier, or
be a compilation error.)

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat/precompose_utf8.h: use more common include guard... Elijah Newren Wed, 15 Aug 2018 17:54:09 +0000 (10:54 -0700)

compat/precompose_utf8.h: use more common include guard style

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

urlmatch.h: fix include guardElijah Newren Wed, 15 Aug 2018 17:54:08 +0000 (10:54 -0700)

urlmatch.h: fix include guard

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Move definition of enum branch_track from cache.h to... Elijah Newren Wed, 15 Aug 2018 17:54:07 +0000 (10:54 -0700)

Move definition of enum branch_track from cache.h to branch.h

'branch_track' feels more closely related to branching, and it is
needed later in branch.h; rather than #include'ing cache.h in branch.h
for this small enum, just move the enum and the external declaration
for git_branch_track to branch.h.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

alloc: make allocate_alloc_state and clear_alloc_state... Elijah Newren Wed, 15 Aug 2018 17:54:06 +0000 (10:54 -0700)

alloc: make allocate_alloc_state and clear_alloc_state more consistent

Since both functions are using the same data type, they should either both
refer to it as void *, or both use the real type (struct alloc_state *).
Opt for the latter.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Add missing includes and forward declarationsElijah Newren Wed, 15 Aug 2018 17:54:05 +0000 (10:54 -0700)

Add missing includes and forward declarations

I looped over the toplevel header files, creating a temporary two-line C
program for each consisting of
#include "git-compat-util.h"
#include $HEADER
This patch is the result of manually fixing errors in compiling those
tiny programs.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

rebase -i: fix numbering in squash messagePhillip Wood Wed, 15 Aug 2018 09:41:25 +0000 (10:41 +0100)

rebase -i: fix numbering in squash message

Commit e12a7ef597 ("rebase -i: Handle "combination of <n> commits" with
GETTEXT_POISON", 2018-04-27) changed the way that individual commit
messages are labelled when squashing commits together. In doing so a
regression was introduced where the numbering of the messages is off by
one. This commit fixes that and adds a test for the numbering.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: fix for core.autocrlf=trueJohannes Schindelin Wed, 15 Aug 2018 14:33:44 +0000 (07:33 -0700)

chainlint: fix for core.autocrlf=true

The `chainlint` target compares actual output to expected output, where
the actual output is generated from files that are specifically checked
out with LF-only line endings. So the expected output needs to be
checked out with LF-only line endings, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

partial-clone: render design doc using asciidocJonathan Nieder Tue, 14 Aug 2018 22:28:46 +0000 (15:28 -0700)

partial-clone: render design doc using asciidoc

Rendered documentation can be easier to read than raw text because
headings and emphasized phrases stand out. Add the missing markup and
Makefile rule required to render this design document using asciidoc.

Tested by running

make -C Documentation technical/partial-clone.html

and viewing the output in a browser.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule: add more exhaustive up-path testingÆvar Arnfjörð Bjarmason Tue, 14 Aug 2018 18:59:06 +0000 (18:59 +0000)

submodule: add more exhaustive up-path testing

The tests added in 63e95beb08 ("submodule: port resolve_relative_url
from shell to C", 2016-04-15) didn't do a good job of testing various
up-path invocations where the up-path would bring us beyond even the
URL in question without emitting an error.

These results look nonsensical, but it's worth exhaustively testing
them before fixing any of this code, so we can see which of these
cases were changed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-submodule.sh: accept verbose flag in cmd_update... Stefan Beller Tue, 14 Aug 2018 18:22:02 +0000 (11:22 -0700)

git-submodule.sh: accept verbose flag in cmd_update to be non-quiet

In a56771a668d (builtin/pull: respect verbosity settings in submodules,
2018-01-25), we made sure to pass on both quiet and verbose flag from
builtin/pull.c to the submodule shell script. However git-submodule doesn't
understand a verbose flag, which results in a bug when invoking

git pull --recurse-submodules -v [...]

There are a few different approaches to fix this bug:

1) rewrite 'argv_push_verbosity' or its caller in builtin/pull.c to
cap opt_verbosity at 0. Then 'argv_push_verbosity' would only add
'-q' if any.

2) Have a flag in 'argv_push_verbosity' that specifies if we allow adding
-q or -v (or both).

3) Add -v to git-submodule.sh and make it a no-op

(1) seems like a maintenance burden: What if we add code after
the submodule operations or move submodule operations higher up,
then we have altered the opt_verbosity setting further down the line
in builtin/pull.c.

(2) seems like it could work reasonably well without more regressions

(3) seems easiest to implement as well as actually is a feature with the
last-one-wins rule of passing flags to Git commands.

Reported-by: Jochen Kühner
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for_each_*_object: move declarations to object-store.hJeff King Tue, 14 Aug 2018 18:21:18 +0000 (14:21 -0400)

for_each_*_object: move declarations to object-store.h

The for_each_loose_object() and for_each_packed_object()
functions are meant to be part of a unified interface: they
use the same set of for_each_object_flags, and it's not
inconceivable that we might one day add a single
for_each_object() wrapper around them.

Let's put them together in a single file, so we can avoid
awkwardness like saying "the flags for this function are
over in cache.h". Moving the loose functions to packfile.h
is silly. Moving the packed functions to cache.h works, but
makes the "cache.h is a kitchen sink" problem worse. The
best place is the recently-created object-store.h, since
these are quite obviously related to object storage.

The for_each_*_in_objdir() functions do not use the same
flags, but they are logically part of the same interface as
for_each_loose_object(), and share callback signatures. So
we'll move those, as well, as they also make sense in
object-store.h.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cat-file: use a single strbuf for all outputJeff King Tue, 14 Aug 2018 18:20:22 +0000 (14:20 -0400)

cat-file: use a single strbuf for all output

When we're in batch mode, we end up in batch_object_write()
for each object, which allocates its own strbuf for each
call. Instead, we can provide a single "scratch" buffer that
gets reused for each output. When running:

git cat-file --batch-all-objects --batch-check='%(objectname)'

on git.git, my best-of-five time drops from:

real 0m0.171s
user 0m0.159s
sys 0m0.012s

to:

real 0m0.133s
user 0m0.121s
sys 0m0.012s

Note that we could do this just by putting the "scratch"
pointer into "struct expand_data", but I chose instead to
add an extra parameter to the callstack. That's more
verbose, but it makes it a bit more obvious what is going
on, which in turn makes it easy to see where we need to be
releasing the string in the caller (right after the loop
which uses it in each case).

Based-on-a-patch-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cat-file: split batch "buf" into two variablesJeff King Tue, 14 Aug 2018 18:18:06 +0000 (14:18 -0400)

cat-file: split batch "buf" into two variables

We use the "buf" strbuf for two things: to read incoming
lines, and as a scratch space for test-expanding the
user-provided format. Let's split this into two variables
with descriptive names, which makes their purpose and
lifetime more clear.

It will also help in a future patch when we start using the
"output" buffer for more expansions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cat-file: use oidset check-and-insertJeff King Tue, 14 Aug 2018 18:14:27 +0000 (14:14 -0400)

cat-file: use oidset check-and-insert

We don't need to check if the oidset has our object before
we insert it; that's done as part of the insertion. We can
just rely on the return value from oidset_insert(), which
saves one hash lookup per object.

This measurable speedup is tiny and within the run-to-run
noise, but the result is simpler to read, too.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5310-pack-bitmaps: fix bogus 'pack-objects to file... SZEDER Gábor Tue, 14 Aug 2018 11:47:21 +0000 (13:47 +0200)

t5310-pack-bitmaps: fix bogus 'pack-objects to file can use bitmap' test

The test 'pack-objects to file can use bitmap' added in 645c432d61
(pack-objects: use reachability bitmap index when generating
non-stdout pack, 2016-09-10) is silently buggy and doesn't check what
it's supposed to.

In 't5310-pack-bitmaps.sh', the 'list_packed_objects' helper function
does what its name implies by running:

git show-index <"$1" | cut -d' ' -f2

The test in question invokes this function like this:

list_packed_objects <packa-$packasha1.idx >packa.objects &&
list_packed_objects <packb-$packbsha1.idx >packb.objects &&
test_cmp packa.objects packb.objects

Note how these two callsites don't specify the name of the pack index
file as the function's parameter, but redirect the function's standard
input from it. This triggers an error message from the shell, as it
has no filename to redirect from in the function, but this error is
ignored, because it happens upstream of a pipe. Consequently, both
invocations produce empty 'pack{a,b}.objects' files, and the
subsequent 'test_cmp' happily finds those two empty files identical.

Fix these two 'list_packed_objects' invocations by specifying the pack
index files as parameters. Furthermore, eliminate the pipe in that
function by replacing it with an &&-chained pair of commands using an
intermediate file, so a failure of 'git show-index' or the shell
redirection will fail the test.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mingw: enable atomic O_APPENDJohannes Sixt Mon, 13 Aug 2018 19:02:50 +0000 (21:02 +0200)

mingw: enable atomic O_APPEND

The Windows CRT implements O_APPEND "manually": on write() calls, the
file pointer is set to EOF before the data is written. Clearly, this is
not atomic. And in fact, this is the root cause of failures observed in
t5552-skipping-fetch-negotiator.sh and t5503-tagfollow.sh, where
different processes write to the same trace file simultanously; it also
occurred in t5400-send-pack.sh, but there it was worked around in
71406ed4d6 ("t5400: avoid concurrent writes into a trace file",
2017-05-18).

Fortunately, Windows does support atomic O_APPEND semantics using the
file access mode FILE_APPEND_DATA. Provide an implementation that does.

This implementation is minimal in such a way that it only implements
the open modes that are actually used in the Git code base. Emulation
for other modes can be added as necessary later. To become aware of
the necessity early, the unusal error ENOSYS is reported if an
unsupported mode is encountered.

Diagnosed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame.c: remove implicit dependency on the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:41 +0000 (18:14 +0200)

blame.c: remove implicit dependency on the_index

Side note, since we gain access to the right repository, we can stop
rely on the_repository in this code as well.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply.c: remove implicit dependency on the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:40 +0000 (18:14 +0200)

apply.c: remove implicit dependency on the_index

Use apply_state->repo->index instead of the_index (in most cases,
unless we need to use a temporary index in some functions). Let the
callers (am and apply) tell us what to use, instead of always assuming
to operate on the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply.c: make init_apply_state() take a struct repositoryNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:39 +0000 (18:14 +0200)

apply.c: make init_apply_state() take a struct repository

We're moving away from the_index in this code. "struct index_state *"
could be added to struct apply_state. But let's aim long term and put
struct repository here instead so that we could even avoid more global
states in the future. The index will be available via
apply_state->repo->index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply.c: pass struct apply_state to more functionsNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:38 +0000 (18:14 +0200)

apply.c: pass struct apply_state to more functions

we're going to remove the dependency on the_index by moving 'struct
index_state *' to somewhere inside struct apply_state. Let's make sure
relevant functions have access to this struct now and reduce the diff
noise when the actual conversion happens.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

resolve-undo.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:37 +0000 (18:14 +0200)

resolve-undo.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

archive-*.c: use the right repositoryNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:36 +0000 (18:14 +0200)

archive-*.c: use the right repository

With 'struct archive_args' gaining new repository pointer, we don't
have to assume the_repository in the archive backends anymore.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

archive.c: avoid access to the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:35 +0000 (18:14 +0200)

archive.c: avoid access to the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

grep: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:34 +0000 (18:14 +0200)

grep: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

attr: remove index from git_attr_set_direction()Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:33 +0000 (18:14 +0200)

attr: remove index from git_attr_set_direction()

Since attr checking API now take the index, there's no need to set an
index in advance with this call. Most call sites are straightforward
because they either pass the_index or NULL (which defaults back to
the_index previously). There's only one suspicious call site in
unpack-trees.c where it sets a different index.

This code in unpack-trees is about to check out entries from the
new/temporary index after merging is done in it. The attributes will
be used by entry.c code to do crlf conversion if needed. entry.c now
respects struct checkout's istate field, and this field is correctly
set in unpack-trees.c, there should be no regression from this change.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

entry.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:32 +0000 (18:14 +0200)

entry.c: use the right index instead of the_index

checkout-index.c needs update because if checkout->istate is NULL,
ie_match_stat() will crash. Previously this is ie_match_stat(&the_index, ..)
so it will not crash, but it is not technically correct either.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:31 +0000 (18:14 +0200)

submodule.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pathspec.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:30 +0000 (18:14 +0200)

pathspec.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unpack-trees: avoid the_index in verify_absent()Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:29 +0000 (18:14 +0200)

unpack-trees: avoid the_index in verify_absent()

Both functions that are updated in this commit are called by
verify_absent(), which is part of the "unpack-trees" operation that is
supposed to work on any index file specified by the caller. Thanks to
Brandon [1] [2], an implicit dependency on the_index is exposed. This
commit fixes it.

In both functions, it makes sense to use src_index to check for
exclusion because it's almost unchanged and should give us the same
outcome as if running the exclude check before the unpack.

It's "almost unchanged" because we do invalidate cache-tree and
untracked cache in the source index. But this should not affect how
exclude machinery uses the index: to see if a file is tracked, and to
read a blob from the index instead of worktree if it's marked
skip-worktree (i.e. it's not available in worktree)

[1] a0bba65b10 (dir: convert is_excluded to take an index - 2017-05-05
[2] 2c1eb10454 (dir: convert read_directory to take an index - 2017-05-05)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unpack-trees: convert clear_ce_flags* to avoid the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:28 +0000 (18:14 +0200)

unpack-trees: convert clear_ce_flags* to avoid the_index

Prior to fba92be8f7, this code implicitly (and incorrectly) assumes
the_index when running the exclude machinery. fba92be8f7 helps show
this problem clearer because unpack-trees operation is supposed to
work on whatever index the caller specifies... not specifically
the_index.

Update the code to use "istate" argument that's originally from
mark_new_skip_worktree(). From the call sites, both in unpack_trees(),
you can see that this function works on two separate indexes:
o->src_index and o->result. The second mark_new_skip_worktree() so far
has incorecctly applied exclude rules on o->src_index instead of
o->result. It's unclear what is the consequences of this, but it's
definitely wrong.

[1] fba92be8f7 (dir: convert is_excluded_from_list to take an index -
2017-05-05)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unpack-trees: don't shadow global var the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:27 +0000 (18:14 +0200)

unpack-trees: don't shadow global var the_index

This function mark_new_skip_worktree() has an argument named the_index
which is also the name of a global variable. While they have different
types (the global the_index is not a pointer) mistakes can easily
happen and it's also confusing for readers. Rename the function
argument to something other than the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unpack-trees: add a note about path invalidationNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:26 +0000 (18:14 +0200)

unpack-trees: add a note about path invalidation

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unpack-trees: remove 'extern' on function declarationNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:25 +0000 (18:14 +0200)

unpack-trees: remove 'extern' on function declaration

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ls-files: correct index argument to get_convert_attr_as... Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:24 +0000 (18:14 +0200)

ls-files: correct index argument to get_convert_attr_ascii()

write_eolinfo() does take an istate as function argument and it should
be used instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

preload-index.c: use the right index instead of the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:23 +0000 (18:14 +0200)

preload-index.c: use the right index instead of the_index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

dir.c: remove an implicit dependency on the_index in... Nguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:22 +0000 (18:14 +0200)

dir.c: remove an implicit dependency on the_index in pathspec code

Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

convert.c: remove an implicit dependency on the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:21 +0000 (18:14 +0200)

convert.c: remove an implicit dependency on the_index

Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

attr: remove an implicit dependency on the_indexNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:20 +0000 (18:14 +0200)

attr: remove an implicit dependency on the_index

Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.

There is one ugly temporary workaround added in attr.c that needs some
more explanation.

Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.

The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cache-tree: wrap the_index based wrappers with #ifdefNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:19 +0000 (18:14 +0200)

cache-tree: wrap the_index based wrappers with #ifdef

This puts update_main_cache_tree() and write_cache_as_tree() in the
same group of "index compat" functions that assume the_index
implicitly, which should only be used within builtin/ or t/helper.

sequencer.c is also updated to not use these functions. As of now, no
files outside builtin/ use these functions anymore.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff.c: move read_index() code back to the callerNguyễn Thái Ngọc Duy Mon, 13 Aug 2018 16:14:18 +0000 (18:14 +0200)

diff.c: move read_index() code back to the caller

This code is only needed for diff-tree (since f0c6b2a2fd ([PATCH]
Optimize diff-tree -[CM] --stdin - 2005-05-27)). Let the caller do the
preparation instead and avoid read_index() in diff.c code.

read_index() should be avoided (in addition to the_index) because it
uses get_index_file() underneath to get the path $GIT_DIR/index. This
effectively pulls the_repository in and may become the only reason to
pull a 'struct repository *' in diff.c. Let's keep the dependencies as
few as possible and kick it back to diff-tree.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cat-file: support "unordered" output for --batch-all... Jeff King Fri, 10 Aug 2018 23:24:57 +0000 (19:24 -0400)

cat-file: support "unordered" output for --batch-all-objects

If you're going to access the contents of every object in a
packfile, it's generally much more efficient to do so in
pack order, rather than in hash order. That increases the
locality of access within the packfile, which in turn is
friendlier to the delta base cache, since the packfile puts
related deltas next to each other. By contrast, hash order
is effectively random, since the sha1 has no discernible
relationship to the content.

This patch introduces an "--unordered" option to cat-file
which iterates over packs in pack-order under the hood. You
can see the results when dumping all of the file content:

$ time ./git cat-file --batch-all-objects --buffer --batch | wc -c
6883195596

real 0m44.491s
user 0m42.902s
sys 0m5.230s

$ time ./git cat-file --unordered \
--batch-all-objects --buffer --batch | wc -c
6883195596

real 0m6.075s
user 0m4.774s
sys 0m3.548s

Same output, different order, way faster. The same speed-up
applies even if you end up accessing the object content in a
different process, like:

git cat-file --batch-all-objects --buffer --batch-check |
grep blob |
git cat-file --batch='%(objectname) %(rest)' |
wc -c

Adding "--unordered" to the first command drops the runtime
in git.git from 24s to 3.5s.

Side note: there are actually further speedups available
for doing it all in-process now. Since we are outputting
the object content during the actual pack iteration, we
know where to find the object and could skip the extra
lookup done by oid_object_info(). This patch stops short
of that optimization since the underlying API isn't ready
for us to make those sorts of direct requests.

So if --unordered is so much better, why not make it the
default? Two reasons:

1. We've promised in the documentation that --batch-all-objects
outputs in hash order. Since cat-file is plumbing,
people may be relying on that default, and we can't
change it.

2. It's actually _slower_ for some cases. We have to
compute the pack revindex to walk in pack order. And
our de-duplication step uses an oidset, rather than a
sort-and-dedup, which can end up being more expensive.
If we're just accessing the type and size of each
object, for example, like:

git cat-file --batch-all-objects --buffer --batch-check

my best-of-five warm cache timings go from 900ms to
1100ms using --unordered. Though it's possible in a
cold-cache or under memory pressure that we could do
better, since we'd have better locality within the
packfile.

And one final question: why is it "--unordered" and not
"--pack-order"? The answer is again two-fold:

1. "pack order" isn't a well-defined thing across the
whole set of objects. We're hitting loose objects, as
well as objects in multiple packs, and the only
ordering we're promising is _within_ a single pack. The
rest is apparently random.

2. The point here is optimization. So we don't want to
promise any particular ordering, but only to say that
we will choose an ordering which is likely to be
efficient for accessing the object content. That leaves
the door open for further changes in the future without
having to add another compatibility option.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cat-file: rename batch_{loose,packed}_object callbacksJeff King Fri, 10 Aug 2018 23:17:14 +0000 (19:17 -0400)

cat-file: rename batch_{loose,packed}_object callbacks

We're not really doing the batch-show operation in these
callbacks, but just collecting the set of objects. That
distinction will become more important in a future patch, so
let's rename them now to avoid cluttering that diff.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1006: test cat-file --batch-all-objects with duplicatesJeff King Fri, 10 Aug 2018 23:16:40 +0000 (19:16 -0400)

t1006: test cat-file --batch-all-objects with duplicates

The test for --batch-all-objects in t1006 covers a variety
of object storage situations, but one thing it doesn't cover
is that we avoid mentioning duplicate objects. We won't have
any because running "git repack -ad" will have packed them
all and deleted the loose ones.

This does work (because we sort and de-dup the output list),
but it's good to include it in our test. And doubly so for
when we add an unordered mode which has to de-dup in a
different way.

Note that we cannot just re-create one of the objects, as
Git will omit the write of an object that is already
present. However, we can create a new pack with one of the
objects, which forces the duplication.

One alternative would be to just use "git repack -a" instead
of "-ad". But then _every_ object would be duplicated as
loose and packed, and we might miss a bug that omits packed
objects (because we'd show their loose counterparts).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for_each_packed_object: support iterating in pack-orderJeff King Fri, 10 Aug 2018 23:15:49 +0000 (19:15 -0400)

for_each_packed_object: support iterating in pack-order

We currently iterate over objects within a pack in .idx
order, which uses the object hashes. That means that it
is effectively random with respect to the location of the
object within the pack. If you're going to access the actual
object data, there are two reasons to move linearly through
the pack itself:

1. It improves the locality of access in the packfile. In
the cold-cache case, this may mean fewer disk seeks, or
better usage of disk cache.

2. We store related deltas together in the packfile. Which
means that the delta base cache can operate much more
efficiently if we visit all of those related deltas in
sequence, as the earlier items are likely to still be
in the cache. Whereas if we visit the objects in
random order, our cache entries are much more likely to
have been evicted by unrelated deltas in the meantime.

So in general, if you're going to access the object contents
pack order is generally going to end up more efficient.

But if you're simply generating a list of object names, or
if you're going to end up sorting the result anyway, you're
better off just using the .idx order, as finding the pack
order means generating the in-memory pack-revindex.
According to the numbers in 8b8dfd5132 (pack-revindex:
radix-sort the revindex, 2013-07-11), that takes about 200ms
for linux.git, and 20ms for git.git (those numbers are a few
years old but are still a good ballpark).

That makes it a good optimization for some cases (we can
save tens of seconds in git.git by having good locality of
delta access, for a 20ms cost), but a bad one for others
(e.g., right now "cat-file --batch-all-objects
--batch-check="%(objectname)" is 170ms in git.git, so adding
20ms to that is noticeable).

Hence this patch makes it an optional flag. You can't
actually do any interesting timings yet, as it's not plumbed
through to any user-facing tools like cat-file. That will
come in a later patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for_each_*_object: give more comprehensive docstringsJeff King Fri, 10 Aug 2018 23:11:14 +0000 (19:11 -0400)

for_each_*_object: give more comprehensive docstrings

We already mention the local/alternate behavior of these
functions, but we can help clarify a few other behaviors:

- there's no need to mention LOCAL_ONLY specifically, since
we already reference the flags by type (and as we add
more flags, we don't want to have to mention each)

- clarify that reachability doesn't matter here; this is
all accessible objects

- what ordering/uniqueness guarantees we give

- how pack-specific flags are handled for the loose case

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for_each_*_object: take flag arguments as enumJeff King Fri, 10 Aug 2018 23:09:44 +0000 (19:09 -0400)

for_each_*_object: take flag arguments as enum

It's not wrong to pass our flags in an "unsigned", as we
know it will be at least as large as the enum. However,
using the enum in the declaration makes it more obvious
where to find the list of flags.

While we're here, let's also drop the "extern" noise-words
from the declarations, per our modern coding style.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

for_each_*_object: store flag definitions in a single... Jeff King Fri, 10 Aug 2018 23:09:06 +0000 (19:09 -0400)

for_each_*_object: store flag definitions in a single location

These flags were split between cache.h and packfile.h,
because some of the flags apply only to packs. However, they
share a single numeric namespace, since both are respected
for the packed variant. Let's make sure they're defined
together so that nobody accidentally adds a new flag in one
location that duplicates the other.

While we're here, let's also put them in an enum (which
helps debugger visibility) and use "(1<<n)" rather than
counting powers of 2 manually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pull doc: fix a long-standing grammar errorÆvar Arnfjörð Bjarmason Mon, 13 Aug 2018 19:22:49 +0000 (19:22 +0000)

pull doc: fix a long-standing grammar error

It should be "is not an empty string" not "is not empty string". This
fixes wording originally introduced in ab9b31386b ("Documentation:
multi-head fetch.", 2005-08-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch tests: correct a comment "remove it" -> "remove... Ævar Arnfjörð Bjarmason Mon, 13 Aug 2018 19:22:48 +0000 (19:22 +0000)

fetch tests: correct a comment "remove it" -> "remove them"

Correct a comment referring to the removal of just the branch to also
refer to the tag. This should have been changed in my
ca3065e7e7 ("fetch tests: add a tag to be deleted to the pruning
tests", 2018-02-09) when the tag deletion was added, but I missed it
at the time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: add test of pathological case which triggere... Eric Sunshine Mon, 13 Aug 2018 08:47:39 +0000 (04:47 -0400)

chainlint: add test of pathological case which triggered false positive

This extract from contrib/subtree/t7900 triggered a false positive due
to three chainlint limitations:

* recognizing only a "blessed" set of here-doc tag names in a subshell
("EOF", "EOT", "INPUT_END"), of which "TXT" is not a member

* inability to recognize multi-line $(...) when the first statement of
the body is cuddled with the opening "$("

* inability to recognize multiple constructs on a single line, such as
opening a multi-line $(...) and starting a here-doc

Now that all of these shortcomings have been addressed, turn this rather
pathological bit of shell coding into a chainlint test case.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: recognize multi-line quoted strings more... Eric Sunshine Mon, 13 Aug 2018 08:47:38 +0000 (04:47 -0400)

chainlint: recognize multi-line quoted strings more robustly

chainlint.sed recognizes multi-line quoted strings within subshells:

echo "abc
def" >out &&

so it can avoid incorrectly classifying lines internal to the string as
breaking the &&-chain. To identify the first line of a multi-line
string, it checks if the line contains a single quote. However, this is
fragile and can be easily fooled by a line containing multiple strings:

echo "xyz" "abc
def" >out &&

Make detection more robust by checking for an odd number of quotes
rather than only a single one.

(Escaped quotes are not handled, but support may be added later.)

The original multi-line string recognizer rather cavalierly threw away
all but the final quote, whereas the new one is careful to retain all
quotes, so the "expected" output of a couple existing chainlint tests is
updated to account for this new behavior.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: let here-doc and multi-line string commence... Eric Sunshine Mon, 13 Aug 2018 08:47:37 +0000 (04:47 -0400)

chainlint: let here-doc and multi-line string commence on same line

After swallowing a here-doc, chainlint.sed assumes that no other
processing needs to be done on the line aside from checking for &&-chain
breakage; likewise, after folding a multi-line quoted string. However,
it's conceivable (even if unlikely in practice) that both a here-doc and
a multi-line quoted string might commence on the same line:

cat <<\EOF && echo "foo
bar"
data
EOF

Support this case by sending the line (after swallowing and folding)
through the normal processing sequence rather than jumping directly to
the check for broken &&-chain.

This change also allows other somewhat pathological cases to be handled,
such as closing a subshell on the same line starting a here-doc:

(
cat <<-\INPUT)
data
INPUT

or, for instance, opening a multi-line $(...) expression on the same
line starting a here-doc:

x=$(cat <<-\END &&
data
END
echo "x")

among others.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: recognize multi-line $(...) when command... Eric Sunshine Mon, 13 Aug 2018 08:47:36 +0000 (04:47 -0400)

chainlint: recognize multi-line $(...) when command cuddled with "$("

For multi-line $(...) expressions nested within subshells, chainlint.sed
only recognizes:

x=$(
echo foo &&
...

but it is not unlikely that test authors may also cuddle the command
with the opening "$(", so support that style, as well:

x=$(echo foo &&
...

The closing ")" is already correctly recognized when cuddled or not.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: match 'quoted' here-doc tagsEric Sunshine Mon, 13 Aug 2018 08:47:35 +0000 (04:47 -0400)

chainlint: match 'quoted' here-doc tags

A here-doc tag can be quoted ('EOF') or escaped (\EOF) to suppress
interpolation within the body. Although, chainlint recognizes escaped
tags, it does not know about quoted tags. For completeness, teach it to
recognize quoted tags, as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

chainlint: match arbitrary here-docs tags rather than... Eric Sunshine Mon, 13 Aug 2018 08:47:34 +0000 (04:47 -0400)

chainlint: match arbitrary here-docs tags rather than hard-coded names

chainlint.sed swallows top-level here-docs to avoid being fooled by
content which might look like start-of-subshell. It likewise swallows
here-docs in subshells to avoid marking content lines as breaking the
&&-chain, and to avoid being fooled by content which might look like
end-of-subshell, start-of-nested-subshell, or other specially-recognized
constructs.

At the time of implementation, it was believed that it was not possible
to support arbitrary here-doc tag names since 'sed' provides no way to
stash the opening tag name in a variable for later comparison against a
line signaling end-of-here-doc. Consequently, tag names are hard-coded,
with "EOF" being the only tag recognized at the top-level, and only
"EOF", "EOT", and "INPUT_END" being recognized within subshells. Also,
special care was taken to avoid being confused by here-docs nested
within other here-docs.

In practice, this limited number of hard-coded tag names has been "good
enough" for the 13000+ existing Git test, despite many of those tests
using tags other than the recognized ones, since the bodies of those
here-docs do not contain content which would fool the linter.
Nevertheless, the situation is not ideal since someone writing new
tests, and choosing a name not in the "blessed" set could potentially
trigger a false-positive.

To address this shortcoming, upgrade chainlint.sed to handle arbitrary
here-doc tag names, both at the top-level and within subshells.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mergetool: don't suggest to continue after last fileNicholas Guriev Mon, 13 Aug 2018 05:09:29 +0000 (08:09 +0300)

mergetool: don't suggest to continue after last file

Eliminate an unnecessary prompt to continue after failed merger, by
not calling the prompt_after_failed_merge function when only one
iteration remains.

Uses positional parameters to count files in the list to make it
easier to see if we have any more paths to process from within the
loop.

Signed-off-by: Nicholas Guriev <guriev-ns@ya.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5318: avoid unnecessary command substitutionsSZEDER Gábor Mon, 13 Aug 2018 00:30:10 +0000 (02:30 +0200)

t5318: avoid unnecessary command substitutions

Two tests added in dade47c06c (commit-graph: add repo arg to graph
readers, 2018-07-11) prepare the contents of 'expect' files by
'echo'ing the results of command substitutions. That's unncessary,
avoid them by directly saving the output of the commands executed in
those command substitutions.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5318: use 'test_cmp_bin' to compare commit-graph filesSZEDER Gábor Mon, 13 Aug 2018 11:52:43 +0000 (13:52 +0200)

t5318: use 'test_cmp_bin' to compare commit-graph files

The commit-graph files are binary files, so they should not be
compared with 'test_cmp', because that might cause issues like
crashing[1] or infinite loop[2] on Windows, where 'test_cmp' is a
shell function to deal with random LF-CRLF conversions[3].

Use 'test_cmp_bin' instead.

1 - b93e6e3663 (t5000, t5003: do not use test_cmp to compare binary
files, 2014-06-04)
2 - f9f3851b4d (t9300: use test_cmp_bin instead of test_cmp to compare
binary files, 2014-09-12)
3 - 4d715ac05c (Windows: a test_cmp that is agnostic to random LF <>
CRLF conversions, 2013-10-26)

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: use dim/bold cues to improve dual color... Johannes Schindelin Mon, 13 Aug 2018 11:33:32 +0000 (04:33 -0700)

range-diff: use dim/bold cues to improve dual color mode

It *is* a confusing thing to look at a diff of diffs. All too easy is it
to mix up whether the -/+ markers refer to the "inner" or the "outer"
diff, i.e. whether a `+` indicates that a line was added by either the
old or the new diff (or both), or whether the new diff does something
different than the old diff.

To make things easier to process for normal developers, we introduced
the dual color mode which colors the lines according to the commit diff,
i.e. lines that are added by a commit (whether old, new, or both) are
colored in green. In non-dual color mode, the lines would be colored
according to the outer diff: if the old commit added a line, it would be
colored red (because that line addition is only present in the first
commit range that was specified on the command-line, i.e. the "old"
commit, but not in the second commit range, i.e. the "new" commit).

However, this dual color mode is still not making things clear enough,
as we are looking at two levels of diffs, and we still only pick a color
according to *one* of them (the outer diff marker is colored
differently, of course, but in particular with deep indentation, it is
easy to lose track of that outer diff marker's background color).

Therefore, let's add another dimension to the mix. Still use
green/red/normal according to the commit diffs, but now also dim the
lines that were only in the old commit, and use bold face for the lines
that are only in the new commit.

That way, it is much easier not to lose track of, say, when we are
looking at a line that was added in the previous iteration of a patch
series but the new iteration adds a slightly different version: the
obsolete change will be dimmed, the current version of the patch will be
bold.

At least this developer has a much easier time reading the range-diffs
that way.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: make --dual-color the default modeJohannes Schindelin Mon, 13 Aug 2018 11:33:30 +0000 (04:33 -0700)

range-diff: make --dual-color the default mode

After using this command extensively for the last two months, this
developer came to the conclusion that even if the dual color mode still
leaves a lot of room for confusion about what was actually changed, the
non-dual color mode is substantially worse in that regard.

Therefore, we really want to make the dual color mode the default.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: left-pad patch numbersJohannes Schindelin Mon, 13 Aug 2018 11:33:28 +0000 (04:33 -0700)

range-diff: left-pad patch numbers

As pointed out by Elijah Newren, tbdiff has this neat little alignment
trick where it outputs the commit pairs with patch numbers that are
padded to the maximal patch number's width:

1: cafedead = 1: acefade first patch
[...]
314: beefeada < 314: facecab up to PI!

Let's do the same in range-diff, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

completion: support `git range-diff`Johannes Schindelin Mon, 13 Aug 2018 11:33:27 +0000 (04:33 -0700)

completion: support `git range-diff`

Tab completion of `git range-diff` is very convenient, especially
given that the revision arguments to specify the commit ranges to
compare are typically more complex than, say, what is normally passed
to `git log`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: populate the man pageJohannes Schindelin Mon, 13 Aug 2018 11:33:25 +0000 (04:33 -0700)

range-diff: populate the man page

The bulk of this patch consists of a heavily butchered version of
tbdiff's README written by Thomas Rast and Thomas Gummerer, lifted from
https://github.com/trast/tbdiff.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff --dual-color: skip white-space warningsJohannes Schindelin Mon, 13 Aug 2018 11:33:24 +0000 (04:33 -0700)

range-diff --dual-color: skip white-space warnings

When displaying a diff of diffs, it is possible that there is an outer
`+` before a context line. That happens when the context changed between
old and new commit. When that context line starts with a tab (after the
space that marks it as context line), our diff machinery spits out a
white-space error (space before tab), but in this case, that is
incorrect.

Rather than adding a specific whitespace flag that specifically ignores
the first space in the output (and might miss other problems with the
white-space warnings), let's just skip handling white-space errors in
dual color mode to begin with.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: offer to dual-color the diffsJohannes Schindelin Mon, 13 Aug 2018 11:33:22 +0000 (04:33 -0700)

range-diff: offer to dual-color the diffs

When showing what changed between old and new commits, we show a diff of
the patches. This diff is a diff between diffs, therefore there are
nested +/- signs, and it can be relatively hard to understand what is
going on.

With the --dual-color option, the preimage and the postimage are colored
like the diffs they are, and the *outer* +/- sign is inverted for
clarity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: add an internal option to dual-color diffs of... Johannes Schindelin Mon, 13 Aug 2018 11:33:20 +0000 (04:33 -0700)

diff: add an internal option to dual-color diffs of diffs

When diffing diffs, it can be quite daunting to figure out what the heck
is going on, as there are nested +/- signs.

Let's make this easier by adding a flag in diff_options that allows
color-coding the outer diff sign with inverted colors, so that the
preimage and postimage is colored like the diff it is.

Of course, this really only makes sense when the preimage and postimage
*are* diffs. So let's not expose this flag via a command-line option for
now.

This is a feature that was invented by git-tbdiff, and it will be used
by `git range-diff` in the next commit, by offering it via a new option:
`--dual-color`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

color: add the meta color GIT_COLOR_REVERSEJohannes Schindelin Mon, 13 Aug 2018 11:33:19 +0000 (04:33 -0700)

color: add the meta color GIT_COLOR_REVERSE

This "color" simply reverts background and foreground. It will be used
in the upcoming "dual color" mode of `git range-diff`, where we will
reverse colors for the -/+ markers and the fragment headers of the
"outer" diff.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: use color for the commit pairsJohannes Schindelin Mon, 13 Aug 2018 11:33:18 +0000 (04:33 -0700)

range-diff: use color for the commit pairs

Arguably the most important part of `git range-diff`'s output is the
list of commits in the two branches, together with their relationships.

For that reason, tbdiff introduced color-coding that is pretty
intuitive, especially for unchanged patches (all dim yellow, like the
first line in `git show`'s output) vs modified patches (old commit is
red, new commit is green). Let's imitate that color scheme.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: add testsThomas Rast Mon, 13 Aug 2018 11:33:16 +0000 (04:33 -0700)

range-diff: add tests

These are essentially lifted from https://github.com/trast/tbdiff, with
light touch-ups to account for the command now being named `git
range-diff`.

Apart from renaming `tbdiff` to `range-diff`, only one test case needed
to be adjusted: 11 - 'changed message'.

The underlying reason it had to be adjusted is that diff generation is
sometimes ambiguous. In this case, a comment line and an empty line are
added, but it is ambiguous whether they were added after the existing
empty line, or whether an empty line and the comment line are added
*before* the existing empty line. And apparently xdiff picks a different
option here than Python's difflib.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: do not show "function names" in hunk headersJohannes Schindelin Mon, 13 Aug 2018 11:33:14 +0000 (04:33 -0700)

range-diff: do not show "function names" in hunk headers

We are comparing complete, formatted commit messages with patches. There
are no function names here, so stop looking for them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: adjust the output of the commit pairsJohannes Schindelin Mon, 13 Aug 2018 11:33:13 +0000 (04:33 -0700)

range-diff: adjust the output of the commit pairs

This not only uses "dashed stand-ins" for "pairs" where one side is
missing (i.e. unmatched commits that are present only in one of the two
commit ranges), but also adds onelines for the reader's pleasure.

This change brings `git range-diff` yet another step closer to
feature parity with tbdiff: it now shows the oneline, too, and indicates
with `=` when the commits have identical diffs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: suppress the diff headersJohannes Schindelin Mon, 13 Aug 2018 11:33:11 +0000 (04:33 -0700)

range-diff: suppress the diff headers

When showing the diff between corresponding patches of the two branch
versions, we have to make up a fake filename to run the diff machinery.

That filename does not carry any meaningful information, hence tbdiff
suppresses it. So we should, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: indent the diffs just like tbdiffJohannes Schindelin Mon, 13 Aug 2018 11:33:10 +0000 (04:33 -0700)

range-diff: indent the diffs just like tbdiff

The main information in the `range-diff` view comes from the list of
matching and non-matching commits, the diffs are additional information.
Indenting them helps with the reading flow.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

range-diff: right-trim commit messagesJohannes Schindelin Mon, 13 Aug 2018 11:33:08 +0000 (04:33 -0700)

range-diff: right-trim commit messages

When comparing commit messages, we need to keep in mind that they are
indented by four spaces. That is, empty lines are no longer empty, but
have "trailing whitespace". When displaying them in color, that results
in those nagging red lines.

Let's just right-trim the lines in the commit message, it's not like
trailing white-space in the commit messages are important enough to care
about in `git range-diff`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>