gitweb.git
progress: break too long progress bar linesSZEDER Gábor Fri, 12 Apr 2019 19:45:15 +0000 (21:45 +0200)

progress: break too long progress bar lines

Some of the recently added progress indicators have quite long titles,
which might be even longer when translated to some languages, and when
they are shown while operating on bigger repositories, then the
progress bar grows longer than the default 80 column terminal width.

When the progress bar exceeds the width of the terminal it gets
line-wrapped, and after that the CR at the end doesn't return to the
beginning of the progress bar, but to the first column of its last
line. Consequently, the first line of the previously shown progress
bar is not overwritten by the next, and we end up with a bunch of
truncated progress bar lines scrolling past:

$ LANG=es_ES.UTF-8 git commit-graph write
Encontrando commits para commit graph entre los objetos empaquetados: 2% (1599
Encontrando commits para commit graph entre los objetos empaquetados: 3% (1975
Encontrando commits para commit graph entre los objetos empaquetados: 4% (2633
Encontrando commits para commit graph entre los objetos empaquetados: 5% (3292
[...]

Prevent this by breaking progress bars after the title once they
exceed the width of the terminal, so the counter and optional
percentage and throughput, i.e. all changing parts, are on the last
line. Subsequent updates will from then on only refresh the changing
parts, but not the title, and it will look like this:

$ LANG=es_ES.UTF-8 ~/src/git/git commit-graph write
Encontrando commits para commit graph entre los objetos empaquetados:
100% (6584502/6584502), listo.
Calculando números de generación de commit graph: 100% (824705/824705), listo.
Escribiendo commit graph en 4 pasos: 100% (3298820/3298820), listo.

Note that the number of columns in the terminal is cached by
term_columns(), so this might not kick in when it should when a
terminal window is resized while the operation is running.
Furthermore, this change won't help if the terminal is so narrow that
the counters don't fit on one line, but I would put this in the "If it
hurts, don't do it" box.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

progress: clear previous progress update dynamicallySZEDER Gábor Fri, 12 Apr 2019 19:45:14 +0000 (21:45 +0200)

progress: clear previous progress update dynamically

When the progress bar includes throughput, its length can shorten as
the unit of display changes from KiB to MiB. To cover up remnants of
the previous progress bar when such a change of units happens we
always print three spaces at the end of the progress bar.

Alas, covering only three characters is not quite enough: when both
the total and the throughput happen to change units from KiB to MiB in
the same update, then the progress bar's length is shortened by four
characters (or maybe even more!):

Receiving objects: 25% (2901/11603), 772.01 KiB | 733.00 KiB/s
Receiving objects: 27% (3133/11603), 1.43 MiB | 1.16 MiB/s s

and a stray 's' is left behind.

So instead of hard-coding the three characters to cover, let's compare
the length of the current progress bar with the previous one, and
cover up as many characters as needed.

Sure, it would be much simpler to just print more spaces at the end of
the progress bar, but this approach is more future-proof, and it won't
print extra spaces when none are needed, notably when the progress bar
doesn't show throughput and thus never shrinks.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests: disallow the use of abbreviated options (by... Johannes Schindelin Fri, 12 Apr 2019 09:37:24 +0000 (02:37 -0700)

tests: disallow the use of abbreviated options (by default)

Git's command-line parsers support uniquely abbreviated options, e.g.
`git init --ba` would automatically expand `--ba` to `--bare`.

This is a very convenient feature in every day life for Git users, in
particular when tab completion is not available.

However, it is not a good idea to rely on that in Git's test suite, as
something that is a unique abbreviation of a command line option today
might no longer be a unique abbreviation tomorrow.

For example, if a future contribution added a new mode
`git init --babyproofing` and a previously-introduced test case used the
fact that `git init --ba` expanded to `git init --bare`, that future
contribution would now have to touch seemingly unrelated tests just to
keep the test suite from failing.

So let's disallow abbreviated options in the test suite by default.

Note: for ease of implementation, this patch really only touches the
`parse-options` machinery: more and more hand-rolled option parsers are
converted to use that internal API, and more and more scripts are
converted to built-ins (naturally using the parse-options API, too), so
in practice this catches most issues, and is definitely the biggest bang
for the buck.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

progress: use xmalloc/xcallocJeff King Thu, 11 Apr 2019 13:49:57 +0000 (09:49 -0400)

progress: use xmalloc/xcalloc

Since the early days of Git, the progress code allocates its struct with
a bare malloc(), not xmalloc(). If the allocation fails, we just avoid
showing progress at all.

While perhaps a noble goal not to fail the whole operation because of
optional progress, in practice:

1. Any failure to allocate a few dozen bytes here means critical path
allocations are likely to fail, too.

2. These days we use a strbuf for throughput progress (and there's a
patch under discussion to do the same for non-throughput cases,
too). And that uses xmalloc() under the hood, which means we'd
still die on some allocation failures.

Let's switch to xmalloc(). That makes us consistent with the rest of Git
and makes it easier to audit for other (less careful) bare mallocs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

xdiff: use xmalloc/xreallocJeff King Thu, 11 Apr 2019 13:49:25 +0000 (09:49 -0400)

xdiff: use xmalloc/xrealloc

Most of xdiff uses a bare malloc() to allocate memory, and returns an
error when we get NULL. However, there are a few spots which don't check
the return value and may segfault, including at least xdl_merge() and
xpatience.c's find_longest_common_sequence().

Let's use xmalloc() everywhere instead, so that we get a graceful die()
for these cases, without having to do further auditing. This does mean
the existing cases which check errors will now die() instead of
returning an error up the stack. But:

- that's how the rest of Git behaves already for malloc errors

- all of the callers of xdi_diff(), etc, die upon seeing an error

So while we might one day want to fully lib-ify the diff code and make
it possible to use as part of a long-running process, we're not close to
that now. And because we're just tweaking the xdl_malloc() macro here,
we're not really moving ourselves any further away from that. We
could, for example, simplify some of the functions which handle malloc()
errors which can no longer occur. But that would probably be taking us
in the wrong direction.

This also makes our malloc handling more consistent with the rest of
Git, including enforcing GIT_ALLOC_LIMIT and trying to reclaim pack
memory when needed.

Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

xdiff: use git-compat-utilJeff King Thu, 11 Apr 2019 13:48:33 +0000 (09:48 -0400)

xdiff: use git-compat-util

Since the xdiff library was not originally part of Git, it does its own
system includes. Let's instead use git-compat-util, which has two
benefits:

1. It adjusts for any system-specific quirks in how or what we should
include (though xdiff's needs are light enough that this hasn't
been a problem in the past).

2. It lets us use wrapper functions like xmalloc().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-prio-queue: use xmallocJeff King Thu, 11 Apr 2019 13:48:14 +0000 (09:48 -0400)

test-prio-queue: use xmalloc

test-prio-queue.c doesn't check the return value of malloc, and could
segfault.

It's unlikely for this to matter in practice; it's a small allocation,
and this code isn't even installed alongside the rest of Git. But let's
use xmalloc(), which makes auditing for other accidental uses of bare
malloc() easier.

Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3000 (ls-files -o): widen description to reflect curre... Kyle Meyer Wed, 10 Apr 2019 15:28:57 +0000 (11:28 -0400)

t3000 (ls-files -o): widen description to reflect current tests

Remove the mention of symlinks from the test description because
several tests that are not related to symlinks have been added since
this file was introduced long ago.

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

untracked cache: fix off-by-oneJohannes Schindelin Wed, 10 Apr 2019 12:56:48 +0000 (05:56 -0700)

untracked cache: fix off-by-one

In f9e6c649589e (untracked cache: load from UNTR index extension,
2015-03-08), code was added to read back the untracked cache from an
index extension.

Probably in the endeavor to avoid the `calloc()` implied by
`FLEX_ALLOC_STR()` (it is hard to know why exactly, the commit message
of that commit is a bit parsimonious with information), it calls
`malloc()` manually and then `memcpy()`s the bits and pieces into place.

It allocates the size of `struct untracked_cache_dir` plus the string
length of the untracked file name, then copies the information in two
steps: first the fixed-size metadata, then the name. And here lies the
rub: it includes the trailing NUL byte in the name.

If `FLEX_ARRAY` is defined as 0, this results in a buffer overrun.

To fix this, let's just add 1, for the trailing NUL byte. Technically,
this overallocates on platforms where `FLEX_ARRAY` is 1, but it should
not matter much in reality, as `malloc()` usually overallocates anyway,
unless the size to allocate aligns exactly with some internal chunk size
(see below for more on that).

The real strange thing is that neither valgrind nor DrMemory catches
this bug. In this developer's tests, a `memcpy()` (but not a
`memset()`!) could write up to 4 bytes after the allocated memory range
before valgrind would start reporting an issue.

However, when running Git built with nedmalloc as allocator, under rare
conditions (and inconsistently at that), this bug triggered an `abort()`
because nedmalloc rounds up the size to be `malloc()`ed to a multiple of
a certain chunk size, then adds a few bytes to be used for storing some
internal state. If there is no rounding up to do (because the size is
already a multiple of that chunk size), and if the buffer is overrun as
in the code patched in this commit, the internal state is corrupted.

The scenario that triggered this here bug fix entailed a git.git
checkout with an extra copy of the source code in an untracked
subdirectory, meaning that there was an untracked subdirectory called
"thunderbird-patch-inline" whose name's length is exactly 24 bytes,
which, added to the size of above-mentioned `struct untracked_cache_dir`
that weighs in with 104 bytes on a 64-bit system, amounts to 128,
aligning perfectly with nedmalloc's chunk size.

As there is no obvious way to trigger this bug reliably, on all
platforms supported by Git, and as the bug is obvious enough, this patch
comes without a regression test.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule: teach set-branch subcommandDenton Liu Fri, 8 Feb 2019 11:21:34 +0000 (03:21 -0800)

submodule: teach set-branch subcommand

This teaches git-submodule the set-branch subcommand which allows the
branch of a submodule to be set through a porcelain command without
having to manually manipulate the .gitmodules file.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/git-show-branch: avoid literal {apostrophe}Todd Zullinger Wed, 10 Apr 2019 00:37:33 +0000 (20:37 -0400)

Documentation/git-show-branch: avoid literal {apostrophe}

The {apostrophe} was needed at the time of a521845800 ("Documentation:
remove stray backslash in show-branch discussion", 2010-08-20). All
other uses of {apostrophe} were removed in 6cf378f0cb ("docs: stop using
asciidoc no-inline-literal", 2012-04-26).

Unfortunately, the {apostrophe} is rendered literally with Asciidoctor
(at least with 1.5.5-2.0.3). Avoid this by using single-quotes.

Escaping the leading single-quote allows the content to render properly
in AsciiDoc and Asciidoctor.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The fourth batchJunio C Hamano Tue, 9 Apr 2019 17:19:09 +0000 (02:19 +0900)

The fourth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jt/submodule-fetch-errmsg'Junio C Hamano Tue, 9 Apr 2019 17:14:26 +0000 (02:14 +0900)

Merge branch 'jt/submodule-fetch-errmsg'

Error message update.

* jt/submodule-fetch-errmsg:
submodule: explain first attempt failure clearly

Merge branch 'jk/sha1dc'Junio C Hamano Tue, 9 Apr 2019 17:14:26 +0000 (02:14 +0900)

Merge branch 'jk/sha1dc'

Build update for SHA-1 with collision detection.

* jk/sha1dc:
Makefile: fix unaligned loads in sha1dc with UBSan

Merge branch 'jk/promote-ggg'Junio C Hamano Tue, 9 Apr 2019 17:14:25 +0000 (02:14 +0900)

Merge branch 'jk/promote-ggg'

Suggest GitGitGadget instead of submitGit as a way to submit
patches based on GitHub PR to us.

* jk/promote-ggg:
point pull requesters to GitGitGadget

Merge branch 'ar/t4150-remove-cruft'Junio C Hamano Tue, 9 Apr 2019 17:14:25 +0000 (02:14 +0900)

Merge branch 'ar/t4150-remove-cruft'

Test cleanup.

* ar/t4150-remove-cruft:
t4150: remove unused variable

Merge branch 'js/rebase-deprecate-preserve-merges'Junio C Hamano Tue, 9 Apr 2019 17:14:24 +0000 (02:14 +0900)

Merge branch 'js/rebase-deprecate-preserve-merges'

"git rebase --rebase-merges" replaces its old "--preserve-merges"
option; the latter is now marked as deprecated.

* js/rebase-deprecate-preserve-merges:
rebase: deprecate --preserve-merges

Merge branch 'ms/worktree-add-atomic-mkdir'Junio C Hamano Tue, 9 Apr 2019 17:14:24 +0000 (02:14 +0900)

Merge branch 'ms/worktree-add-atomic-mkdir'

"git worktree add" used to do a "find an available name with stat
and then mkdir", which is race-prone. This has been fixed by using
mkdir and reacting to EEXIST in a loop.

* ms/worktree-add-atomic-mkdir:
worktree: fix worktree add race

Merge branch 'jk/line-log-with-patch'Junio C Hamano Tue, 9 Apr 2019 17:14:23 +0000 (02:14 +0900)

Merge branch 'jk/line-log-with-patch'

"git log -L<from>,<to>:<path>" with "-s" did not suppress the patch
output as it should. This has been corrected.

* jk/line-log-with-patch:
line-log: detect unsupported formats
line-log: suppress diff output with "-s"

Merge branch 'ra/t3600-test-path-funcs'Junio C Hamano Tue, 9 Apr 2019 17:14:23 +0000 (02:14 +0900)

Merge branch 'ra/t3600-test-path-funcs'

A GSoC micro.

* ra/t3600-test-path-funcs:
t3600: use helpers to replace test -d/f/e/s <path>
t3600: modernize style
test functions: add function `test_file_not_empty`

Merge branch 'nd/rewritten-ref-is-per-worktree'Junio C Hamano Tue, 9 Apr 2019 17:14:23 +0000 (02:14 +0900)

Merge branch 'nd/rewritten-ref-is-per-worktree'

"git rebase" uses the refs/rewritten/ hierarchy to store its
intermediate states, which inherently makes the hierarchy per
worktree, but it didn't quite work well.

* nd/rewritten-ref-is-per-worktree:
Make sure refs/rewritten/ is per-worktree
files-backend.c: reduce duplication in add_per_worktree_entries_to_dir()
files-backend.c: factor out per-worktree code in loose_fill_ref_dir()

Merge branch 'jh/resize-convert-scratch-buffer'Junio C Hamano Tue, 9 Apr 2019 17:14:22 +0000 (02:14 +0900)

Merge branch 'jh/resize-convert-scratch-buffer'

When the "clean" filter can reduce the size of a huge file in the
working tree down to a small "token" (a la Git LFS), there is no
point in allocating a huge scratch area upfront, but the buffer is
sized based on the original file size. The convert mechanism now
allocates very minimum and reallocates as it receives the output
from the clean filter process.

* jh/resize-convert-scratch-buffer:
convert: avoid malloc of original file size

Merge branch 'dl/ignore-docs'Junio C Hamano Tue, 9 Apr 2019 17:14:22 +0000 (02:14 +0900)

Merge branch 'dl/ignore-docs'

Doc update.

* dl/ignore-docs:
docs: move core.excludesFile from git-add to gitignore
git-clean.txt: clarify ignore pattern files

Merge branch 'ja/dir-rename-doc-markup-fix'Junio C Hamano Tue, 9 Apr 2019 17:14:21 +0000 (02:14 +0900)

Merge branch 'ja/dir-rename-doc-markup-fix'

Doc update.

* ja/dir-rename-doc-markup-fix:
Doc: fix misleading asciidoc formating

Merge branch 'dl/reset-doc-no-wrt-abbrev'Junio C Hamano Tue, 9 Apr 2019 17:14:20 +0000 (02:14 +0900)

Merge branch 'dl/reset-doc-no-wrt-abbrev'

Doc update.

* dl/reset-doc-no-wrt-abbrev:
git-reset.txt: clarify documentation

t3301: fix false negativeJohannes Schindelin Tue, 9 Apr 2019 10:41:23 +0000 (03:41 -0700)

t3301: fix false negative

In 6956f858f6 (notes: implement helpers needed for note copying during
rewrite, 2010-03-12), we introduced a test case that verifies that the
config setting `notes.rewriteRef` can be overridden via the environment
variable `GIT_NOTES_REWRITE_REF`.

Back when it was introduced, it relied on a side effect of an earlier
test case that configured `core.noteRef` to point to `refs/notes/other`.

In 908a320363 (t3301: modernize style, 2014-11-12), this side effect was
removed.

The test case *still* passed, but for the wrong reason: we no longer
overrode the rewrite ref, but there simply was nothing to rewrite
anymore, as the overridden notes ref was "modernized" away.

Let's let that test case pass for the correct reason again.

To make sure of that, let's change the idea of the original test case:
it configured `notes.rewriteRef` to point to the actual notes ref,
forced that to be ignored and then verified that the notes were *not*
rewritten.

By turning that idea upside down (configure the `notes.rewriteRef` to
another notes ref, override it via the environment variable to force the
notes to be copied, and then verify that the notes *were* rewritten), we
make it much harder for that test case to pass for the wrong reason.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

describe doc: remove '7-char' abbreviation referencePhilip Oakley Sat, 6 Apr 2019 13:27:47 +0000 (14:27 +0100)

describe doc: remove '7-char' abbreviation reference

While the minimum is 7-char, the unambiguous length can be longer.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

rerere doc: quote `rerere.enabled`Philip Oakley Sat, 6 Apr 2019 13:27:46 +0000 (14:27 +0100)

rerere doc: quote `rerere.enabled`

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame: default to HEAD in a bare repo when no start... SZEDER Gábor Sun, 7 Apr 2019 23:43:27 +0000 (01:43 +0200)

blame: default to HEAD in a bare repo when no start commit is given

When 'git blame' is invoked without specifying the commit to start
blaming from, it starts from the given file's state in the work tree.
However, when invoked in a bare repository without a start commit,
then there is no work tree state to start from, and it dies with the
following error message:

$ git rev-parse --is-bare-repository
true
$ git blame file.c
fatal: this operation must be run in a work tree

This is misleading, because it implies that 'git blame' doesn't work
in bare repositories at all, but it does, in fact, work just fine when
it is given a commit to start from.

We could improve the error message, of course, but let's just default
to HEAD in a bare repository instead, as most likely that is what the
user wanted anyway (if they wanted to start from an other commit, then
they would have specified that in the first place).

'git annotate' is just a thin wrapper around 'git blame', so in the
same situation it printed the same misleading error message, and this
patch fixes it, too.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ls-files: use correct format stringThomas Gummerer Sun, 7 Apr 2019 18:47:51 +0000 (19:47 +0100)

ls-files: use correct format string

struct stat_data and struct cache_time both use unsigned ints for all
their members. However the format string for 'git ls-files --debug'
currently uses %d for formatting these numbers. This means that we
potentially print these values incorrectly if they are greater than
INT_MAX.

This has been the case since the --debug option was introduced in 'git
ls-files' in 8497421715 ("ls-files: learn a debugging dump format",
2010-07-31).

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: remove incorrect reference to gc.auto=0Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:17 +0000 (21:52 +0200)

gc docs: remove incorrect reference to gc.auto=0

The chance of a repository being corrupted due to a "gc" has nothing
to do with whether or not that "gc" was invoked via "gc --auto", but
whether there's other concurrent operations happening.

This is already noted earlier in the paragraph, so there's no reason
to suggest this here. The user can infer from the rest of the
documentation that "gc" will run automatically unless gc.auto=0 is
set, and we shouldn't confuse the issue by implying that "gc --auto"
is somehow more prone to produce corruption than a normal "gc".

Well, it is in the sense that a blocking "gc" would stop you from
doing anything else in *that* particular terminal window, but users
are likely to have another window, or to be worried about how
concurrent "gc" on a server might cause corruption.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: clarify that "gc" doesn't throw away reference... Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:16 +0000 (21:52 +0200)

gc docs: clarify that "gc" doesn't throw away referenced objects

Amend the "NOTES" section to fix up wording that's been with us since
3ffb58be0a ("doc/git-gc: add a note about what is collected",
2008-04-23).

I can't remember when/where anymore (I think Freenode #Git), but at
some point I was having a conversation with someone who was convinced
that "gc" would prune things only referenced by e.g. refs/pull/*, and
pointed to this section as proof.

It turned out that they'd read the "branches and tags" wording here
and thought just refs/{heads,tags}/* and refs/remotes/* etc. would be
kept, which is what we enumerate explicitly.

So let's say "other refs", even though just above we say "objects that
are referenced anywhere in your repository".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: note "gc --aggressive" in "fast-import"Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:15 +0000 (21:52 +0200)

gc docs: note "gc --aggressive" in "fast-import"

Amend the "PACKFILE OPTIMIZATION" section in "fast-import" to explain
that simply running "git gc --aggressive" after a "fast-import" should
properly optimize the repository. This is simpler and more effective
than the existing "repack" advice (which I'm keeping as it helps
explain things) because it e.g. also packs the newly imported refs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: downplay the usefulness of --aggressiveÆvar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:14 +0000 (21:52 +0200)

gc docs: downplay the usefulness of --aggressive

The existing "gc --aggressive" docs come just short of recommending to
users that they run it regularly. I've personally talked to many users
who've taken these docs as an advice to use this option, and have,
usually it's (mostly) a waste of time.

So let's clarify what it really does, and let the user draw their own
conclusions.

Let's also clarify the "The effects [...] are persistent" to
paraphrase a brief version of Jeff King's explanation at [1].

1. https://public-inbox.org/git/20190318235356.GK29661@sigill.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: note how --aggressive impacts --window & ... Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:13 +0000 (21:52 +0200)

gc docs: note how --aggressive impacts --window & --depth

Since 07e7dbf0db (gc: default aggressive depth to 50, 2016-08-11) we
somewhat confusingly use the same depth under --aggressive as we do by
default.

As noted in that commit that makes sense, it was wrong to make more
depth the default for "aggressive", and thus save disk space at the
expense of runtime performance, which is usually the opposite of
someone who'd like "aggressive gc" wants.

But that's left us with a mostly-redundant configuration variable, so
let's clearly note in its documentation that it doesn't change the
default.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: fix formatting for "gc.writeCommitGraph"Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:12 +0000 (21:52 +0200)

gc docs: fix formatting for "gc.writeCommitGraph"

Change the AsciiDoc formatting so that an example of "gc --auto" isn't
rendered as "git-gc(1) --auto", but as "git gc --auto". This is
consistent with the rest of the links and command examples in this
documentation.

The formatting I'm changing was initially introduced in
d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: re-flow the "gc.*" section in "config"Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:11 +0000 (21:52 +0200)

gc docs: re-flow the "gc.*" section in "config"

Re-flow the "gc.*" section in "config". A previous commit moved this
over from the "gc" docs, but tried to keep as many of the lines
identical to benefit from diff's move detection.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc docs: include the "gc.*" section from "config" in... Ævar Arnfjörð Bjarmason Sun, 7 Apr 2019 19:52:10 +0000 (21:52 +0200)

gc docs: include the "gc.*" section from "config" in "gc"

Rather than duplicating the documentation for the various "gc" options
let's include the "gc" docs from git-config. They were mostly better
already, and now we don't have the same docs in two places with subtly
different wording.

In the cases where the git-gc(1) docs were saying something the "gc"
docs in git-config(1) didn't cover move the relevant section over to
the git-config(1) docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: batch fetching of missing blobsJonathan Tan Fri, 5 Apr 2019 17:09:34 +0000 (10:09 -0700)

diff: batch fetching of missing blobs

When running a command like "git show" or "git diff" in a partial clone,
batch all missing blobs to be fetched as one request.

This is similar to c0c578b33c ("unpack-trees: batch fetching of missing
blobs", 2017-12-08), but for another command.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

progress: assemble percentage and counters in a strbuf... SZEDER Gábor Fri, 5 Apr 2019 00:45:37 +0000 (02:45 +0200)

progress: assemble percentage and counters in a strbuf before printing

The following patches in this series want to handle the progress bar's
title and changing parts (i.e. the counter and the optional percentage
and throughput combined) differently, and need to know the length
of the changing parts of the previously displayed progress bar.

To prepare for those changes assemble the changing parts in a separate
strbuf kept in 'struct progress' before printing.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

progress: make display_progress() return voidSZEDER Gábor Fri, 5 Apr 2019 00:45:36 +0000 (02:45 +0200)

progress: make display_progress() return void

Ever since the progress infrastructure was introduced in 96a02f8f6d
(common progress display support, 2007-04-18), display_progress() has
returned an int, telling callers whether it updated the progress bar
or not. However, this is:

- useless, because over the last dozen years there has never been a
single caller that cared about that return value.

- not quite true, because it doesn't print a progress bar when
running in the background, yet it returns 1; see 85cb8906f0
(progress: no progress in background, 2015-04-13).

The related display_throughput() function returned void already upon
its introduction in cf84d51c43 (add throughput to progress display,
2007-10-30).

Let's make display_progress() return void, too. While doing so
several return statements in display() become unnecessary, remove
them.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: fix AsciiDoc/Asciidoctor stderr check in the docume... SZEDER Gábor Fri, 29 Mar 2019 12:35:20 +0000 (13:35 +0100)

ci: fix AsciiDoc/Asciidoctor stderr check in the documentation build job

In 'ci/test-documentation.sh' we save the standard error of 'make
doc', and, in an attempt to make sure that neither AsciiDoc nor
Asciidoctor printed any warnings, we check the emptiness of the
resulting file with '! test -s stderr.log'. This check has never
actually worked, because in our 'ci/*' build scripts we rely on 'set
-e' aborting the build job when a command exits with error, and,
unfortunately, the combination of the two doesn't work as intended.
According to POSIX [1]:

"The -e setting shall be ignored when executing [...] a pipeline
beginning with the ! reserved word" [2]

Watch and learn:

$ echo unexpected >file
$ ( set -e; ! test -s file ; echo "should not reach this" ) ; echo $?
should not reach this
0

This is why we haven't noticed the warnings from Asciidoctor that were
fixed in the first patches of this patch series, though some of them
were already there in the build of v2.18.0-rc0 [3].

Check the emptiness of that file with 'test ! -s' instead, which works
properly with 'set -e':

$ ( set -e; test ! -s file ; echo "should not reach this" ) ; echo $?
1

Furthermore, dump the contents of that file to the log for our
convenience, so if it were to unexpectedly end up being non-empty,
then we wouldn't have to scroll through all that long build log
looking for warnings, but could see them right away near the end of
the log.

Note that we are only really interested in the standard error of
AsciiDoc and Asciidoctor, but by saving the stderr of 'make doc' we
also save any error output from the make rules. Currently there is
only one such line: we build the docs with Asciidoctor right after a
'make clean', meaning that 'make USE_ASCIIDOCTOR=1 doc' always starts
with running 'GIT-VERSION-GEN', which in turn prints the version to
stderr. A 'sed' command was supposed to remove this version line to
prevent it from triggering that (previously defunct) emptiness check,
but, unfortunately, this command doesn't work as intended, either,
because it leaves the file to be checked intact, but that defunct
emptiness check hid this issue, too... Furthermore, in the near
future there will be an other line on stderr, because commit
9a71722b4d (Doc: auto-detect changed build flags, 2019-03-17) in the
currently cooking branch 'ma/doc-diff-doc-vs-doctor-comparison' will
print "* new asciidoc flags" at the beginning of both 'make doc'
invokations.

Extend that 'sed' command to remove this line, too, wrap it in a
helper function so the output of both 'make doc' is filtered the same
way, and change its invokation to actually write the logfile to be
checked.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#set

[2] POSIX doesn't discuss the meaning of '! cmd' in case of simple
commands, but it defines that "A pipeline is a sequence of one or
more commands separated by the control operator '|'", so
apparently a simple command is considered as pipeline as well.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_02

[3] https://travis-ci.org/git/git/jobs/385932007#L1463

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: stick with Asciidoctor v1.5.8 for nowSZEDER Gábor Fri, 29 Mar 2019 19:52:46 +0000 (20:52 +0100)

ci: stick with Asciidoctor v1.5.8 for now

The recent release of Asciidoctor v2.0.0 broke our documentation
build job on Travis CI, where we 'gem install asciidoctor', which
always brings us the latest and (supposedly) greatest. Alas, we are
not ready for that just yet, because it removed support for DocBook
4.5, and we have been requiring that particular DocBook version to
build 'user-manual.xml' with Asciidoctor, resulting in:

ASCIIDOC user-manual.xml
asciidoctor: FAILED: missing converter for backend 'docbook45'. Processing aborted.
Use --trace for backtrace
make[1]: *** [user-manual.xml] Error 1

Unfortunately, we can't simply switch to DocBook 5 right away, as
doing so leads to validation errors from 'xmlto', and working around
those leads to yet another errors... [1]

So let's stick with Asciidoctor v1.5.8 (latest stable release before
v2.0.0) in our documentation build job on Travis CI for now, until we
figure out how to deal with the fallout from Asciidoctor v2.0.0.

[1] https://public-inbox.org/git/20190324162131.GL4047@pobox.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cocci: FLEX_ALLOC_MEM to FLEX_ALLOC_STRDenton Liu Wed, 3 Apr 2019 22:00:06 +0000 (15:00 -0700)

cocci: FLEX_ALLOC_MEM to FLEX_ALLOC_STR

Ensure that a FLEX_MALLOC_MEM that uses 'strlen' for its 'len' uses
FLEX_ALLOC_STR instead, since these are equivalent forms.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

midx.c: convert FLEX_ALLOC_MEM to FLEX_ALLOC_STRDenton Liu Wed, 3 Apr 2019 22:00:05 +0000 (15:00 -0700)

midx.c: convert FLEX_ALLOC_MEM to FLEX_ALLOC_STR

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: use a prio_queue to hold rewritten parentsJeff King Thu, 4 Apr 2019 01:41:09 +0000 (21:41 -0400)

revision: use a prio_queue to hold rewritten parents

This patch fixes a quadratic list insertion in rewrite_one() when
pathspec limiting is combined with --parents. What happens is something
like this:

1. We see that some commit X touches the path, so we try to rewrite
its parents.

2. rewrite_one() loops forever, rewriting parents, until it finds a
relevant parent (or hits the root and decides there are none). The
heavy lifting is done by process_parent(), which uses
try_to_simplify_commit() to drop parents.

3. process_parent() puts any intermediate parents into the
&revs->commits list, inserting by commit date as usual.

So if commit X is recent, and then there's a large chunk of history that
doesn't touch the path, we may add a lot of commits to &revs->commits.
And insertion by commit date is O(n) in the worst case, making the whole
thing quadratic.

We tried to deal with this long ago in fce87ae538 (Fix quadratic
performance in rewrite_one., 2008-07-12). In that scheme, we cache the
oldest commit in the list; if the new commit to be added is older, we
can start our linear traversal there. This often works well in practice
because parents are older than their descendants, and thus we tend to
add older and older commits as we traverse.

But this isn't guaranteed, and in fact there's a simple case where it is
not: merges. Imagine we look at the first parent of a merge and see a
very old commit (let's say 3 years old). And on the second parent, as we
go back 3 years in history, we might have many commits. That one
first-parent commit has polluted our oldest-commit cache; it will remain
the oldest while we traverse a huge chunk of history, during which we
have to fall back to the slow, linear method of adding to the list.

Naively, one might imagine that instead of caching the oldest commit,
we'd start at the last-added one. But that just makes some cases faster
while making others slower (and indeed, while it made a real-world test
case much faster, it does quite poorly in the perf test include here).
Fundamentally, these are just heuristics; our worst case is still
quadratic, and some cases will approach that.

Instead, let's use a data structure with better worst-case performance.
Swapping out revs->commits for something else would have repercussions
all over the code base, but we can take advantage of one fact: for the
rewrite_one() case, nobody actually needs to see those commits in
revs->commits until we've finished generating the whole list.

That leaves us with two obvious options:

1. We can generate the list _unordered_, which should be O(n), and
then sort it afterwards, which would be O(n log n) total. This is
"sort-after" below.

2. We can insert the commits into a separate data structure, like a
priority queue. This is "prio-queue" below.

I expected that sort-after would be the fastest (since it saves us the
extra step of copying the items into the linked list), but surprisingly
the prio-queue seems to be a bit faster.

Here are timings for the new p0001.6 for all three techniques across a
few repositories, as compared to master:

master cache-last sort-after prio-queue
--------------------------------------------------------------------------------------------
GIT_PERF_REPO=git.git
0.52(0.50+0.02) 0.53(0.51+0.02) +1.9% 0.37(0.33+0.03) -28.8% 0.37(0.32+0.04) -28.8%

GIT_PERF_REPO=linux.git
20.81(20.74+0.07) 20.31(20.24+0.07) -2.4% 0.94(0.86+0.07) -95.5% 0.91(0.82+0.09) -95.6%

GIT_PERF_REPO=llvm-project.git
83.67(83.57+0.09) 4.23(4.15+0.08) -94.9% 3.21(3.15+0.06) -96.2% 2.98(2.91+0.07) -96.4%

A few items to note:

- the cache-list tweak does improve the bad case for llvm-project.git
that started my digging into this problem. But it performs terribly
on linux.git, barely helping at all.

- the sort-after and prio-queue techniques work well. They approach
the timing for running without --parents at all, which is what you'd
expect (see below for more data).

- prio-queue just barely outperforms sort-after. As I said, I'm not
really sure why this is the case, but it is. You can see it even
more prominently in this real-world case on llvm-project.git:

git rev-list --parents 07ef786652e7 -- llvm/test/CodeGen/Generic/bswap.ll

where prio-queue routinely outperforms sort-after by about 7%. One
guess is that the prio-queue may just be more efficient because it
uses a compact array.

There are three new perf tests:

- "rev-list --parents" gives us a baseline for running with --parents.
This isn't sped up meaningfully here, because the bad case is
triggered only with simplification. But it's good to make sure we
don't screw it up (now, or in the future).

- "rev-list -- dummy" gives us a baseline for just traversing with
pathspec limiting. This gives a lower bound for the next test (and
it's also a good thing for us to be checking in general for
regressions, since we don't seem to have any existing tests).

- "rev-list --parents -- dummy" shows off the problem (and our fix)

Here are the timings for those three on llvm-project.git, before and
after the fix:

Test master prio-queue
------------------------------------------------------------------------------
0001.3: rev-list --parents 2.24(2.12+0.12) 2.22(2.11+0.11) -0.9%
0001.5: rev-list -- dummy 2.89(2.82+0.07) 2.92(2.89+0.03) +1.0%
0001.6: rev-list --parents -- dummy 83.67(83.57+0.09) 2.98(2.91+0.07) -96.4%

Changes in the first two are basically noise, and you can see we
approach our lower bound in the final one.

Note that we can't fully get rid of the list argument from
process_parents(). Other callers do have lists, and it would be hard to
convert them. They also don't seem to have this problem (probably
because they actually remove items from the list as they loop, meaning
it doesn't grow so large in the first place). So this basically just
drops the "cache_ptr" parameter (which was used only by the one caller
we're fixing here) and replaces it with a prio_queue. Callers are free
to use either data structure, depending on what they're prepared to
handle.

Reported-by: Björn Pettersson A <bjorn.a.pettersson@ericsson.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/completion: add smerge to the mergetool complet... David Aguilar Thu, 4 Apr 2019 07:34:39 +0000 (00:34 -0700)

contrib/completion: add smerge to the mergetool completion candidates

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mergetools: add support for smerge (Sublime Merge)David Aguilar Thu, 4 Apr 2019 07:34:38 +0000 (00:34 -0700)

mergetools: add support for smerge (Sublime Merge)

Teach difftool and mergetool about the Sublime Merge "smerge" command.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame.c: don't drop origin blobs as eagerlyDavid Kastrup Tue, 2 Apr 2019 11:56:25 +0000 (13:56 +0200)

blame.c: don't drop origin blobs as eagerly

When a parent blob already has chunks queued up for blaming, dropping
the blob at the end of one blame step will cause it to get reloaded
right away, doubling the amount of I/O and unpacking when processing a
linear history.

Keeping such parent blobs in memory seems like a reasonable optimization
that should incur additional memory pressure mostly when processing the
merges from old branches.

Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-tree.txt: clarify --reset and worktree changesNguyễn Thái Ngọc Duy Mon, 1 Apr 2019 12:05:05 +0000 (19:05 +0700)

read-tree.txt: clarify --reset and worktree changes

The description of --reset stays true to the first implementation in
438195cced (git-read-tree: add "--reset" flag, 2005-06-09). That is,
--reset discards unmerged entries. Or at least true to the commit
message because I can't be sure about read-tree's behavior regarding
local changes.

But in fcc387db9b (read-tree -m -u: do not overwrite or remove untracked
working tree files., 2006-05-17), it is clear that "-m -u" tries to keep
local changes, while --reset is singled out and will keep overwriting
worktree files. It's not stated in the commit message, but it's obvious
from the patch.

I went this far back not because I had a lot of free time, but because I
did not trust my reading of unpack-trees.c code. So far I think the
related changes in history agree with my understanding of the current
code, that "--reset" loses local changes.

This behavior is not mentioned in git-read-tree.txt, even though
old-timers probably can just guess it based on the "reset" name. Update
git-read-tree.txt about this.

Side note. There's another change regarding --reset that is not
obviously about local changes, b018ff6085 (unpack-trees: fix "read-tree
-u --reset A B" with conflicted index, 2012-12-29). But I'm pretty sure
this is about the first function of --reset, to discard unmerged entries
correctly.

PS. The patch changes one more line than necessary because the first
line uses spaces instead of tab.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests (pack-objects): use the full, unabbreviated ... Johannes Schindelin Mon, 25 Mar 2019 18:14:23 +0000 (11:14 -0700)

tests (pack-objects): use the full, unabbreviated `--revs` option

To use the singular form of a word, when the option wants the plural
form (and quietly expands it because it thinks it was abbreviated), is
an easy mistake to make, and t5317 contains almost two dozen of them.

However, using abbreviated options in tests is a bit fragile, so we will
disallow use of abbreviated options in our test suite.

In preparation for this change, let's fix
`t5317-pack-objects-filter-objects.sh`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests (status): spell out the `--find-renames` option... Johannes Schindelin Mon, 25 Mar 2019 18:14:22 +0000 (11:14 -0700)

tests (status): spell out the `--find-renames` option in full

To avoid future ambiguities, we really want to use full option names in
the test suite. `t7525-status-rename.sh` used an abbreviated form of the
`--find-renames` option, though, so let's change that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests (push): do not abbreviate the `--follow-tags... Johannes Schindelin Mon, 25 Mar 2019 18:14:21 +0000 (11:14 -0700)

tests (push): do not abbreviate the `--follow-tags` option

We really want to spell out the option in the full form, to avoid any
ambiguity that might be introduced by future patches.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5531: avoid using an abbreviated optionJohannes Schindelin Mon, 25 Mar 2019 18:14:20 +0000 (11:14 -0700)

t5531: avoid using an abbreviated option

It was probably just an oversight: the `--recurse-submodules` option
puts the term "submodules" in the plural form, not the singular one.

To avoid future problems in case that another option is introduced that
starts with the prefix `--recurse-submodule`, let's just fix this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t7810: do not abbreviate `--no-exclude-standard` nor... Johannes Schindelin Mon, 25 Mar 2019 18:14:19 +0000 (11:14 -0700)

t7810: do not abbreviate `--no-exclude-standard` nor `--invert-match`

This script used abbreviated options, which is unnecessarily fragile.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests (rebase): spell out the `--force-rebase` optionJohannes Schindelin Mon, 25 Mar 2019 18:14:19 +0000 (11:14 -0700)

tests (rebase): spell out the `--force-rebase` option

In quite a few test cases, we were sloppy and used the abbreviation
`--force`, but we really should be precise in what we want to test.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests (rebase): spell out the `--keep-empty` optionJohannes Schindelin Mon, 25 Mar 2019 18:14:18 +0000 (11:14 -0700)

tests (rebase): spell out the `--keep-empty` option

This test wants to run `git rebase` with the `--keep-empty` option, but
it really only spelled out `--keep` and trusted Git's option parsing to
determine that this was a unique abbreviation of the real option.

However, Denton Liu contributed a patch series in
https://public-inbox.org/git/cover.1553354374.git.liu.denton@gmail.com/
that introduces a new `git rebase` option called `--keep-base`, which
makes this previously unique abbreviation non-unique.

Whether this patch series is accepted or not, it is actually a bad
practice to use abbreviated options in our test suite, because of the
issue that those unique option names are not guaranteed to stay unique
in the future.

So let's just not use abbreviated options in the test suite.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

index-pack: show progress while checking objectsSZEDER Gábor Sun, 31 Mar 2019 23:12:35 +0000 (01:12 +0200)

index-pack: show progress while checking objects

When 'git index-pack' is run by 'git clone', its check_objects()
function usually doesn't take long enough to be a concern, but I just
run into a situation where it took about a minute or so: I
inadvertently put some memory pressure on my tiny laptop while cloning
linux.git, and then there was quite a long silence between the
"Resolving deltas" and "Checking connectivity" progress bars.

Show a progress bar during the loop of check_objects() to let the user
know that something is still going on.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/git-status: fix titles in porcelain v2... Todd Zullinger Sat, 30 Mar 2019 18:30:01 +0000 (14:30 -0400)

Documentation/git-status: fix titles in porcelain v2 section

Asciidoc uses either one-line or two-line syntax for document/section
titles[1]. The two-line form is used in git-status. Fix a few section
titles in the porcelain v2 section which were inadvertently using
markdown syntax.

[1] http://asciidoc.org/userguide.html#X17

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/rev-list-options: wrap --date=<format... Todd Zullinger Sat, 30 Mar 2019 18:30:00 +0000 (14:30 -0400)

Documentation/rev-list-options: wrap --date=<format> block with "--"

Using "+" to continue multiple list items is more tedious and
error-prone than wrapping the entire block with "--" block markers.

When using asciidoctor, the list items after the --date=iso list items
are incorrectly formatted when using "+" continuation. Use "--" block
markers to correctly format the block.

When using asciidoc there is no change in how the content is rendered.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-lib: whitelist GIT_TR2_* in the environmentÆvar Arnfjörð Bjarmason Sat, 30 Mar 2019 07:51:19 +0000 (08:51 +0100)

test-lib: whitelist GIT_TR2_* in the environment

Add GIT_TR2_* to the whitelist of environment variables that we don't
clear when running the test suite.

This allows us to use the test suite to produce trace2 test data,
which is handy to e.g. write consumers that collate the trace data
itself.

One caveat here is that we produce trace output for not *just* the
tests, but also e.g. from this line in test-lib.sh:

# It appears that people try to run tests without building...
"${GIT_TEST_INSTALLED:-$GIT_BUILD_DIR}/git$X" >/dev/null
[...]

I consider this not just OK but a feature. Let's log *all* the git
commands we're going to execute, not just those within
test_expect_*().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: binary search when storing wanted-refsJonathan Tan Wed, 27 Mar 2019 21:11:10 +0000 (14:11 -0700)

fetch-pack: binary search when storing wanted-refs

In do_fetch_pack_v2(), the "sought" array is sorted by name, and it is
not subsequently reordered (within the function). Therefore,
receive_wanted_refs() can assume that "sought" is sorted, and can thus
use a binary search when storing wanted-refs retrieved from the server.

Replace the existing linear search with a binary search. This improves
performance significantly when mirror cloning a repository with more
than 1 million refs.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

interpret-trailers.txt: start the desc line with a... Nguyễn Thái Ngọc Duy Wed, 27 Mar 2019 09:16:28 +0000 (16:16 +0700)

interpret-trailers.txt: start the desc line with a capital letter

This description line is shown in 'git help -a' and all other commands
description starts with an uppercase character. This just makes that
printout a bit nicer.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1-file: support OBJECT_INFO_FOR_PREFETCHJonathan Tan Fri, 29 Mar 2019 21:39:27 +0000 (14:39 -0700)

sha1-file: support OBJECT_INFO_FOR_PREFETCH

Teach oid_object_info_extended() to support a new flag that inhibits
fetching of missing objects. This is equivalent to setting
fetch_is_missing to 0, calling oid_object_info_extended(), then setting
fetch_if_missing to whatever it was before. Update unpack-trees.c to use
this new flag instead of repeatedly setting fetch_if_missing.

This new flag complicates things slightly in that there are now 2 ways
to do the same thing. But this eliminates the need to repeatedly set a
global variable, and more importantly, allows prefetching to be done in
parallel (in the future); hence, this patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: respect --no-update-shallow in v2Jonathan Tan Tue, 26 Mar 2019 19:31:21 +0000 (12:31 -0700)

fetch-pack: respect --no-update-shallow in v2

In protocol v0, when sending "shallow" lines, the server distinguishes
between lines caused by the remote repo being shallow and lines caused
by client-specified depth settings. Unless "--update-shallow" is
specified, there is a difference in behavior: refs that reach the former
"shallow" lines, but not the latter, are rejected. But in v2, the server
does not, and the client treats all "shallow" lines like lines caused by
client-specified depth settings.

Full restoration of v0 functionality is not possible without protocol
change, but we can implement a heuristic: if we specify any depth
setting, treat all "shallow" lines like lines caused by client-specified
depth settings (that is, unaffected by "--no-update-shallow"), but
otherwise, treat them like lines caused by the remote repo being shallow
(that is, affected by "--no-update-shallow"). This restores most of v0
behavior, except in the case where a client fetches from a shallow
repository with depth settings.

This patch causes a test that previously failed with
GIT_TEST_PROTOCOL_VERSION=2 to pass.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: call prepare_shallow_info only if v0Jonathan Tan Tue, 26 Mar 2019 19:31:20 +0000 (12:31 -0700)

fetch-pack: call prepare_shallow_info only if v0

In fetch_pack(), be clearer that there is no shallow information before
the fetch when v2 is used - memset the struct shallow_info to 0 instead
of calling prepare_shallow_info().

This patch is in preparation for a future patch in which a v2 fetch
might call prepare_shallow_info() after shallow info has been retrieved
during the fetch, so I needed to ensure that prepare_shallow_info() is
not called before the fetch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jt/test-protocol-version' into jt/fetch... Junio C Hamano Mon, 1 Apr 2019 06:35:01 +0000 (15:35 +0900)

Merge branch 'jt/test-protocol-version' into jt/fetch-no-update-shallow-in-proto-v2

* jt/test-protocol-version:
t5552: compensate for v2 filtering ref adv.
tests: fix protocol version for overspecifications
t5700: only run with protocol version 1
t5512: compensate for v0 only sending HEAD symrefs
t5503: fix overspecification of trace expectation
tests: always test fetch of unreachable with v0
t5601: check ssh command only with protocol v0
tests: define GIT_TEST_PROTOCOL_VERSION

ci: install Asciidoctor in 'ci/install-dependencies.sh'SZEDER Gábor Fri, 29 Mar 2019 12:35:18 +0000 (13:35 +0100)

ci: install Asciidoctor in 'ci/install-dependencies.sh'

When our '.travis.yml' was split into several 'ci/*' scripts [1], the
installation of the 'asciidoctor' gem somehow ended up in
'ci/test-documentation.sh'.

Install it in 'ci/install-dependencies.sh', where we install other
dependencies of the Documentation build job as well (asciidoc,
xmlto).

[1] 657343a602 (travis-ci: move Travis CI code into dedicated scripts,
2017-09-10)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/technical/protocol-v2.txt: fix formattingSZEDER Gábor Fri, 29 Mar 2019 12:35:17 +0000 (13:35 +0100)

Documentation/technical/protocol-v2.txt: fix formatting

Asciidoctor versions v1.5.7 or later print the following warning while
building the documentation:

ASCIIDOC technical/protocol-v2.html
asciidoctor: WARNING: protocol-v2.txt: line 38: unterminated listing block

This highlights an issue (even with older Asciidoctor versions) where
the 'Initial Client Request' header is not rendered as a header but in
monospace. I'm not sure what exactly causes this issue and why it's
an issue only with this particular header, but all headers in
'protocol-v2.txt' are written like this:

Initial Client Request
------------------------

i.e. the header itself is indented by a space, and the "underline" is
two characters longer than the header.

Dropping that indentation and making the length of the underline match
the length of the header apparently fixes this issue.

While at it, adjust all other headers 'protocol-v2.txt' as well, to
match the style we use everywhere else.

The page rendered with AsciiDoc doesn't have this formatting issue.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/technical/api-config.txt: fix formattingSZEDER Gábor Fri, 29 Mar 2019 12:35:16 +0000 (13:35 +0100)

Documentation/technical/api-config.txt: fix formatting

Asciidoctor versions v1.5.7 or later print the following warning while
building the documentation:

ASCIIDOC technical/api-config.html
asciidoctor: WARNING: api-config.txt: line 232: unterminated listing block

This highlight an issue (even with older Asciidoctor versions) where
the length of the '----' lines surrounding a code example don't match,
and the rest of the document is rendered in monospace.

Fix this by making sure that the length of those lines match.

The page rendered with AsciiDoc doesn't have this formatting issue.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/git-diff-tree.txt: fix formattingSZEDER Gábor Fri, 29 Mar 2019 12:35:15 +0000 (13:35 +0100)

Documentation/git-diff-tree.txt: fix formatting

Asciidoctor versions v1.5.7 or later print the following warning while
building the documentation:

ASCIIDOC git-diff-tree.xml
asciidoctor: WARNING: diff-format.txt: line 2: unterminated listing block

This highlights an issue (even with older Asciidoctor versions) where
the "Raw output format" header is not rendered as a header, and the
rest of the document is rendered in monospace. This is not caused by
'diff-format.txt' in itself, but rather by 'git-diff-tree.txt'
including 'pretty-formats.txt' and 'diff-format.txt' on subsequent
lines, while the former happens to end with monospace-formatted
example commands.

Fix this by inserting an empty line between the two include::
directives.

The page rendered with AsciiDoc doesn't have this formatting issue.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: correct '**' matching in includeIf patternsNguyễn Thái Ngọc Duy Tue, 26 Mar 2019 09:41:01 +0000 (16:41 +0700)

config: correct '**' matching in includeIf patterns

The current wildmatch() call for includeIf's gitdir pattern does not
pass the WM_PATHNAME flag. Without this flag, '*' is treated _almost_
the same as '**' (because '*' also matches slashes) with one exception:

'/**/' can match a single slash. The pattern 'foo/**/bar' matches
'foo/bar'.

But '/*/', which is essentially what wildmatch engine sees without
WM_PATHNAME, has to match two slashes (and '*' matches nothing). Which
means 'foo/*/bar' cannot match 'foo/bar'. It can only match 'foo//bar'.

The result of this is the current wildmatch() call works most of the
time until the user depends on '/**/' matching no path component. And
also '*' matches slashes while it should not, but people probably
haven't noticed this yet. The fix is straightforward.

Reported-by: Jason Karns <jason.karns@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

check-docs: fix for setups where executables have an... Johannes Schindelin Mon, 25 Mar 2019 21:41:39 +0000 (14:41 -0700)

check-docs: fix for setups where executables have an extension

On Windows, for example, executables (must) have the extension `.exe`.
Our `check-docs` target was not prepared for that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

check-docs: do not expect guide pages to correspond... Johannes Schindelin Mon, 25 Mar 2019 21:41:38 +0000 (14:41 -0700)

check-docs: do not expect guide pages to correspond to commands

When we want to see what commands are listed in `command-list.txt` but
not installed, we currently include lines that refer to guides, e.g.
`gitattributes` or `gitcli`.

Let's not include those lines, as they are not referring to commands.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

check-docs: really look at the documented commands... Johannes Schindelin Mon, 25 Mar 2019 21:41:37 +0000 (14:41 -0700)

check-docs: really look at the documented commands again

As part of the `check-docs` target, we verify that commands that are
documented are actually in the current list of commands to be built.

However, this logic broke in 5fafce0b78 (check-docs: get documented
command list from Makefile, 2012-08-08), when we tried to make the logic
safer by not looking at the files in the worktree, but at the list of
files to be generated in `Documentation/Makefile`. While this was the
right thing to do, it failed to accommodate for the fact that `make -C
Documentation/ print-man1`, unlike `ls Documentation/*.txt`, would *not*
print lines starting with the prefix `Documentation/`.

At long last, let's fix this.

Note: This went undetected due to a funny side effect of the
`ALL_PROGRAMS` variable starting with a space. That space, together with
the extra space we inserted before `$(ALL_PROGRAMS)` in the

case " $(ALL_PROGRAMS)" in
*" $$cmd ")
[...]

construct, is responsible that this case arm is used when `cmd` is empty
(which was clearly not intended to be the case).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

docs: do not document the `git remote-testgit` commandJohannes Schindelin Mon, 25 Mar 2019 21:41:37 +0000 (14:41 -0700)

docs: do not document the `git remote-testgit` command

Since 7ded055401 (build: do not install git-remote-testgit, 2013-06-07),
we do not install it. Therefore it makes no sense to document it,
either.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

docs: move gitremote-helpers into section 7Johannes Schindelin Mon, 25 Mar 2019 21:41:36 +0000 (14:41 -0700)

docs: move gitremote-helpers into section 7

It is currently in section 1, but that section is intended for
"Executable programs or shell commands".

A more appropriate place is section 7: "Miscellaneous (including macro
packages and conventions), e.g. man(7), groff(7)".

This issue should have been detected earlier by `make check-docs`, but
was missed due to a bug in that Makefile target (that we are about to
fix).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc: handle & check gc.reflogExpire configÆvar Arnfjörð Bjarmason Thu, 28 Mar 2019 16:14:34 +0000 (17:14 +0100)

gc: handle & check gc.reflogExpire config

Don't redundantly run "git reflog expire --all" when gc.reflogExpire
and gc.reflogExpireUnreachable are set to "never", and die immediately
if those configuration valuer are bad.

As an earlier "assert lack of early exit" change to the tests for "git
reflog expire" shows, an early check of gc.reflogExpire{Unreachable,}
isn't wanted in general for "git reflog expire", but it makes sense
for "gc" because:

1) Similarly to 8ab5aa4bd8 ("parseopt: handle malformed --expire
arguments more nicely", 2018-04-21) we'll now die early if the
config variables are set to invalid values.

We run "pack-refs" before "reflog expire", which can take a while,
only to then die on an invalid gc.reflogExpire{Unreachable,}
configuration.

2) Not invoking the command at all means it won't show up in trace
output, which makes what's going on more obvious when the two are
set to "never".

3) As a later change documents we lock the refs when looping over the
refs to expire, even in cases where we end up doing nothing due to
this config.

For the reasons noted in the earlier "assert lack of early exit"
change I don't think it's worth it to bend over backwards in "git
reflog expire" itself to carefully detect if we'll really do
nothing given the combination of all its possible options and skip
that locking, but that's easy to detect here in "gc" where we'll
only run "reflog expire" in a relatively simple mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reflog tests: assert lack of early exit with expiry... Ævar Arnfjörð Bjarmason Thu, 28 Mar 2019 16:14:33 +0000 (17:14 +0100)

reflog tests: assert lack of early exit with expiry="never"

When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
and --stale-fix isn't in effect we *could* exit early without
pointlessly looping over all the reflogs.

However, as an earlier change to add a test for the "points nowhere"
warning shows even in such a mode we might want to print out a
warning.

So while it's conceivable to implement this, I don't think it's worth
it. It's going to be too easy to inadvertently add some flag that'll
make the expiry happen anyway, and even with "never" we'd like to see
all the lines we're going to keep.

So let's assert that we're going to loop over all the references even
when this configuration is in effect.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: improve & i18n error messagesÆvar Arnfjörð Bjarmason Mon, 25 Mar 2019 12:08:34 +0000 (13:08 +0100)

commit-graph: improve & i18n error messages

Change the error emitted when a commit-graph file is corrupt so that
we actually mention the commit-graph, e.g. change errors like:

error: improper chunk offset 0000000000385e0c

To:

error: commit-graph improper chunk offset 0000000000385e0c

As discussed in the commits leading up to this one the commit-graph
machinery is now used by common commands like "status". If the graph
was corrupt we'd often emit some error that gave no indication what
was wrong. Now some of them are still cryptic, but they'll at least
mention "commit-graph" to give the user a hint as to where to look.

While I'm at it mark some of the strings that hadn't been marked for
translation. It's clear from the commit history and the code that this
was merely forgotten at the time, and wasn't intentional.p5

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph write: don't die if the existing graph... Ævar Arnfjörð Bjarmason Mon, 25 Mar 2019 12:08:33 +0000 (13:08 +0100)

commit-graph write: don't die if the existing graph is corrupt

When the commit-graph is written we end up calling
parse_commit(). This will in turn invoke code that'll consult the
existing commit-graph about the commit, if the graph is corrupted we
die.

We thus get into a state where a failing "commit-graph verify" can't
be followed-up with a "commit-graph write" if core.commitGraph=true is
set, the graph either needs to be manually removed to proceed, or
core.commitGraph needs to be set to "false".

Change the "commit-graph write" codepath to use a new
parse_commit_no_graph() helper instead of parse_commit() to avoid
this. The latter will call repo_parse_commit_internal() with
use_commit_graph=1 as seen in 177722b344 ("commit: integrate commit
graph with commit parsing", 2018-04-10).

Not using the old graph at all slows down the writing of the new graph
by some small amount, but is a sensible way to prevent an error in the
existing commit-graph from spreading.

Just fixing the current issue would be likely to result in code that's
inadvertently broken in the future. New code might use the
commit-graph at a distance. To detect such cases introduce a
"GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD" setting used when we do our
corruption tests, and test that a "write/verify" combo works after
every one of our current test cases where we now detect commit-graph
corruption.

Some of the code changes here might be strictly unnecessary, e.g. I
was unable to find cases where the parse_commit() called from
write_graph_chunk_data() didn't exit early due to
"item->object.parsed" being true in
repo_parse_commit_internal() (before the use_commit_graph=1 has any
effect). But let's also convert those cases for good measure, we do
not have exhaustive tests for all possible types of commit-graph
corruption.

This might need to be re-visited if we learn to write the commit-graph
incrementally, but probably not. Hopefully we'll just start by finding
out what commits we have in total, then read the old graph(s) to see
what they cover, and finally write a new graph file with everything
that's missing. In that case the new graph writing code just needs to
continue to use e.g. a parse_commit() that doesn't consult the
existing commit-graphs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph verify: detect inability to read the graphÆvar Arnfjörð Bjarmason Mon, 25 Mar 2019 12:08:32 +0000 (13:08 +0100)

commit-graph verify: detect inability to read the graph

Change "commit-graph verify" to error on open() failures other than
ENOENT. As noted in the third paragraph of 283e68c72f ("commit-graph:
add 'verify' subcommand", 2018-06-27) and the test it added it's
intentional that "commit-graph verify" doesn't error out when the file
doesn't exist.

But let's not be overly promiscuous in what we accept. If we can't
read the file for other reasons, e.g. permission errors, bad file
descriptor etc. we'd like to report an error to the user.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: don't pass filename to load_commit_graph_... Ævar Arnfjörð Bjarmason Mon, 25 Mar 2019 12:08:31 +0000 (13:08 +0100)

commit-graph: don't pass filename to load_commit_graph_one_fd_st()

An earlier change implemented load_commit_graph_one_fd_st() in a way
that was bug-compatible with earlier code in terms of the "graph file
%s is too small" error message printing out the path to the
commit-graph (".git/objects/info/commit-graph").

But change that, because:

* A function that takes an already-open file descriptor also needing
the filename isn't very intuitive.

* The vast majority of errors we might emit when loading the graph
come from parse_commit_graph(), which doesn't report the
filename. Let's not do that either in this case for consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: don't early exit(1) on e.g. "git status"Ævar Arnfjörð Bjarmason Mon, 25 Mar 2019 12:08:30 +0000 (13:08 +0100)

commit-graph: don't early exit(1) on e.g. "git status"

Make the commit-graph loading code work as a library that returns an
error code instead of calling exit(1) when the commit-graph is
corrupt. This means that e.g. "status" will now report commit-graph
corruption as an "error: [...]" at the top of its output, but then
proceed to work normally.

This required splitting up the load_commit_graph_one() function so
that the code that deals with open()-ing and stat()-ing the graph can
now be called independently as open_commit_graph().

This is needed because "commit-graph verify" where the graph doesn't
exist isn't an error. See the third paragraph in
283e68c72f ("commit-graph: add 'verify' subcommand",
2018-06-27). There's a bug in that logic where we conflate the
intended ENOENT with other errno values (e.g. EACCES), but this change
doesn't address that. That'll be addressed in a follow-up change.

I'm then splitting most of the logic out of load_commit_graph_one()
into load_commit_graph_one_fd_st(), which allows for providing an
existing file descriptor and stat information to the loading
code. This isn't strictly needed, but it would be redundant and
confusing to open() and stat() the file twice for some of the
codepaths, this allows for calling open_commit_graph() followed by
load_commit_graph_one_fd_st(). The "graph_file" still needs to be
passed to that function for the the "graph file %s is too small" error
message.

This leaves load_commit_graph_one() unused by everything except the
internal prepare_commit_graph_one() function, so let's mark it as
"static". If someone needs it in the future we can remove the "static"
attribute. I could also rewrite its sole remaining
user ("prepare_commit_graph_one()") to use
load_commit_graph_one_fd_st() instead, but let's leave it at this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: fix segfault on e.g. "git status"Ævar Arnfjörð Bjarmason Mon, 25 Mar 2019 12:08:29 +0000 (13:08 +0100)

commit-graph: fix segfault on e.g. "git status"

When core.commitGraph=true is set, various common commands now consult
the commit graph. Because the commit-graph code is very trusting of
its input data, it's possibly to construct a graph that'll cause an
immediate segfault on e.g. "status" (and e.g. "log", "blame", ...). In
some other cases where git immediately exits with a cryptic error
about the graph being broken.

The root cause of this is that while the "commit-graph verify"
sub-command exhaustively verifies the graph, other users of the graph
simply trust the graph, and will e.g. deference data found at certain
offsets as pointers, causing segfaults.

This change does the bare minimum to ensure that we don't segfault in
the common fill_commit_in_graph() codepath called by
e.g. setup_revisions(), to do this instrument the "commit-graph
verify" tests to always check if "status" would subsequently
segfault. This fixes the following tests which would previously
segfault:

not ok 50 - detect low chunk count
not ok 51 - detect missing OID fanout chunk
not ok 52 - detect missing OID lookup chunk
not ok 53 - detect missing commit data chunk

Those happened because with the commit-graph enabled setup_revisions()
would eventually call fill_commit_in_graph(), where e.g.
g->chunk_commit_data is used early as an offset (and will be
0x0). With this change we get far enough to detect that the graph is
broken, and show an error instead. E.g.:

$ git status; echo $?
error: commit-graph is missing the Commit Data chunk
1

That also sucks, we should *warn* and not hard-fail "status" just
because the commit-graph is corrupt, but fixing is left to a follow-up
change.

A side-effect of changing the reporting from graph_report() to error()
is that we now have an "error: " prefix for these even for
"commit-graph verify". Pseudo-diff before/after:

$ git commit-graph verify
-commit-graph is missing the Commit Data chunk
+error: commit-graph is missing the Commit Data chunk

Changing that is OK. Various errors it emits now early on are prefixed
with "error: ", moving these over and changing the output doesn't
break anything.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import: fix erroneous handling of get-mark with... Elijah Newren Wed, 20 Feb 2019 22:58:46 +0000 (14:58 -0800)

fast-import: fix erroneous handling of get-mark with empty orphan commits

When get-mark was introduced in commit 28c7b1f7b7b7 ("fast-import: add a
get-mark command", 2015-07-01), it followed the precedent of the
cat-blob command to be allowed on any line other than in the middle of a
data directive; see commit 777f80d7429b ("fast-import: Allow cat-blob
requests at arbitrary points in stream", 2010-11-28). It was useful to
allow cat-blob directives in the middle of a commit to get more data
that would be used in writing the current commit object. get-mark is
not similarly useful since fast-import can already use either object id
or mark. Further, trying to allow this command anywhere caused parsing
bugs. Fix the parsing problems by only allowing get-mark commands to
appear when other commands have completed.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import: only allow cat-blob requests where it... Elijah Newren Wed, 20 Feb 2019 22:58:45 +0000 (14:58 -0800)

fast-import: only allow cat-blob requests where it makes sense

In commit 777f80d7429b ("fast-import: Allow cat-blob requests at
arbitrary points in stream", 2010-11-28), fast-import started allowing
cat-blob commands to appear on the start of any line except in the
middle of a "data" command. It could be in the middle of various
directives that were part of a tag command, or in the middle of
checkpoints or progresses (each of which allow an optional second empty
newline), or even immediately after the mark command of a blob before
the data directive appeared (raising the question of what if it used the
mark for the blob that just barely appeared in the stream that we do not
yet have the data for). None of these locations make any sense as
places to put cat-blob requests.

The purpose of this change as stated in that commit message was to

[save] frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.

However, that can be achieved by simply allowing cat-blob requests to
appear whenever a filemodify directive is allowed. Further, it avoids
setting a bad precedent for other commands to follow (e.g. get-mark); a
precedent which caused parsing problems in corner cases.

Technically, inline filemodify directives add a slight wrinkle in that
frontends might want to have cat-blob directives appear after the start
of the filemodify and before the data directive contained within it. I
think it would have been better to disallow such a case (it would be
trivial to use cat-blob before the filemodify instead), but since there
is evidence this was used, for backwards compatibility let's support
that case too.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import: check most prominent commands firstElijah Newren Wed, 20 Feb 2019 22:58:44 +0000 (14:58 -0800)

fast-import: check most prominent commands first

This is not a very important change, and one that I expect to have no
performance impact whatsoever, but reading the code bothered me. The
parsing of command types in cmd_main() mostly runs in order of most
common to least common commands; sure, it's hard to say for sure what
the most common are without some type of study, but it seems fairly
clear to mark the original four ("blob", "commit", "tag", "reset") as
the most prominent. Indeed, the parsing for most other commands were
added to later in the list. However, when "ls" was added, it was stuck
near the top of the list, with no rationale for that particular
location. Move it down to later to appease my Tourette's-like internal
twitching that its former location was causing.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-fast-import.txt: fix wording about where ls command... Elijah Newren Wed, 20 Feb 2019 22:58:43 +0000 (14:58 -0800)

git-fast-import.txt: fix wording about where ls command can appear

The docs claimed `ls` commands could appear almost anywhere, but the
code told a different story. Modify the docs to match the code.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t9300: demonstrate bug with get-mark and empty orphan... Elijah Newren Wed, 20 Feb 2019 22:58:42 +0000 (14:58 -0800)

t9300: demonstrate bug with get-mark and empty orphan commits

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitweb: make hash size independentbrian m. carlson Tue, 19 Feb 2019 00:05:26 +0000 (00:05 +0000)

gitweb: make hash size independent

Gitweb has several hard-coded 40 values throughout it to check for
values that are passed in or acquired from Git. To simplify the code,
introduce a regex variable that matches either exactly 40 or exactly 64
hex characters, and use this variable anywhere we would have previously
hard-coded a 40 in a regex.

Add some helper functions which allow us to write tighter regexes that
match exactly the number of hex characters we're expecting.

Similarly, switch the code that looks for deleted diffinfo information
to look for either 40 or 64 zeros, and update one piece of code to use
this function. Finally, when formatting a log line, allow an
abbreviated describe output to contain up to 64 characters.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Git.pm: make hash size independentbrian m. carlson Tue, 19 Feb 2019 00:05:25 +0000 (00:05 +0000)

Git.pm: make hash size independent

The cat_blob function was matching on exactly 40 hex characters. This
won't work with SHA-256, which uses 64-character hex object IDs. While
it should be fine to simply match any number of hex characters since the
output is space delimited, be extra safe by matching either exactly 40
or exactly 64 hex characters.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: read data in a hash-independent waybrian m. carlson Tue, 19 Feb 2019 00:05:24 +0000 (00:05 +0000)

read-cache: read data in a hash-independent way

Index entries are structured with a variety of fields up front, followed
by a hash and one or two flags fields. Because the hash field is stored
in the middle of the structure, it's difficult to use one fixed-size
structure that easily allows access to the hash and flags fields.
Adjust the structure to hold the maximum amount of data that may be
needed using a member called "data" and read and write this field
independently in the various places that need to read and write the
structure.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

dir: make untracked cache extension hash size independentbrian m. carlson Tue, 19 Feb 2019 00:05:23 +0000 (00:05 +0000)

dir: make untracked cache extension hash size independent

Instead of using a struct with a flex array member to read and write the
untracked cache extension, use a shorter, fixed-length struct and add
the name and hash data explicitly.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/difftool: use parse_oid_hexbrian m. carlson Tue, 19 Feb 2019 00:05:22 +0000 (00:05 +0000)

builtin/difftool: use parse_oid_hex

Instead of using get_oid_hex and adding constants to the result, use
parse_oid_hex to make this code independent of the hash size.

Additionally, correct a typo that would cause us to print one too few
characters on error, since we will already have incremented the pointer
to point to the beginning of the object ID before we get to printing the
error message.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refspec: make hash size independentbrian m. carlson Tue, 19 Feb 2019 00:05:21 +0000 (00:05 +0000)

refspec: make hash size independent

Switch a use of GIT_SHA1_HEXSZ to use the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

archive: convert struct archiver_args to object_idbrian m. carlson Tue, 19 Feb 2019 00:05:20 +0000 (00:05 +0000)

archive: convert struct archiver_args to object_id

Change the commit_sha1 member to be called "commit_oid" and change it to
be a pointer to struct object_id. Additionally, update some uses of
GIT_SHA1_HEXSZ and hard-coded values to use the_hash_algo instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/get-tar-commit-id: make hash size independentbrian m. carlson Tue, 19 Feb 2019 00:05:19 +0000 (00:05 +0000)

builtin/get-tar-commit-id: make hash size independent

To make this code independent of the hash size, verify that the length
of the comment is equal to that of any supported hash algorithm.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

get-tar-commit-id: parse comment recordRene Scharfe Tue, 19 Feb 2019 00:05:18 +0000 (00:05 +0000)

get-tar-commit-id: parse comment record

Parse pax comment records properly and get rid of magic numbers for
acceptable comment length. This simplifies a later change to handle
longer hashes.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

hash: add a function to lookup hash algorithm by lengthbrian m. carlson Tue, 19 Feb 2019 00:05:17 +0000 (00:05 +0000)

hash: add a function to lookup hash algorithm by length

There are some cases, such as the dumb HTTP transport and bundles, where
we can only determine the hash algorithm in use by the length of the
object IDs. Provide a function that looks up the algorithm by length.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>