Andrew's git - gitweb.git/log

diff | tree

config.c: refactor die_bad_number() to not call gettext... Ævar Arnfjörð Bjarmason Fri, 21 Jun 2019 10:18:07 +0000 (12:18 +0200)

config.c: refactor die_bad_number() to not call gettext() early

Prepare die_bad_number() for a change to specially handle
GIT_TEST_GETTEXT_POISON calling git_env_bool() by making
die_bad_number() not call gettext() early, which would in turn call
git_env_bool().

There's no meaningful change here yet, just a re-arrangement of the
current code to make that subsequent change easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

env--helper: new undocumented builtin wrapping git_env_*()Ævar Arnfjörð Bjarmason Fri, 21 Jun 2019 10:18:06 +0000 (12:18 +0200)

env--helper: new undocumented builtin wrapping git_env_*()

We have many GIT_TEST_* variables that accept a <boolean> because
they're implemented in C, and then some that take <non-empty?> because
they're implemented at least partially in shellscript.

Add a helper that wraps git_env_bool() and git_env_ulong() as the
first step in fixing this. This isn't being added as a test-tool mode
because some of these are used outside the test suite.

Part of what this tool does can be done via a trick with "git config"
added in 83d842dc8c ("tests: turn on network daemon tests by default",
2014-02-10) for test_tristate(), i.e.:

git -c magic.variable="$1" config --bool magic.variable 2>/dev/null

But as subsequent changes will show being able to pass along the
default value makes all the difference, and we'll be able to replace
test_tristate() itself with that.

The --type=bool option will be used by subsequent patches, but not
--type=ulong. I figured it was easy enough to add it & test for it so
I left it in so we'd have wrappers for both git_env_*() functions, and
to have a template to make it obvious how we'd add --type=int etc. if
it's needed in the future.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pull: add --[no-]show-forced-updates passthroughDerrick Stolee Tue, 18 Jun 2019 20:25:28 +0000 (13:25 -0700)

pull: add --[no-]show-forced-updates passthrough

The 'git fetch' command can avoid calculating forced updates, so
allow users of 'git pull' to provide that option. This is particularly
necessary when the advice to use '--no-show-forced-updates' is given
at the end of the command.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch: warn about forced updates in branch listingDerrick Stolee Tue, 18 Jun 2019 20:25:27 +0000 (13:25 -0700)

fetch: warn about forced updates in branch listing

The --[no-]show-forced-updates option in 'git fetch' can be confusing
for some users, especially if it is enabled via config setting and not
by argument. Add advice to warn the user that the (forced update)
messages were not listed.

Additionally, warn users when the forced update check takes longer
than ten seconds, and recommend that they disable the check. These
messages can be disabled by the advice.fetchShowForcedUpdates config
setting.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch: add --[no-]show-forced-updates argumentDerrick Stolee Tue, 18 Jun 2019 20:25:26 +0000 (13:25 -0700)

fetch: add --[no-]show-forced-updates argument

After updating a set of remove refs during a 'git fetch', we walk the
commits in the new ref value and not in the old ref value to discover
if the update was a forced update. This results in two things happening
during the command:

1. The line including the ref update has an additional "(forced-update)"
marker at the end.

2. The ref log for that remote branch includes a bit saying that update
is a forced update.

For many situations, this forced-update message happens infrequently, or
is a small bit of information among many ref updates. Many users ignore
these messages, but the calculation required here slows down their fetches
significantly. Keep in mind that they do not have the opportunity to
calculate a commit-graph file containing the newly-fetched commits, so
these comparisons can be very slow.

Add a '--[no-]show-forced-updates' option that allows a user to skip this
calculation. The only permanent result is dropping the forced-update bit
in the reflog.

Include a new fetch.showForcedUpdates config setting that allows this
behavior without including the argument in every command. The config
setting is overridden by the command-line arguments.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

status: ignore status.aheadbehind in porcelain formatsJeff Hostetler Tue, 18 Jun 2019 20:21:28 +0000 (13:21 -0700)

status: ignore status.aheadbehind in porcelain formats

Teach porcelain V[12] formats to ignore the status.aheadbehind
config setting. They only respect the --[no-]ahead-behind
command line argument. This is for backwards compatibility
with existing scripts.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

status: warn when a/b calculation takes too longJeff Hostetler Tue, 18 Jun 2019 20:21:27 +0000 (13:21 -0700)

status: warn when a/b calculation takes too long

The ahead/behind calculation in 'git status' can be slow in some
cases. Users may not realize that there are ways to avoid this
computation, especially if they are not using the information.

Add a warning that appears if this calculation takes more than
two seconds. The warning can be disabled through the new config
setting advice.statusAheadBehind.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

status: add status.aheadbehind settingJeff Hostetler Tue, 18 Jun 2019 20:21:25 +0000 (13:21 -0700)

status: add status.aheadbehind setting

The --[no-]ahead-behind option was introduced in fd9b544a
(status: add --[no-]ahead-behind to status and commit for V2
format, 2018-01-09). This is a necessary change of behavior
in repos where the remote tracking branches can move very
quickly ahead of the local branches. However, users need to
remember to provide the command-line argument every time.

Add a new "status.aheadBehind" config setting to change the
default behavior of all git status formats.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

status: remove the empty line after hintsJohn Lin Tue, 4 Jun 2019 14:02:21 +0000 (07:02 -0700)

status: remove the empty line after hints

Before this patch, there is inconsistency between the status
messages with hints and the ones without hints: there is an
empty line between the title and the file list if hints are
presented, but there isn't one if there are no hints.

This patch remove the inconsistency by removing the empty
lines even if hints are presented.

Signed-off-by: John Lin <johnlinp@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

config tests: simplify include cycle testÆvar Arnfjörð Bjarmason Thu, 20 Jun 2019 21:09:08 +0000 (23:09 +0200)

config tests: simplify include cycle test

Simplify an overly verbose test added in 9b25a0b52e ("config: add
include directive", 2012-02-06). The "expect" file was never used, and
by using .gitconfig it's not as intuitive to reproduce this manually
with "-d" as some other tests, since HOME needs to be set in the
environment.

Also remove the use of test_i18ngrep added in a769bfc74f ("config.c:
mark more strings for translation", 2018-07-21) in favor of overriding
the GIT_TEST_GETTEXT_POISON value.

Using the i18n test wrappers hasn't been needed since my
6cdccfce1e ("i18n: make GETTEXT_POISON a runtime option", 2018-11-08).
As a follow-up change to the yet-to-be-added t0017-env-helper.sh will
show, doing it this way can hide a regression when combined with
trace2's early config reading. That early config reading was added in
bce9db6de9 ("trace2: use system/global config for default trace2
settings", 2019-04-15).

So let's remove the testing for that potential regression here, I'll
instead add it explicitly to t0017-env-helper.sh in a follow-up
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: update Makefile to allow for spaces in the compil... Jeff Hostetler Wed, 19 Jun 2019 21:06:06 +0000 (14:06 -0700)

msvc: update Makefile to allow for spaces in the compiler path

It is quite common that MS Visual C++ is installed into a location whose
path contains spaces, therefore we need to quote it.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: fix detect_msys_tty()Jeff Hostetler Wed, 19 Jun 2019 21:06:05 +0000 (14:06 -0700)

msvc: fix detect_msys_tty()

The ntstatus.h header is only available in MINGW.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: define ftello()Jeff Hostetler Wed, 19 Jun 2019 21:06:04 +0000 (14:06 -0700)

msvc: define ftello()

It is just called differently in MSVC's headers.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: do not re-declare the timespec structJeff Hostetler Wed, 19 Jun 2019 21:06:03 +0000 (14:06 -0700)

msvc: do not re-declare the timespec struct

VS2015's headers already declare that struct.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: mark a variable as non-constJeff Hostetler Wed, 19 Jun 2019 21:06:02 +0000 (14:06 -0700)

msvc: mark a variable as non-const

VS2015 complains when using a const pointer in memcpy()/free().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: define O_ACCMODEPhilip Oakley Wed, 19 Jun 2019 21:06:02 +0000 (14:06 -0700)

msvc: define O_ACCMODE

This constant is not defined in MSVC's headers.

In UCRT's fcntl.h, _O_RDONLY, _O_WRONLY and _O_RDWR are defined as 0, 1
and 2, respectively. Yes, that means that UCRT breaks with the tradition
that O_RDWR == O_RDONLY | O_WRONLY.

It is a perfectly legal way to define those constants, though, therefore
we need to take care of defining O_ACCMODE accordingly.

This is particularly important in order to keep our "open() can set
errno to EISDIR" emulation working: it tests that (flags & O_ACCMODE) is
not identical to O_RDONLY before going on to test specifically whether
the file for which open() reported EACCES is, in fact, a directory.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: include sigset_t definitionPhilip Oakley Wed, 19 Jun 2019 21:06:01 +0000 (14:06 -0700)

msvc: include sigset_t definition

On MSVC (VS2008) sigset_t is not defined.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

msvc: fix dependencies of compat/msvc.cJohannes Schindelin Wed, 19 Jun 2019 21:06:00 +0000 (14:06 -0700)

msvc: fix dependencies of compat/msvc.c

The file compat/msvc.c includes compat/mingw.c, which means that we have
to recompile compat/msvc.o if compat/mingw.c changes.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

mingw: replace mingw_startup() hackJohannes Schindelin Wed, 19 Jun 2019 21:05:59 +0000 (14:05 -0700)

mingw: replace mingw_startup() hack

Git for Windows has special code to retrieve the command-line parameters
(and even the environment) in UTF-16 encoding, so that they can be
converted to UTF-8. This is necessary because Git for Windows wants to
use UTF-8 encoded strings throughout its code, and the main() function
does not get the parameters in that encoding.

To do that, we used the __wgetmainargs() function, which is not even a
Win32 API function, but provided by the MINGW "runtime" instead.

Obviously, this method would not work with any compiler other than GCC,
and in preparation for compiling with Visual C++, we would like to avoid
precisely that.

Lucky us, there is a much more elegant way: we can simply implement the
UTF-16 variant of `main()`: `wmain()`.

To make that work, we need to link with -municode. The command-line
parameters are passed to `wmain()` encoded in UTF-16, as desired, and
this method also works with GCC, and also with Visual C++ after
adjusting the MSVC linker flags to force it to use `wmain()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

obstack: fix compiler warningJohannes Schindelin Wed, 19 Jun 2019 21:05:59 +0000 (14:05 -0700)

obstack: fix compiler warning

MS Visual C suggests that the construct

condition ? (int) i : (ptrdiff_t) d

is incorrect. Let's fix this by casting to ptrdiff_t also for the
positive arm of the conditional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

cache-tree/blame: avoid reusing the DEBUG constantJeff Hostetler Wed, 19 Jun 2019 21:05:58 +0000 (14:05 -0700)

cache-tree/blame: avoid reusing the DEBUG constant

In MS Visual C, the `DEBUG` constant is set automatically whenever
compiling with debug information.

This is clearly not what was intended in `cache-tree.c` nor in
`builtin/blame.c`, so let's use a less ambiguous name there.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t0001 (mingw): do not expect a specific order of stdout... Johannes Schindelin Wed, 19 Jun 2019 21:05:57 +0000 (14:05 -0700)

t0001 (mingw): do not expect a specific order of stdout/stderr

When redirecting stdout/stderr to the same file, we cannot guarantee
that stdout will come first.

In fact, in this test case, it seems that an MSVC build always prints
stderr first.

In any case, this test case does not want to verify the *order* but
the *presence* of both outputs, so let's test exactly that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

Mark .bat files as requiring CR/LF endingsJohannes Schindelin Wed, 19 Jun 2019 21:05:57 +0000 (14:05 -0700)

Mark .bat files as requiring CR/LF endings

Just like the natural line ending for Unix shell scripts consist of a
single Line Feed, the natural line ending for (DOS) Batch scripts
consists of a Carriage Return followed by a Line Feed.

It seems that both Unix shell script interpreters and the interpreter
for Batch scripts (`cmd.exe`) are keen on seeing the "right" line
endings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

mingw: fix a typo in the msysGit-specific sectionJohannes Schindelin Wed, 19 Jun 2019 21:05:56 +0000 (14:05 -0700)

mingw: fix a typo in the msysGit-specific section

The msysGit project (i.e. Git for Windows 1.x' SDK) is safely dead for
*years* already. This is probably the reason why nobody caught this typo
until Carlo Arenas spotted a copy-edited version of it nearby.

It is probably about time to rip out the remainders of msysGit/MSys1
support, but that can safely wait a bit longer, and we can at least fix
the typo for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch-pack: print server version at the top in -v -vNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 11:59:51 +0000 (18:59 +0700)

fetch-pack: print server version at the top in -v -v

Before the previous patch, the server version is printed after all the
"Server supports" lines. The previous one puts the version in the middle
of "Server supports" group.

Instead of moving it to the bottom, I move it to the top. Version may
stand out more at the top as we will have even more debug out after
capabilities.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch-pack: print all relevant supported capabilities... Nguyễn Thái Ngọc Duy Thu, 20 Jun 2019 11:59:50 +0000 (18:59 +0700)

fetch-pack: print all relevant supported capabilities with -v -v

When we check if some capability is supported, we do print something in
verbose mode. Some capabilities are not printed though (and it made me
think it's not supported; I was more used to GIT_TRACE_PACKET) so let's
print them all.

It's a bit more code. And one could argue for printing all supported
capabilities the server sends us. But I think it's still valuable this
way because we see the capabilities that the client cares about.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch-pack: move capability names out of i18n stringsNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 11:59:49 +0000 (18:59 +0700)

fetch-pack: move capability names out of i18n strings

This reduces the work on translators since they only have one string to
translate (and I think it's still enough context to translate). It also
makes sure no capability name is translated by accident.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

blame: add a test to cover blame_coalesce()Barret Rhoden Thu, 20 Jun 2019 16:38:20 +0000 (12:38 -0400)

blame: add a test to cover blame_coalesce()

Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

blame: use the fingerprint heuristic to match ignored... Barret Rhoden Thu, 20 Jun 2019 16:38:19 +0000 (12:38 -0400)

blame: use the fingerprint heuristic to match ignored lines

This commit integrates the fuzzy fingerprint heuristic into
guess_line_blames().

We actually make two passes. The first pass uses the fuzzy algorithm to
find a match within the current diff chunk. If that fails, the second
pass searches the entire parent file for the best match.

For an example of scanning the entire parent for a match, consider:

commit-a 30) #include <sys/header_a.h>
commit-b 31) #include <header_b.h>
commit-c 32) #include <header_c.h>

Then commit X alphabetizes them:

commit-X 30) #include <header_b.h>
commit-X 31) #include <header_c.h>
commit-X 32) #include <sys/header_a.h>

If we just check the parent's chunk (i.e. the first pass), we'd get:

commit-b 30) #include <header_b.h>
commit-c 31) #include <header_c.h>
commit-X 32) #include <sys/header_a.h>

That's because commit X actually consists of two chunks: one chunk is
removing sys/header_a.h, then some context, and the second chunk is
adding sys/header_a.h.

If we scan the entire parent file, we get:

commit-b 30) #include <header_b.h>
commit-c 31) #include <header_c.h>
commit-a 32) #include <sys/header_a.h>

Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

blame: add a fingerprint heuristic to match ignored... Michael Platings Thu, 20 Jun 2019 16:38:18 +0000 (12:38 -0400)

blame: add a fingerprint heuristic to match ignored lines

This algorithm will replace the heuristic used to identify lines from
ignored commits with one that finds likely candidate lines in the
parent's version of the file. The actual replacement occurs in an
upcoming commit.

The old heuristic simply assigned lines in the target to the same line
number (plus offset) in the parent. The new function uses a
fingerprinting algorithm to detect similarity between lines.

The new heuristic is designed to accurately match changes made
mechanically by formatting tools such as clang-format and clang-tidy.
These tools make changes such as breaking up lines to fit within a
character limit or changing identifiers to fit with a naming convention.
The heuristic is not intended to match more extensive refactoring
changes and may give misleading results in such cases.

In most cases formatting tools preserve line ordering, so the heuristic
is optimised for such cases. (Some types of changes do reorder lines
e.g. sorting keep the line content identical, the git blame -M option
can already be used to address this). The reason that it is advantageous
to rely on ordering is due to source code repeating the same character
sequences often e.g. declaring an identifier on one line and using that
identifier on several subsequent lines. This means that lines can look
very similar to each other which presents a problem when doing fuzzy
matching. Relying on ordering gives us extra clues to point towards the
true match.

The heuristic operates on a single diff chunk change at a time. It
creates a “fingerprint” for each line on each side of the change.
Fingerprints are described in detail in the comment for `struct
fingerprint`, but essentially are a multiset of the character pairs in a
line. The heuristic first identifies the line in the target entry whose
fingerprint is most clearly matched to a line fingerprint in the parent
entry. Where fingerprints match identically, the position of the lines
is used as a tie-break. The heuristic locks in the best match, and
subtracts the fingerprint of the line in the target entry from the
fingerprint of the line in the parent entry to prevent other lines being
matched on the same parts of that line. It then repeats the process
recursively on the section of the chunk before the match, and then the
section of the chunk after the match.

Here's an example of the difference the fingerprinting makes. Consider
a file with two commits:

commit-a 1) void func_1(void *x, void *y);
commit-b 2) void func_2(void *x, void *y);

After a commit 'X', we have:

commit-X 1) void func_1(void *x,
commit-X 2) void *y);
commit-X 3) void func_2(void *x,
commit-X 4) void *y);

When we blame-ignored with the old algorithm, we get:

commit-a 1) void func_1(void *x,
commit-b 2) void *y);
commit-X 3) void func_2(void *x,
commit-X 4) void *y);

Where commit-b is blamed for 2 instead of 3. With the fingerprint
algorithm, we get:

commit-a 1) void func_1(void *x,
commit-a 2) void *y);
commit-b 3) void func_2(void *x,
commit-b 4) void *y);

Note line 2 could be matched with either commit-a or commit-b as it is
equally similar to both lines, but is matched with commit-a because its
position as a fraction of the new line range is more similar to commit-a
as a fraction of the old line range. Line 4 is also equally similar to
both lines, but as it appears after line 3 which will be matched first
it cannot be matched with an earlier line.

For many more examples, see t/t8014-blame-ignore-fuzzy.sh which contains
example parent and target files and the line numbers in the parent that
must be matched.

Signed-off-by: Michael Platings <michael@platin.gs>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

completion: disable dwim on "git switch -d"Nguyễn Thái Ngọc Duy Thu, 20 Jun 2019 09:55:22 +0000 (16:55 +0700)

completion: disable dwim on "git switch -d"

Even though dwim is enabled by default, it will never be done when
--detached is specified. If you force "-d --guess" you will get an error
because --guess then implies -c which cannot be used with -d. So we can
disable dwim in "switch -d". It makes the completion list in this case a
bit shorter.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

switch: allow to switch in the middle of bisectNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 09:55:21 +0000 (16:55 +0700)

switch: allow to switch in the middle of bisect

In c45f0f525d (switch: reject if some operation is in progress,
2019-03-29), a check is added to prevent switching when some operation
is in progress. The reason is it's often not safe to do so.

This is true for merge, am, rebase, cherry-pick and revert, but not so
much for bisect because bisecting is basically jumping/switching between
a bunch of commits to pin point the first bad one. git-bisect suggests
the next commit to test, but it's not wrong for the user to test a
different commit because git-bisect cannot have the knowledge to know
better.

For this reason, allow to switch when bisecting (*). I considered if we
should still prevent switching by default and allow it with
--ignore-in-progress. But I don't think the prevention really adds
anything much.

If the user switches away by mistake, since we print the previous HEAD
value, even if they don't know about the "-" shortcut, switching back is
still possible.

The warning will be printed on every switch while bisect is still
ongoing, not the first time you switch away from bisect's suggested
commit, so it could become a bit annoying.

(*) of course when it's safe to do so, i.e. no loss of local changes and
stuff.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t2027: use test_must_be_emptyNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 09:55:20 +0000 (16:55 +0700)

t2027: use test_must_be_empty

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

delta-islands: respect progress flagJeff King Thu, 20 Jun 2019 08:58:32 +0000 (04:58 -0400)

delta-islands: respect progress flag

The delta island code always prints "Marked %d islands", even if
progress has been suppressed with --no-progress or by sending stderr to
a non-tty.

Let's pass a progress boolean to load_delta_islands(). We already do
the same thing for the progress meter in resolve_tree_islands().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

rev-list: teach --no-object-names to enable pipingEmily Shaffer Wed, 19 Jun 2019 20:56:56 +0000 (13:56 -0700)

rev-list: teach --no-object-names to enable piping

Allow easier parsing by cat-file by giving rev-list an option to print
only the OID of a non-commit object without any additional information.
This is a short-term shim; later on, rev-list should be taught how to
print the types of objects it finds in a format similar to cat-file's.

Before this commit, the output from rev-list needed to be massaged
before being piped to cat-file, like so:

git rev-list --objects HEAD | cut -f 1 -d ' ' |
git cat-file --batch-check

This was especially unexpected when dealing with root trees, as an
invisible whitespace exists at the end of the OID:

git rev-list --objects --filter=tree:1 --max-count=1 HEAD |
xargs -I% echo "AA%AA"

Now, it can be piped directly, as in the added test case:

git rev-list --objects --no-object-names HEAD | git cat-file --batch-check

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Change-Id: I489bdf0a8215532e540175188883ff7541d70e1b
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

hashmap: convert sha1hash() to oidhash()Jeff King Thu, 20 Jun 2019 07:41:49 +0000 (03:41 -0400)

hashmap: convert sha1hash() to oidhash()

There are no callers left of sha1hash() that do not simply pass the
"hash" member of a "struct object_id". Let's get rid of the outdated
sha1-specific function and provide one that operates on the whole struct
(even though the technique, taking the first few bytes of the hash, will
remain the same).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

hash.h: move object_id definition from cache.hJeff King Thu, 20 Jun 2019 07:41:45 +0000 (03:41 -0400)

hash.h: move object_id definition from cache.h

Our hashmap.h helpfully defines a sha1hash() function. But it cannot
define a similar oidhash() without including all of cache.h, which
itself wants to include hashmap.h! Let's break this circular dependency
by moving the definition to hash.h, along with the remaining RAWSZ
macros, etc. That will put them with the existing git_hash_algo
definition.

One alternative would be to move oidhash() into cache.h, but it's
already quite bloated. We're better off moving things out than in.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

khash: rename oid helper functionsJeff King Thu, 20 Jun 2019 07:41:42 +0000 (03:41 -0400)

khash: rename oid helper functions

For use in object_id hash tables, we have oid_hash() and oid_equal().
But these are confusingly similar to the existing oideq() and the
oidhash() we plan to add to replace sha1hash().

The big difference from those functions is that rather than accepting a
const pointer to the "struct object_id", we take the arguments by value
(which is a khash internal convention). So let's make that obvious by
calling them oidhash_by_value() and oideq_by_value().

Those names are fairly horrendous to type, but we rarely need to do so;
they are passed to the khash implementation macro and then only used
internally. Callers get to use the nice kh_put_oid_map(), etc.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

khash: drop sha1-specific map typesJeff King Thu, 20 Jun 2019 07:41:38 +0000 (03:41 -0400)

khash: drop sha1-specific map types

All of the callers of khash_sha1 and khash_sha1_pos have been removed,
in favor of using maps that use "struct object_id" as their keys. Let's
drop these now-obsolete types.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pack-bitmap: convert khash_sha1 maps into kh_oid_mapJeff King Thu, 20 Jun 2019 07:41:35 +0000 (03:41 -0400)

pack-bitmap: convert khash_sha1 maps into kh_oid_map

All of the users of our khash_sha1 maps actually have a "struct
object_id". Let's use the more descriptive type.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

delta-islands: convert island_marks khash to use oidsJeff King Thu, 20 Jun 2019 07:41:32 +0000 (03:41 -0400)

delta-islands: convert island_marks khash to use oids

All of the users of this map have an actual "struct object_id" rather
than a bare sha1. Let's use the more descriptive type (and get one step
closer to dropping khash_sha1 entirely).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

khash: rename kh_oid_t to kh_oid_setJeff King Thu, 20 Jun 2019 07:41:28 +0000 (03:41 -0400)

khash: rename kh_oid_t to kh_oid_set

khash lets us define a hash as either a map or a set (i.e., with no
"value" type). For the oid maps we define, "oid" is the set and
"oid_map" is the map. As the bug in the previous commit shows, it's easy
to pick the wrong one.

So let's make the names more distinct: "oid_set" and "oid_map".

An alternative naming scheme would be to actually name the type after
the key/value types. So e.g., "oid" _would_ be the set, since it has no
value type. And "oid_map" would become "oid_void" or similar (and
"oid_pos" becomes "oid_int"). That's better in some ways: it's more
regular, and a given map type can be more reasily reused in multiple
contexts (e.g., something storing an "int" that isn't a "pos"). But it's
also slightly less descriptive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

khash: drop broken oid_map typedefJeff King Thu, 20 Jun 2019 07:41:25 +0000 (03:41 -0400)

khash: drop broken oid_map typedef

Commit 5a8643eff1 (khash: move oid hash table definition, 2019-02-19)
added a khash "oid_map" type to match the existing "oid" type, which is
a simple set (i.e., just keys, no values). But in setting up the
khash_oid_map typedef, it accidentally referred to "kh_oid_t", which is
the set type.

Nobody noticed the breakage because there are not yet any callers; the
type was added just as a match to the existing sha1 types (whose map
type confusingly _is_ called khash_sha1, and it has no matching set
type).

We could easily fix this with s/oid/oid_map/ in the typedef. But let's
take this a step further, and just drop the typedef entirely. These
typedefs were added by 5a8643eff1 to match the khash_sha1 typedefs. But
the actual khash-derived type names are descriptive enough; this is just
adding an extra layer of indirection. The khash names do not quite
follow our usual style (e.g., they end in "_t"), but since we end up
using other khash names (e.g., khiter_t, kh_get_oid()) anyway, just
typedef-ing the struct name is not really helping much.

And there are already many cases where we use the raw khash type names
anyway (e.g., the "set" variant defined just above us does not have such
a typedef!).

So let's drop this typedef, and the matching oid_pos one (which actually
_does_ have a user, but we can easily convert it).

We'll leave the khash_sha1 typedef around. The ultimate fate of its
callers should be conversion to kh_oid_map_t, so there's no point in
going through the noise of changing the names now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

object: convert create_object() to use object_idJeff King Thu, 20 Jun 2019 07:41:21 +0000 (03:41 -0400)

object: convert create_object() to use object_id

There are no callers left of create_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

object: convert internal hash_obj() to object_idJeff King Thu, 20 Jun 2019 07:41:17 +0000 (03:41 -0400)

object: convert internal hash_obj() to object_id

Now that lookup_object() has an object_id, we can consistently pass that
around instead of a raw sha1. We still convert to a hash to pass to
sha1hash(), but the goal is for that to go away shortly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

object: convert lookup_object() to use object_idJeff King Thu, 20 Jun 2019 07:41:14 +0000 (03:41 -0400)

object: convert lookup_object() to use object_id

There are no callers left of lookup_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables. It also
matches the existing conversions of lookup_blob(), etc.

The conversions of callers were done by hand, but they're all mechanical
one-liners.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

object: convert lookup_unknown_object() to use object_idJeff King Thu, 20 Jun 2019 07:41:10 +0000 (03:41 -0400)

object: convert lookup_unknown_object() to use object_id

There are no callers left of lookup_unknown_object() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
It also matches the existing conversions of lookup_blob(), etc.

The conversions of callers were done by hand, but they're all mechanical
one-liners.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pack-objects: convert locate_object_entry_hash() to... Jeff King Thu, 20 Jun 2019 07:41:07 +0000 (03:41 -0400)

pack-objects: convert locate_object_entry_hash() to object_id

There are no callers of locate_object_entry_hash() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pack-objects: convert packlist_find() to use object_idJeff King Thu, 20 Jun 2019 07:41:03 +0000 (03:41 -0400)

pack-objects: convert packlist_find() to use object_id

We take a raw hash pointer, but most of our callers have a "struct
object_id" already. Let's switch to taking the full struct, which will
let us continue removing uses of raw sha1 buffers.

There are two callers that do need special attention:

- in rebuild_existing_bitmaps(), we need to switch to
nth_packed_object_oid(). This incurs an extra hash copy over
pointing straight to the mmap'd sha1, but it shouldn't be measurable
compared to the rest of the operation.

- in can_reuse_delta() we already spent the effort to copy the sha1
into a "struct object_id", but now we just have to do so a little
earlier in the function (we can't easily convert that function's
callers because they may be pointing at mmap'd REF_DELTA blocks).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

pack-bitmap-write: convert some helpers to use object_idJeff King Thu, 20 Jun 2019 07:40:59 +0000 (03:40 -0400)

pack-bitmap-write: convert some helpers to use object_id

A few functions take raw hash pointers, but all of their callers
actually have a "struct object_id". Let's retain that extra type as long
as possible (which will let future patches extend that further, and so
on).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

upload-pack: rename a "sha1" variable to "oid"Jeff King Thu, 20 Jun 2019 07:40:54 +0000 (03:40 -0400)

upload-pack: rename a "sha1" variable to "oid"

This variable is a "struct object_id", but uses the old-style name
"sha1". Let's call it oid to match more modern code (and make it clear
that it can handle any algorithm, since it uses parse_oid_hex()
properly).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

describe: fix accidental oid/hash type-punningJeff King Thu, 20 Jun 2019 07:40:50 +0000 (03:40 -0400)

describe: fix accidental oid/hash type-punning

The find_commit_name() function passes an object_id.hash as the key of a
hashmap. That ends up in commit_name_neq(), which then feeds it to
oideq(). Which means we should actually be the whole "struct object_id".

It works anyway because pointers to the two are interchangeable. And
because we're going through a layer of void pointers, the compiler
doesn't notice the type mismatch.

But it's worth cleaning up (especially since once we switch away from
sha1hash() on the same line, accessing the hash member will look doubly
out of place).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fetch: only run 'gc' once when fetching multiple remotesNguyễn Thái Ngọc Duy Wed, 19 Jun 2019 09:46:30 +0000 (16:46 +0700)

fetch: only run 'gc' once when fetching multiple remotes

In multiple remotes mode, git-fetch is launched for n-1 remotes and the
last remote is handled by the current process. Each of these processes
will in turn run 'gc' at the end.

This is not really a problem because even if multiple 'gc --auto' is run
at the same time we still handle it correctly. It does show multiple
"auto packing in the background" messages though. And we may waste some
resources when gc actually runs because we still do some stuff before
checking the lock and moving it to background.

So let's try to avoid that. We should only need one 'gc' run after all
objects and references are added anyway. Add a new option --no-auto-gc
that will be used by those n-1 processes. 'gc --auto' will always run on
the main fetch process (*).

(*) even if we fetch remotes in parallel at some point in future, this
should still be fine because we should "join" all those processes
before this step.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: test verify across alternatesDerrick Stolee Tue, 18 Jun 2019 18:14:36 +0000 (11:14 -0700)

commit-graph: test verify across alternates

The 'git commit-graph verify' subcommand loads a commit-graph from
a given object directory instead of using the standard method
prepare_commit_graph(). During development of load_commit_graph_chain(),
a version did not include prepare_alt_odb() as it was previously
run by prepare_commit_graph() in most cases.

Add a test that prevents that mistake from happening again.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: normalize commit-graph filenamesDerrick Stolee Tue, 18 Jun 2019 18:14:36 +0000 (11:14 -0700)

commit-graph: normalize commit-graph filenames

When writing commit-graph files, we append path data to an
object directory, which may be specified by the user via the
'--object-dir' option. If the user supplies a trailing slash,
or some other alternative path format, the resulting path may
be usable for writing to the correct location. However, when
expiring graph files from the <obj-dir>/info/commit-graphs
directory during a write, we need to compare paths with exact
string matches.

Normalize the commit-graph filenames to avoid ambiguity. This
creates extra allocations, but this is a constant multiple of
the number of commit-graph files, which should be a number in
the single digits.

Further normalize the object directory in the context. Due to
a comparison between g->obj_dir and ctx->obj_dir in
split_graph_merge_strategy(), a trailing slash would prevent
any merging of layers within the same object directory. The
check is there to ensure we do not merge across alternates.
Update the tests to include a case with this trailing slash
problem.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: test --split across alternate without... Derrick Stolee Tue, 18 Jun 2019 18:14:35 +0000 (11:14 -0700)

commit-graph: test --split across alternate without --split

We allow sharing commit-graph files across alternates. When we are
writing a split commit-graph, we allow adding tip graph files that
are not in the alternate, but include commits from our local repo.

However, if our alternate is not using the split commit-graph format,
its file is at .git/objects/info/commit-graph and we are trying to
write files in .git/objects/info/commit-graphs/graph-{hash}.graph.

We already have logic to ensure we do not merge across alternate
boundaries, but we also cannot have a commit-graph chain to our
alternate if uses the old filename structure.

Create a test that verifies we create a new split commit-graph
with only one level and we do not modify the existing commit-graph
in the alternate.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: test octopus merges with --splitDerrick Stolee Tue, 18 Jun 2019 18:14:34 +0000 (11:14 -0700)

commit-graph: test octopus merges with --split

Octopus merges require an extra chunk of data in the commit-graph
file format. Create a test that ensures the new --split option
continues to work with an octopus merge. Specifically, ensure
that the octopus merge has parents across layers to truly check
that our graph position logic holds up correctly.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: clean up chains after flattened writeDerrick Stolee Tue, 18 Jun 2019 18:14:33 +0000 (11:14 -0700)

commit-graph: clean up chains after flattened write

If we write a commit-graph file without the split option, then
we write to $OBJDIR/info/commit-graph and start to ignore
the chains in $OBJDIR/info/commit-graphs/.

Unlink the commit-graph-chain file and expire the graph-{hash}.graph
files in $OBJDIR/info/commit-graphs/ during every write.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: verify chains with --shallow modeDerrick Stolee Tue, 18 Jun 2019 18:14:32 +0000 (11:14 -0700)

commit-graph: verify chains with --shallow mode

If we wrote a commit-graph chain, we only modified the tip file in
the chain. It is valuable to verify what we wrote, but not waste
time checking files we did not write.

Add a '--shallow' option to the 'git commit-graph verify' subcommand
and check that it does not read the base graph in a two-file chain.

Making the verify subcommand read from a chain of commit-graphs takes
some rearranging of the builtin code.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: create options for split filesDerrick Stolee Tue, 18 Jun 2019 18:14:32 +0000 (11:14 -0700)

commit-graph: create options for split files

The split commit-graph feature is now fully implemented, but needs
some more run-time configurability. Allow direct callers to 'git
commit-graph write --split' to specify the values used in the
merge strategy and the expire time.

Update the documentation to specify these values.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: expire commit-graph filesDerrick Stolee Tue, 18 Jun 2019 18:14:31 +0000 (11:14 -0700)

commit-graph: expire commit-graph files

As we merge commit-graph files in a commit-graph chain, we should clean
up the files that are no longer used.

This change introduces an 'expiry_window' value to the context, which is
always zero (for now). We then check the modified time of each
graph-{hash}.graph file in the $OBJDIR/info/commit-graphs folder and
unlink the files that are older than the expiry_window.

Since this is always zero, this immediately clears all unused graph
files. We will update the value to match a config setting in a future
change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: allow cross-alternate chainsDerrick Stolee Tue, 18 Jun 2019 18:14:30 +0000 (11:14 -0700)

commit-graph: allow cross-alternate chains

In an environment like a fork network, it is helpful to have a
commit-graph chain that spans both the base repo and the fork repo. The
fork is usually a small set of data on top of the large repo, but
sometimes the fork is much larger. For example, git-for-windows/git has
almost double the number of commits as git/git because it rebases its
commits on every major version update.

To allow cross-alternate commit-graph chains, we need a few pieces:

1. When looking for a graph-{hash}.graph file, check all alternates.

2. When merging commit-graph chains, do not merge across alternates.

3. When writing a new commit-graph chain based on a commit-graph file
in another object directory, do not allow success if the base file
has of the name "commit-graph" instead of
"commit-graphs/graph-{hash}.graph".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: merge commit-graph chainsDerrick Stolee Tue, 18 Jun 2019 18:14:29 +0000 (11:14 -0700)

commit-graph: merge commit-graph chains

When searching for a commit in a commit-graph chain of G graphs with N
commits, the search takes O(G log N) time. If we always add a new tip
graph with every write, the linear G term will start to dominate and
slow the lookup process.

To keep lookups fast, but also keep most incremental writes fast, create
a strategy for merging levels of the commit-graph chain. The strategy is
detailed in the commit-graph design document, but is summarized by these
two conditions:

1. If the number of commits we are adding is more than half the number
of commits in the graph below, then merge with that graph.

2. If we are writing more than 64,000 commits into a single graph,
then merge with all lower graphs.

The numeric values in the conditions above are currently constant, but
can become config options in a future update.

As we merge levels of the commit-graph chain, check that the commits
still exist in the repository. A garbage-collection operation may have
removed those commits from the object store and we do not want to
persist them in the commit-graph chain. This is a non-issue if the
'git gc' process wrote a new, single-level commit-graph file.

After we merge levels, the old graph-{hash}.graph files are no longer
referenced by the commit-graph-chain file. We will expire these files in
a future change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: add --split option to builtinDerrick Stolee Tue, 18 Jun 2019 18:14:28 +0000 (11:14 -0700)

commit-graph: add --split option to builtin

Add a new "--split" option to the 'git commit-graph write' subcommand. This
option allows the optional behavior of writing a commit-graph chain.

The current behavior will add a tip commit-graph containing any commits that
are not in the existing commit-graph or commit-graph chain. Later changes
will allow merging the chain and expiring out-dated files.

Add a new test script (t5324-split-commit-graph.sh) that demonstrates this
behavior.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: write commit-graph chainsDerrick Stolee Tue, 18 Jun 2019 18:14:27 +0000 (11:14 -0700)

commit-graph: write commit-graph chains

Extend write_commit_graph() to write a commit-graph chain when given the
COMMIT_GRAPH_SPLIT flag.

This implementation is purposefully simplistic in how it creates a new
chain. The commits not already in the chain are added to a new tip
commit-graph file.

Much of the logic around writing a graph-{hash}.graph file and updating
the commit-graph-chain file is the same as the commit-graph file case.
However, there are several places where we need to do some extra logic
in the split case.

Track the list of graph filenames before and after the planned write.
This will be more important when we start merging graph files, but it
also allows us to upgrade our commit-graph file to the appropriate
graph-{hash}.graph file when we upgrade to a chain of commit-graphs.

Note that we use the eighth byte of the commit-graph header to store the
number of base graph files. This determines the length of the base
graphs chunk.

A subtle change of behavior with the new logic is that we do not write a
commit-graph if we our commit list is empty. This extends to the typical
case, which is reflected in t5318-commit-graph.sh.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: rearrange chunk count logicDerrick Stolee Tue, 18 Jun 2019 18:14:27 +0000 (11:14 -0700)

commit-graph: rearrange chunk count logic

The number of chunks in a commit-graph file can change depending on
whether we need the Extra Edges Chunk. We are going to add more optional
chunks, and it will be helpful to rearrange this logic around the chunk
count before doing so.

Specifically, we need to finalize the number of chunks before writing
the commit-graph header. Further, we also need to fill out the chunk
lookup table dynamically and using "num_chunks" as we add optional
chunks is useful for adding optional chunks in the future.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: add base graphs chunkDerrick Stolee Tue, 18 Jun 2019 18:14:26 +0000 (11:14 -0700)

commit-graph: add base graphs chunk

To quickly verify a commit-graph chain is valid on load, we will
read from the new "Base Graphs Chunk" of each file in the chain.
This will prevent accidentally loading incorrect data from manually
editing the commit-graph-chain file or renaming graph-{hash}.graph
files.

The commit_graph struct already had an object_id struct "oid", but
it was never initialized or used. Add a line to read the hash from
the end of the commit-graph file and into the oid member.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: load commit-graph chainsDerrick Stolee Tue, 18 Jun 2019 18:14:25 +0000 (11:14 -0700)

commit-graph: load commit-graph chains

Prepare the logic for reading a chain of commit-graphs.

First, look for a file at $OBJDIR/info/commit-graph. If it exists,
then use that file and stop.

Next, look for the chain file at $OBJDIR/info/commit-graphs/commit-graph-chain.
If this file exists, then load the hash values as line-separated values in that
file and load $OBJDIR/info/commit-graphs/graph-{hash[i]}.graph for each hash[i]
in that file. The file is given in order, so the first hash corresponds to the
"base" file and the final hash corresponds to the "tip" file.

This implementation assumes that all of the graph-{hash}.graph files are in
the same object directory as the commit-graph-chain file. This will be updated
in a future change. This change is purposefully simple so we can isolate the
different concerns.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: rename commit_compare to oid_compareDerrick Stolee Tue, 18 Jun 2019 18:14:24 +0000 (11:14 -0700)

commit-graph: rename commit_compare to oid_compare

The helper function commit_compare() actually compares object_id
structs, not commits. A future change to commit-graph.c will need
to sort commit structs, so rename this function in advance.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: prepare for commit-graph chainsDerrick Stolee Tue, 18 Jun 2019 18:14:24 +0000 (11:14 -0700)

commit-graph: prepare for commit-graph chains

To prepare for a chain of commit-graph files, augment the
commit_graph struct to point to a base commit_graph. As we load
commits from the graph, we may actually want to read from a base
file according to the graph position.

The "graph position" of a commit is given by concatenating the
lexicographic commit orders from each of the commit-graph files in
the chain. This means that we must distinguish two values:

* lexicographic index : the position within the lexicographic
order in a single commit-graph file.

* graph position: the position within the concatenated order
of multiple commit-graph files

Given the lexicographic index of a commit in a graph, we can
compute the graph position by adding the number of commits in
the lower-level graphs. To find the lexicographic index of
a commit, we subtract the number of commits in lower-level graphs.

While here, change insert_parent_or_die() to take a uint32_t
position, as that is the type used by its only caller and that
makes more sense with the limits in the commit-graph format.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

commit-graph: document commit-graph chainsDerrick Stolee Tue, 18 Jun 2019 18:14:23 +0000 (11:14 -0700)

commit-graph: document commit-graph chains

Add a basic description of commit-graph chains. More details about the
feature will be added as we add functionality. This introduction gives a
high-level overview to the goals of the feature and the basic layout of
commit-graph chains.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

stash: fix show referencing stash indexThomas Gummerer Sat, 15 Jun 2019 11:26:18 +0000 (12:26 +0100)

stash: fix show referencing stash index

In the conversion of 'stash show' to C in dc7bd382b1 ("stash: convert
show to builtin", 2019-02-25), 'git stash show <n>', where n is the
index of a stash got broken, if n is not a file or a valid revision by
itself.

'stash show' accepts any flag 'git diff' accepts for changing the
output format. Internally we use 'setup_revisions()' to parse these
command line flags. Currently we pass the whole argv through to
'setup_revisions()', which includes the stash index.

As the stash index is not a valid revision or a file in the working
tree in most cases however, this 'setup_revisions()' call (and thus
the whole command) ends up failing if we use this form of 'git stash
show'.

Instead of passing the whole argv to 'setup_revisions()', only pass
the flags (and the command name) through, while excluding the stash
reference. The stash reference is parsed (and validated) in
'get_stash_info()' already.

This separate parsing also means that we currently do produce the
correct output if the command succeeds.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

ref-filter: sort detached HEAD lines firstlyMatthew DeVore Tue, 18 Jun 2019 22:29:15 +0000 (15:29 -0700)

ref-filter: sort detached HEAD lines firstly

Before this patch, "git branch" would put "(HEAD detached...)" and "(no
branch, rebasing...)" lines before all the other branches *in most
cases* except for when using Chinese-language messages. zh_CN generally
uses a full-width "(" symbol (codepoint FF08) to match the full-width
proportions of Chinese characters, and the translated strings we had did
use them. This meant that the detached HEAD line would appear after all
local refs and even after the remote refs if there were any.

AFAIK, it is sometimes not jarring to see the half-width parenthesis in
"full-width" text as in the CJK languages, for instance when there are
no characters preceding or following the parenthesized text fragment. By
removing the parenthesis from the localizable text, we can share strings
with wt-status.c and remove a cautionary comment to translators.

Remove the ( from the localizable portion of messages so the sorting
happens properly regardless of locale.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

wt-status.h: drop stdio.h includeJeff King Tue, 18 Jun 2019 15:54:19 +0000 (11:54 -0400)

wt-status.h: drop stdio.h include

We started including stdio.h to pick up the declaration of "FILE" in
f26a001226 (Enable wt-status output to a given FILE pointer.,
2007-09-17). But there's no need, since headers can assume that
git-compat-util.h has been included, which covers stdio.

This should just be redundant, and not hurting anything (like pulling in
includes out of order) because C files are supposed to always include
git-compat-util.h first. But it's worth cleaning up to model good
behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

verify-tag: drop signal.h includeJeff King Tue, 18 Jun 2019 15:54:09 +0000 (11:54 -0400)

verify-tag: drop signal.h include

There's no reason verify-tag.c needs to include signal.h. It's already
in git-compat-util.h, which we properly include as the first header.
And there doesn't seem to be a particular reason for this include; it's
just an artifact from the file creation in 2ae68fcb78 (Make verify-tag a
builtin., 2007-07-27).

Likewise verify-commit.c has the same issue, probably because it was
created using verify-tag as a template in d07b00b7f3 (verify-commit:
scriptable commit signature verification, 2014-06-23).

These includes are probably just redundant, and not hurting anything by
circumventing the order that git-compat-util.h tries to impose, since
we'll always have loaded git-compat-util by the time we get to these. So
this is just a cleanup, and shouldn't fix or break any platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

wrapper: avoid undefined behaviour in macOSCarlo Marcelo Arenas Belón Sun, 16 Jun 2019 18:40:03 +0000 (11:40 -0700)

wrapper: avoid undefined behaviour in macOS

0620b39b3b ("compat: add a mkstemps() compatibility function", 2009-05-31)
included a function based on code from libiberty which would result in
undefined behaviour in platforms where timeval's tv_usec is a 32-bit signed
type as shown by:

wrapper.c:505:31: runtime error: left shift of 594546 by 16 places cannot be represented in type '__darwin_suseconds_t' (aka 'int')

interestingly the version of this code from gcc never had this bug and the
code had a cast that would had prevented the issue (at least in 64-bit
platforms) but was misapplied.

change the cast to uint64_t so it also works in 32-bit platforms.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

interpret-trailers: load default configJeff King Wed, 19 Jun 2019 03:37:28 +0000 (23:37 -0400)

interpret-trailers: load default config

The interpret-trailers program does not do the usual loading of config
via git_default_config(), and thus does not respect many of the usual
options. In particular, we will not load core.commentChar, even though
the underlying trailer code uses its value.

This can be seen in the accompanying test, where setting
core.commentChar to anything besides "#" results in a failure to treat
the comments correctly.

Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

show --continue/skip etc. consistently in synopsisPhillip Wood Mon, 17 Jun 2019 09:17:09 +0000 (10:17 +0100)

show --continue/skip etc. consistently in synopsis

Command mode options that the user can choose one among many are
listed like this in the documentation:

git am (--continue | --skip | --abort | --quit)

They are listed on a single line and in parenthesis, because they
are not optional.

But documentation pages for some commands deviate from this norm.
Fix the merge and rebase docs to match this style.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

use COPY_ARRAY for copying arraysRené Scharfe Sat, 15 Jun 2019 18:36:35 +0000 (20:36 +0200)

use COPY_ARRAY for copying arrays

Convert calls of memcpy(3) to use COPY_ARRAY, which shortens and
simplifies the code a bit.

Patch generated by Coccinelle and contrib/coccinelle/array.cocci.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

coccinelle: use COPY_ARRAY for copying arraysRené Scharfe Sat, 15 Jun 2019 18:32:58 +0000 (20:32 +0200)

coccinelle: use COPY_ARRAY for copying arrays

The current semantic patch for COPY_ARRAY transforms memcpy(3) calls on
pointers, but Coccinelle distinguishes them from arrays. It already
contains three rules to handle the options for sizeof (i.e. source,
destination and type), and handling arrays as source and destination
would require four times as many rules if we enumerated all cases.

We also don't handle array subscripts, and supporting that would
increase the number of rules by another factor of four. (An isomorphism
telling Coccinelle that "sizeof x[...]" is equivalent to "sizeof *x"
would be nice..)

Support arrays and array subscripts, but keep the number of rules down
by adding normalization steps: First turn array subscripts into
derefences, then determine the types of expressions used with sizeof and
replace them with these types, and then convert the different possible
combinations of arrays and pointers with memcpy(3) to COPY_ARRAY.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

fsmonitor: avoid signed integer overflow / infinite... Carlo Marcelo Arenas Belón Sat, 15 Jun 2019 16:11:35 +0000 (09:11 -0700)

fsmonitor: avoid signed integer overflow / infinite loop

883e248b8a ("fsmonitor: teach git to optionally utilize a file system
monitor to speed up detecting new or changed files.", 2017-09-22) uses
an int in a loop that would wrap if index_state->cache_nr (unsigned)
is bigger than INT_MAX

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

test-hashmap: remove 'hash' commandChristian Couder Sat, 15 Jun 2019 10:07:02 +0000 (12:07 +0200)

test-hashmap: remove 'hash' command

If hashes like strhash() are updated, for example to use a different
hash algorithm, we should not have to be updating t0011 to change out
the hashes.

As long as hashmap can store and retrieve values, and that it performs
well, we should not care what are the values of the hashes. Let's just
focus on the externally visible behavior instead.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

oidmap: use sha1hash() instead of static hash() functionChristian Couder Sat, 15 Jun 2019 10:07:01 +0000 (12:07 +0200)

oidmap: use sha1hash() instead of static hash() function

Get rid of the static hash() function in oidmap.c which is redundant
with sha1hash(). Use sha1hash() directly instead.

Let's be more consistent and not use several hash functions doing
nearly exactly the same thing.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t: add t0016-oidmap.shChristian Couder Sat, 15 Jun 2019 10:07:00 +0000 (12:07 +0200)

t: add t0016-oidmap.sh

Add actual tests for operations using `struct oidmap` from oidmap.{c,h}.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

t/helper: add test-oidmap.cChristian Couder Sat, 15 Jun 2019 10:06:59 +0000 (12:06 +0200)

t/helper: add test-oidmap.c

This new helper is very similar to "test-hashmap.c" and will help
test how `struct oidmap` from oidmap.{c,h} can be used.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

The second batchJunio C Hamano Mon, 17 Jun 2019 17:16:10 +0000 (10:16 -0700)

The second batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff | tree

Merge branch 'xl/record-partial-clone-origin'Junio C Hamano Mon, 17 Jun 2019 17:15:20 +0000 (10:15 -0700)

Merge branch 'xl/record-partial-clone-origin'

When creating a partial clone, the object filtering criteria is
recorded for the origin of the clone, but this incorrectly used a
hardcoded name "origin" to name that remote; it has been corrected
to honor the "--origin <name>" option.

* xl/record-partial-clone-origin:
clone: respect user supplied origin name when setting up partial clone

diff | tree

Merge branch 'pb/request-pull-verify-remote-ref'Junio C Hamano Mon, 17 Jun 2019 17:15:20 +0000 (10:15 -0700)

Merge branch 'pb/request-pull-verify-remote-ref'

"git request-pull" learned to warn when the ref we ask them to pull
from in the local repository and in the published repository are
different.

* pb/request-pull-verify-remote-ref:
request-pull: warn if the remote object is not the same as the local one
request-pull: quote regex metacharacters in local ref

diff | tree

Merge branch 'mm/p4-unshelve-windows-fix'Junio C Hamano Mon, 17 Jun 2019 17:15:19 +0000 (10:15 -0700)

Merge branch 'mm/p4-unshelve-windows-fix'

The command line to invoke a "git cat-file" command from inside
"git p4" was not properly quoted to protect a caret and running a
broken command on Windows, which has been corrected.

* mm/p4-unshelve-windows-fix:
p4 unshelve: fix "Not a valid object name HEAD0" on Windows

diff | tree

Merge branch 'po/git-help-on-git-itself'Junio C Hamano Mon, 17 Jun 2019 17:15:19 +0000 (10:15 -0700)

Merge branch 'po/git-help-on-git-itself'

"git help git" was hard to discover (well, at least for some
people).

* po/git-help-on-git-itself:
Doc: git.txt: remove backticks from link and add git-scm.com/docs
git.c: show usage for accessing the git(1) help page

diff | tree

Merge branch 'es/first-contrib-tutorial'Junio C Hamano Mon, 17 Jun 2019 17:15:18 +0000 (10:15 -0700)

Merge branch 'es/first-contrib-tutorial'

A new tutorial targetting specifically aspiring git-core
developers.

* es/first-contrib-tutorial:
doc: add some nit fixes to MyFirstContribution
documentation: add anchors to MyFirstContribution
documentation: add tutorial for first contribution

diff | tree

Merge branch 'bb/unicode-12.1-reiwa'Junio C Hamano Mon, 17 Jun 2019 17:15:18 +0000 (10:15 -0700)

Merge branch 'bb/unicode-12.1-reiwa'

Update to Unicode 12.1 width table.

* bb/unicode-12.1-reiwa:
unicode: update the width tables to Unicode 12.1

diff | tree

Merge branch 'sw/git-p4-unshelve-branched-files'Junio C Hamano Mon, 17 Jun 2019 17:15:18 +0000 (10:15 -0700)

Merge branch 'sw/git-p4-unshelve-branched-files'

"git p4" update.

* sw/git-p4-unshelve-branched-files:
git-p4: allow unshelving of branched files

diff | tree

Merge branch 'js/fsmonitor-unflake'Junio C Hamano Mon, 17 Jun 2019 17:15:17 +0000 (10:15 -0700)

Merge branch 'js/fsmonitor-unflake'

The data collected by fsmonitor was not properly written back to
the on-disk index file, breaking t7519 tests occasionally, which
has been corrected.

* js/fsmonitor-unflake:
mark_fsmonitor_valid(): mark the index as changed if needed
fill_stat_cache_info(): prepare for an fsmonitor fix

diff | tree

Merge branch 'ds/topo-traversal-using-commit-graph'Junio C Hamano Mon, 17 Jun 2019 17:15:17 +0000 (10:15 -0700)

Merge branch 'ds/topo-traversal-using-commit-graph'

Prepare use of reachability index in topological walker that works
on a range (A..B).

* ds/topo-traversal-using-commit-graph:
revision: keep topo-walk free of unintersting commits
revision: use generation for A..B --topo-order queries

diff | tree

Merge branch 'bl/userdiff-octave'Junio C Hamano Mon, 17 Jun 2019 17:15:17 +0000 (10:15 -0700)

Merge branch 'bl/userdiff-octave'

The pattern "git diff/grep" use to extract funcname and words
boundary for Matlab has been extend to cover Octave, which is more
or less equivalent.

* bl/userdiff-octave:
userdiff: fix grammar and style issues
userdiff: add Octave

diff | tree

Merge branch 'ba/clone-remote-submodules'Junio C Hamano Mon, 17 Jun 2019 17:15:17 +0000 (10:15 -0700)

Merge branch 'ba/clone-remote-submodules'

"git clone --recurse-submodules" learned to set up the submodules
to ignore commit object names recorded in the superproject gitlink
and instead use the commits that happen to be at the tip of the
remote-tracking branches from the get-go, by passing the new
"--remote-submodules" option.

* ba/clone-remote-submodules:
clone: add `--remote-submodules` flag

diff | tree

Merge branch 'vv/merge-squash-with-explicit-commit'Junio C Hamano Mon, 17 Jun 2019 17:15:17 +0000 (10:15 -0700)

Merge branch 'vv/merge-squash-with-explicit-commit'

"git merge --squash" is designed to update the working tree and the
index without creating the commit, and this cannot be countermanded
by adding the "--commit" option; the command now refuses to work
when both options are given.

* vv/merge-squash-with-explicit-commit:
merge: refuse --commit with --squash

diff | tree

Merge branch 'js/bundle-verify-require-object-store'Junio C Hamano Mon, 17 Jun 2019 17:15:16 +0000 (10:15 -0700)

Merge branch 'js/bundle-verify-require-object-store'

"git bundle verify" needs to see if prerequisite objects exist in
the receiving repository, but the command did not check if we are
in a repository upfront, which has been corrected.

* js/bundle-verify-require-object-store:
bundle verify: error out if called without an object database

diff | tree

Merge branch 'js/bisect-helper-check-get-oid-return... Junio C Hamano Mon, 17 Jun 2019 17:15:16 +0000 (10:15 -0700)

Merge branch 'js/bisect-helper-check-get-oid-return-value'

Code cleanup.

* js/bisect-helper-check-get-oid-return-value:
bisect--helper: verify HEAD could be parsed before continuing