Andrew's git - gitweb.git/log

Use the right 'struct repository' instead of the_repositoryNguyễn Thái Ngọc Duy Thu, 27 Jun 2019 09:28:52 +0000 (16:28 +0700)

Use the right 'struct repository' instead of the_repository

There are a couple of places where 'struct repository' is already passed
around, but the_repository is still used. Use the right repo.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

match-trees.c: remove the_repo from shift_tree*()Nguyễn Thái Ngọc Duy Thu, 27 Jun 2019 09:28:51 +0000 (16:28 +0700)

match-trees.c: remove the_repo from shift_tree*()

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tree-walk.c: remove the_repo from get_tree_entry_follow... Nguyễn Thái Ngọc Duy Thu, 27 Jun 2019 09:28:50 +0000 (16:28 +0700)

tree-walk.c: remove the_repo from get_tree_entry_follow_symlinks()

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tree-walk.c: remove the_repo from get_tree_entry()Nguyễn Thái Ngọc Duy Thu, 27 Jun 2019 09:28:49 +0000 (16:28 +0700)

tree-walk.c: remove the_repo from get_tree_entry()

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tree-walk.c: remove the_repo from fill_tree_descriptor()Nguyễn Thái Ngọc Duy Thu, 27 Jun 2019 09:28:48 +0000 (16:28 +0700)

tree-walk.c: remove the_repo from fill_tree_descriptor()

While at there, clean up the_repo usage in builtin/merge-tree.c a tiny
bit.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1-file.c: remove the_repo from read_object_with_refe... Nguyễn Thái Ngọc Duy Thu, 27 Jun 2019 09:28:47 +0000 (16:28 +0700)

sha1-file.c: remove the_repo from read_object_with_reference()

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

trace2: correct typo in technical documentationCarlo Marcelo Arenas Belón Wed, 26 Jun 2019 20:03:03 +0000 (13:03 -0700)

trace2: correct typo in technical documentation

an apparent typo for the environment variable was included with 81567caf87
("trace2: update docs to describe system/global config settings",
2019-04-15), and was missed when renamed variables by e4b75d6a1d
("trace2: rename environment variables to GIT_TRACE2*", 2019-05-19)

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: do not report errors in sequencer/todoPhillip Wood Thu, 27 Jun 2019 14:12:46 +0000 (07:12 -0700)

status: do not report errors in sequencer/todo

commit 4a72486de9 ("fix cherry-pick/revert status after commit",
2019-04-16) used parse_insn_line() to parse the first line of the todo
list to check if it was a pick or revert. However if the todo list is
left over from an old cherry-pick or revert and references a commit that
no longer exists then parse_insn_line() prints an error message which is
confusing for users [1]. Instead parse just the command name so that the
user is alerted to the presence of stale sequencer state by status
reporting that a cherry-pick or revert is in progress.

Note that we should not be leaving stale sequencer state lying around
(or at least not as often) after commit b07d9bfd17 ("commit/reset: try
to clean up sequencer state", 2019-04-16). However the user may still
have stale state that predates that commit.

Also avoid printing an error message if for some reason the user has a
file called `sequencer` in $GIT_DIR.

[1] https://public-inbox.org/git/3bc58c33-4268-4e7c-bf1a-fe349b3cb037@www.fastmail.com/

Reported-by: Espen Antonsen <espen@inspired.no>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sequencer: factor out todo command name parsingPhillip Wood Thu, 27 Jun 2019 14:12:45 +0000 (07:12 -0700)

sequencer: factor out todo command name parsing

Factor out the code that parses the name of the command at the start of
each line in the todo file into its own function so that it can be used
in the next commit.

As I don't want to burden other callers with having to pass in a pointer
to the end of the line the test for an abbreviated command is
changed. This change should not affect the behavior. Instead of testing
`eol == bol + 1` the new code checks for the end of the line by testing
for '\n', '\r' or '\0' following the abbreviated name. To avoid reading
past the end of an empty string it also checks that there is actually a
single character abbreviation before testing if it matches. This
prevents it from matching '\0' as the abbreviated name and then trying
to read another character.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sequencer: always allow tab after command namePhillip Wood Thu, 27 Jun 2019 14:12:44 +0000 (07:12 -0700)

sequencer: always allow tab after command name

The code that parses the todo list allows an unabbreviated command name
to be followed by a space or a tab, but if the command name is
abbreviated it only allows a space after it. Fix this inconsistency.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5551: test usage of chunked encoding explicitlyJonathan Tan Wed, 5 Jun 2019 19:26:24 +0000 (12:26 -0700)

t5551: test usage of chunked encoding explicitly

When run using GIT_TEST_PROTOCOL_VERSION=2, a test in t5551 fails
because 4 POSTs (probe, ls-refs, probe, fetch) are sent instead of 2
(probe, fetch).

One way to resolve this would be to relax the condition (from "= 2" to
greater than 1, say), but upon further inspection, the test probably
shouldn't be counting the number of POSTs. This test states that large
requests are split across POSTs, but this is not correct; the main
change is that chunked transfer encoding is used, but the request is
still contained within one POST. (The test coincidentally works because
Git indeed sends 2 POSTs in the case of a large request, but that is
because, as stated above, the first POST is a probing RPC - see
post_rpc() in remote-curl.c for more information.)

Therefore, instead of counting POSTs, check that chunked transfer
encoding is used. This also has the desirable side effect of passing
with GIT_TEST_PROTOCOL_VERSION=2.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5551: use 'test_i18ngrep' to check translated outputSZEDER Gábor Mon, 24 Jun 2019 12:44:46 +0000 (14:44 +0200)

t5551: use 'test_i18ngrep' to check translated output

The two tests 'invalid Content-Type rejected' and 'server-side error
detected' in 't5551-http-fetch-smart.sh' use "plain" 'grep' to check
that 'git clone' failed with the expected error message, but the
messages they are checking are translated, and, consequently, these
tests fail when the test script is run with GIT_TEST_GETTEXT_POISON
enabled.

Use 'test_i18ngrep' instead.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule foreach: fix recursion of optionsMorian Sonnet Mon, 24 Jun 2019 20:26:55 +0000 (22:26 +0200)

submodule foreach: fix recursion of options

Calling

git submodule foreach --recursive <subcommand> --<option>

leads to an error stating that the option --<option> is unknown to
submodule--helper. That is of course only, when <option> is not a valid
option for git submodule foreach.

The reason for this is, that above call is internally translated into a
call to submodule--helper:

git submodule--helper foreach --recursive \
-- <subcommand> --<option>

This call starts by executing the subcommand with its option inside the
first level submodule and continues by calling the next iteration of
the submodule foreach call

git --super-prefix <submodulepath> submodule--helper \
foreach --recursive <subcommand> --<option>

inside the first level submodule. Note that the double dash in front of
the subcommand is missing.

This problem starts to arise only recently, as the
PARSE_OPT_KEEP_UNKNOWN flag for the argument parsing of git submodule
foreach was removed in commit a282f5a906. Hence, the unknown option is
complained about now, as the argument parsing is not properly ended by
the double dash.

This commit fixes the problem by adding the double dash in front of the
subcommand during the recursion.

Signed-off-by: Morian Sonnet <moriansonnet@googlemail.com>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: ignore .dll and incremental compile outputJeff Hostetler Tue, 25 Jun 2019 14:49:42 +0000 (07:49 -0700)

msvc: ignore .dll and incremental compile output

Ignore .dll files copied into the top-level directory.
Ignore MSVC incremental compiler output files.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: avoid debug assertion windows in Debug ModeJohannes Schindelin Tue, 25 Jun 2019 14:49:42 +0000 (07:49 -0700)

msvc: avoid debug assertion windows in Debug Mode

For regular debugging, it is pretty helpful when a debug assertion in a
running application triggers a window that offers to start the debugger.

However, when running the test suite, it is not so helpful, in
particular when the debug assertions are then suppressed anyway because
we disable the invalid parameter checking (via invalidcontinue.obj, see
the comment in config.mak.uname about that object for more information).

So let's simply disable that window in Debug Mode (it is already
disabled in Release Mode).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: do not pretend to support all signalsJeff Hostetler Tue, 25 Jun 2019 14:49:41 +0000 (07:49 -0700)

msvc: do not pretend to support all signals

This special-cases various signals that are not supported on Windows,
such as SIGPIPE. These cause the UCRT to throw asserts (at least in
debug mode).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: add pragmas for common warningsPhilip Oakley Tue, 25 Jun 2019 14:49:40 +0000 (07:49 -0700)

msvc: add pragmas for common warnings

MSVC can be overzealous about some warnings. Disable them.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: add a compile-time flag to allow detailed heap... Jeff Hostetler Tue, 25 Jun 2019 14:49:40 +0000 (07:49 -0700)

msvc: add a compile-time flag to allow detailed heap debugging

MS Visual C comes with a few neat features we can use to analyze the
heap consumption (i.e. leaks, max memory, etc).

With this patch, we introduce support via the build-time flag
`USE_MSVC_CRTDBG`.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: support building Git using MS Visual C++Jeff Hostetler Tue, 25 Jun 2019 14:49:39 +0000 (07:49 -0700)

msvc: support building Git using MS Visual C++

With this patch, Git can be built using the Microsoft toolchain, via:

make MSVC=1 [DEBUG=1]

Third party libraries are built from source using the open source
"vcpkg" tool set. See https://github.com/Microsoft/vcpkg

On a first build, the vcpkg tools and the third party libraries are
automatically downloaded and built. DLLs for the third party libraries
are copied to the top-level (and t/helper) directory to facilitate
debugging. See compat/vcbuild/README.

A series of .bat files are invoked by the Makefile to find the location
of the installed version of Visual Studio and the associated compiler
tools (essentially replicating the environment setup performed by a
"Developer Command Prompt"). This should find the most recent VS2015 or
VS2017 installation. Output from these scripts are used by the Makefile
to define compiler and linker pathnames and -I and -L arguments.

The build produces .pdb files for both debug and release builds.

Note: This commit was squashed from an organic series of commits
developed between 2016 and 2018 in Git for Windows' `master` branch.
This combined commit eliminates the obsolete commits related to fetching
NuGet packages for third party libraries. It is difficult to use NuGet
packages for C/C++ sources because they may be built by earlier versions
of the MSVC compiler and have CRT version and linking issues.

Additionally, the C/C++ NuGet packages that we were using tended to not
be updated concurrently with the sources. And in the case of cURL and
OpenSSL, this could expose us to security issues.

Helped-by: Yue Lin Ho <b8732003@student.nsysu.edu.tw>
Helped-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pager: add a helper function to clear the last line... SZEDER Gábor Mon, 24 Jun 2019 18:13:16 +0000 (20:13 +0200)

pager: add a helper function to clear the last line in the terminal

There are a couple of places where we want to clear the last line on
the terminal, e.g. when a progress bar line is overwritten by a
shorter line, then the end of that progress line would remain visible,
unless we cover it up.

In 'progress.c' we did this by always appending a fixed number of
space characters to the next line (even if it was not shorter than the
previous), but as it turned out that fixed number was not quite large
enough, see the fix in 9f1fd84e15 (progress: clear previous progress
update dynamically, 2019-04-12). From then on we've been keeping
track of the length of the last displayed progress line and appending
the appropriate number of space characters to the next line, if
necessary, but, alas, this approach turned out to be error prone, see
the fix in 1aed1a5f25 (progress: avoid empty line when breaking the
progress line, 2019-05-19). The next patch in this series is about to
fix a case where we don't clear the last line, and on occasion do end
up with such garbage at the end of the line. It would be great if we
could do that without the need to deal with that without meticulously
computing the necessary number of space characters.

So add a helper function to clear the last line on the terminal using
an ANSI escape sequence, which has the advantage to clear the whole
line no matter how wide it is, even after the terminal width changed.
Such an escape sequence is not available on dumb terminals, though, so
in that case fall back to simply print a whole terminal width (as
reported by term_columns()) worth of space characters.

In 'editor.c' launch_specified_editor() already used this ANSI escape
sequence, so replace it with a call to this function.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3404: make the 'rebase.missingCommitsCheck=ignore... SZEDER Gábor Mon, 24 Jun 2019 18:13:15 +0000 (20:13 +0200)

t3404: make the 'rebase.missingCommitsCheck=ignore' test more focused

The test 'rebase -i respects rebase.missingCommitsCheck = warn' is
mainly interested in the warning about the dropped commits, but it
checks the whole output of 'git rebase', including progress lines and
what not that are not at all relevant to 'rebase.missingCommitsCheck',
but make it necessary to update this test whenever e.g. the way we
show progress is updated (as it will happen in one of the later
patches of this series).

Modify the test to verify only the first four lines of 'git rebase's
output that contain all the important lines, notably the line
containing the "Warning:" itself and the oneline log of the dropped
commit.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3404: modernize here doc styleSZEDER Gábor Mon, 24 Jun 2019 18:13:14 +0000 (20:13 +0200)

t3404: modernize here doc style

In 't3404-rebase-interactive.sh' the expected output of several tests
is prepared from here documents, which are outside of
'test_expect_success' blocks and have spaces around redirection
operators.

Move these here documents into the corresponding 'test_expect_success'
block and avoid spaces between filename and redition operators.
Furthermore, quote the here docs' delimiter word to prevent parameter
expansions and what not, where applicable.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: don't use git. as example gitweb URLJakub Wilk Sat, 22 Jun 2019 16:48:57 +0000 (18:48 +0200)

doc: don't use git. as example gitweb URL

git.kernel.org uses cgit, not gitweb, these days:

$ w3m -dump 'http://git.kernel.org/?p=git/git.git;a=tree;f=gitweb' | grep -w generated
generated by cgit 1.2-0.3.lf.el7 (git 2.18.0) at 2019-06-22 16:14:38 +0000

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: simplify parsing of unit factorsRené Scharfe Sat, 22 Jun 2019 10:03:40 +0000 (12:03 +0200)

config: simplify parsing of unit factors

Just return the value of the factor or zero for unrecognized strings
instead of using an output reference and a separate return value to
indicate success. This is shorter and simpler.

It basically reverts that function to before c8deb5a146 ("Improve error
messages when int/long cannot be parsed from config", 2007-12-25), while
keeping the better messages, so restore its old name, get_unit_factor(),
as well.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: don't multiply in parse_unit_factor()René Scharfe Sat, 22 Jun 2019 10:03:36 +0000 (12:03 +0200)

config: don't multiply in parse_unit_factor()

parse_unit_factor() multiplies the number that is passed to it with the
value of a recognized unit factor (K, M or G for 2^10, 2^20 and 2^30,
respectively). All callers pass in 1 as a number, though, which allows
them to check the actual multiplication for overflow before they are
doing it themselves.

Ignore the passed in number and don't multiply, as this feature of
parse_unit_factor() is not used anymore. Rename the output parameter to
reflect that it's not about the end result anymore, but just about the
unit factor.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: use unsigned_mult_overflows to check for overflowsRené Scharfe Sat, 22 Jun 2019 10:03:30 +0000 (12:03 +0200)

config: use unsigned_mult_overflows to check for overflows

parse_unit_factor() checks if a K, M or G is present after a number and
multiplies it by 2^10, 2^20 or 2^30, respectively. One of its callers
checks if the result is smaller than the number alone to detect
overflows. The other one passes 1 as the number and does multiplication
and overflow check itself in a similar manner.

This works, but is inconsistent, and it would break if we added support
for a bigger unit factor. E.g. 16777217T is 2^64 + 2^40, i.e. too big
for a 64-bit number. Modulo 2^64 we get 2^40 == 1TB, which is bigger
than the raw number 16777217 == 2^24 + 1, so the overflow would go
undetected by that method.

Let both callers pass 1 and handle overflow check and multiplication
themselves. Do the check before the multiplication, using
unsigned_mult_overflows, which is simpler and can deal with larger unit
factors.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0001: fix on case-insensitive filesystemsJohannes Schindelin Mon, 24 Jun 2019 17:40:05 +0000 (19:40 +0200)

t0001: fix on case-insensitive filesystems

On a case-insensitive filesystem, such as HFS+ or NTFS, it is possible
that the idea Bash has of the current directory differs in case from
what Git thinks it is. That's totally okay, though, and we should not
expect otherwise.

On Windows, for example, when you call

cd C:\GIT-SDK-64

in a PowerShell and there exists a directory called `C:\git-sdk-64`, the
current directory will be reported in all upper-case. Even in a Bash
that you might call from that PowerShell. Git, however, will have
normalized this via `GetFinalPathByHandle()`, and the expectation in
t0001 that the recorded gitdir will match what `pwd` says will be
violated.

Let's address this by comparing these paths in a case-insensitive
manner when `core.ignoreCase` is `true`.

Reported by Jameson Miller.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc: improve usage string in MyFirstContributionChristian Couder Sat, 22 Jun 2019 06:10:35 +0000 (08:10 +0200)

doc: improve usage string in MyFirstContribution

We implement a command called git-psuh which accept arguments, so let's
show that it accepts arguments in the doc and the usage string.

While at it, we need to prepare "a NULL-terminated array of usage strings",
not just "a NULL-terminated usage string".

Helped-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests: mark two failing tests under FAIL_PREREQSÆvar Arnfjörð Bjarmason Thu, 20 Jun 2019 20:42:27 +0000 (22:42 +0200)

tests: mark two failing tests under FAIL_PREREQS

Fix a couple of tests that would potentially fail under
GIT_TEST_FAIL_PREREQS=true.

I missed these when annotating other tests in dfe1a17df9 ("tests: add
a special setup where prerequisites fail", 2019-05-13) because on my
system I can only reproduce this failure when I run the tests as
"root", since the tests happen to depend on whether we can fall back
on GECOS info or not. I.e. they'd usually fail to look up the ident
info anyway, but not always.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The third batchJunio C Hamano Fri, 21 Jun 2019 18:26:11 +0000 (11:26 -0700)

The third batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'mo/clang-format-for-each-update'Junio C Hamano Fri, 21 Jun 2019 18:24:12 +0000 (11:24 -0700)

Merge branch 'mo/clang-format-for-each-update'

The list of for-each like macros used by clang-format has been
updated.

* mo/clang-format-for-each-update:
clang-format: use git grep to generate the ForEachMacros list

Merge branch 'md/url-parse-harden'Junio C Hamano Fri, 21 Jun 2019 18:24:12 +0000 (11:24 -0700)

Merge branch 'md/url-parse-harden'

The URL decoding code has been updated to avoid going past the end
of the string while parsing %-<hex>-<hex> sequence.

* md/url-parse-harden:
url: do not allow %00 to represent NUL in URLs
url: do not read past end of buffer

Merge branch 'an/ignore-doc-update'Junio C Hamano Fri, 21 Jun 2019 18:24:11 +0000 (11:24 -0700)

Merge branch 'an/ignore-doc-update'

The description about slashes in gitignore patterns (used to
indicate things like "anchored to this level only" and "only
matches directories") has been revamped.

* an/ignore-doc-update:
gitignore.txt: make slash-rules more readable

Merge branch 'ab/hash-object-doc'Junio C Hamano Fri, 21 Jun 2019 18:24:11 +0000 (11:24 -0700)

Merge branch 'ab/hash-object-doc'

Doc update.

* ab/hash-object-doc:
hash-object doc: stop mentioning git-cvsimport

Merge branch 'cm/send-email-document-req-modules'Junio C Hamano Fri, 21 Jun 2019 18:24:10 +0000 (11:24 -0700)

Merge branch 'cm/send-email-document-req-modules'

A doc update.

* cm/send-email-document-req-modules:
send-email: update documentation of required Perl modules

Merge branch 'md/list-objects-filter-parse-msgfix'Junio C Hamano Fri, 21 Jun 2019 18:24:10 +0000 (11:24 -0700)

Merge branch 'md/list-objects-filter-parse-msgfix'

Make an end-user facing message localizable.

* md/list-objects-filter-parse-msgfix:
list-objects-filter-options: error is localizeable

Merge branch 'md/list-objects-filter-memfix'Junio C Hamano Fri, 21 Jun 2019 18:24:09 +0000 (11:24 -0700)

Merge branch 'md/list-objects-filter-memfix'

The filter_data used in the list-objects-filter (which manages a
lazily sparse clone repository) did not use the dynamic array API
correctly---'nr' is supposed to point at one past the last element
of the array in use. This has been corrected.

* md/list-objects-filter-memfix:
list-objects-filter: correct usage of ALLOC_GROW

Merge branch 'jt/partial-clone-missing-ref-delta-base'Junio C Hamano Fri, 21 Jun 2019 18:24:09 +0000 (11:24 -0700)

Merge branch 'jt/partial-clone-missing-ref-delta-base'

"git fetch" into a lazy clone forgot to fetch base objects that are
necessary to complete delta in a thin packfile, which has been
corrected.

* jt/partial-clone-missing-ref-delta-base:
t5616: cover case of client having delta base
t5616: use correct flag to check object is missing
index-pack: prefetch missing REF_DELTA bases
t5616: refactor packfile replacement

Merge branch 'ml/userdiff-rust'Junio C Hamano Fri, 21 Jun 2019 18:24:08 +0000 (11:24 -0700)

Merge branch 'ml/userdiff-rust'

The pattern "git diff/grep" use to extract funcname and words
boundary for Rust has been added.

* ml/userdiff-rust:
userdiff: two simplifications of patterns for rust
userdiff: add built-in pattern for rust

pull: add --[no-]show-forced-updates passthroughDerrick Stolee Tue, 18 Jun 2019 20:25:28 +0000 (13:25 -0700)

pull: add --[no-]show-forced-updates passthrough

The 'git fetch' command can avoid calculating forced updates, so
allow users of 'git pull' to provide that option. This is particularly
necessary when the advice to use '--no-show-forced-updates' is given
at the end of the command.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: warn about forced updates in branch listingDerrick Stolee Tue, 18 Jun 2019 20:25:27 +0000 (13:25 -0700)

fetch: warn about forced updates in branch listing

The --[no-]show-forced-updates option in 'git fetch' can be confusing
for some users, especially if it is enabled via config setting and not
by argument. Add advice to warn the user that the (forced update)
messages were not listed.

Additionally, warn users when the forced update check takes longer
than ten seconds, and recommend that they disable the check. These
messages can be disabled by the advice.fetchShowForcedUpdates config
setting.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: add --[no-]show-forced-updates argumentDerrick Stolee Tue, 18 Jun 2019 20:25:26 +0000 (13:25 -0700)

fetch: add --[no-]show-forced-updates argument

After updating a set of remove refs during a 'git fetch', we walk the
commits in the new ref value and not in the old ref value to discover
if the update was a forced update. This results in two things happening
during the command:

1. The line including the ref update has an additional "(forced-update)"
marker at the end.

2. The ref log for that remote branch includes a bit saying that update
is a forced update.

For many situations, this forced-update message happens infrequently, or
is a small bit of information among many ref updates. Many users ignore
these messages, but the calculation required here slows down their fetches
significantly. Keep in mind that they do not have the opportunity to
calculate a commit-graph file containing the newly-fetched commits, so
these comparisons can be very slow.

Add a '--[no-]show-forced-updates' option that allows a user to skip this
calculation. The only permanent result is dropping the forced-update bit
in the reflog.

Include a new fetch.showForcedUpdates config setting that allows this
behavior without including the argument in every command. The config
setting is overridden by the command-line arguments.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: ignore status.aheadbehind in porcelain formatsJeff Hostetler Tue, 18 Jun 2019 20:21:28 +0000 (13:21 -0700)

status: ignore status.aheadbehind in porcelain formats

Teach porcelain V[12] formats to ignore the status.aheadbehind
config setting. They only respect the --[no-]ahead-behind
command line argument. This is for backwards compatibility
with existing scripts.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: warn when a/b calculation takes too longJeff Hostetler Tue, 18 Jun 2019 20:21:27 +0000 (13:21 -0700)

status: warn when a/b calculation takes too long

The ahead/behind calculation in 'git status' can be slow in some
cases. Users may not realize that there are ways to avoid this
computation, especially if they are not using the information.

Add a warning that appears if this calculation takes more than
two seconds. The warning can be disabled through the new config
setting advice.statusAheadBehind.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: add status.aheadbehind settingJeff Hostetler Tue, 18 Jun 2019 20:21:25 +0000 (13:21 -0700)

status: add status.aheadbehind setting

The --[no-]ahead-behind option was introduced in fd9b544a
(status: add --[no-]ahead-behind to status and commit for V2
format, 2018-01-09). This is a necessary change of behavior
in repos where the remote tracking branches can move very
quickly ahead of the local branches. However, users need to
remember to provide the command-line argument every time.

Add a new "status.aheadBehind" config setting to change the
default behavior of all git status formats.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

status: remove the empty line after hintsJohn Lin Tue, 4 Jun 2019 14:02:21 +0000 (07:02 -0700)

status: remove the empty line after hints

Before this patch, there is inconsistency between the status
messages with hints and the ones without hints: there is an
empty line between the title and the file list if hints are
presented, but there isn't one if there are no hints.

This patch remove the inconsistency by removing the empty
lines even if hints are presented.

Signed-off-by: John Lin <johnlinp@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: update Makefile to allow for spaces in the compil... Jeff Hostetler Wed, 19 Jun 2019 21:06:06 +0000 (14:06 -0700)

msvc: update Makefile to allow for spaces in the compiler path

It is quite common that MS Visual C++ is installed into a location whose
path contains spaces, therefore we need to quote it.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: fix detect_msys_tty()Jeff Hostetler Wed, 19 Jun 2019 21:06:05 +0000 (14:06 -0700)

msvc: fix detect_msys_tty()

The ntstatus.h header is only available in MINGW.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: define ftello()Jeff Hostetler Wed, 19 Jun 2019 21:06:04 +0000 (14:06 -0700)

msvc: define ftello()

It is just called differently in MSVC's headers.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: do not re-declare the timespec structJeff Hostetler Wed, 19 Jun 2019 21:06:03 +0000 (14:06 -0700)

msvc: do not re-declare the timespec struct

VS2015's headers already declare that struct.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: mark a variable as non-constJeff Hostetler Wed, 19 Jun 2019 21:06:02 +0000 (14:06 -0700)

msvc: mark a variable as non-const

VS2015 complains when using a const pointer in memcpy()/free().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: define O_ACCMODEPhilip Oakley Wed, 19 Jun 2019 21:06:02 +0000 (14:06 -0700)

msvc: define O_ACCMODE

This constant is not defined in MSVC's headers.

In UCRT's fcntl.h, _O_RDONLY, _O_WRONLY and _O_RDWR are defined as 0, 1
and 2, respectively. Yes, that means that UCRT breaks with the tradition
that O_RDWR == O_RDONLY | O_WRONLY.

It is a perfectly legal way to define those constants, though, therefore
we need to take care of defining O_ACCMODE accordingly.

This is particularly important in order to keep our "open() can set
errno to EISDIR" emulation working: it tests that (flags & O_ACCMODE) is
not identical to O_RDONLY before going on to test specifically whether
the file for which open() reported EACCES is, in fact, a directory.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: include sigset_t definitionPhilip Oakley Wed, 19 Jun 2019 21:06:01 +0000 (14:06 -0700)

msvc: include sigset_t definition

On MSVC (VS2008) sigset_t is not defined.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

msvc: fix dependencies of compat/msvc.cJohannes Schindelin Wed, 19 Jun 2019 21:06:00 +0000 (14:06 -0700)

msvc: fix dependencies of compat/msvc.c

The file compat/msvc.c includes compat/mingw.c, which means that we have
to recompile compat/msvc.o if compat/mingw.c changes.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mingw: replace mingw_startup() hackJohannes Schindelin Wed, 19 Jun 2019 21:05:59 +0000 (14:05 -0700)

mingw: replace mingw_startup() hack

Git for Windows has special code to retrieve the command-line parameters
(and even the environment) in UTF-16 encoding, so that they can be
converted to UTF-8. This is necessary because Git for Windows wants to
use UTF-8 encoded strings throughout its code, and the main() function
does not get the parameters in that encoding.

To do that, we used the __wgetmainargs() function, which is not even a
Win32 API function, but provided by the MINGW "runtime" instead.

Obviously, this method would not work with any compiler other than GCC,
and in preparation for compiling with Visual C++, we would like to avoid
precisely that.

Lucky us, there is a much more elegant way: we can simply implement the
UTF-16 variant of `main()`: `wmain()`.

To make that work, we need to link with -municode. The command-line
parameters are passed to `wmain()` encoded in UTF-16, as desired, and
this method also works with GCC, and also with Visual C++ after
adjusting the MSVC linker flags to force it to use `wmain()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

obstack: fix compiler warningJohannes Schindelin Wed, 19 Jun 2019 21:05:59 +0000 (14:05 -0700)

obstack: fix compiler warning

MS Visual C suggests that the construct

condition ? (int) i : (ptrdiff_t) d

is incorrect. Let's fix this by casting to ptrdiff_t also for the
positive arm of the conditional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cache-tree/blame: avoid reusing the DEBUG constantJeff Hostetler Wed, 19 Jun 2019 21:05:58 +0000 (14:05 -0700)

cache-tree/blame: avoid reusing the DEBUG constant

In MS Visual C, the `DEBUG` constant is set automatically whenever
compiling with debug information.

This is clearly not what was intended in `cache-tree.c` nor in
`builtin/blame.c`, so let's use a less ambiguous name there.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0001 (mingw): do not expect a specific order of stdout... Johannes Schindelin Wed, 19 Jun 2019 21:05:57 +0000 (14:05 -0700)

t0001 (mingw): do not expect a specific order of stdout/stderr

When redirecting stdout/stderr to the same file, we cannot guarantee
that stdout will come first.

In fact, in this test case, it seems that an MSVC build always prints
stderr first.

In any case, this test case does not want to verify the *order* but
the *presence* of both outputs, so let's test exactly that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Mark .bat files as requiring CR/LF endingsJohannes Schindelin Wed, 19 Jun 2019 21:05:57 +0000 (14:05 -0700)

Mark .bat files as requiring CR/LF endings

Just like the natural line ending for Unix shell scripts consist of a
single Line Feed, the natural line ending for (DOS) Batch scripts
consists of a Carriage Return followed by a Line Feed.

It seems that both Unix shell script interpreters and the interpreter
for Batch scripts (`cmd.exe`) are keen on seeing the "right" line
endings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mingw: fix a typo in the msysGit-specific sectionJohannes Schindelin Wed, 19 Jun 2019 21:05:56 +0000 (14:05 -0700)

mingw: fix a typo in the msysGit-specific section

The msysGit project (i.e. Git for Windows 1.x' SDK) is safely dead for
*years* already. This is probably the reason why nobody caught this typo
until Carlo Arenas spotted a copy-edited version of it nearby.

It is probably about time to rip out the remainders of msysGit/MSys1
support, but that can safely wait a bit longer, and we can at least fix
the typo for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: print server version at the top in -v -vNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 11:59:51 +0000 (18:59 +0700)

fetch-pack: print server version at the top in -v -v

Before the previous patch, the server version is printed after all the
"Server supports" lines. The previous one puts the version in the middle
of "Server supports" group.

Instead of moving it to the bottom, I move it to the top. Version may
stand out more at the top as we will have even more debug out after
capabilities.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: print all relevant supported capabilities... Nguyễn Thái Ngọc Duy Thu, 20 Jun 2019 11:59:50 +0000 (18:59 +0700)

fetch-pack: print all relevant supported capabilities with -v -v

When we check if some capability is supported, we do print something in
verbose mode. Some capabilities are not printed though (and it made me
think it's not supported; I was more used to GIT_TRACE_PACKET) so let's
print them all.

It's a bit more code. And one could argue for printing all supported
capabilities the server sends us. But I think it's still valuable this
way because we see the capabilities that the client cares about.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: move capability names out of i18n stringsNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 11:59:49 +0000 (18:59 +0700)

fetch-pack: move capability names out of i18n strings

This reduces the work on translators since they only have one string to
translate (and I think it's still enough context to translate). It also
makes sure no capability name is translated by accident.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame: add a test to cover blame_coalesce()Barret Rhoden Thu, 20 Jun 2019 16:38:20 +0000 (12:38 -0400)

blame: add a test to cover blame_coalesce()

Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame: use the fingerprint heuristic to match ignored... Barret Rhoden Thu, 20 Jun 2019 16:38:19 +0000 (12:38 -0400)

blame: use the fingerprint heuristic to match ignored lines

This commit integrates the fuzzy fingerprint heuristic into
guess_line_blames().

We actually make two passes. The first pass uses the fuzzy algorithm to
find a match within the current diff chunk. If that fails, the second
pass searches the entire parent file for the best match.

For an example of scanning the entire parent for a match, consider:

commit-a 30) #include <sys/header_a.h>
commit-b 31) #include <header_b.h>
commit-c 32) #include <header_c.h>

Then commit X alphabetizes them:

commit-X 30) #include <header_b.h>
commit-X 31) #include <header_c.h>
commit-X 32) #include <sys/header_a.h>

If we just check the parent's chunk (i.e. the first pass), we'd get:

commit-b 30) #include <header_b.h>
commit-c 31) #include <header_c.h>
commit-X 32) #include <sys/header_a.h>

That's because commit X actually consists of two chunks: one chunk is
removing sys/header_a.h, then some context, and the second chunk is
adding sys/header_a.h.

If we scan the entire parent file, we get:

commit-b 30) #include <header_b.h>
commit-c 31) #include <header_c.h>
commit-a 32) #include <sys/header_a.h>

Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame: add a fingerprint heuristic to match ignored... Michael Platings Thu, 20 Jun 2019 16:38:18 +0000 (12:38 -0400)

blame: add a fingerprint heuristic to match ignored lines

This algorithm will replace the heuristic used to identify lines from
ignored commits with one that finds likely candidate lines in the
parent's version of the file. The actual replacement occurs in an
upcoming commit.

The old heuristic simply assigned lines in the target to the same line
number (plus offset) in the parent. The new function uses a
fingerprinting algorithm to detect similarity between lines.

The new heuristic is designed to accurately match changes made
mechanically by formatting tools such as clang-format and clang-tidy.
These tools make changes such as breaking up lines to fit within a
character limit or changing identifiers to fit with a naming convention.
The heuristic is not intended to match more extensive refactoring
changes and may give misleading results in such cases.

In most cases formatting tools preserve line ordering, so the heuristic
is optimised for such cases. (Some types of changes do reorder lines
e.g. sorting keep the line content identical, the git blame -M option
can already be used to address this). The reason that it is advantageous
to rely on ordering is due to source code repeating the same character
sequences often e.g. declaring an identifier on one line and using that
identifier on several subsequent lines. This means that lines can look
very similar to each other which presents a problem when doing fuzzy
matching. Relying on ordering gives us extra clues to point towards the
true match.

The heuristic operates on a single diff chunk change at a time. It
creates a “fingerprint” for each line on each side of the change.
Fingerprints are described in detail in the comment for `struct
fingerprint`, but essentially are a multiset of the character pairs in a
line. The heuristic first identifies the line in the target entry whose
fingerprint is most clearly matched to a line fingerprint in the parent
entry. Where fingerprints match identically, the position of the lines
is used as a tie-break. The heuristic locks in the best match, and
subtracts the fingerprint of the line in the target entry from the
fingerprint of the line in the parent entry to prevent other lines being
matched on the same parts of that line. It then repeats the process
recursively on the section of the chunk before the match, and then the
section of the chunk after the match.

Here's an example of the difference the fingerprinting makes. Consider
a file with two commits:

commit-a 1) void func_1(void *x, void *y);
commit-b 2) void func_2(void *x, void *y);

After a commit 'X', we have:

commit-X 1) void func_1(void *x,
commit-X 2) void *y);
commit-X 3) void func_2(void *x,
commit-X 4) void *y);

When we blame-ignored with the old algorithm, we get:

commit-a 1) void func_1(void *x,
commit-b 2) void *y);
commit-X 3) void func_2(void *x,
commit-X 4) void *y);

Where commit-b is blamed for 2 instead of 3. With the fingerprint
algorithm, we get:

commit-a 1) void func_1(void *x,
commit-a 2) void *y);
commit-b 3) void func_2(void *x,
commit-b 4) void *y);

Note line 2 could be matched with either commit-a or commit-b as it is
equally similar to both lines, but is matched with commit-a because its
position as a fraction of the new line range is more similar to commit-a
as a fraction of the old line range. Line 4 is also equally similar to
both lines, but as it appears after line 3 which will be matched first
it cannot be matched with an earlier line.

For many more examples, see t/t8014-blame-ignore-fuzzy.sh which contains
example parent and target files and the line numbers in the parent that
must be matched.

Signed-off-by: Michael Platings <michael@platin.gs>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

completion: disable dwim on "git switch -d"Nguyễn Thái Ngọc Duy Thu, 20 Jun 2019 09:55:22 +0000 (16:55 +0700)

completion: disable dwim on "git switch -d"

Even though dwim is enabled by default, it will never be done when
--detached is specified. If you force "-d --guess" you will get an error
because --guess then implies -c which cannot be used with -d. So we can
disable dwim in "switch -d". It makes the completion list in this case a
bit shorter.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

switch: allow to switch in the middle of bisectNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 09:55:21 +0000 (16:55 +0700)

switch: allow to switch in the middle of bisect

In c45f0f525d (switch: reject if some operation is in progress,
2019-03-29), a check is added to prevent switching when some operation
is in progress. The reason is it's often not safe to do so.

This is true for merge, am, rebase, cherry-pick and revert, but not so
much for bisect because bisecting is basically jumping/switching between
a bunch of commits to pin point the first bad one. git-bisect suggests
the next commit to test, but it's not wrong for the user to test a
different commit because git-bisect cannot have the knowledge to know
better.

For this reason, allow to switch when bisecting (*). I considered if we
should still prevent switching by default and allow it with
--ignore-in-progress. But I don't think the prevention really adds
anything much.

If the user switches away by mistake, since we print the previous HEAD
value, even if they don't know about the "-" shortcut, switching back is
still possible.

The warning will be printed on every switch while bisect is still
ongoing, not the first time you switch away from bisect's suggested
commit, so it could become a bit annoying.

(*) of course when it's safe to do so, i.e. no loss of local changes and
stuff.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t2027: use test_must_be_emptyNguyễn Thái Ngọc Duy Thu, 20 Jun 2019 09:55:20 +0000 (16:55 +0700)

t2027: use test_must_be_empty

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

delta-islands: respect progress flagJeff King Thu, 20 Jun 2019 08:58:32 +0000 (04:58 -0400)

delta-islands: respect progress flag

The delta island code always prints "Marked %d islands", even if
progress has been suppressed with --no-progress or by sending stderr to
a non-tty.

Let's pass a progress boolean to load_delta_islands(). We already do
the same thing for the progress meter in resolve_tree_islands().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

rev-list: teach --no-object-names to enable pipingEmily Shaffer Wed, 19 Jun 2019 20:56:56 +0000 (13:56 -0700)

rev-list: teach --no-object-names to enable piping

Allow easier parsing by cat-file by giving rev-list an option to print
only the OID of a non-commit object without any additional information.
This is a short-term shim; later on, rev-list should be taught how to
print the types of objects it finds in a format similar to cat-file's.

Before this commit, the output from rev-list needed to be massaged
before being piped to cat-file, like so:

git rev-list --objects HEAD | cut -f 1 -d ' ' |
git cat-file --batch-check

This was especially unexpected when dealing with root trees, as an
invisible whitespace exists at the end of the OID:

git rev-list --objects --filter=tree:1 --max-count=1 HEAD |
xargs -I% echo "AA%AA"

Now, it can be piped directly, as in the added test case:

git rev-list --objects --no-object-names HEAD | git cat-file --batch-check

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Change-Id: I489bdf0a8215532e540175188883ff7541d70e1b
Signed-off-by: Junio C Hamano <gitster@pobox.com>

hashmap: convert sha1hash() to oidhash()Jeff King Thu, 20 Jun 2019 07:41:49 +0000 (03:41 -0400)

hashmap: convert sha1hash() to oidhash()

There are no callers left of sha1hash() that do not simply pass the
"hash" member of a "struct object_id". Let's get rid of the outdated
sha1-specific function and provide one that operates on the whole struct
(even though the technique, taking the first few bytes of the hash, will
remain the same).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

hash.h: move object_id definition from cache.hJeff King Thu, 20 Jun 2019 07:41:45 +0000 (03:41 -0400)

hash.h: move object_id definition from cache.h

Our hashmap.h helpfully defines a sha1hash() function. But it cannot
define a similar oidhash() without including all of cache.h, which
itself wants to include hashmap.h! Let's break this circular dependency
by moving the definition to hash.h, along with the remaining RAWSZ
macros, etc. That will put them with the existing git_hash_algo
definition.

One alternative would be to move oidhash() into cache.h, but it's
already quite bloated. We're better off moving things out than in.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

khash: rename oid helper functionsJeff King Thu, 20 Jun 2019 07:41:42 +0000 (03:41 -0400)

khash: rename oid helper functions

For use in object_id hash tables, we have oid_hash() and oid_equal().
But these are confusingly similar to the existing oideq() and the
oidhash() we plan to add to replace sha1hash().

The big difference from those functions is that rather than accepting a
const pointer to the "struct object_id", we take the arguments by value
(which is a khash internal convention). So let's make that obvious by
calling them oidhash_by_value() and oideq_by_value().

Those names are fairly horrendous to type, but we rarely need to do so;
they are passed to the khash implementation macro and then only used
internally. Callers get to use the nice kh_put_oid_map(), etc.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

khash: drop sha1-specific map typesJeff King Thu, 20 Jun 2019 07:41:38 +0000 (03:41 -0400)

khash: drop sha1-specific map types

All of the callers of khash_sha1 and khash_sha1_pos have been removed,
in favor of using maps that use "struct object_id" as their keys. Let's
drop these now-obsolete types.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: convert khash_sha1 maps into kh_oid_mapJeff King Thu, 20 Jun 2019 07:41:35 +0000 (03:41 -0400)

pack-bitmap: convert khash_sha1 maps into kh_oid_map

All of the users of our khash_sha1 maps actually have a "struct
object_id". Let's use the more descriptive type.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

delta-islands: convert island_marks khash to use oidsJeff King Thu, 20 Jun 2019 07:41:32 +0000 (03:41 -0400)

delta-islands: convert island_marks khash to use oids

All of the users of this map have an actual "struct object_id" rather
than a bare sha1. Let's use the more descriptive type (and get one step
closer to dropping khash_sha1 entirely).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

khash: rename kh_oid_t to kh_oid_setJeff King Thu, 20 Jun 2019 07:41:28 +0000 (03:41 -0400)

khash: rename kh_oid_t to kh_oid_set

khash lets us define a hash as either a map or a set (i.e., with no
"value" type). For the oid maps we define, "oid" is the set and
"oid_map" is the map. As the bug in the previous commit shows, it's easy
to pick the wrong one.

So let's make the names more distinct: "oid_set" and "oid_map".

An alternative naming scheme would be to actually name the type after
the key/value types. So e.g., "oid" _would_ be the set, since it has no
value type. And "oid_map" would become "oid_void" or similar (and
"oid_pos" becomes "oid_int"). That's better in some ways: it's more
regular, and a given map type can be more reasily reused in multiple
contexts (e.g., something storing an "int" that isn't a "pos"). But it's
also slightly less descriptive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

khash: drop broken oid_map typedefJeff King Thu, 20 Jun 2019 07:41:25 +0000 (03:41 -0400)

khash: drop broken oid_map typedef

Commit 5a8643eff1 (khash: move oid hash table definition, 2019-02-19)
added a khash "oid_map" type to match the existing "oid" type, which is
a simple set (i.e., just keys, no values). But in setting up the
khash_oid_map typedef, it accidentally referred to "kh_oid_t", which is
the set type.

Nobody noticed the breakage because there are not yet any callers; the
type was added just as a match to the existing sha1 types (whose map
type confusingly _is_ called khash_sha1, and it has no matching set
type).

We could easily fix this with s/oid/oid_map/ in the typedef. But let's
take this a step further, and just drop the typedef entirely. These
typedefs were added by 5a8643eff1 to match the khash_sha1 typedefs. But
the actual khash-derived type names are descriptive enough; this is just
adding an extra layer of indirection. The khash names do not quite
follow our usual style (e.g., they end in "_t"), but since we end up
using other khash names (e.g., khiter_t, kh_get_oid()) anyway, just
typedef-ing the struct name is not really helping much.

And there are already many cases where we use the raw khash type names
anyway (e.g., the "set" variant defined just above us does not have such
a typedef!).

So let's drop this typedef, and the matching oid_pos one (which actually
_does_ have a user, but we can easily convert it).

We'll leave the khash_sha1 typedef around. The ultimate fate of its
callers should be conversion to kh_oid_map_t, so there's no point in
going through the noise of changing the names now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: convert create_object() to use object_idJeff King Thu, 20 Jun 2019 07:41:21 +0000 (03:41 -0400)

object: convert create_object() to use object_id

There are no callers left of create_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: convert internal hash_obj() to object_idJeff King Thu, 20 Jun 2019 07:41:17 +0000 (03:41 -0400)

object: convert internal hash_obj() to object_id

Now that lookup_object() has an object_id, we can consistently pass that
around instead of a raw sha1. We still convert to a hash to pass to
sha1hash(), but the goal is for that to go away shortly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: convert lookup_object() to use object_idJeff King Thu, 20 Jun 2019 07:41:14 +0000 (03:41 -0400)

object: convert lookup_object() to use object_id

There are no callers left of lookup_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables. It also
matches the existing conversions of lookup_blob(), etc.

The conversions of callers were done by hand, but they're all mechanical
one-liners.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: convert lookup_unknown_object() to use object_idJeff King Thu, 20 Jun 2019 07:41:10 +0000 (03:41 -0400)

object: convert lookup_unknown_object() to use object_id

There are no callers left of lookup_unknown_object() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
It also matches the existing conversions of lookup_blob(), etc.

The conversions of callers were done by hand, but they're all mechanical
one-liners.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: convert locate_object_entry_hash() to... Jeff King Thu, 20 Jun 2019 07:41:07 +0000 (03:41 -0400)

pack-objects: convert locate_object_entry_hash() to object_id

There are no callers of locate_object_entry_hash() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: convert packlist_find() to use object_idJeff King Thu, 20 Jun 2019 07:41:03 +0000 (03:41 -0400)

pack-objects: convert packlist_find() to use object_id

We take a raw hash pointer, but most of our callers have a "struct
object_id" already. Let's switch to taking the full struct, which will
let us continue removing uses of raw sha1 buffers.

There are two callers that do need special attention:

- in rebuild_existing_bitmaps(), we need to switch to
nth_packed_object_oid(). This incurs an extra hash copy over
pointing straight to the mmap'd sha1, but it shouldn't be measurable
compared to the rest of the operation.

- in can_reuse_delta() we already spent the effort to copy the sha1
into a "struct object_id", but now we just have to do so a little
earlier in the function (we can't easily convert that function's
callers because they may be pointing at mmap'd REF_DELTA blocks).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap-write: convert some helpers to use object_idJeff King Thu, 20 Jun 2019 07:40:59 +0000 (03:40 -0400)

pack-bitmap-write: convert some helpers to use object_id

A few functions take raw hash pointers, but all of their callers
actually have a "struct object_id". Let's retain that extra type as long
as possible (which will let future patches extend that further, and so
on).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: rename a "sha1" variable to "oid"Jeff King Thu, 20 Jun 2019 07:40:54 +0000 (03:40 -0400)

upload-pack: rename a "sha1" variable to "oid"

This variable is a "struct object_id", but uses the old-style name
"sha1". Let's call it oid to match more modern code (and make it clear
that it can handle any algorithm, since it uses parse_oid_hex()
properly).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

describe: fix accidental oid/hash type-punningJeff King Thu, 20 Jun 2019 07:40:50 +0000 (03:40 -0400)

describe: fix accidental oid/hash type-punning

The find_commit_name() function passes an object_id.hash as the key of a
hashmap. That ends up in commit_name_neq(), which then feeds it to
oideq(). Which means we should actually be the whole "struct object_id".

It works anyway because pointers to the two are interchangeable. And
because we're going through a layer of void pointers, the compiler
doesn't notice the type mismatch.

But it's worth cleaning up (especially since once we switch away from
sha1hash() on the same line, accessing the hash member will look doubly
out of place).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: only run 'gc' once when fetching multiple remotesNguyễn Thái Ngọc Duy Wed, 19 Jun 2019 09:46:30 +0000 (16:46 +0700)

fetch: only run 'gc' once when fetching multiple remotes

In multiple remotes mode, git-fetch is launched for n-1 remotes and the
last remote is handled by the current process. Each of these processes
will in turn run 'gc' at the end.

This is not really a problem because even if multiple 'gc --auto' is run
at the same time we still handle it correctly. It does show multiple
"auto packing in the background" messages though. And we may waste some
resources when gc actually runs because we still do some stuff before
checking the lock and moving it to background.

So let's try to avoid that. We should only need one 'gc' run after all
objects and references are added anyway. Add a new option --no-auto-gc
that will be used by those n-1 processes. 'gc --auto' will always run on
the main fetch process (*).

(*) even if we fetch remotes in parallel at some point in future, this
should still be fine because we should "join" all those processes
before this step.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: test verify across alternatesDerrick Stolee Tue, 18 Jun 2019 18:14:36 +0000 (11:14 -0700)

commit-graph: test verify across alternates

The 'git commit-graph verify' subcommand loads a commit-graph from
a given object directory instead of using the standard method
prepare_commit_graph(). During development of load_commit_graph_chain(),
a version did not include prepare_alt_odb() as it was previously
run by prepare_commit_graph() in most cases.

Add a test that prevents that mistake from happening again.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: normalize commit-graph filenamesDerrick Stolee Tue, 18 Jun 2019 18:14:36 +0000 (11:14 -0700)

commit-graph: normalize commit-graph filenames

When writing commit-graph files, we append path data to an
object directory, which may be specified by the user via the
'--object-dir' option. If the user supplies a trailing slash,
or some other alternative path format, the resulting path may
be usable for writing to the correct location. However, when
expiring graph files from the <obj-dir>/info/commit-graphs
directory during a write, we need to compare paths with exact
string matches.

Normalize the commit-graph filenames to avoid ambiguity. This
creates extra allocations, but this is a constant multiple of
the number of commit-graph files, which should be a number in
the single digits.

Further normalize the object directory in the context. Due to
a comparison between g->obj_dir and ctx->obj_dir in
split_graph_merge_strategy(), a trailing slash would prevent
any merging of layers within the same object directory. The
check is there to ensure we do not merge across alternates.
Update the tests to include a case with this trailing slash
problem.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: test --split across alternate without... Derrick Stolee Tue, 18 Jun 2019 18:14:35 +0000 (11:14 -0700)

commit-graph: test --split across alternate without --split

We allow sharing commit-graph files across alternates. When we are
writing a split commit-graph, we allow adding tip graph files that
are not in the alternate, but include commits from our local repo.

However, if our alternate is not using the split commit-graph format,
its file is at .git/objects/info/commit-graph and we are trying to
write files in .git/objects/info/commit-graphs/graph-{hash}.graph.

We already have logic to ensure we do not merge across alternate
boundaries, but we also cannot have a commit-graph chain to our
alternate if uses the old filename structure.

Create a test that verifies we create a new split commit-graph
with only one level and we do not modify the existing commit-graph
in the alternate.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: test octopus merges with --splitDerrick Stolee Tue, 18 Jun 2019 18:14:34 +0000 (11:14 -0700)

commit-graph: test octopus merges with --split

Octopus merges require an extra chunk of data in the commit-graph
file format. Create a test that ensures the new --split option
continues to work with an octopus merge. Specifically, ensure
that the octopus merge has parents across layers to truly check
that our graph position logic holds up correctly.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: clean up chains after flattened writeDerrick Stolee Tue, 18 Jun 2019 18:14:33 +0000 (11:14 -0700)

commit-graph: clean up chains after flattened write

If we write a commit-graph file without the split option, then
we write to $OBJDIR/info/commit-graph and start to ignore
the chains in $OBJDIR/info/commit-graphs/.

Unlink the commit-graph-chain file and expire the graph-{hash}.graph
files in $OBJDIR/info/commit-graphs/ during every write.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify chains with --shallow modeDerrick Stolee Tue, 18 Jun 2019 18:14:32 +0000 (11:14 -0700)

commit-graph: verify chains with --shallow mode

If we wrote a commit-graph chain, we only modified the tip file in
the chain. It is valuable to verify what we wrote, but not waste
time checking files we did not write.

Add a '--shallow' option to the 'git commit-graph verify' subcommand
and check that it does not read the base graph in a two-file chain.

Making the verify subcommand read from a chain of commit-graphs takes
some rearranging of the builtin code.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: create options for split filesDerrick Stolee Tue, 18 Jun 2019 18:14:32 +0000 (11:14 -0700)

commit-graph: create options for split files

The split commit-graph feature is now fully implemented, but needs
some more run-time configurability. Allow direct callers to 'git
commit-graph write --split' to specify the values used in the
merge strategy and the expire time.

Update the documentation to specify these values.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: expire commit-graph filesDerrick Stolee Tue, 18 Jun 2019 18:14:31 +0000 (11:14 -0700)

commit-graph: expire commit-graph files

As we merge commit-graph files in a commit-graph chain, we should clean
up the files that are no longer used.

This change introduces an 'expiry_window' value to the context, which is
always zero (for now). We then check the modified time of each
graph-{hash}.graph file in the $OBJDIR/info/commit-graphs folder and
unlink the files that are older than the expiry_window.

Since this is always zero, this immediately clears all unused graph
files. We will update the value to match a config setting in a future
change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: allow cross-alternate chainsDerrick Stolee Tue, 18 Jun 2019 18:14:30 +0000 (11:14 -0700)

commit-graph: allow cross-alternate chains

In an environment like a fork network, it is helpful to have a
commit-graph chain that spans both the base repo and the fork repo. The
fork is usually a small set of data on top of the large repo, but
sometimes the fork is much larger. For example, git-for-windows/git has
almost double the number of commits as git/git because it rebases its
commits on every major version update.

To allow cross-alternate commit-graph chains, we need a few pieces:

1. When looking for a graph-{hash}.graph file, check all alternates.

2. When merging commit-graph chains, do not merge across alternates.

3. When writing a new commit-graph chain based on a commit-graph file
in another object directory, do not allow success if the base file
has of the name "commit-graph" instead of
"commit-graphs/graph-{hash}.graph".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: merge commit-graph chainsDerrick Stolee Tue, 18 Jun 2019 18:14:29 +0000 (11:14 -0700)

commit-graph: merge commit-graph chains

When searching for a commit in a commit-graph chain of G graphs with N
commits, the search takes O(G log N) time. If we always add a new tip
graph with every write, the linear G term will start to dominate and
slow the lookup process.

To keep lookups fast, but also keep most incremental writes fast, create
a strategy for merging levels of the commit-graph chain. The strategy is
detailed in the commit-graph design document, but is summarized by these
two conditions:

1. If the number of commits we are adding is more than half the number
of commits in the graph below, then merge with that graph.

2. If we are writing more than 64,000 commits into a single graph,
then merge with all lower graphs.

The numeric values in the conditions above are currently constant, but
can become config options in a future update.

As we merge levels of the commit-graph chain, check that the commits
still exist in the repository. A garbage-collection operation may have
removed those commits from the object store and we do not want to
persist them in the commit-graph chain. This is a non-issue if the
'git gc' process wrote a new, single-level commit-graph file.

After we merge levels, the old graph-{hash}.graph files are no longer
referenced by the commit-graph-chain file. We will expire these files in
a future change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: add --split option to builtinDerrick Stolee Tue, 18 Jun 2019 18:14:28 +0000 (11:14 -0700)

commit-graph: add --split option to builtin

Add a new "--split" option to the 'git commit-graph write' subcommand. This
option allows the optional behavior of writing a commit-graph chain.

The current behavior will add a tip commit-graph containing any commits that
are not in the existing commit-graph or commit-graph chain. Later changes
will allow merging the chain and expiring out-dated files.

Add a new test script (t5324-split-commit-graph.sh) that demonstrates this
behavior.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>