an apparent typo for the environment variable was included with 81567caf87
("trace2: update docs to describe system/global config settings",
2019-04-15), and was missed when renamed variables by e4b75d6a1d
("trace2: rename environment variables to GIT_TRACE2*", 2019-05-19)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit 4a72486de9 ("fix cherry-pick/revert status after commit",
2019-04-16) used parse_insn_line() to parse the first line of the todo
list to check if it was a pick or revert. However if the todo list is
left over from an old cherry-pick or revert and references a commit that
no longer exists then parse_insn_line() prints an error message which is
confusing for users [1]. Instead parse just the command name so that the
user is alerted to the presence of stale sequencer state by status
reporting that a cherry-pick or revert is in progress.
Note that we should not be leaving stale sequencer state lying around
(or at least not as often) after commit b07d9bfd17 ("commit/reset: try
to clean up sequencer state", 2019-04-16). However the user may still
have stale state that predates that commit.
Also avoid printing an error message if for some reason the user has a
file called `sequencer` in $GIT_DIR.
Factor out the code that parses the name of the command at the start of
each line in the todo file into its own function so that it can be used
in the next commit.
As I don't want to burden other callers with having to pass in a pointer
to the end of the line the test for an abbreviated command is
changed. This change should not affect the behavior. Instead of testing
`eol == bol + 1` the new code checks for the end of the line by testing
for '\n', '\r' or '\0' following the abbreviated name. To avoid reading
past the end of an empty string it also checks that there is actually a
single character abbreviation before testing if it matches. This
prevents it from matching '\0' as the abbreviated name and then trying
to read another character.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code that parses the todo list allows an unabbreviated command name
to be followed by a space or a tab, but if the command name is
abbreviated it only allows a space after it. Fix this inconsistency.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When run using GIT_TEST_PROTOCOL_VERSION=2, a test in t5551 fails
because 4 POSTs (probe, ls-refs, probe, fetch) are sent instead of 2
(probe, fetch).
One way to resolve this would be to relax the condition (from "= 2" to
greater than 1, say), but upon further inspection, the test probably
shouldn't be counting the number of POSTs. This test states that large
requests are split across POSTs, but this is not correct; the main
change is that chunked transfer encoding is used, but the request is
still contained within one POST. (The test coincidentally works because
Git indeed sends 2 POSTs in the case of a large request, but that is
because, as stated above, the first POST is a probing RPC - see
post_rpc() in remote-curl.c for more information.)
Therefore, instead of counting POSTs, check that chunked transfer
encoding is used. This also has the desirable side effect of passing
with GIT_TEST_PROTOCOL_VERSION=2.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5551: use 'test_i18ngrep' to check translated output
The two tests 'invalid Content-Type rejected' and 'server-side error
detected' in 't5551-http-fetch-smart.sh' use "plain" 'grep' to check
that 'git clone' failed with the expected error message, but the
messages they are checking are translated, and, consequently, these
tests fail when the test script is run with GIT_TEST_GETTEXT_POISON
enabled.
Use 'test_i18ngrep' instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
leads to an error stating that the option --<option> is unknown to
submodule--helper. That is of course only, when <option> is not a valid
option for git submodule foreach.
The reason for this is, that above call is internally translated into a
call to submodule--helper:
This call starts by executing the subcommand with its option inside the
first level submodule and continues by calling the next iteration of
the submodule foreach call
inside the first level submodule. Note that the double dash in front of
the subcommand is missing.
This problem starts to arise only recently, as the
PARSE_OPT_KEEP_UNKNOWN flag for the argument parsing of git submodule
foreach was removed in commit a282f5a906. Hence, the unknown option is
complained about now, as the argument parsing is not properly ended by
the double dash.
This commit fixes the problem by adding the double dash in front of the
subcommand during the recursion.
Signed-off-by: Morian Sonnet <moriansonnet@googlemail.com> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ignore .dll files copied into the top-level directory.
Ignore MSVC incremental compiler output files.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
For regular debugging, it is pretty helpful when a debug assertion in a
running application triggers a window that offers to start the debugger.
However, when running the test suite, it is not so helpful, in
particular when the debug assertions are then suppressed anyway because
we disable the invalid parameter checking (via invalidcontinue.obj, see
the comment in config.mak.uname about that object for more information).
So let's simply disable that window in Debug Mode (it is already
disabled in Release Mode).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This special-cases various signals that are not supported on Windows,
such as SIGPIPE. These cause the UCRT to throw asserts (at least in
debug mode).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSVC can be overzealous about some warnings. Disable them.
Signed-off-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
msvc: add a compile-time flag to allow detailed heap debugging
MS Visual C comes with a few neat features we can use to analyze the
heap consumption (i.e. leaks, max memory, etc).
With this patch, we introduce support via the build-time flag
`USE_MSVC_CRTDBG`.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this patch, Git can be built using the Microsoft toolchain, via:
make MSVC=1 [DEBUG=1]
Third party libraries are built from source using the open source
"vcpkg" tool set. See https://github.com/Microsoft/vcpkg
On a first build, the vcpkg tools and the third party libraries are
automatically downloaded and built. DLLs for the third party libraries
are copied to the top-level (and t/helper) directory to facilitate
debugging. See compat/vcbuild/README.
A series of .bat files are invoked by the Makefile to find the location
of the installed version of Visual Studio and the associated compiler
tools (essentially replicating the environment setup performed by a
"Developer Command Prompt"). This should find the most recent VS2015 or
VS2017 installation. Output from these scripts are used by the Makefile
to define compiler and linker pathnames and -I and -L arguments.
The build produces .pdb files for both debug and release builds.
Note: This commit was squashed from an organic series of commits
developed between 2016 and 2018 in Git for Windows' `master` branch.
This combined commit eliminates the obsolete commits related to fetching
NuGet packages for third party libraries. It is difficult to use NuGet
packages for C/C++ sources because they may be built by earlier versions
of the MSVC compiler and have CRT version and linking issues.
Additionally, the C/C++ NuGet packages that we were using tended to not
be updated concurrently with the sources. And in the case of cURL and
OpenSSL, this could expose us to security issues.
Helped-by: Yue Lin Ho <b8732003@student.nsysu.edu.tw> Helped-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pager: add a helper function to clear the last line in the terminal
There are a couple of places where we want to clear the last line on
the terminal, e.g. when a progress bar line is overwritten by a
shorter line, then the end of that progress line would remain visible,
unless we cover it up.
In 'progress.c' we did this by always appending a fixed number of
space characters to the next line (even if it was not shorter than the
previous), but as it turned out that fixed number was not quite large
enough, see the fix in 9f1fd84e15 (progress: clear previous progress
update dynamically, 2019-04-12). From then on we've been keeping
track of the length of the last displayed progress line and appending
the appropriate number of space characters to the next line, if
necessary, but, alas, this approach turned out to be error prone, see
the fix in 1aed1a5f25 (progress: avoid empty line when breaking the
progress line, 2019-05-19). The next patch in this series is about to
fix a case where we don't clear the last line, and on occasion do end
up with such garbage at the end of the line. It would be great if we
could do that without the need to deal with that without meticulously
computing the necessary number of space characters.
So add a helper function to clear the last line on the terminal using
an ANSI escape sequence, which has the advantage to clear the whole
line no matter how wide it is, even after the terminal width changed.
Such an escape sequence is not available on dumb terminals, though, so
in that case fall back to simply print a whole terminal width (as
reported by term_columns()) worth of space characters.
In 'editor.c' launch_specified_editor() already used this ANSI escape
sequence, so replace it with a call to this function.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3404: make the 'rebase.missingCommitsCheck=ignore' test more focused
The test 'rebase -i respects rebase.missingCommitsCheck = warn' is
mainly interested in the warning about the dropped commits, but it
checks the whole output of 'git rebase', including progress lines and
what not that are not at all relevant to 'rebase.missingCommitsCheck',
but make it necessary to update this test whenever e.g. the way we
show progress is updated (as it will happen in one of the later
patches of this series).
Modify the test to verify only the first four lines of 'git rebase's
output that contain all the important lines, notably the line
containing the "Warning:" itself and the oneline log of the dropped
commit.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 't3404-rebase-interactive.sh' the expected output of several tests
is prepared from here documents, which are outside of
'test_expect_success' blocks and have spaces around redirection
operators.
Move these here documents into the corresponding 'test_expect_success'
block and avoid spaces between filename and redition operators.
Furthermore, quote the here docs' delimiter word to prevent parameter
expansions and what not, where applicable.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just return the value of the factor or zero for unrecognized strings
instead of using an output reference and a separate return value to
indicate success. This is shorter and simpler.
It basically reverts that function to before c8deb5a146 ("Improve error
messages when int/long cannot be parsed from config", 2007-12-25), while
keeping the better messages, so restore its old name, get_unit_factor(),
as well.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_unit_factor() multiplies the number that is passed to it with the
value of a recognized unit factor (K, M or G for 2^10, 2^20 and 2^30,
respectively). All callers pass in 1 as a number, though, which allows
them to check the actual multiplication for overflow before they are
doing it themselves.
Ignore the passed in number and don't multiply, as this feature of
parse_unit_factor() is not used anymore. Rename the output parameter to
reflect that it's not about the end result anymore, but just about the
unit factor.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: use unsigned_mult_overflows to check for overflows
parse_unit_factor() checks if a K, M or G is present after a number and
multiplies it by 2^10, 2^20 or 2^30, respectively. One of its callers
checks if the result is smaller than the number alone to detect
overflows. The other one passes 1 as the number and does multiplication
and overflow check itself in a similar manner.
This works, but is inconsistent, and it would break if we added support
for a bigger unit factor. E.g. 16777217T is 2^64 + 2^40, i.e. too big
for a 64-bit number. Modulo 2^64 we get 2^40 == 1TB, which is bigger
than the raw number 16777217 == 2^24 + 1, so the overflow would go
undetected by that method.
Let both callers pass 1 and handle overflow check and multiplication
themselves. Do the check before the multiplication, using
unsigned_mult_overflows, which is simpler and can deal with larger unit
factors.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On a case-insensitive filesystem, such as HFS+ or NTFS, it is possible
that the idea Bash has of the current directory differs in case from
what Git thinks it is. That's totally okay, though, and we should not
expect otherwise.
On Windows, for example, when you call
cd C:\GIT-SDK-64
in a PowerShell and there exists a directory called `C:\git-sdk-64`, the
current directory will be reported in all upper-case. Even in a Bash
that you might call from that PowerShell. Git, however, will have
normalized this via `GetFinalPathByHandle()`, and the expectation in
t0001 that the recorded gitdir will match what `pwd` says will be
violated.
Let's address this by comparing these paths in a case-insensitive
manner when `core.ignoreCase` is `true`.
Reported by Jameson Miller.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We implement a command called git-psuh which accept arguments, so let's
show that it accepts arguments in the doc and the usage string.
While at it, we need to prepare "a NULL-terminated array of usage strings",
not just "a NULL-terminated usage string".
Helped-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a couple of tests that would potentially fail under
GIT_TEST_FAIL_PREREQS=true.
I missed these when annotating other tests in dfe1a17df9 ("tests: add
a special setup where prerequisites fail", 2019-05-13) because on my
system I can only reproduce this failure when I run the tests as
"root", since the tests happen to depend on whether we can fall back
on GECOS info or not. I.e. they'd usually fail to look up the ident
info anyway, but not always.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description about slashes in gitignore patterns (used to
indicate things like "anchored to this level only" and "only
matches directories") has been revamped.
* an/ignore-doc-update:
gitignore.txt: make slash-rules more readable
The filter_data used in the list-objects-filter (which manages a
lazily sparse clone repository) did not use the dynamic array API
correctly---'nr' is supposed to point at one past the last element
of the array in use. This has been corrected.
* md/list-objects-filter-memfix:
list-objects-filter: correct usage of ALLOC_GROW
"git fetch" into a lazy clone forgot to fetch base objects that are
necessary to complete delta in a thin packfile, which has been
corrected.
* jt/partial-clone-missing-ref-delta-base:
t5616: cover case of client having delta base
t5616: use correct flag to check object is missing
index-pack: prefetch missing REF_DELTA bases
t5616: refactor packfile replacement
The 'git fetch' command can avoid calculating forced updates, so
allow users of 'git pull' to provide that option. This is particularly
necessary when the advice to use '--no-show-forced-updates' is given
at the end of the command.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: warn about forced updates in branch listing
The --[no-]show-forced-updates option in 'git fetch' can be confusing
for some users, especially if it is enabled via config setting and not
by argument. Add advice to warn the user that the (forced update)
messages were not listed.
Additionally, warn users when the forced update check takes longer
than ten seconds, and recommend that they disable the check. These
messages can be disabled by the advice.fetchShowForcedUpdates config
setting.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After updating a set of remove refs during a 'git fetch', we walk the
commits in the new ref value and not in the old ref value to discover
if the update was a forced update. This results in two things happening
during the command:
1. The line including the ref update has an additional "(forced-update)"
marker at the end.
2. The ref log for that remote branch includes a bit saying that update
is a forced update.
For many situations, this forced-update message happens infrequently, or
is a small bit of information among many ref updates. Many users ignore
these messages, but the calculation required here slows down their fetches
significantly. Keep in mind that they do not have the opportunity to
calculate a commit-graph file containing the newly-fetched commits, so
these comparisons can be very slow.
Add a '--[no-]show-forced-updates' option that allows a user to skip this
calculation. The only permanent result is dropping the forced-update bit
in the reflog.
Include a new fetch.showForcedUpdates config setting that allows this
behavior without including the argument in every command. The config
setting is overridden by the command-line arguments.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
status: ignore status.aheadbehind in porcelain formats
Teach porcelain V[12] formats to ignore the status.aheadbehind
config setting. They only respect the --[no-]ahead-behind
command line argument. This is for backwards compatibility
with existing scripts.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ahead/behind calculation in 'git status' can be slow in some
cases. Users may not realize that there are ways to avoid this
computation, especially if they are not using the information.
Add a warning that appears if this calculation takes more than
two seconds. The warning can be disabled through the new config
setting advice.statusAheadBehind.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --[no-]ahead-behind option was introduced in fd9b544a
(status: add --[no-]ahead-behind to status and commit for V2
format, 2018-01-09). This is a necessary change of behavior
in repos where the remote tracking branches can move very
quickly ahead of the local branches. However, users need to
remember to provide the command-line argument every time.
Add a new "status.aheadBehind" config setting to change the
default behavior of all git status formats.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before this patch, there is inconsistency between the status
messages with hints and the ones without hints: there is an
empty line between the title and the file list if hints are
presented, but there isn't one if there are no hints.
This patch remove the inconsistency by removing the empty
lines even if hints are presented.
Signed-off-by: John Lin <johnlinp@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
msvc: update Makefile to allow for spaces in the compiler path
It is quite common that MS Visual C++ is installed into a location whose
path contains spaces, therefore we need to quote it.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
VS2015 complains when using a const pointer in memcpy()/free().
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In UCRT's fcntl.h, _O_RDONLY, _O_WRONLY and _O_RDWR are defined as 0, 1
and 2, respectively. Yes, that means that UCRT breaks with the tradition
that O_RDWR == O_RDONLY | O_WRONLY.
It is a perfectly legal way to define those constants, though, therefore
we need to take care of defining O_ACCMODE accordingly.
This is particularly important in order to keep our "open() can set
errno to EISDIR" emulation working: it tests that (flags & O_ACCMODE) is
not identical to O_RDONLY before going on to test specifically whether
the file for which open() reported EACCES is, in fact, a directory.
Signed-off-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows has special code to retrieve the command-line parameters
(and even the environment) in UTF-16 encoding, so that they can be
converted to UTF-8. This is necessary because Git for Windows wants to
use UTF-8 encoded strings throughout its code, and the main() function
does not get the parameters in that encoding.
To do that, we used the __wgetmainargs() function, which is not even a
Win32 API function, but provided by the MINGW "runtime" instead.
Obviously, this method would not work with any compiler other than GCC,
and in preparation for compiling with Visual C++, we would like to avoid
precisely that.
Lucky us, there is a much more elegant way: we can simply implement the
UTF-16 variant of `main()`: `wmain()`.
To make that work, we need to link with -municode. The command-line
parameters are passed to `wmain()` encoded in UTF-16, as desired, and
this method also works with GCC, and also with Visual C++ after
adjusting the MSVC linker flags to force it to use `wmain()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache-tree/blame: avoid reusing the DEBUG constant
In MS Visual C, the `DEBUG` constant is set automatically whenever
compiling with debug information.
This is clearly not what was intended in `cache-tree.c` nor in
`builtin/blame.c`, so let's use a less ambiguous name there.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like the natural line ending for Unix shell scripts consist of a
single Line Feed, the natural line ending for (DOS) Batch scripts
consists of a Carriage Return followed by a Line Feed.
It seems that both Unix shell script interpreters and the interpreter
for Batch scripts (`cmd.exe`) are keen on seeing the "right" line
endings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The msysGit project (i.e. Git for Windows 1.x' SDK) is safely dead for
*years* already. This is probably the reason why nobody caught this typo
until Carlo Arenas spotted a copy-edited version of it nearby.
It is probably about time to rip out the remainders of msysGit/MSys1
support, but that can safely wait a bit longer, and we can at least fix
the typo for now.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: print server version at the top in -v -v
Before the previous patch, the server version is printed after all the
"Server supports" lines. The previous one puts the version in the middle
of "Server supports" group.
Instead of moving it to the bottom, I move it to the top. Version may
stand out more at the top as we will have even more debug out after
capabilities.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: print all relevant supported capabilities with -v -v
When we check if some capability is supported, we do print something in
verbose mode. Some capabilities are not printed though (and it made me
think it's not supported; I was more used to GIT_TRACE_PACKET) so let's
print them all.
It's a bit more code. And one could argue for printing all supported
capabilities the server sends us. But I think it's still valuable this
way because we see the capabilities that the client cares about.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: move capability names out of i18n strings
This reduces the work on translators since they only have one string to
translate (and I think it's still enough context to translate). It also
makes sure no capability name is translated by accident.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
blame: use the fingerprint heuristic to match ignored lines
This commit integrates the fuzzy fingerprint heuristic into
guess_line_blames().
We actually make two passes. The first pass uses the fuzzy algorithm to
find a match within the current diff chunk. If that fails, the second
pass searches the entire parent file for the best match.
For an example of scanning the entire parent for a match, consider:
That's because commit X actually consists of two chunks: one chunk is
removing sys/header_a.h, then some context, and the second chunk is
adding sys/header_a.h.
blame: add a fingerprint heuristic to match ignored lines
This algorithm will replace the heuristic used to identify lines from
ignored commits with one that finds likely candidate lines in the
parent's version of the file. The actual replacement occurs in an
upcoming commit.
The old heuristic simply assigned lines in the target to the same line
number (plus offset) in the parent. The new function uses a
fingerprinting algorithm to detect similarity between lines.
The new heuristic is designed to accurately match changes made
mechanically by formatting tools such as clang-format and clang-tidy.
These tools make changes such as breaking up lines to fit within a
character limit or changing identifiers to fit with a naming convention.
The heuristic is not intended to match more extensive refactoring
changes and may give misleading results in such cases.
In most cases formatting tools preserve line ordering, so the heuristic
is optimised for such cases. (Some types of changes do reorder lines
e.g. sorting keep the line content identical, the git blame -M option
can already be used to address this). The reason that it is advantageous
to rely on ordering is due to source code repeating the same character
sequences often e.g. declaring an identifier on one line and using that
identifier on several subsequent lines. This means that lines can look
very similar to each other which presents a problem when doing fuzzy
matching. Relying on ordering gives us extra clues to point towards the
true match.
The heuristic operates on a single diff chunk change at a time. It
creates a “fingerprint” for each line on each side of the change.
Fingerprints are described in detail in the comment for `struct
fingerprint`, but essentially are a multiset of the character pairs in a
line. The heuristic first identifies the line in the target entry whose
fingerprint is most clearly matched to a line fingerprint in the parent
entry. Where fingerprints match identically, the position of the lines
is used as a tie-break. The heuristic locks in the best match, and
subtracts the fingerprint of the line in the target entry from the
fingerprint of the line in the parent entry to prevent other lines being
matched on the same parts of that line. It then repeats the process
recursively on the section of the chunk before the match, and then the
section of the chunk after the match.
Here's an example of the difference the fingerprinting makes. Consider
a file with two commits:
Note line 2 could be matched with either commit-a or commit-b as it is
equally similar to both lines, but is matched with commit-a because its
position as a fraction of the new line range is more similar to commit-a
as a fraction of the old line range. Line 4 is also equally similar to
both lines, but as it appears after line 3 which will be matched first
it cannot be matched with an earlier line.
For many more examples, see t/t8014-blame-ignore-fuzzy.sh which contains
example parent and target files and the line numbers in the parent that
must be matched.
Signed-off-by: Michael Platings <michael@platin.gs> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though dwim is enabled by default, it will never be done when
--detached is specified. If you force "-d --guess" you will get an error
because --guess then implies -c which cannot be used with -d. So we can
disable dwim in "switch -d". It makes the completion list in this case a
bit shorter.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c45f0f525d (switch: reject if some operation is in progress,
2019-03-29), a check is added to prevent switching when some operation
is in progress. The reason is it's often not safe to do so.
This is true for merge, am, rebase, cherry-pick and revert, but not so
much for bisect because bisecting is basically jumping/switching between
a bunch of commits to pin point the first bad one. git-bisect suggests
the next commit to test, but it's not wrong for the user to test a
different commit because git-bisect cannot have the knowledge to know
better.
For this reason, allow to switch when bisecting (*). I considered if we
should still prevent switching by default and allow it with
--ignore-in-progress. But I don't think the prevention really adds
anything much.
If the user switches away by mistake, since we print the previous HEAD
value, even if they don't know about the "-" shortcut, switching back is
still possible.
The warning will be printed on every switch while bisect is still
ongoing, not the first time you switch away from bisect's suggested
commit, so it could become a bit annoying.
(*) of course when it's safe to do so, i.e. no loss of local changes and
stuff.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev-list: teach --no-object-names to enable piping
Allow easier parsing by cat-file by giving rev-list an option to print
only the OID of a non-commit object without any additional information.
This is a short-term shim; later on, rev-list should be taught how to
print the types of objects it finds in a format similar to cat-file's.
Before this commit, the output from rev-list needed to be massaged
before being piped to cat-file, like so:
There are no callers left of sha1hash() that do not simply pass the
"hash" member of a "struct object_id". Let's get rid of the outdated
sha1-specific function and provide one that operates on the whole struct
(even though the technique, taking the first few bytes of the hash, will
remain the same).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our hashmap.h helpfully defines a sha1hash() function. But it cannot
define a similar oidhash() without including all of cache.h, which
itself wants to include hashmap.h! Let's break this circular dependency
by moving the definition to hash.h, along with the remaining RAWSZ
macros, etc. That will put them with the existing git_hash_algo
definition.
One alternative would be to move oidhash() into cache.h, but it's
already quite bloated. We're better off moving things out than in.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
For use in object_id hash tables, we have oid_hash() and oid_equal().
But these are confusingly similar to the existing oideq() and the
oidhash() we plan to add to replace sha1hash().
The big difference from those functions is that rather than accepting a
const pointer to the "struct object_id", we take the arguments by value
(which is a khash internal convention). So let's make that obvious by
calling them oidhash_by_value() and oideq_by_value().
Those names are fairly horrendous to type, but we rarely need to do so;
they are passed to the khash implementation macro and then only used
internally. Callers get to use the nice kh_put_oid_map(), etc.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of khash_sha1 and khash_sha1_pos have been removed,
in favor of using maps that use "struct object_id" as their keys. Let's
drop these now-obsolete types.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
delta-islands: convert island_marks khash to use oids
All of the users of this map have an actual "struct object_id" rather
than a bare sha1. Let's use the more descriptive type (and get one step
closer to dropping khash_sha1 entirely).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
khash lets us define a hash as either a map or a set (i.e., with no
"value" type). For the oid maps we define, "oid" is the set and
"oid_map" is the map. As the bug in the previous commit shows, it's easy
to pick the wrong one.
So let's make the names more distinct: "oid_set" and "oid_map".
An alternative naming scheme would be to actually name the type after
the key/value types. So e.g., "oid" _would_ be the set, since it has no
value type. And "oid_map" would become "oid_void" or similar (and
"oid_pos" becomes "oid_int"). That's better in some ways: it's more
regular, and a given map type can be more reasily reused in multiple
contexts (e.g., something storing an "int" that isn't a "pos"). But it's
also slightly less descriptive.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 5a8643eff1 (khash: move oid hash table definition, 2019-02-19)
added a khash "oid_map" type to match the existing "oid" type, which is
a simple set (i.e., just keys, no values). But in setting up the
khash_oid_map typedef, it accidentally referred to "kh_oid_t", which is
the set type.
Nobody noticed the breakage because there are not yet any callers; the
type was added just as a match to the existing sha1 types (whose map
type confusingly _is_ called khash_sha1, and it has no matching set
type).
We could easily fix this with s/oid/oid_map/ in the typedef. But let's
take this a step further, and just drop the typedef entirely. These
typedefs were added by 5a8643eff1 to match the khash_sha1 typedefs. But
the actual khash-derived type names are descriptive enough; this is just
adding an extra layer of indirection. The khash names do not quite
follow our usual style (e.g., they end in "_t"), but since we end up
using other khash names (e.g., khiter_t, kh_get_oid()) anyway, just
typedef-ing the struct name is not really helping much.
And there are already many cases where we use the raw khash type names
anyway (e.g., the "set" variant defined just above us does not have such
a typedef!).
So let's drop this typedef, and the matching oid_pos one (which actually
_does_ have a user, but we can easily convert it).
We'll leave the khash_sha1 typedef around. The ultimate fate of its
callers should be conversion to kh_oid_map_t, so there's no point in
going through the noise of changing the names now.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers left of create_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that lookup_object() has an object_id, we can consistently pass that
around instead of a raw sha1. We still convert to a hash to pass to
sha1hash(), but the goal is for that to go away shortly.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers left of lookup_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables. It also
matches the existing conversions of lookup_blob(), etc.
The conversions of callers were done by hand, but they're all mechanical
one-liners.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object: convert lookup_unknown_object() to use object_id
There are no callers left of lookup_unknown_object() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
It also matches the existing conversions of lookup_blob(), etc.
The conversions of callers were done by hand, but they're all mechanical
one-liners.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects: convert locate_object_entry_hash() to object_id
There are no callers of locate_object_entry_hash() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects: convert packlist_find() to use object_id
We take a raw hash pointer, but most of our callers have a "struct
object_id" already. Let's switch to taking the full struct, which will
let us continue removing uses of raw sha1 buffers.
There are two callers that do need special attention:
- in rebuild_existing_bitmaps(), we need to switch to
nth_packed_object_oid(). This incurs an extra hash copy over
pointing straight to the mmap'd sha1, but it shouldn't be measurable
compared to the rest of the operation.
- in can_reuse_delta() we already spent the effort to copy the sha1
into a "struct object_id", but now we just have to do so a little
earlier in the function (we can't easily convert that function's
callers because they may be pointing at mmap'd REF_DELTA blocks).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-bitmap-write: convert some helpers to use object_id
A few functions take raw hash pointers, but all of their callers
actually have a "struct object_id". Let's retain that extra type as long
as possible (which will let future patches extend that further, and so
on).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This variable is a "struct object_id", but uses the old-style name
"sha1". Let's call it oid to match more modern code (and make it clear
that it can handle any algorithm, since it uses parse_oid_hex()
properly).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The find_commit_name() function passes an object_id.hash as the key of a
hashmap. That ends up in commit_name_neq(), which then feeds it to
oideq(). Which means we should actually be the whole "struct object_id".
It works anyway because pointers to the two are interchangeable. And
because we're going through a layer of void pointers, the compiler
doesn't notice the type mismatch.
But it's worth cleaning up (especially since once we switch away from
sha1hash() on the same line, accessing the hash member will look doubly
out of place).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: only run 'gc' once when fetching multiple remotes
In multiple remotes mode, git-fetch is launched for n-1 remotes and the
last remote is handled by the current process. Each of these processes
will in turn run 'gc' at the end.
This is not really a problem because even if multiple 'gc --auto' is run
at the same time we still handle it correctly. It does show multiple
"auto packing in the background" messages though. And we may waste some
resources when gc actually runs because we still do some stuff before
checking the lock and moving it to background.
So let's try to avoid that. We should only need one 'gc' run after all
objects and references are added anyway. Add a new option --no-auto-gc
that will be used by those n-1 processes. 'gc --auto' will always run on
the main fetch process (*).
(*) even if we fetch remotes in parallel at some point in future, this
should still be fine because we should "join" all those processes
before this step.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git commit-graph verify' subcommand loads a commit-graph from
a given object directory instead of using the standard method
prepare_commit_graph(). During development of load_commit_graph_chain(),
a version did not include prepare_alt_odb() as it was previously
run by prepare_commit_graph() in most cases.
Add a test that prevents that mistake from happening again.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing commit-graph files, we append path data to an
object directory, which may be specified by the user via the
'--object-dir' option. If the user supplies a trailing slash,
or some other alternative path format, the resulting path may
be usable for writing to the correct location. However, when
expiring graph files from the <obj-dir>/info/commit-graphs
directory during a write, we need to compare paths with exact
string matches.
Normalize the commit-graph filenames to avoid ambiguity. This
creates extra allocations, but this is a constant multiple of
the number of commit-graph files, which should be a number in
the single digits.
Further normalize the object directory in the context. Due to
a comparison between g->obj_dir and ctx->obj_dir in
split_graph_merge_strategy(), a trailing slash would prevent
any merging of layers within the same object directory. The
check is there to ensure we do not merge across alternates.
Update the tests to include a case with this trailing slash
problem.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: test --split across alternate without --split
We allow sharing commit-graph files across alternates. When we are
writing a split commit-graph, we allow adding tip graph files that
are not in the alternate, but include commits from our local repo.
However, if our alternate is not using the split commit-graph format,
its file is at .git/objects/info/commit-graph and we are trying to
write files in .git/objects/info/commit-graphs/graph-{hash}.graph.
We already have logic to ensure we do not merge across alternate
boundaries, but we also cannot have a commit-graph chain to our
alternate if uses the old filename structure.
Create a test that verifies we create a new split commit-graph
with only one level and we do not modify the existing commit-graph
in the alternate.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Octopus merges require an extra chunk of data in the commit-graph
file format. Create a test that ensures the new --split option
continues to work with an octopus merge. Specifically, ensure
that the octopus merge has parents across layers to truly check
that our graph position logic holds up correctly.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: clean up chains after flattened write
If we write a commit-graph file without the split option, then
we write to $OBJDIR/info/commit-graph and start to ignore
the chains in $OBJDIR/info/commit-graphs/.
Unlink the commit-graph-chain file and expire the graph-{hash}.graph
files in $OBJDIR/info/commit-graphs/ during every write.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we wrote a commit-graph chain, we only modified the tip file in
the chain. It is valuable to verify what we wrote, but not waste
time checking files we did not write.
Add a '--shallow' option to the 'git commit-graph verify' subcommand
and check that it does not read the base graph in a two-file chain.
Making the verify subcommand read from a chain of commit-graphs takes
some rearranging of the builtin code.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The split commit-graph feature is now fully implemented, but needs
some more run-time configurability. Allow direct callers to 'git
commit-graph write --split' to specify the values used in the
merge strategy and the expire time.
Update the documentation to specify these values.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we merge commit-graph files in a commit-graph chain, we should clean
up the files that are no longer used.
This change introduces an 'expiry_window' value to the context, which is
always zero (for now). We then check the modified time of each
graph-{hash}.graph file in the $OBJDIR/info/commit-graphs folder and
unlink the files that are older than the expiry_window.
Since this is always zero, this immediately clears all unused graph
files. We will update the value to match a config setting in a future
change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In an environment like a fork network, it is helpful to have a
commit-graph chain that spans both the base repo and the fork repo. The
fork is usually a small set of data on top of the large repo, but
sometimes the fork is much larger. For example, git-for-windows/git has
almost double the number of commits as git/git because it rebases its
commits on every major version update.
To allow cross-alternate commit-graph chains, we need a few pieces:
1. When looking for a graph-{hash}.graph file, check all alternates.
2. When merging commit-graph chains, do not merge across alternates.
3. When writing a new commit-graph chain based on a commit-graph file
in another object directory, do not allow success if the base file
has of the name "commit-graph" instead of
"commit-graphs/graph-{hash}.graph".
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When searching for a commit in a commit-graph chain of G graphs with N
commits, the search takes O(G log N) time. If we always add a new tip
graph with every write, the linear G term will start to dominate and
slow the lookup process.
To keep lookups fast, but also keep most incremental writes fast, create
a strategy for merging levels of the commit-graph chain. The strategy is
detailed in the commit-graph design document, but is summarized by these
two conditions:
1. If the number of commits we are adding is more than half the number
of commits in the graph below, then merge with that graph.
2. If we are writing more than 64,000 commits into a single graph,
then merge with all lower graphs.
The numeric values in the conditions above are currently constant, but
can become config options in a future update.
As we merge levels of the commit-graph chain, check that the commits
still exist in the repository. A garbage-collection operation may have
removed those commits from the object store and we do not want to
persist them in the commit-graph chain. This is a non-issue if the
'git gc' process wrote a new, single-level commit-graph file.
After we merge levels, the old graph-{hash}.graph files are no longer
referenced by the commit-graph-chain file. We will expire these files in
a future change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new "--split" option to the 'git commit-graph write' subcommand. This
option allows the optional behavior of writing a commit-graph chain.
The current behavior will add a tip commit-graph containing any commits that
are not in the existing commit-graph or commit-graph chain. Later changes
will allow merging the chain and expiring out-dated files.
Add a new test script (t5324-split-commit-graph.sh) that demonstrates this
behavior.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>