setup: fix memory leaks with `struct repository_format`
After we set up a `struct repository_format`, it owns various pieces of
allocated memory. We then either use those members, because we decide we
want to use the "candidate" repository format, or we discard the
candidate / scratch space. In the first case, we transfer ownership of
the memory to a few global variables. In the latter case, we just
silently drop the struct and end up leaking memory.
Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a
function `clear_repository_format()`, to be used on each side of
`read_repository_format()`. To have a clear and simple memory ownership,
let all users of `struct repository_format` duplicate the strings that
they take from it, rather than stealing the pointers.
Call `clear_...()` at the start of `read_...()` instead of just zeroing
the struct, since we sometimes enter the function multiple times. Thus,
it is important to initialize the struct before calling `read_...()`, so
document that. It's also important because we might not even call
`read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c.
Teach `read_...()` to clear the struct on error, so that it is reset to
a safe state, and document this. (In `setup_git_directory_gently()`, we
look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we
weren't actually supposed to do per the API. After this commit, that's
ok.)
We inherit the existing code's combining "error" and "no version found".
Both are signalled through `version == -1` and now both cause us to
clear any partial configuration we have picked up. For "extensions.*",
that's fine, since they require a positive version number. For
"core.bare" and "core.worktree", we're already verifying that we have a
non-negative version number before using them.
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
travis: remove the hack to build the Windows job on Azure Pipelines
Since Travis did not support Windows (and now only supports very limited
Windows jobs, too limited for our use, the test suite would time out
*all* the time), we added a hack where a Travis job would trigger an
Azure Pipeline (which back then was still called VSTS Build), wait for
it to finish (or time out), and download the log (if available).
Needless to say that it was a horrible hack, necessitated by a bad
situation.
Nowadays, however, we have Azure Pipelines support, and do not need that
hack anymore. So let's retire it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After completing a bisection, we print out the commit we found using an
internal version of diff-tree. The result is aesthetically lacking:
- it shows a raw diff, which is generally less informative for human
readers than "--stat --summary" (which we already decided was nice
for humans in format-patch's output).
- by not abbreviating hashes, the result is likely to wrap on most
people's terminals
- we don't use "-r", so if the commit touched files in a directory,
you only get to see the top-level directory mentioned
- we don't specify "--cc" or similar, so merges print nothing (not
even the commit message!)
Even though bisect might be driven by scripts, there's no reason to
consider this part of the output as machine-readable (if anything, the
initial "$hash is the first bad commit" might be parsed, but we won't
touch that here). Let's make it prettier and more informative for a
human reading the output.
While we're tweaking the options, let's also switch to using the diff
"ui" config. If we're accepting that this is human-readable output, then
we should respect the user's options for how to display it.
Note that we have to touch a few tests in t6030. These check bisection
in a corrupted repository (it's missing a subtree). They didn't fail
with the previous code, because it didn't actually recurse far enough in
the diff to find the broken tree. But now we'll see the corruption and
complain.
Adjusting the tests to expect the die() is the best fix. We still
confirm that we're able to bisect within the broken repo. And we'll
still print "$hash is the first bad commit" as usual before dying;
showing that is a reasonable outcome in a corrupt repository (and was
what might happen already, if the root tree was corrupt).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we run our internal diff-tree to show the bisected commit, we call
init_revisions(), then load config, then setup_revisions(). But that
order is wrong: we copy the configured defaults into the rev_info struct
during the init_revisions step, so our config load wasn't actually doing
anything.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bisect: use string arguments to feed internal diff-tree
Commit e22278c0a0 (bisect: display first bad commit without forking a
new process, 2009-05-28) converted our external call to diff-tree to
an internal use of the log_tree_commit(). But rather than individually
setting options in the rev_info struct (and explaining in comments how
they map to command-line options), we can just pass the command-line
options to setup_revisions().
This is shorter, easier to change, and less likely to break if
revision.c internals change.
Note that we unconditionally set the output format to "raw". The
conditional in the original code didn't actually do anything useful,
since nobody had an opportunity to set the format to anything.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On NetBSD, the version of msgfmt is still 0.14.4. There's no hope for
an upgrade due to some GPLv3 allergy of NetBSD's. This version chokes
on heavily decorated commented entries in po files. It's safer to get
rid of all these obsolete entries.
Makefile: allow for combining DEVELOPER=1 and CFLAGS="..."
Ever since the DEVELOPER=1 facility introduced there's been no way to
have custom CFLAGS (e.g. CFLAGS="-O0 -g -ggdb3") while still
benefiting from the set of warnings and assertions DEVELOPER=1
enables.
This is because the semantics of variables in the Makefile are such
that the user setting CFLAGS overrides anything we set, including what
we're doing in config.mak.dev[1].
So let's introduce a "DEVELOPER_CFLAGS" variable in config.mak.dev and
add it to ALL_CFLAGS. Before this the ALL_CFLAGS variable
would (basically, there's some nuance we won't go into) be set to:
$(CPPFLAGS) [$(CFLAGS) *or* $(CFLAGS) in config.mak.dev] $(BASIC_CFLAGS) $(EXTRA_CPPFLAGS)
The reason for putting DEVELOPER_CFLAGS first is to allow for
selectively overriding something DEVELOPER=1 brings in. On both GCC
and Clang later settings override earlier ones. E.g. "-Wextra
-Wno-extra" will enable no "extra" warnings, but not if those two
arguments are reversed.
Examples of things that weren't possible before, but are now:
# Use -O0 instead of -O2 for less painful debuggng
DEVELOPER=1 CFLAGS="-O0 -g"
# DEVELOPER=1 plus -Wextra, but disable some of the warnings
DEVELOPER=1 DEVOPTS="no-error extra-all" CFLAGS="-O0 -g -Wno-unused-parameter"
The reason for the patches leading up to this one re-arranged the
various *FLAGS assignments and includes is just for readability. The
Makefile supports assignments out of order, e.g.:
$ cat Makefile
X = $(A) $(B) $(C)
A = A
B = B
include c.mak
all:
@echo $(X)
$ cat c.mak
C=C
$ make
A B C
So we could have gotten away with the much smaller change of changing
"CFLAGS" in config.mak.dev to "DEVELOPER_CFLAGS" and adding that to
ALL_CFLAGS earlier in the Makefile "before" the config.mak.*
includes. But I think it's more readable to use variables "after"
they're included.
Makefile: move the setting of *FLAGS closer to "include"
Move the setting of variables like CFLAGS down past settings like
"prefix" and default programs like "TAR" to just before we do the
include from "config.mak.*".
There's no functional changes here yet, but move note that
"ALL_CFLAGS" and "ALL_LDFLAGS" are moved below the include. A
follow-up change will tweak those depending on a variable set in
config.mak.dev.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the assignment of the "STRIP" variable down to where we're
setting variables with the names of other programs.
For consistency with those use "=" for the assignment instead of
"?=". I can't imagine why this would need to be different than the
rest, and 4dc00021f7 ("Makefile: add 'strip' target", 2006-01-12)
which added it doesn't provide an explanation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a comment referring to a caveat that hasn't been applicable
since 18b0fc1ce1 ("Git.pm: Kill Git.xs for now", 2006-09-23).
At the time of 8d7f586f13 ("Git.pm: Support for perl/ being built by a
different compiler", 2006-06-25) some of the code in perl would be
built by a C compiler, but support for that went away a few months
later in 18b0fc1ce1 discussed above.
Since my 20d2a30f8f ("Makefile: replace perl/Makefile.PL with simple
make rules", 2017-12-10) the perl/ directory doesn't even have its own
build process.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "--no-index" is in effect (or implied by the arguments), git-diff
jumps early to a special code path to perform that diff. This means we
miss out on some settings like enabling --ext-diff and --textconv by
default.
Let's jump to the no-index path _after_ we've done more setup on
rev.diffopt. Since some of the options don't affect us (e.g., items
related to the index), let's re-order the setup into two blocks (see the
in-code comments).
Note that we also need to stop re-initializing the diffopt struct in
diff_no_index(). This should not be necessary, as it will already have
been initialized by cmd_diff() (and there are no other callers). That in
turn lets us drop the "repository" argument from diff_no_index (which
never made much sense, since the whole point is that you don't need a
repository).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
During the six months of development of the Azure Pipelines support, the
patches went through quite a few iterations of changes, and to test
those iterations, a temporary build definition was used.
In the meantime, Azure Pipelines support made it to `master`, and we now
have a regular Azure Pipeline, installed via the common GitHub App
workflow. This new pipeline has a different name (git.git instead of
test-git.git), and a new ID (11 instead of 2).
Let's adjust the badge in our README to reflect that final shape of the
Azure Pipeline.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When studying the performance of 'git push' we would like to know
how much time is spent at various parts of the command. One area
that could cause performance trouble is 'git pack-objects'.
Add trace2 regions around the three main actions taken in this
command:
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2_region_enter() and trace2_region_leave() calls around the
various phases of a status scan. This gives elapsed time for each
phase in the GIT_TR2_PERF and GIT_TR2_EVENT trace target.
Also, these Trace2 calls now use s->repo rather than the_repository.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new unified tracing facility for git. The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.
In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.
Trace2 defines 3 output targets. These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).
* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
summary data.
* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
It extends the output with columns for the command process, thread,
repo, absolute and relative elapsed times. It reports events for
child process start/stop, thread start/stop, and per-thread function
nesting.
* GIT_TR2_EVENT is a new structured format. It writes event data as a
series of JSON records.
Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is git-checy-racy command, added a long time ago [1] and was never
part of the default build. Naturally after some makefile changes [2],
git-check-racy was no longer recognized as a build target. Even if it
compiles to day, it will not link after the introduction of
common-main.c [3].
Racy index has not been a problem for a long time [4]. It's time to let
this code go. I briefly consider if check-racy should be part of
test-tool. But I don't think it's worth the effort.
[1] 42f774063d (Add check program "git-check-racy" - 2006-08-15)
[2] c373991375 (Makefile: list generated object files in OBJECTS -
2010-01-26)
[3] 3f2e2297b9 (add an extra level of indirection to main() -
2016-07-01)
[4] I pretend I don't remember anything about the recent split-index's
racy problem
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-curl: refactor reading into rpc_state's buf
Currently, whenever remote-curl reads pkt-lines from its response file
descriptor, only the payload is written to its buf, not the 4 characters
denoting the length. A future patch will require the ability to also
write those 4 characters, so in preparation for that, refactor this read
into its own function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change an unportable invocation of "dd" with count=0, that wanted to
truncate the commit-graph file. In POSIX it is unspecified what
happens when count=0 is provided[1]. The NetBSD "dd" behavior
differs from GNU (and seemingly other BSDs), which has left this test
broken since d2b86fbaa1 ("commit-graph: fix buffer read-overflow",
2019-01-15).
Copying from /dev/null would seek/truncate to seek=$zero_pos and
stop immediately after that (without being able to copy anything),
which is the right way to truncate the file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix widely supported but non-POSIX basic regex syntax introduced in
[1] and [2]. On GNU, NetBSD and FreeBSD the following works:
$ echo xy >f
$ grep 'xy\?' f; echo $?
xy
0
The same goes for "\+". The "?" and "+" syntax is not in the BRE
syntax, just in ERE, but on some implementations it can be invoked by
prefixing the meta-operator with "\", but not on OpenBSD:
In 7171d8c15f ("upload-pack: send symbolic ref information as
capability"), we added a symref capability to the pack protocol, but it
was never documented. Adapt the patch notes from that commit and add
them to the capabilities documentation.
While we're at it, add a disclaimer to the top of
protocol-capabilities.txt noting that the doc only applies to v0/v1 of
the wire protocol.
Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-options.txt: correct wording of --no-commit option
The former wording implied that --no-commit would always cause the
merge operation to "pause" and allow the user to make further changes
and/or provide a special commit message for the merge commit. This
is not the case for fast-forward merges, as there is no merge commit
to create. Without a merge commit, there is no place where it makes
sense to "stop the merge and allow the user to tweak changes"; doing
that would require a full rebase of some sort.
Since users may be unaware of whether their branches have diverged or
not, modify the wording to correctly address fast-forward cases as well
and suggest using --no-ff with --no-commit if the point is to ensure
that the merge stops before completing.
Reported-by: Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mention use of "hooks.allownonascii" in "man githooks"
The default pre-commit script checks the config variable
"hooks.allownonascii" to determine whether to allow non-ASCII file
names -- mention this in "man githooks", just as the section on
"update" mentions the use of "hooks.allowunannotated".
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The resolve_ref_unsafe() function can, and sometimes will in the case
of this codepath, return the char * passed to it to the caller. In
this case we construct a strbuf, free it, and then continue using the
dst_name after that free().
The code being fixed dates back to da3efdb17b ("receive-pack: detect
aliased updates which can occur with symrefs", 2010-04-19). When it
was originally added it didn't have this bug, it was introduced when
it was subsequently modified to use strbuf in 6b01ecfe22 ("ref
namespaces: Support remote repositories via upload-pack and
receive-pack", 2011-07-08).
This is theoretically a security issue, the C standard makes no
guarantees that a value you use after free() hasn't been poked at or
changed by something else on the system, but in practice modern OSs
will have mapped the relevant page to this process, so nothing else
would have used it. We do no further allocations between the free()
and use-after-free, so we ourselves didn't corrupt or change the
value.
Jeff investigated that and found: "It probably would be an issue if
the allocation were larger. glibc at least will use mmap()/munmap()
after some cutoff[1], in which case we'd get a segfault from hitting
the unmapped page. But for small allocations, it just bumps brk() and
the memory is still available for further allocations after
free(). [...] If you had a sufficiently large refname you might be
able to trigger the bug [...]. I tried to push such a ref. I had to
manually make a packed-refs file with the long name to avoid
filesystem limits (though probably you could have a long a/b/c/ name
on ext4). But the result can't actually be pushed, because it all has
to fit into a 64k pkt-line as part of the push protocol.".
An a alternative and more succinct way of implementing this would have
been to do the strbuf_release() at the end of check_aliased_update()
and use "goto out" instead of the early "return" statements. Hopefully
this approach of using a helper instead makes it easier to follow.
1. Jeff: "Weirdly, the mmap() cutoff on my glibc system is 135168
bytes. Which is...2^17 + 2^12? 33 pages? I'm sure there's a good
reason for that, but I didn't dig into it."
Reported-by: 王健强 <jianqiang.wang@securitygossip.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds value completion for a couple more paramters. To make it
easier to maintain these hard coded lists, add a comment at the original
list/code to remind people to update git-completion.bash too.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mk/t5562-no-input-to-too-large-an-input-test:
t5562: do not depend on /dev/zero
Revert "t5562: replace /dev/zero with a pipe from generate_zero_bytes"
Some expected failures of git-http-backend leaves running its children
(receive-pack or upload-pack) which still hold opened descriptors
to act.err and with some probability they live long enough to write
there their failure messages after next test has already truncated
the files. This causes occasional failures of the test script.
Avoid the issue by using separated output and error file for each test,
apprending the test number to their name.
Reported-by: Carlo Arenas <carenas@gmail.com> Helped-by: Carlo Arenas <carenas@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Max Kirillov <max@max630.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tests: teach the test-tool to generate NUL bytes and use it
In cc95bc2025 (t5562: replace /dev/zero with a pipe from
generate_zero_bytes, 2019-02-09), we replaced usage of /dev/zero (which
is not available on NonStop, apparently) by a Perl script snippet to
generate NUL bytes.
Sadly, it does not seem to work on NonStop, as t5562 reportedly hangs.
Worse, this also hangs in the Ubuntu 16.04 agents of the CI builds on
Azure Pipelines: for some reason, the Perl script snippet that is run
via `generate_zero_bytes` in t5562's 'CONTENT_LENGTH overflow ssite_t'
test case tries to write out an infinite amount of NUL bytes unless a
broken pipe is encountered, that snippet never encounters the broken
pipe, and keeps going until the build times out.
Oddly enough, this does not reproduce on the Windows and macOS agents,
nor in a local Ubuntu 18.04.
This developer tried for a day to figure out the exact circumstances
under which this hang happens, to no avail, the details remain a
mystery.
In the end, though, what counts is that this here change incidentally
fixes that hang (maybe also on NonStop?). Even more positively, it gets
rid of yet another unnecessary Perl invocation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was reported [1] that NonStop platform does not have /dev/zero.
The test uses /dev/zero as a dummy input. Passing case (http-backed
failed because of too big input size) should not be reading anything
from it. If http-backend would erroneously try to read any data
returning EOF probably would be even safer than providing some
meaningless data.
Replace /dev/zero with /dev/null to avoid issues with platforms which do
not have /dev/zero.
Revert "t5562: replace /dev/zero with a pipe from generate_zero_bytes"
Revert cc95bc20 ("t5562: replace /dev/zero with a pipe from
generate_zero_bytes", 2019-02-09), as not feeding anything to the
command is a better way to test it.
l10n: de.po: consistent translation of 'root commit'
'root commit' is usually translated as 'Root-Commit'. But in one
occasion it‘s translated as 'Basis-Commit' which is the translation
for 'base commit'.
Signed-off-by: Sebastian Staudt <koraktor@gmail.com>
mingw: safe-guard a bit more against getenv() problems
Running up to v2.21.0, we fixed two bugs that were made prominent by the
Windows-specific change to retain copies of only the 30 latest getenv()
calls' returned strings, invalidating any copies of previous getenv()
calls' return values.
While this really shines a light onto bugs of the form where we hold
onto getenv()'s return values without copying them, it is also a real
problem for users.
And even if Jeff King's patches merged via 773e408881 (Merge branch
'jk/save-getenv-result', 2019-01-29) provide further work on that front,
we are far from done. Just one example: on Windows, we unset environment
variables when spawning new processes, which potentially invalidates
strings that were previously obtained via getenv(), and therefore we
have to duplicate environment values that are somehow involved in
spawning new processes (e.g. GIT_MAN_VIEWER in show_man_page()).
We do not have a chance to investigate, let address, all of those issues
in time for v2.21.0, so let's at least help Windows users by increasing
the number of getenv() calls' return values that are kept valid. The
number 64 was determined by looking at the average number of getenv()
calls per process in the entire test suite run on Windows (which is
around 40) and then adding a bit for good measure. And it is a power of
two (which would have hit yesterday's theme perfectly).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule's default behavior wasn't documented in both git-submodule.txt
and in the usage text of git-submodule. Document the default behavior
similar to how git-remote does it.
Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many of our grab_* functions, which parse the object content, take a
buf/sz pair of the object bytes. However, the functions which actually
parse the buffers (like find_wholine() and find_subpos()) never look at
"sz", and instead use functions like strchr() and strchrnul() that
assume the result is NUL-terminated.
This is OK in practice (and common for Git's parsing code), since we
always allocate an extra NUL when loading an object into memory (and
likewise, we are OK with stopping parsing if a commit or tag contains an
embedded NUL).
Let's drop these extra "sz" parameters, as they are misleading about how
the functions intend to access the buffer. We can drop from both the
functions mentioned above, which in turn lets us drop from their
callers, cascading all the way up to the top-level grab_values().
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>