Merge branch 'jk/pack-idx-corruption-safety' into maint
The code to read the pack data using the offsets stored in the pack
idx file has been made more carefully check the validity of the
data in the idx.
* jk/pack-idx-corruption-safety:
sha1_file.c: mark strings for translation
use_pack: handle signed off_t overflow
nth_packed_object_offset: bounds-check extended offset
t5313: test bounds-checks of corrupted/malicious pack/idx files
Merge branch 'js/config-set-in-non-repository' into maint
"git config section.var value" to set a value in per-repository
configuration file failed when it was run outside any repository,
but didn't say the reason correctly.
* js/config-set-in-non-repository:
git config: report when trying to modify a non-existing repo config
Merge branch 'sb/submodule-module-list-fix' into maint
A helper function "git submodule" uses since v2.7.0 to list the
modules that match the pathspec argument given to its subcommands
(e.g. "submodule add <repo> <path>") has been fixed.
Merge branch 'jk/grep-binary-workaround-in-test' into maint
Recent versions of GNU grep are pickier when their input contains
arbitrary binary data, which some of our tests uses. Rewrite the
tests to sidestep the problem.
* jk/grep-binary-workaround-in-test:
t9200: avoid grep on non-ASCII data
t8005: avoid grep on non-ASCII data
* jk/tighten-alloc: (23 commits)
compat/mingw: brown paper bag fix for 50a6c8e
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
...
"git merge-tree" used to mishandle "both sides added" conflict with
its own "create a fake ancestor file that has the common parts of
what both sides have added and do a 3-way merge" logic; this has
been updated to use the usual "3-way merge with an empty blob as
the fake common ancestor file" approach used in the rest of the
system.
* jk/no-diff-emit-common:
xdiff: drop XDL_EMIT_COMMON
merge-tree: drop generate_common strategy
merge-one-file: use empty blob for add/add base
Merge branch 'nd/dwim-wildcards-as-pathspecs' into maint
"git show 'HEAD:Foo[BAR]Baz'" did not interpret the argument as a
rev, i.e. the object named by the the pathname with wildcard
characters in a tree object.
Handling of errors while writing into our internal asynchronous
process has been made more robust, which reduces flakiness in our
tests.
* jk/epipe-in-async:
t5504: handle expected output from SIGPIPE death
test_must_fail: report number of unexpected signal
fetch-pack: ignore SIGPIPE in sideband demuxer
write_or_die: handle EPIPE in async threads
Many codepaths forget to check return value from git_config_set();
the function is made to die() to make sure we do not proceed when
setting a configuration variable failed.
* ps/config-error:
config: rename git_config_set_or_die to git_config_set
config: rename git_config_set to git_config_set_gently
compat: die when unable to set core.precomposeunicode
sequencer: die on config error when saving replay opts
init-db: die on config errors when initializing empty repo
clone: die on config error in cmd_clone
remote: die on config error when manipulating remotes
remote: die on config error when setting/adding branches
remote: die on config error when setting URL
submodule--helper: die on config error when cloning module
submodule: die on config error when linking modules
branch: die on config error when editing branch description
branch: die on config error when unsetting upstream
branch: report errors in tracking branch setup
config: introduce set_or_die wrappers
Traditionally, the tests that try commands that work on the
contents in the working tree were named with "worktree" in their
filenames, but with the recent addition of "git worktree"
subcommand, whose tests are also named similarly, it has become
harder to tell them apart. The traditional tests have been renamed
to use "work-tree" instead in an attempt to differentiate them.
* mg/work-tree-tests:
tests: rename work-tree tests to *work-tree*
Simplify the two callback functions that are triggered when the
child process terminates to avoid misuse of the child-process
structure that has already been cleaned up.
* sb/submodule-parallel-fetch:
run-command: do not pass child process data into callbacks
* nd/i18n-2.8.0:
trailer.c: mark strings for translation
ref-filter.c: mark strings for translation
builtin/clone.c: mark strings for translation
builtin/checkout.c: mark strings for translation
The code to read the pack data using the offsets stored in the pack
idx file has been made more carefully check the validity of the
data in the idx.
* jk/pack-idx-corruption-safety:
sha1_file.c: mark strings for translation
use_pack: handle signed off_t overflow
nth_packed_object_offset: bounds-check extended offset
t5313: test bounds-checks of corrupted/malicious pack/idx files
t5510 carefully keeps the cwd at the test root by using either subshells
or explicit cd'ing back to the root. Use a subshell for the last
subtest, too.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d53c2c6 (mingw: fix t9700's assumption about
directory separators, 2016-01-27) uses perl's "/r" regex
modifier to do a non-destructive replacement on a string,
leaving the original unmodified and returning the result.
This feature was introduced in perl 5.14, but systems with
older perl are still common (e.g., CentOS 6.5 still has perl
5.10). Let's work around it by providing a helper function
that does the same thing using older syntax.
While we're at it, let's switch to using an alternate regex
separator, which is slightly more readable.
Reported-by: Christian Couder <christian.couder@gmail.com> Helped-by: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0001: fix GIT_* environment variable check under --valgrind
When a test case is run without --valgrind, the wrap-for-bin.sh
helper script inserts the environment variable GIT_TEXTDOMAINDIR, but
when run with --valgrind, the variable is missing. A recently
introduced test case expects the presence of the variable, though, and
fails under --valgrind.
Rewrite the test case to strip conditially defined environment variables
from both expected and actual output.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The wording is introduced in c3f0baaca (Documentation: sync git.txt
command list and manual page title, 2007-01-18), but rebase has evolved
since then, capture the modern usage by being more generic about the
rebase command in the summary.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pthread_exit() function is not expected to return. Ever. On Windows,
we call ExitThread() whose documentation claims: "Ends the calling
thread", i.e. there is no condition in which this function simply
returns: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682659
While at it, fix the return type to be void, as per
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_exit.html
Pointed out by Jeff King, helped by Stefan Naewe, Junio Hamano &
Johannes Sixt.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run-command: do not pass child process data into callbacks
The expected way to pass data into the callback is to pass them via
the customizable callback pointer. The error reporting in
default_{start_failure, task_finished} is not user friendly enough, that
we want to encourage using the child data for such purposes.
Furthermore the struct child data is cleaned by the run-command API,
before we access them in the callbacks, leading to use-after-free
situations.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before commit 372370f (http: use credential API to handle proxy auth...),
Environment variable "no_proxy" will take effect if the config variable
"http.proxy" is not set. So the following comamnd won't fail if not
behind a firewall.
But commit 372370f not only read git config variable "http.proxy", but
also read "http_proxy" and "https_proxy" environment variables, and set
the curl option using:
Commit 50a6c8e (use st_add and st_mult for allocation size
computation, 2016-02-22) fixed up many xmalloc call-sites
including ones in compat/mingw.c.
But I screwed up one of them, which was half-converted to
ALLOC_ARRAY, using a very early prototype of the function.
And I never caught it because I don't build on Windows.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gcc under Mac OX 10.6 throws an internal compiler error:
CC combine-diff.o
combine-diff.c: In function ‘diff_tree_combined’:
combine-diff.c:1391: internal compiler error: Segmentation fault
while attempting to build Git at 5b442c4f (tree-diff: catch integer
overflow in combine_diff_path allocation, 2016-02-19).
As clang that ships with the version does not have the same bug,
make Git compile under Mac OS X 10.6 by using clang instead of gcc
to work this around, as it is unlikely that we will see fixed GCC
on that platform.
Later versions of Mac OSX/Xcode only provide clang, and gcc is a
wrapper to it.
Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
README has been renamed to README.md and its contents got tweaked
slightly to make it easier on the eyes.
* mm/readme-markdown:
README.md: move down historical explanation about the name
README.md: don't call git stupid in the title
README.md: move the link to git-scm.com up
README.md: add hyperlinks on filenames
README: use markdown syntax
"git config section.var value" to set a value in per-repository
configuration file failed when it was run outside any repository,
but didn't say the reason correctly.
* js/config-set-in-non-repository:
git config: report when trying to modify a non-existing repo config
Handling of errors while writing into our internal asynchronous
process has been made more robust, which reduces flakiness in our
tests.
* jk/epipe-in-async:
t5504: handle expected output from SIGPIPE death
test_must_fail: report number of unexpected signal
fetch-pack: ignore SIGPIPE in sideband demuxer
write_or_die: handle EPIPE in async threads
Across the transition at around Git version 2.0, the user used to
get a pretty loud warning when running "git push" without setting
push.default configuration variable. We no longer warn, given that
the transition is over long time ago.
* mm/push-default-warning:
push: remove "push.default is unset" warning message
When "git submodule update" did not result in fetching the commit
object in the submodule that is referenced by the superproject, the
command learned to retry another fetch, specifically asking for
that commit that may not be connected to the refs it usually
fetches.
* sb/submodule-fetch-nontip:
submodule: try harder to fetch needed sha1 by direct fetching sha1
A helper function "git submodule" uses since v2.7.0 to list the
modules that match the pathspec argument given to its subcommands
(e.g. "submodule add <repo> <path>") has been fixed.
Recent versions of GNU grep are pickier when their input contains
arbitrary binary data, which some of our tests uses. Rewrite the
tests to sidestep the problem.
* jk/grep-binary-workaround-in-test:
t9200: avoid grep on non-ASCII data
t8005: avoid grep on non-ASCII data
The "credential-cache" daemon process used to run in whatever
directory it happened to start in, but this made umount(2)ing the
filesystem that houses the repository harder; now the process
chdir()s to the directory that house its own socket on startup.
* jg/credential-cache-chdir-to-sockdir:
credential-cache--daemon: change to the socket dir on startup
credential-cache--daemon: disallow relative socket path
credential-cache--daemon: refactor check_socket_directory
Many codepaths forget to check return value from git_config_set();
the function is made to die() to make sure we do not proceed when
setting a configuration variable failed.
* ps/config-error:
config: rename git_config_set_or_die to git_config_set
config: rename git_config_set to git_config_set_gently
compat: die when unable to set core.precomposeunicode
sequencer: die on config error when saving replay opts
init-db: die on config errors when initializing empty repo
clone: die on config error in cmd_clone
remote: die on config error when manipulating remotes
remote: die on config error when setting/adding branches
remote: die on config error when setting URL
submodule--helper: die on config error when cloning module
submodule: die on config error when linking modules
branch: die on config error when editing branch description
branch: die on config error when unsetting upstream
branch: report errors in tracking branch setup
config: introduce set_or_die wrappers
Traditionally, the tests that try commands that work on the
contents in the working tree were named with "worktree" in their
filenames, but with the recent addition of "git worktree"
subcommand, whose tests are also named similarly, it has become
harder to tell them apart. The traditional tests have been renamed
to use "work-tree" instead in an attempt to differentiate them.
* mg/work-tree-tests:
tests: rename work-tree tests to *work-tree*
The configuration system has been taught to phrase where it found a
bad configuration variable in a better way in its error messages.
"git config" learnt a new "--show-origin" option to indicate where
the values come from.
* ls/config-origin:
config: add '--show-origin' option to print the origin of a config value
config: add 'origin_type' to config_source struct
rename git_config_from_buf to git_config_from_mem
t: do not hide Git's exit code in tests using 'nul_to_q'
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
"git merge-tree" used to mishandle "both sides added" conflict with
its own "create a fake ancestor file that has the common parts of
what both sides have added and do a 3-way merge" logic; this has
been updated to use the usual "3-way merge with an empty blob as
the fake common ancestor file" approach used in the rest of the
system.
* jk/no-diff-emit-common:
xdiff: drop XDL_EMIT_COMMON
merge-tree: drop generate_common strategy
merge-one-file: use empty blob for add/add base
The internal API to interact with "remote.*" configuration
variables has been streamlined.
* tg/git-remote:
remote: use remote_is_configured() for add and rename
remote: actually check if remote exits
remote: simplify remote_is_configured()
remote: use parse_config_key
In contrast to apache 2.2, apache 2.4 does not load mod_unixd in its
default configuration (because there are choices). Thus, with the
current config, apache 2.4.10 will not be started and the httpd tests
will not run on distros with default apache config (RedHat type).
Enable mod_unixd to make the httpd tests run. This does not affect
distros negatively which have that config already in their default
(Debian type). httpd tests will run on these before and after this patch.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 8bf4bec (add "ok=sigpipe" to test_must_fail and use
it to fix flaky tests, 2015-11-27) taught t5504 to handle
"git push" racily exiting with SIGPIPE rather than failing.
However, one of the tests checks the output of the command,
as well. In the SIGPIPE case, we will not have produced any
output. If we want the test to be truly non-flaky, we have
to accept either output.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_must_fail: report number of unexpected signal
If a command is marked as test_must_fail but dies with a
signal, we consider that a problem and report the error to
stderr. However, we don't say _which_ signal; knowing that
can make debugging easier. Let's share as much as we know.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the other side feeds us a bogus pack, index-pack (or
unpack-objects) may die early, before consuming all of its
input. As a result, the sideband demuxer may get SIGPIPE
(racily, depending on whether our data made it into the pipe
buffer or not). If this happens and we are compiled with
pthread support, it will take down the main thread, too.
This isn't the end of the world, as the main process will
just die() anyway when it sees index-pack failed. But it
does mean we don't get a chance to say "fatal: index-pack
failed" or similar. And it also means that we racily fail
t5504, as we sometimes die() and sometimes are killed by
SIGPIPE.
So let's ignore SIGPIPE while demuxing the sideband. We are
already careful to check the return value of write(), so we
won't waste time writing to a broken pipe. The caller will
notice the error return from the async thread, though in
practice we don't even get that far, as we die() as soon as
we see that index-pack failed.
The non-sideband case is already fine; we let index-pack
read straight from the socket, so there is no SIGPIPE at
all. Technically the non-threaded async case is also OK
without this (the forked async process gets SIGPIPE), but
it's not worth distinguishing from the threaded case here.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When write_or_die() sees EPIPE, it treats it specially by
converting it into a SIGPIPE death. We obviously cannot
ignore it, as the write has failed and the caller expects us
to die. But likewise, we cannot just call die(), because
printing any message at all would be a nuisance during
normal operations.
However, this is a problem if write_or_die() is called from
a thread. Our raised signal ends up killing the whole
process, when logically we just need to kill the thread
(after all, if we are ignoring SIGPIPE, there is good reason
to think that the main thread is expecting to handle it).
Inside an async thread, the die() code already does the
right thing, because we use our custom die_async() routine,
which calls pthread_join(). So ideally we would piggy-back
on that, and simply call:
die_quietly_with_code(141);
or similar. But refactoring the die code to do this is
surprisingly non-trivial. The die_routines themselves handle
both printing and the decision of the exit code. Every one
of them would have to be modified to take new parameters for
the code, and to tell us to be quiet.
Instead, we can just teach write_or_die() to check for the
async case and handle it specially. We do have to build an
interface to abstract the async exit, but it's simple and
self-contained. If we had many call-sites that wanted to do
this die_quietly_with_code(), this approach wouldn't scale
as well, but we don't. This is the only place where do this
weird exit trick.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add DEVELOPER makefile knob to check for acknowledged warnings
We assume Git developers have a reasonably modern compiler and recommend
them to enable the DEVELOPER makefile knob to ensure their patches are
clear of all compiler warnings the Git core project cares about.
Enable the DEVELOPER makefile knob in the Travis-CI build.
Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A v2 pack index file can specify an offset within a packfile
of up to 2^64-1 bytes. On a system with a signed 64-bit
off_t, we can represent only up to 2^63-1. This means that a
corrupted .idx file can end up with a negative offset in the
pack code. Our bounds-checking use_pack function looks for
too-large offsets, but not for ones that have wrapped around
to negative. Let's do so, which fixes an out-of-bounds
access demonstrated in t5313.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a pack .idx file has a corrupted offset for an object, we
may try to access an offset in the .idx or .pack file that
is larger than the file's size. For the .pack case, we have
use_pack() to protect us, which realizes the access is out
of bounds. But if the corrupted value asks us to look in the
.idx file's secondary 64-bit offset table, we blindly add it
to the mmap'd index data and access arbitrary memory.
We can fix this with a simple bounds-check compared to the
size we found when we opened the .idx file.
Note that there's similar code in index-pack that is
triggered only during "index-pack --verify". To support
both, we pull the bounds-check into a separate function,
which dies when it sees a corrupted file.
It would be nice if we could return an error, so that the
pack code could try to find a good copy of the object
elsewhere. Currently nth_packed_object_offset doesn't have
any way to return an error, but it could probably use "0" as
a sentinel value (since no object can start there). This is
the minimal fix, and we can improve the resilience later on
top.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5313: test bounds-checks of corrupted/malicious pack/idx files
Our on-disk .pack and .idx files may reference other data by
offset. We should make sure that we are not fooled by
corrupt data into accessing memory outside of our mmap'd
boundaries.
This patch adds a series of tests for offsets found in .pack
and .idx files. For the most part we get this right, but
there are two tests of .idx files marked as failures: we do
not bounds-check offsets in the v2 index's extended offset
table, nor do we handle .idx offsets that overflow a signed
off_t.
With these tests, we should have good coverage of all
offsets found in these files. Note that this doesn't cover
.bitmap files, which may have similar bugs.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
push: remove "push.default is unset" warning message
The warning was important before the 2.0 transition, and remained
important for a while after, so that new users get push.default
explicitly in their configuration and do not experience inconsistent
behavior if they ever used an older version of Git.
The warning has been there since version 1.8.0 (Oct 2012), hence we can
expect the vast majority of current Git users to have been exposed to
it, and most of them have already set push.default explicitly. The
switch from 'matching' to 'simple' was planned for 2.0 (May 2014), but
actually happened only for 2.3 (Feb 2015).
Today, the warning is mostly seen by beginners, who have not set their
push.default configuration (yet). For many of them, the warning is
confusing because it talks about concepts that they have not learned and
asks them a choice that they are not able to make yet. See for example
(1260 votes for the question, 1824 for the answer as of writing)
Remove the warning completely to avoid disturbing beginners. People who
still occasionally use an older version of Git will be exposed to the
warning through this old version.
Eventually, versions of Git without the warning will be deployed enough
and tutorials will not need to advise setting push.default anymore.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
README.md: move down historical explanation about the name
The explanations about why the name was chosen are secondary compared to
the description and link to the documentation.
Some consider these explanations as good computer scientists joke, but
other see it as needlessly offensive vocabulary.
This patch preserves the historical joke, but gives it less importance
by moving it to the end of the README, and makes it clear that it is a
historical explanation, that does not necessarily reflect the state of
mind of current developers.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"the stupid content tracker" was true in the early days of Git, but
hardly applicable these days. "fast, scalable, distributed" describes
Git more accuralety.
Also, "stupid" can be seen as offensive by some people. Let's not use it
in the very first words of the README.
The new formulation is taken from the description of the Debian package.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule: try harder to fetch needed sha1 by direct fetching sha1
When reviewing a change that also updates a submodule in Gerrit, a
common review practice is to download and cherry-pick the patch
locally to test it. However when testing it locally, the 'git
submodule update' may fail fetching the correct submodule sha1 as
the corresponding commit in the submodule is not yet part of the
project history, but also just a proposed change.
If $sha1 was not part of the default fetch, we try to fetch the $sha1
directly. Some servers however do not support direct fetch by sha1,
which leads git-fetch to fail quickly. We can fail ourselves here as
the still missing sha1 would lead to a failure later in the checkout
stage anyway, so failing here is as good as we can get.
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>