We set GIT_CONFIG_NOSYSTEM in our test scripts so that we do
not accidentally read /etc/gitconfig and have it influence
the outcome of the tests. But when running smart-http tests,
Apache will clean the environment, including this variable,
and the "server" side of our http operations will read it.
You can see this breakage by doing something like:
make
./git config --system http.getanyfile false
make test
which will cause t5561 to fail when it tests the
fallback-to-dumb operation.
We can fix this by instructing Apache to pass through the
variable. Unlike with other variables (e.g., 89c57ab3's
GIT_TRACE), we don't need to set a dummy value to prevent
warnings from Apache. test-lib.sh already makes sure that
GIT_CONFIG_NOSYSTEM is set and exported.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this code was originally written, there wasn't much
thought given to the timing between a client asking for
"exit", the daemon signaling that the action is done (with
EOF), and the actual cleanup of the socket.
However, we need to care about this so that our test scripts
do not end up racy (e.g., by asking for an exit and checking
that the socket was cleaned up). The code that is already
there happens to behave very reasonably; let's add a comment
to make it clear that any changes should retain the same
behavior.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: ignore trailing whitespace in mailrc alias file
The regex for parsing mailrc considers everything after the
second whitespace to be the email address, up to the end of
the line. We have to include whitespace there, because you
may have multiple space-separated addresses, each with their
own internal quoting.
But if there is trailing whitespace, we include that, too.
This confuses quotewords() when we try to split the
individual addresses, and we end up storing "undef" in our
alias list. Later parts of the code then access that,
generating perl warnings.
Let's tweak our regex to throw away any trailing whitespace
on each line.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.5:
Git 2.5.5
Git 2.4.11
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
* maint-2.4:
Git 2.4.11
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
Merge branch 'jk/path-name-safety-2.4' into maint-2.4
Bugfix patches were backported from the 'master' front to plug heap
corruption holes, to catch integer overflow in the computation of
pathname lengths, and to get rid of the name_path API. Both of
these would have resulted in writing over an under-allocated buffer
when formulating pathnames while tree traversal.
* jk/path-name-safety-2.4:
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
Translate 22 new messages came from git.pot update in f1522b2
(l10n: git.pot: v2.8.0 round 2 (21 new, 1 removed)) and a5a4168
(l10n: git.pot: Add one new message for Git 2.8.0).
* maint:
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
Recent versions of GNU grep is pickier than before to decide if a
file is "binary" and refuse to give line-oriented hits when we
expect it to, unless explicitly told with "-a" option. As our
scripted Porcelains use sane_grep wrapper for line-oriented data,
even when the line may contain non-ASCII payload we took from
end-user data, use "grep -a" to implement sane_grep wrapper when
using an implementation of "grep" that takes the "-a" option.
* jc/sane-grep:
rebase-i: clarify "is this commit relevant?" test
sane_grep: pass "-a" if grep accepts it
The two alternative ways to spell "ssh://" transport have been
deprecated for a long time. The last mention of them has finally
removed from the documentation.
* cn/deprecate-ssh-git-url:
Disown ssh+git and git+ssh
git-svn: fix URL canonicalization during init w/ SVN 1.7+
URL canonicalization when full URLs are passed became broken
when using SVN::_Core::svn_dirent_canonicalize under SVN 1.7.
Ensure we canonicalize paths and URLs with appropriate functions
for each type from now on as the path/URL-agnostic
SVN::_Core::svn_path_canonicalize function is deprecated in SVN.
Reported-by: Adam Dinwoodie <adam@dinwoodie.org>
http://mid.gmane.org/20160315162344.GM29016@dinwoodie.org Signed-off-by: Eric Wong <normalperson@yhbt.net>
* jk/path-name-safety-2.7:
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
t9117: test specifying full url to git svn init -T
According to the documentation, full URLs can be specified in the `-T`
argument to `git svn init`. However, the canonicalization of such
arguments squashes together consecutive "/"s, which unsurprisingly
breaks http://, svn://, etc URLs. Add a failing test case to provide
evidence of that.
On systems where Subversion provides svn_path_canonicalize but not
svn_dirent_canonicalize (Subversion 1.6 and earlier?), this test passes,
as svn_path_canonicalize doesn't mangle the consecutive "/"s.
[ew: fixed whitespace]
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org> Signed-off-by: Eric Wong <normalperson@yhbt.net>
Merge branch 'jk/path-name-safety-2.6' into jk/path-name-safety-2.7
* jk/path-name-safety-2.6:
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
Merge branch 'jk/path-name-safety-2.5' into jk/path-name-safety-2.6
* jk/path-name-safety-2.5:
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
Merge branch 'jk/path-name-safety-2.4' into jk/path-name-safety-2.5
* jk/path-name-safety-2.4:
list-objects: pass full pathname to callbacks
list-objects: drop name_path entirely
list-objects: convert name_path to a strbuf
show_object_with_name: simplify by using path_name()
http-push: stop using name_path
tree-diff: catch integer overflow in combine_diff_path allocation
add helpers for detecting size_t overflow
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
show_object_with_name: simplify by using path_name()
When "git rev-list" shows an object with its associated path
name, it does so by walking the name_path linked list and
printing each component (stopping at any embedded NULs or
newlines).
We'd like to eventually get rid of name_path entirely in
favor of a single buffer, and dropping this custom printing
code is part of that. As a first step, let's use path_name()
to format the list into a single buffer, and print that.
This is strictly less efficient than the original, but it's
a temporary step in the refactoring; our end game will be to
get the fully formatted name in the first place.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Performing computations on size_t variables that we feed to
xmalloc and friends can be dangerous, as an integer overflow
can cause us to allocate a much smaller chunk than we
realized.
We already have unsigned_add_overflows(), but let's add
unsigned_mult_overflows() to that. Furthermore, rather than
have each site manually check and die on overflow, we can
provide some helpers that will:
- promote the arguments to size_t, so that we know we are
doing our computation in the same size of integer that
will ultimately be fed to xmalloc
- check and die on overflow
- return the result so that computations can be done in
the parameter list of xmalloc.
These functions are a lot uglier to use than normal
arithmetic operators (you have to do "st_add(foo, bar)"
instead of "foo + bar"). To at least limit the damage, we
also provide multi-valued versions. So rather than:
st_add(st_add(a, b), st_add(c, d));
you can write:
st_add4(a, b, c, d);
This isn't nearly as elegant as a varargs function, but it's
a lot harder to get it wrong. You don't have to remember to
add a sentinel value at the end, and the compiler will
complain if you get the number of arguments wrong. This
patch adds only the numbered variants required to convert
the current code base; we can easily add more later if
needed.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The graph traversal code here passes along a name_path to
build up the pathname at which we find each blob. But we
never actually do anything with the resulting names, making
it a waste of code and memory.
This usage came in aa1dbc9 (Update http-push functionality,
2006-03-07), and originally the result was passed to
"add_object" (which stored it, but didn't really use it,
either). But we stopped using that function in 1f1e895 (Add
"named object array" concept, 2006-06-19) in favor of
storing just the objects themselves.
Moreover, the generation of the name in process_tree() is
buggy. It sticks "name" onto the end of the name_path linked
list, and then passes it down again as it recurses (instead
of "entry.path"). So it's a good thing this was unused, as
the resulting path for "a/b/c/d" would end up as "a/a/a/a".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree-diff: catch integer overflow in combine_diff_path allocation
A combine_diff_path struct has two "flex" members allocated
alongside the struct: a string to hold the pathname, and an
array of parent pointers. We use an "int" to compute this,
meaning we may easily overflow it if the pathname is
extremely long.
We can fix this by using size_t, and checking for overflow
with the st_add helper.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to find a good spot for testing clone with submodules, I
got confused where to add a new test file. There are both tests in t560*
as well as t57* both testing the clone command. t/README claims the
second digit is to indicate the command, which is inconsistent to the
current naming structure.
Rename all t57* tests to be in t56* to follow the pattern of the digits
as laid out in t/README.
It would have been less work to rename t56* => t57* because there are less
files, but the tests in t56* look more basic and I assumed the higher the
last digits the more complicated niche details are tested, so with the patch
now it looks more in order to me.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error messages should attempt to fit within the confines of
an 80-column terminal to avoid compatibility and accessibility
problems. Furthermore the word "directories" can be misleading
when used in the context of git refnames.
Expand the area of globs applicability for branches and tags
in git-svn. It is now possible to use globs like 'a*e', or 'release_*'.
This allows users to avoid long lines in config like:
Nobody reads this anymore, and they're not likely to; the
interesting thing is whether or not we passed
check_repository_format(), and possibly the individual
"extension" variables.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, check_repository_format_gently would parse
the config with a single callback, and that callback would
set up a bunch of global variables. But now that we have
separate workdirs, we have to be more careful. Commit 31e26eb (setup.c: support multi-checkout repo setup,
2014-11-30) introduced a reduced callback which omits some
values like core.worktree. In the "main" callback we call
the reduced one, and then add back in the missing variables.
Now that we have split the config-parsing from the munging
of the global variables, we can do it all with a single
callback, and keep all of the "are we in a separate workdir"
logic together.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check our templates to make sure they are from a
version of git we understand (otherwise we would init a
repository we cannot ourselves run in!). But our simple
integer check has fallen behind the times. Let's use the
helpers that setup.c provides to do it right.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup: refactor repo format reading and verification
When we want to know if we're in a git repository of
reasonable vintage, we can call check_repository_format_gently(),
which does three things:
1. Reads the config from the .git/config file.
2. Verifies that the version info we read is sane.
3. Writes some global variables based on this.
There are a few things we could improve here.
One is that steps 1 and 3 happen together. So if the
verification in step 2 fails, we still clobber the global
variables. This is especially bad if we go on to try another
repository directory; we may end up with a state of mixed
config variables.
The second is there's no way to ask about the repository
version for anything besides the main repository we're in.
git-init wants to do this, and it's possible that we would
want to start doing so for submodules (e.g., to find out
which ref backend they're using).
We can improve both by splitting the first two steps into
separate functions. Now check_repository_format_gently()
calls out to steps 1 and 2, and does 3 only if step 2
succeeds.
Note that the public interface for read_repository_format()
and what check_repository_format_gently() needs from it are
not quite the same, leading us to have an extra
read_repository_format_1() helper. The extra needs from
check_repository_format_gently() will go away in a future
patch, and we can simplify this then to just the public
interface.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no more callers, and it's a rather confusing
interface. This could just be folded into
git_config_with_options(), but for the sake of readability,
we'll leave it as a separate (static) helper function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_repository_format_gently: stop using git_config_early
There's a chicken-and-egg problem with using the regular
git_config during the repository setup process. We get
around it here by using a special interface that lets us
specify the per-repo config, and avoid calling
git_pathdup().
But this interface doesn't actually make sense. It will look
in the system and per-user config, too; we definitely would
not want to accept a core.repositoryformatversion from
there.
The git_config_from_file interface is a better match, as it
lets us look at a single file.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "shared_repository" config is loaded as part of
check_repository_format_version, but it's not quite like the
other values we check there. Something like
core.repositoryformatversion only makes sense in per-repo
config, but core.sharedrepository can be set in a per-user
config (e.g., to make all "git init" invocations shared by
default).
So it would make more sense as part of git_default_config.
Commit 457f06d (Introduce core.sharedrepository, 2005-12-22)
says:
[...]the config variable is set in the function which
checks the repository format. If this were done in
git_default_config instead, a lot of programs would need
to be modified to call git_config(git_default_config)
first.
This is still the case today, but we have one extra trick up
our sleeve. Now that we have the git_configset
infrastructure, it's not so expensive for us to ask for a
single value. So we can simply lazy-load it on demand.
This should be OK to do in general. There are some problems
with loading config before setup_git_directory() is called,
but we shouldn't be accessing the value before then (if we
were, then it would already be broken, as the variable would
not have been set by check_repository_format_version!). The
trickiest caller is git-init, but it handles the values
manually itself.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
wrap shared_repository global in get/set accessors
It would be useful to control access to the global
shared_repository, so that we can lazily load its config.
The first step to doing so is to make sure all access
goes through a set of functions.
This step is purely mechanical, and should result in no
change of behavior.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function's interface is rather enigmatic, so let's
document it further.
While we're here, let's also drop the return value. It will
always either be "0" or the function will die (consequently,
neither of its two callers bothered to check the return).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While I was checking all the call sites of sane_grep and sane_egrep,
I noticed this one is somewhat strangely written. The lines in the
file sane_grep works on all begin with 40-hex object name, so there
is no real risk of confusing "test $(...) = ''" by finding something
that begins with a dash, but using the status from sane_grep makes
it a lot clearer what is going on.
Newer versions of GNU grep is reported to be pickier when we feed a
non-ASCII input and break some Porcelain scripts. As we know we do
not feed random binary file to our own sane_grep wrapper, allow us
to always pass "-a" by setting SANE_TEXT_GREP=-a Makefile variable
to work it around.
If two branches each move a file into different directories then
mergetool will fail because it assumes that the file being merged, and
its parent directory, are present in the worktree.
Create the merge file's parent directory to allow using the
deleted base version of the file for merge resolution when
encountering a delete/delete conflict.
The end result is that a delete/delete conflict is presented for the
user to resolve.
Reported-by: Joe Einertson <joe@kidblog.org> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jk/pack-idx-corruption-safety' into maint
The code to read the pack data using the offsets stored in the pack
idx file has been made more carefully check the validity of the
data in the idx.
* jk/pack-idx-corruption-safety:
sha1_file.c: mark strings for translation
use_pack: handle signed off_t overflow
nth_packed_object_offset: bounds-check extended offset
t5313: test bounds-checks of corrupted/malicious pack/idx files
Merge branch 'js/config-set-in-non-repository' into maint
"git config section.var value" to set a value in per-repository
configuration file failed when it was run outside any repository,
but didn't say the reason correctly.
* js/config-set-in-non-repository:
git config: report when trying to modify a non-existing repo config
Merge branch 'sb/submodule-module-list-fix' into maint
A helper function "git submodule" uses since v2.7.0 to list the
modules that match the pathspec argument given to its subcommands
(e.g. "submodule add <repo> <path>") has been fixed.
Merge branch 'jk/grep-binary-workaround-in-test' into maint
Recent versions of GNU grep are pickier when their input contains
arbitrary binary data, which some of our tests uses. Rewrite the
tests to sidestep the problem.
* jk/grep-binary-workaround-in-test:
t9200: avoid grep on non-ASCII data
t8005: avoid grep on non-ASCII data
* jk/tighten-alloc: (23 commits)
compat/mingw: brown paper bag fix for 50a6c8e
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
...
"git merge-tree" used to mishandle "both sides added" conflict with
its own "create a fake ancestor file that has the common parts of
what both sides have added and do a 3-way merge" logic; this has
been updated to use the usual "3-way merge with an empty blob as
the fake common ancestor file" approach used in the rest of the
system.
* jk/no-diff-emit-common:
xdiff: drop XDL_EMIT_COMMON
merge-tree: drop generate_common strategy
merge-one-file: use empty blob for add/add base
Merge branch 'nd/dwim-wildcards-as-pathspecs' into maint
"git show 'HEAD:Foo[BAR]Baz'" did not interpret the argument as a
rev, i.e. the object named by the the pathname with wildcard
characters in a tree object.
Handling of errors while writing into our internal asynchronous
process has been made more robust, which reduces flakiness in our
tests.
* jk/epipe-in-async:
t5504: handle expected output from SIGPIPE death
test_must_fail: report number of unexpected signal
fetch-pack: ignore SIGPIPE in sideband demuxer
write_or_die: handle EPIPE in async threads
Many codepaths forget to check return value from git_config_set();
the function is made to die() to make sure we do not proceed when
setting a configuration variable failed.
* ps/config-error:
config: rename git_config_set_or_die to git_config_set
config: rename git_config_set to git_config_set_gently
compat: die when unable to set core.precomposeunicode
sequencer: die on config error when saving replay opts
init-db: die on config errors when initializing empty repo
clone: die on config error in cmd_clone
remote: die on config error when manipulating remotes
remote: die on config error when setting/adding branches
remote: die on config error when setting URL
submodule--helper: die on config error when cloning module
submodule: die on config error when linking modules
branch: die on config error when editing branch description
branch: die on config error when unsetting upstream
branch: report errors in tracking branch setup
config: introduce set_or_die wrappers
Traditionally, the tests that try commands that work on the
contents in the working tree were named with "worktree" in their
filenames, but with the recent addition of "git worktree"
subcommand, whose tests are also named similarly, it has become
harder to tell them apart. The traditional tests have been renamed
to use "work-tree" instead in an attempt to differentiate them.
* mg/work-tree-tests:
tests: rename work-tree tests to *work-tree*