Merge branch 'sb/submodule-helper-clone-regression-fix' into sb/submodule-init
* sb/submodule-helper-clone-regression-fix:
submodule--helper, module_clone: catch fprintf failure
submodule--helper: do not borrow absolute_path() result for too long
submodule--helper, module_clone: always operate on absolute paths
submodule--helper clone: create the submodule path just once
submodule--helper: fix potential NULL-dereference
recursive submodules: test for relative paths
submodule--helper: do not borrow absolute_path() result for too long
absolute_path() is designed to allow its callers to take a brief
peek of the result (typically, to be fed to functions like
strbuf_add() and relative_path() as a parameter) without having to
worry about freeing it, but the other side of the coin of that
memory model is that the caller shouldn't rely too much on the
result living forever--there may be a helper function the caller
subsequently calls that makes its own call to absolute_path(),
invalidating the earlier result.
Use xstrdup() to make our own copy, and free(3) it when we are done.
While at it, remove an unnecessary sm_gitdir_rel variable that was
only used to as a parameter to call absolute_path() and never used
again.
submodule--helper, module_clone: always operate on absolute paths
When giving relative paths to `relative_path` to compute a relative path
from one directory to another, this may fail in `relative_path`.
Make sure both arguments to `relative_path` are always absolute.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git submodule update --init --recursive" uses full path to refer to
the true location of the repository in the "gitdir:" pointer for
nested submodules; the command used to use relative paths.
This was reported by Norio Nomura in $gmane/290280.
The root cause for that bug is in using recursive submodules as
their relative path handling was broken in ee8838d (2015-09-08,
submodule: rewrite `module_clone` shell function in C).
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose possible parallelism either via the "--jobs" CLI parameter or
the "submodule.fetchJobs" setting.
By having the variable initialized to -1, we make sure 0 can be passed
into the parallel processing machine, which will then pick as many parallel
workers as there are CPUs.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git submodule update: have a dedicated helper for cloning
This introduces a new helper function in git submodule--helper
which takes care of cloning all submodules, which we want to
parallelize eventually.
Some tests (such as empty URL, update_mode=none) are required in the
helper to make the decision for cloning. These checks have been
moved into the C function as well (no need to repeat them in the
shell script).
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run_processes_parallel: rename parameters for the callbacks
The refs code has a similar pattern of passing around 'struct strbuf *err',
which is strictly used for error reporting. This is not the case here,
as the strbuf is used to accumulate all the output (whether it is error
or not) for the user. Rename it to 'out'.
Suggested-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run_processes_parallel: treat output of children as byte array
We do not want the output to be interrupted by a NUL byte, so we
cannot use raw fputs. Introduce strbuf_write to avoid having long
arguments in run-command.c.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows to configure fetching and updating in parallel
without having the command line option.
This moved the responsibility to determine how many parallel processes
to start from builtin/fetch to submodule.c as we need a way to communicate
"The user did not specify the number of parallel processes in the command
line options" in the builtin fetch. The submodule code takes care of
the precedence (CLI > config > default).
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adhere to the common coding style of Git and not check explicitly
for NULL throughout the file. There are still other occurrences in the
code base but that is usually inside of conditions with side effects.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently submodule.<name>.update is only handled by git-submodule.sh.
C code will start to need to make use of that value as more of the
functionality of git-submodule.sh moves into library code in C.
Add the update field to 'struct submodule' and populate it so it can
be read as sm->update or from sm->update_command.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run-command: do not pass child process data into callbacks
The expected way to pass data into the callback is to pass them via
the customizable callback pointer. The error reporting in
default_{start_failure, task_finished} is not user friendly enough, that
we want to encourage using the child data for such purposes.
Furthermore the struct child data is cleaned by the run-command API,
before we access them in the callbacks, leading to use-after-free
situations.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch_populated_submodules: use new parallel job processing
In a later patch we enable parallel processing of submodules, this
only adds the possibility for it. So this change should not change
any user facing behavior.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run-command: add an asynchronous parallel child processor
This allows to run external commands in parallel with ordered output
on stderr.
If we run external commands in parallel we cannot pipe the output directly
to the our stdout/err as it would mix up. So each process's output will
flow through a pipe, which we buffer. One subprocess can be directly
piped to out stdout/err for a low latency feedback to the user.
Example:
Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
different amount of time as the different submodules vary in size, then
the output of fetches in sequential order might look like this:
time -->
output: |---A---| |-B-| |-------C-------| |-D-| |-E-|
When we schedule these submodules into maximal two parallel processes,
a schedule and sample output over time may look like this:
process 1: |---A---| |-D-| |-E-|
process 2: |-B-| |-------C-------|
output: |---A---|B|---C-------|DE
So A will be perceived as it would run normally in the single child
version. As B has finished by the time A is done, we can dump its whole
progress buffer on stderr, such that it looks like it finished in no
time. Once that is done, C is determined to be the visible child and
its progress will be reported in real time.
So this way of output is really good for human consumption, as it only
changes the timing, not the actual output.
For machine consumption the output needs to be prepared in the tasks,
by either having a prefix per line or per block to indicate whose tasks
output is displayed, because the output order may not follow the
original sequential ordering:
|----A----| |--B--| |-C-|
will be scheduled to be all parallel:
process 1: |----A----|
process 2: |--B--|
process 3: |-C-|
output: |----A----|CB
This happens because C finished before B did, so it will be queued for
output before B.
To detect when a child has finished executing, we check interleaved
with other actions (such as checking the liveliness of children or
starting new processes) whether the stderr pipe still exists. Once a
child closed its stderr stream, we assume it is terminating very soon,
and use `finish_command()` from the single external process execution
interface to collect the exit status.
By maintaining the strong assumption of stderr being open until the
very end of a child process, we can avoid other hassle such as an
implementation using `waitpid(-1)`, which is not implemented in Windows.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf: add strbuf_read_once to read without blocking
The new call will read from a file descriptor into a strbuf once. The
underlying call xread is just run once. xread only reattempts
reading in case of EINTR, which makes it suitable to use for a
nonblocking read.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
EAGAIN The file descriptor fd refers to a file other than a socket
and has been marked nonblocking (O_NONBLOCK), and the read
would block.
EAGAIN or EWOULDBLOCK
The file descriptor fd refers to a socket and has been marked
nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001
allows either error to be returned for this case, and does not
require these constants to have the same value, so a portable
application should check for both possibilities.
If we get an EAGAIN or EWOULDBLOCK the fd must have set O_NONBLOCK.
As the intent of xread is to read as much as possible either until the
fd is EOF or an actual error occurs, we can ease the feeder of the fd
by not spinning the whole time, but rather wait for it politely by not
busy waiting.
We should not care if the call to poll failed, as we're in an infinite
loop and can only get out with the correct read().
Signed-off-by: Stefan Beller <sbeller@google.com> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule.c: write "Fetching submodule <foo>" to stderr
The "Pushing submodule <foo>" progress output correctly goes to
stderr, but "Fetching submodule <foo>" is going to stdout by
mistake. Fix it to write to stderr.
Noticed while trying to implement a parallel submodule fetch. When
this particular output line went to a different file descriptor, it
was buffered separately, resulting in wrongly interleaved output if
we copied it to the terminal naively.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' into maint
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic
Update "git subtree" (in contrib/) so that it can take whitespaces
in the pathnames, not only in the in-tree pathname but the name of
the directory that the repository is in.
* as/subtree-with-spaces:
contrib/subtree: respect spaces in a repository path
t7900-subtree: test the "space in a subdirectory name" case
Merge branch 'jk/test-lint-forbid-when-finished-in-subshell' into maint
Because "test_when_finished" in our test framework queues the
clean-up tasks to be done in a shell variable, it should not be
used inside a subshell. Add a mechanism to allow 'bash' to catch
such uses, and fix the ones that were found.
* jk/test-lint-forbid-when-finished-in-subshell:
test-lib-functions: detect test_when_finished in subshell
t7800: don't use test_config in a subshell
test-lib-functions: support "test_config -C <dir> ..."
t5801: don't use test_when_finished in a subshell
t7610: don't use test_config in a subshell
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic
More transition from "unsigned char[40]" to "struct object_id".
This needed a few merge fixups, but is mostly disentangled from other
topics.
* bc/object-id:
remote: convert functions to struct object_id
Remove get_object_hash.
Convert struct object to object_id
Add several uses of get_object_hash.
object: introduce get_object_hash macro.
ref_newer: convert to use struct object_id
push_refs_with_export: convert to struct object_id
get_remote_heads: convert to struct object_id
parse_fetch: convert to use struct object_id
add_sought_entry_mem: convert to struct object_id
Convert struct ref to use object_id.
sha1_file: introduce has_object_file helper.
The necessary infrastructure to build topics using the free Travis
CI has been added. Developers forking from this topic (and enabling
Travis) can do their own builds, and we can turn on auto-builds for
git/git (including build-status for pull requests that people
open).
A build without NO_IPv6 used to use gethostbyname() when guessing
user's hostname, instead of getaddrinfo() that is used in other
codepaths in such a build.
* ep/ident-with-getaddrinfo:
ident.c: add support for IPv6
Fix some racy client/server tests by treating SIGPIPE the same as a
normal non-zero exit.
* ls/test-must-fail-sigpipe:
add "ok=sigpipe" to test_must_fail and use it to fix flaky tests
implement test_might_fail using a refactored test_must_fail
* dt/refs-backend-pre-vtable:
refs: break out ref conflict checks
files_log_ref_write: new function
initdb: make safe_create_dir public
refs: split filesystem-based refs code into a new file
refs/refs-internal.h: new header file
refname_is_safe(): improve docstring
pack_if_possible_fn(): use ref_type() instead of is_per_worktree_ref()
copy_msg(): rename to copy_reflog_msg()
verify_refname_available(): new function
verify_refname_available(): rename function
Apple's common crypto implementation of SHA1_Update() does not take
more than 4GB at a time, and we now have a compile-time workaround
for it.
* ad/sha1-update-chunked:
sha1: allow limiting the size of the data passed to SHA1_Update()
sha1: provide another level of indirection for the SHA-1 functions
Merge branch 'sg/bash-prompt-dirty-orphan' into maint
Produce correct "dirty" marker for shell prompts, even when we
are on an orphan or an unborn branch.
* sg/bash-prompt-dirty-orphan:
bash prompt: indicate dirty index even on orphan branches
bash prompt: remove a redundant 'git diff' option
bash prompt: test dirty index and worktree while on an orphan branch
t3404: fix quoting of redirect for some versions of bash
As CodingGuidelines says, some versions of bash errors out when
$variable substitution is used as the target for redirection without
being quoted (even though POSIX may not require such a quote).
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of the sync does not mention --recursive, and the
description of --recursive says that it is only available for foreach,
update and status.
The option was introduced (82f49f294c, Teach --recursive to submodule
sync, 2012-10-26) a while ago, so let's document it, too.
Reported-by: Per Cederqvist <cederp@opera.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to prepare the working tree side of temporary directory
for the "dir-diff" feature forgot that symbolic links need not be
copied (or symlinked) to the temporary area, as the code already
special cases and overwrites them. Besides, it was wrong to try
computing the object name of the target of symbolic link, which may
not even exist or may be a directory.
* da/difftool:
difftool: ignore symbolic links in use_wt_file
Produce correct "dirty" marker for shell prompts, even when we
are on an orphan or an unborn branch.
* sg/bash-prompt-dirty-orphan:
bash prompt: indicate dirty index even on orphan branches
bash prompt: remove a redundant 'git diff' option
bash prompt: test dirty index and worktree while on an orphan branch
Make "-h" command line option work more consistently in all commands.
* rs/parseopt-short-help:
show-ref: stop using PARSE_OPT_NO_INTERNAL_HELP
grep: stop using PARSE_OPT_NO_INTERNAL_HELP
parse-options: allow -h as a short option
parse-options: inline parse_options_usage() at its only remaining caller
parse-options: deduplicate parse_options_usage() calls
Apple's common crypto implementation of SHA1_Update() does not take
more than 4GB at a time, and we now have a compile-time workaround
for it.
* ad/sha1-update-chunked:
sha1: allow limiting the size of the data passed to SHA1_Update()
sha1: provide another level of indirection for the SHA-1 functions
* ls/p4-test-timeouts:
git-p4: add trap to kill p4d on test exit
git-p4: add p4d timeout in tests
git-p4: retry kill/cleanup operations in tests with timeout
* js/test-modernize-t9300:
modernize t9300: move test preparations into test_expect_success
modernize t9300: mark here-doc words to ignore tab indentation
modernize t9300: use test_when_finished for clean-up
modernize t9300: wrap lines after &&
modernize t9300: use test_must_be_empty
modernize t9300: use test_must_fail
modernize t9300: single-quote placement and indentation
* dg/subtree-test-cleanup:
contrib/subtree: Handle '--prefix' argument with a slash appended
contrib/subtree: Make each test self-contained
contrib/subtree: Add split tests
contrib/subtree: Add merge tests
contrib/subtree: Add tests for subtree add
contrib/subtree: Add test for missing subtree
contrib/subtree: Clean and refactor test code
verify_pack: do not ignore return value of verification function
In verify_pack, a caller-supplied verification function is called.
The function returns an int. If that return value is non-zero,
verify_pack should fail.
The only caller of verify_pack is in builtin/fsck.c, whose verify_fn
returns a meaningful error code (which was then ignored). Now, fsck
might return a different error code (with more detail). This would
happen in the unlikely event that a commit or tree that is a valid git
object but not a valid instance of its type gets into a pack.
Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Jeff King <peff@peff.net>
Make error message after failing commit_lock_file() less confusing
The error message after a failing commit_lock_file() call sometimes
looks like this, causing confusion:
$ git remote add remote git@server.com/repo.git
error: could not commit config file .git/config
# Huh?!
# I didn't want to commit anything, especially not my config file!
While in the narrow context of the lockfile module using the verb
'commit' in the error message makes perfect sense, in the broader
context of git the word 'commit' already has a very specific meaning,
hence the confusion.
Reword these error messages to say "could not write" instead of "could
not commit".
While at it, include strerror in the error messages after writing the
config file or the credential store fails to provide some information
about the cause of the failure, and update the style of the error
message after writing the reflog fails to match surrounding error
messages (i.e. no '' around the pathname and no () around the error
description).
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net>
* maint:
http: treat config options sslCAPath and sslCAInfo as paths
Documentation/diff: give --word-diff-regex=. example
filter-branch: deal with object name vs. pathname ambiguity in tree-filter
check-ignore: correct documentation about output
git-p4: clean up after p4 submit failure
git-p4: work with a detached head
git-p4: add option to system() to return subshell status
git-p4: add failing test for submit from detached head
remote-http(s): support SOCKS proxies
t5813: avoid creating urls that break on cygwin
Escape Git's exec path in contrib/rerere-train.sh script
allow hooks to ignore their standard input stream
rebase-i-exec: Allow space in SHELL_PATH
Documentation: make environment variable formatting more consistent
* ld/p4-detached-head:
git-p4: work with a detached head
git-p4: add option to system() to return subshell status
git-p4: add failing test for submit from detached head
wt-status: correct and simplify check for detached HEAD
If a branch name is longer than four characters then memcmp() reads over
the end of the static string "HEAD". This causes the following test
failures with AddressSanitizer:
And if a branch is named "H", "HE", or "HEA" then the current if clause
erroneously considers it as matching "HEAD" because it only compares
up to the end of the branch name.
Fix that by doing the comparison using strcmp() and only after the
branch name is extracted. This way neither too less nor too many
characters are checked. While at it call strchrnul() to find the end
of the branch name instead of open-coding it.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net>
Add IPv6 support by implementing name resolution with the
protocol agnostic getaddrinfo(3) API. The old gethostbyname(3)
code is still available when git is compiled with NO_IPV6.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net>