write(2) can hit the same EAGAIN/EWOULDBLOCK errors as read(2),
so busy-looping on a non-blocking FD is a waste of resources.
Currently, I do not know of a way for this happen:
* the NonBlocking directive in systemd does not apply to stdin,
stdout, or stderr.
* xinetd provides no way to set the non-blocking flag at all
But theoretically, it's possible a careless C10K HTTP server
could use pipe2(..., O_NONBLOCK) to setup a pipe for
git-http-backend with only the intent to use non-blocking reads;
but accidentally leave non-blocking set on the write end passed
as stdout to git-upload-pack.
Followup-to: 1079c4be0b720 ("xread: poll on non blocking fds")
Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run-command: do not pass child process data into callbacks
The expected way to pass data into the callback is to pass them via
the customizable callback pointer. The error reporting in
default_{start_failure, task_finished} is not user friendly enough, that
we want to encourage using the child data for such purposes.
Furthermore the struct child data is cleaned by the run-command API,
before we access them in the callbacks, leading to use-after-free
situations.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch_populated_submodules: use new parallel job processing
In a later patch we enable parallel processing of submodules, this
only adds the possibility for it. So this change should not change
any user facing behavior.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
run-command: add an asynchronous parallel child processor
This allows to run external commands in parallel with ordered output
on stderr.
If we run external commands in parallel we cannot pipe the output directly
to the our stdout/err as it would mix up. So each process's output will
flow through a pipe, which we buffer. One subprocess can be directly
piped to out stdout/err for a low latency feedback to the user.
Example:
Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
different amount of time as the different submodules vary in size, then
the output of fetches in sequential order might look like this:
time -->
output: |---A---| |-B-| |-------C-------| |-D-| |-E-|
When we schedule these submodules into maximal two parallel processes,
a schedule and sample output over time may look like this:
process 1: |---A---| |-D-| |-E-|
process 2: |-B-| |-------C-------|
output: |---A---|B|---C-------|DE
So A will be perceived as it would run normally in the single child
version. As B has finished by the time A is done, we can dump its whole
progress buffer on stderr, such that it looks like it finished in no
time. Once that is done, C is determined to be the visible child and
its progress will be reported in real time.
So this way of output is really good for human consumption, as it only
changes the timing, not the actual output.
For machine consumption the output needs to be prepared in the tasks,
by either having a prefix per line or per block to indicate whose tasks
output is displayed, because the output order may not follow the
original sequential ordering:
|----A----| |--B--| |-C-|
will be scheduled to be all parallel:
process 1: |----A----|
process 2: |--B--|
process 3: |-C-|
output: |----A----|CB
This happens because C finished before B did, so it will be queued for
output before B.
To detect when a child has finished executing, we check interleaved
with other actions (such as checking the liveliness of children or
starting new processes) whether the stderr pipe still exists. Once a
child closed its stderr stream, we assume it is terminating very soon,
and use `finish_command()` from the single external process execution
interface to collect the exit status.
By maintaining the strong assumption of stderr being open until the
very end of a child process, we can avoid other hassle such as an
implementation using `waitpid(-1)`, which is not implemented in Windows.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf: add strbuf_read_once to read without blocking
The new call will read from a file descriptor into a strbuf once. The
underlying call xread is just run once. xread only reattempts
reading in case of EINTR, which makes it suitable to use for a
nonblocking read.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
EAGAIN The file descriptor fd refers to a file other than a socket
and has been marked nonblocking (O_NONBLOCK), and the read
would block.
EAGAIN or EWOULDBLOCK
The file descriptor fd refers to a socket and has been marked
nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001
allows either error to be returned for this case, and does not
require these constants to have the same value, so a portable
application should check for both possibilities.
If we get an EAGAIN or EWOULDBLOCK the fd must have set O_NONBLOCK.
As the intent of xread is to read as much as possible either until the
fd is EOF or an actual error occurs, we can ease the feeder of the fd
by not spinning the whole time, but rather wait for it politely by not
busy waiting.
We should not care if the call to poll failed, as we're in an infinite
loop and can only get out with the correct read().
Signed-off-by: Stefan Beller <sbeller@google.com> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule.c: write "Fetching submodule <foo>" to stderr
The "Pushing submodule <foo>" progress output correctly goes to
stderr, but "Fetching submodule <foo>" is going to stdout by
mistake. Fix it to write to stderr.
Noticed while trying to implement a parallel submodule fetch. When
this particular output line went to a different file descriptor, it
was buffered separately, resulting in wrongly interleaved output if
we copied it to the terminal naively.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' into maint
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic
Update "git subtree" (in contrib/) so that it can take whitespaces
in the pathnames, not only in the in-tree pathname but the name of
the directory that the repository is in.
* as/subtree-with-spaces:
contrib/subtree: respect spaces in a repository path
t7900-subtree: test the "space in a subdirectory name" case
Merge branch 'jk/test-lint-forbid-when-finished-in-subshell' into maint
Because "test_when_finished" in our test framework queues the
clean-up tasks to be done in a shell variable, it should not be
used inside a subshell. Add a mechanism to allow 'bash' to catch
such uses, and fix the ones that were found.
* jk/test-lint-forbid-when-finished-in-subshell:
test-lib-functions: detect test_when_finished in subshell
t7800: don't use test_config in a subshell
test-lib-functions: support "test_config -C <dir> ..."
t5801: don't use test_when_finished in a subshell
t7610: don't use test_config in a subshell
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic
More transition from "unsigned char[40]" to "struct object_id".
This needed a few merge fixups, but is mostly disentangled from other
topics.
* bc/object-id:
remote: convert functions to struct object_id
Remove get_object_hash.
Convert struct object to object_id
Add several uses of get_object_hash.
object: introduce get_object_hash macro.
ref_newer: convert to use struct object_id
push_refs_with_export: convert to struct object_id
get_remote_heads: convert to struct object_id
parse_fetch: convert to use struct object_id
add_sought_entry_mem: convert to struct object_id
Convert struct ref to use object_id.
sha1_file: introduce has_object_file helper.
The necessary infrastructure to build topics using the free Travis
CI has been added. Developers forking from this topic (and enabling
Travis) can do their own builds, and we can turn on auto-builds for
git/git (including build-status for pull requests that people
open).
A build without NO_IPv6 used to use gethostbyname() when guessing
user's hostname, instead of getaddrinfo() that is used in other
codepaths in such a build.
* ep/ident-with-getaddrinfo:
ident.c: add support for IPv6
Fix some racy client/server tests by treating SIGPIPE the same as a
normal non-zero exit.
* ls/test-must-fail-sigpipe:
add "ok=sigpipe" to test_must_fail and use it to fix flaky tests
implement test_might_fail using a refactored test_must_fail
* dt/refs-backend-pre-vtable:
refs: break out ref conflict checks
files_log_ref_write: new function
initdb: make safe_create_dir public
refs: split filesystem-based refs code into a new file
refs/refs-internal.h: new header file
refname_is_safe(): improve docstring
pack_if_possible_fn(): use ref_type() instead of is_per_worktree_ref()
copy_msg(): rename to copy_reflog_msg()
verify_refname_available(): new function
verify_refname_available(): rename function
Apple's common crypto implementation of SHA1_Update() does not take
more than 4GB at a time, and we now have a compile-time workaround
for it.
* ad/sha1-update-chunked:
sha1: allow limiting the size of the data passed to SHA1_Update()
sha1: provide another level of indirection for the SHA-1 functions
Merge branch 'sg/bash-prompt-dirty-orphan' into maint
Produce correct "dirty" marker for shell prompts, even when we
are on an orphan or an unborn branch.
* sg/bash-prompt-dirty-orphan:
bash prompt: indicate dirty index even on orphan branches
bash prompt: remove a redundant 'git diff' option
bash prompt: test dirty index and worktree while on an orphan branch
t3404: fix quoting of redirect for some versions of bash
As CodingGuidelines says, some versions of bash errors out when
$variable substitution is used as the target for redirection without
being quoted (even though POSIX may not require such a quote).
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of the sync does not mention --recursive, and the
description of --recursive says that it is only available for foreach,
update and status.
The option was introduced (82f49f294c, Teach --recursive to submodule
sync, 2012-10-26) a while ago, so let's document it, too.
Reported-by: Per Cederqvist <cederp@opera.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to prepare the working tree side of temporary directory
for the "dir-diff" feature forgot that symbolic links need not be
copied (or symlinked) to the temporary area, as the code already
special cases and overwrites them. Besides, it was wrong to try
computing the object name of the target of symbolic link, which may
not even exist or may be a directory.
* da/difftool:
difftool: ignore symbolic links in use_wt_file
Produce correct "dirty" marker for shell prompts, even when we
are on an orphan or an unborn branch.
* sg/bash-prompt-dirty-orphan:
bash prompt: indicate dirty index even on orphan branches
bash prompt: remove a redundant 'git diff' option
bash prompt: test dirty index and worktree while on an orphan branch
Make "-h" command line option work more consistently in all commands.
* rs/parseopt-short-help:
show-ref: stop using PARSE_OPT_NO_INTERNAL_HELP
grep: stop using PARSE_OPT_NO_INTERNAL_HELP
parse-options: allow -h as a short option
parse-options: inline parse_options_usage() at its only remaining caller
parse-options: deduplicate parse_options_usage() calls
Apple's common crypto implementation of SHA1_Update() does not take
more than 4GB at a time, and we now have a compile-time workaround
for it.
* ad/sha1-update-chunked:
sha1: allow limiting the size of the data passed to SHA1_Update()
sha1: provide another level of indirection for the SHA-1 functions
* ls/p4-test-timeouts:
git-p4: add trap to kill p4d on test exit
git-p4: add p4d timeout in tests
git-p4: retry kill/cleanup operations in tests with timeout
* js/test-modernize-t9300:
modernize t9300: move test preparations into test_expect_success
modernize t9300: mark here-doc words to ignore tab indentation
modernize t9300: use test_when_finished for clean-up
modernize t9300: wrap lines after &&
modernize t9300: use test_must_be_empty
modernize t9300: use test_must_fail
modernize t9300: single-quote placement and indentation
* dg/subtree-test-cleanup:
contrib/subtree: Handle '--prefix' argument with a slash appended
contrib/subtree: Make each test self-contained
contrib/subtree: Add split tests
contrib/subtree: Add merge tests
contrib/subtree: Add tests for subtree add
contrib/subtree: Add test for missing subtree
contrib/subtree: Clean and refactor test code
verify_pack: do not ignore return value of verification function
In verify_pack, a caller-supplied verification function is called.
The function returns an int. If that return value is non-zero,
verify_pack should fail.
The only caller of verify_pack is in builtin/fsck.c, whose verify_fn
returns a meaningful error code (which was then ignored). Now, fsck
might return a different error code (with more detail). This would
happen in the unlikely event that a commit or tree that is a valid git
object but not a valid instance of its type gets into a pack.
Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Jeff King <peff@peff.net>
Make error message after failing commit_lock_file() less confusing
The error message after a failing commit_lock_file() call sometimes
looks like this, causing confusion:
$ git remote add remote git@server.com/repo.git
error: could not commit config file .git/config
# Huh?!
# I didn't want to commit anything, especially not my config file!
While in the narrow context of the lockfile module using the verb
'commit' in the error message makes perfect sense, in the broader
context of git the word 'commit' already has a very specific meaning,
hence the confusion.
Reword these error messages to say "could not write" instead of "could
not commit".
While at it, include strerror in the error messages after writing the
config file or the credential store fails to provide some information
about the cause of the failure, and update the style of the error
message after writing the reflog fails to match surrounding error
messages (i.e. no '' around the pathname and no () around the error
description).
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net>
* maint:
http: treat config options sslCAPath and sslCAInfo as paths
Documentation/diff: give --word-diff-regex=. example
filter-branch: deal with object name vs. pathname ambiguity in tree-filter
check-ignore: correct documentation about output
git-p4: clean up after p4 submit failure
git-p4: work with a detached head
git-p4: add option to system() to return subshell status
git-p4: add failing test for submit from detached head
remote-http(s): support SOCKS proxies
t5813: avoid creating urls that break on cygwin
Escape Git's exec path in contrib/rerere-train.sh script
allow hooks to ignore their standard input stream
rebase-i-exec: Allow space in SHELL_PATH
Documentation: make environment variable formatting more consistent
* ld/p4-detached-head:
git-p4: work with a detached head
git-p4: add option to system() to return subshell status
git-p4: add failing test for submit from detached head
wt-status: correct and simplify check for detached HEAD
If a branch name is longer than four characters then memcmp() reads over
the end of the static string "HEAD". This causes the following test
failures with AddressSanitizer:
And if a branch is named "H", "HE", or "HEA" then the current if clause
erroneously considers it as matching "HEAD" because it only compares
up to the end of the branch name.
Fix that by doing the comparison using strcmp() and only after the
branch name is extracted. This way neither too less nor too many
characters are checked. While at it call strchrnul() to find the end
of the branch name instead of open-coding it.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net>
Add IPv6 support by implementing name resolution with the
protocol agnostic getaddrinfo(3) API. The old gethostbyname(3)
code is still available when git is compiled with NO_IPV6.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net>
add "ok=sigpipe" to test_must_fail and use it to fix flaky tests
t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" is
flaky in the following case:
1. remote upload-pack finds out "not our ref"
2. remote sends a response and closes the pipe
3. fetch-pack still tries to write commands to the remote upload-pack
4. write call in wrapper.c dies with SIGPIPE
The test is flaky because the sending fetch-pack may or may
not have finished writing its output by step (3). If it did,
then we see a closed pipe on the next read() call. If it
didn't, then we get the SIGPIPE from step (4) above. Both
are fine, but the latter fools test_must_fail.
t5504 "9 - push with transfer.fsckobjects" is flaky, too, and returns
SIGPIPE once in a while. I had to remove the final "To dst..." output
check because there is no output if the process dies with SIGPIPE.
Accept such a death-with-sigpipe also as OK when we are expecting a
failure.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Jeff King <peff@peff.net>
implement test_might_fail using a refactored test_must_fail
Add an (optional) first parameter "ok=<special case>" to test_must_fail
and return success for "<special case>". Add "success" as
"<special case>" and use it to implement "test_might_fail". This removes
redundancies in test-lib-function.sh.
You can pass multiple <special case> arguments divided by comma (e.g.
"test_must_fail ok=success,something")
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Jeff King <peff@peff.net>
filter-branch: deal with object name vs. pathname ambiguity in tree-filter
'git filter-branch' fails complaining about an ambiguous argument, if
a tree-filter renames a path and the new pathname happens to match an
existing object name.
After the tree-filter has been applied, 'git filter-branch' looks for
changed paths by running:
If the CA path isn't found it's most likely to indicate a
misconfiguration, in which case accepting any certificate is unlikely to
be the correct thing to do.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Jeff King <peff@peff.net>
By default git check-ignore shows only the filenames that will be
ignored, not the pattern that causes their exclusion. Instead of moving
the partial exclude pattern precendence information to the -v option
where it belongs, link to gitignore(5) which describes this more
thoroughly.
Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net>
Commit 1b0d400 refactored the prepare_final() function so
that it could be reused in multiple places. Originally, the
loop had two outputs: a commit to stuff into sb->final, and
the name of the commit from the rev->pending array.
After the refactor, that loop is put in its own function
with a single return value: the object_array_entry from the
rev->pending array. This contains both the name and the object,
but with one important difference: the object is the
_original_ object found by the revision parser, not the
dereferenced commit. If one feeds a tag to "git blame", we
end up casting the tag object to a "struct commit", which
causes a segfault.
Instead, let's return the commit (properly casted) directly
from the function, and take the "name" as an optional
out-parameter. This does the right thing, and actually
simplifies the callers, who no longer need to cast or
dereference the object_array_entry themselves.
When "p4 submit" command fails in P4Submit.applyCommit, the
workspace is left with the changes. We already have code to revert
the changes to the workspace when the user decides to cancel
submission by aborting the editor that edits the change description,
and we should treat the "p4 submit" failure the same way.
Clean the workspace if p4_write_pipe raised SystemExit, so that the
user don't have to do it themselves.
Signed-off-by: GIRARD Etienne <egirard@murex.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Luke Diamand <luke@diamand.org> Signed-off-by: Jeff King <peff@peff.net>
bash prompt: indicate dirty index even on orphan branches
__git_ps1() doesn't indicate dirty index while on an orphan branch.
To check the dirtiness of the index, __git_ps1() runs 'git diff-index
--cached ... HEAD', which doesn't work on an orphan branch,
because HEAD doesn't point to a valid commit.
Run 'git diff ... --cached' instead, as it does the right thing both
on valid and invalid HEAD, i.e. compares the index to the existing
HEAD in the former case and to the empty tree in the latter. This
fixes the two failing tests added in the first commit of this series.
The dirtiness of the worktree is already checked with 'git diff' and
is displayed correctly even on an orphan branch.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net>