Commit 09c9957 (send-pack: avoid deadlock when pack-object
dies early, 2011-04-25) attempted to fix a hang in the
stateless rpc case by closing a file descriptor early, but
we still need that descriptor.
Basically the deadlock can happen when pack-objects fails,
and the descriptor to upstream is left open. We never send
the pack, so the upstream is left waiting for us to say
something, and we are left waiting for upstream to close the
connection.
In the non-rpc case, our descriptor points straight to the
upstream. We hand it off to run-command, which takes
ownership and closes the descriptor after pack-objects
finishes (whether it succeeds or not).
Commit 09c9957 tried to emulate that in the rpc case. That
isn't right, though. We actually have a descriptor going
back to the remote-helper, and we need to keep using it
after pack-objects is finished. Closing it early completely
breaks pushing via smart-http.
We still need to do something on error to signal the
remote-helper that we won't be sending any pack data
(otherwise we get the deadlock). In an ideal world, we
would send a special packet back that says "Sorry, there was
an error". But the remote-helper doesn't understand any such
packet, so the best we can do is close the descriptor and
let it report that we hung up unexpectedly.
Signed-off-by: Jeff King <peff@peff.net> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'js/maint-1.6.6-send-pack-stateless-rpc-deadlock-fix' into js/maint-send-pack-stateless-rpc-deadlock-fix
* js/maint-1.6.6-send-pack-stateless-rpc-deadlock-fix:
send-pack: avoid deadlock when pack-object dies early
Evil merge to adjust the way the use of pthreads in sideband-demultiplexor
was decided (earlier it was "if we are not on Windows", now it is "if we
are not using pthreads").
send-pack: avoid deadlock when pack-object dies early
Send-pack deadlocks in two ways when pack-object dies early (for example,
because there is some repo corruption).
The first deadlock happens with the smart push protocol (--stateless-rpc).
After the initial rev-exchange, the remote is waiting for the pack data
to arrive, and the sideband demuxer at the local side continues trying to
stream data from the remote repository until it gets EOF. Meanwhile,
send-pack (in function pack_objects()) has noticed that pack-objects did
not produce output and died. Back in send_pack(), it now tries to clean
up the sideband demuxer using finish_async(). The demuxer, however, waits
for the remote end to close down, the remote waits for pack data, and
the reason that it still waits is that send-pack forgot to close the
outgoing channel. Add the missing close() in pack_objects().
The second deadlock happens in a similar constellation when the sideband
demuxer runs in a forked process (rather than in a thread). Again, the
remote end waits for pack data to arrive, the sideband demuxer waits for
the remote to shut down, and send-pack (in the regular clean-up) waits for
the demuxer to terminate. This time, the send-pack parent process closes
the writable end of the outgoing channel (in start_command() that spawned
pack-objects) so that after the death of the pack-objects process all
writable ends should have been closed and the remote repo should see EOF.
This does not happen, however, because when the sideband demuxer was forked
earlier, it also inherited a writable end; it remains open and keeps the
remote repo from seeing EOF. To break this deadlock, close the writable end
in the demuxer.
Analyzed-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dying in an async procedure should only exit the thread, not the process.
Async procedures are intended as helpers that perform a very restricted
task, and the caller usually has to manage them in a larger context.
Conceptually, the async procedure is not concerned with the "bigger
picture" in whose context it is run. When it dies, it is not supposed
to destroy this "bigger picture", but rather only its own limit view
of the world. On POSIX, the async procedure is run in its own process,
and exiting this process naturally had only these limited effects.
On Windows (or when ASYNC_AS_THREAD is set), calling die() exited the
whole process, destroying the caller (the "big picture") as well.
This fixes it to exit only the thread.
Without ASYNC_AS_THREAD, one particular effect of exiting the async
procedure process is that it automatically closes file descriptors, most
notably the writable end of the pipe that the async procedure writes to.
The async API already requires that the async procedure closes the pipe
ends when it exits normally. But for calls to die() no requirements are
imposed. In the non-threaded case the pipe ends are closed implicitly
by the exiting process, but in the threaded case, the die routine must
take care of closing them.
Now t5530-upload-pack-error.sh passes on Windows.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, async procedures have always been run in threads, and the
implementation used Windows specific APIs. Rewrite the code to use pthreads.
A new configuration option is introduced so that the threaded implementation
can also be used on POSIX systems. Since this option is intended only as
playground on POSIX, but is mandatory on Windows, the option is not
documented.
One detail is that on POSIX it is necessary to set FD_CLOEXEC on the pipe
handles. On Windows, this is not needed because pipe handles are not
inherited to child processes, and the new calls to set_cloexec() are
effectively no-ops.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/maint-fix-pager:
tests: Fix race condition in t7006-pager
t7006-pager: if stdout is not a terminal, make a new one
tests: Add tests for automatic use of pager
am: Fix launching of pager
git svn: Fix launching of pager
git.1: Clarify the behavior of the --paginate option
Make 'git var GIT_PAGER' always print the configured pager
Fix 'git var' usage synopsis
Merge branch 'np/compress-loose-object-memsave' into maint
* np/compress-loose-object-memsave:
sha1_file: be paranoid when creating loose objects
sha1_file: don't malloc the whole compressed result when writing out objects
submodule summary: do not shift a non-existent positional variable
When "git submodule summary" is run without any argument, we default to
compare the state of index with the HEAD, but tried to shift out $1 that
does not exist (and worse yet, we didn't use it).
* sp/maint-push-sideband:
receive-pack: Send internal errors over side-band #2
t5401: Use a bare repository for the remote peer
receive-pack: Send hook output over side band #2
receive-pack: Wrap status reports inside side-band-64k
receive-pack: Refactor how capabilities are shown to the client
send-pack: demultiplex a sideband stream with status data
run-command: support custom fd-set in async
run-command: Allow stderr to be a caller supplied pipe
* np/fast-import-idx-v2:
fast-import: use the diff_delta() max_delta_size argument
fast-import: honor pack.indexversion and pack.packsizelimit config vars
fast-import: make default pack size unlimited
fast-import: use write_idx_file() instead of custom code
fast-import: use sha1write() for pack data
fast-import: start using struct pack_idx_entry
* jn/maint-fix-pager:
tests: Fix race condition in t7006-pager
t7006-pager: if stdout is not a terminal, make a new one
tests: Add tests for automatic use of pager
am: Fix launching of pager
git svn: Fix launching of pager
git.1: Clarify the behavior of the --paginate option
Make 'git var GIT_PAGER' always print the configured pager
Fix 'git var' usage synopsis
* jc/for-each-ref:
for-each-ref --format='%(flag)'
for-each-ref --format='%(symref) %(symref:short)'
builtin-for-each-ref.c: check if we need to peel onion while parsing the format
builtin-for-each-ref.c: comment fixes
* np/compress-loose-object-memsave:
sha1_file: be paranoid when creating loose objects
sha1_file: don't malloc the whole compressed result when writing out objects
This commit fixes a bug in processing project-specific override in
a situation when there is no project, e.g. for the projects list page.
When 'snapshot' feature had project specific config override enabled
by putting
$feature{'snapshot'}{'override'} = 1;
(or equivalent) in $GITWEB_CONFIG, and when viewing toplevel gitweb
page, which means the projects list page (to be more exact this
happens for any project-less action), gitweb would put the following
Perl warnings in error log:
gitweb.cgi: Use of uninitialized value $git_dir in concatenation (.) or string at gitweb.cgi line 2065.
fatal: error processing config file(s)
gitweb.cgi: Use of uninitialized value $git_dir in concatenation (.) or string at gitweb.cgi line 2221.
gitweb.cgi: Use of uninitialized value $git_dir in concatenation (.) or string at gitweb.cgi line 2218.
The problem is in the following fragment of code:
# path to the current git repository
our $git_dir;
$git_dir = "$projectroot/$project" if $project;
# list of supported snapshot formats
our @snapshot_fmts = gitweb_get_feature('snapshot');
@snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts);
For the toplevel gitweb page, which is the list of projects, $project is not
defined, therefore neither is $git_dir. gitweb_get_feature() subroutine
calls git_get_project_config() if project specific override is turned
on... but we don't have project here.
Those errors mentioned above occur in the following fragment of code in
git_get_project_config():
# get config
if (!defined $config_file ||
$config_file ne "$git_dir/config") {
%config = git_parse_project_config('gitweb');
$config_file = "$git_dir/config";
}
git_parse_project_config() calls git_cmd() which has '--git-dir='.$git_dir
There are (at least) three possible solutions:
1. Harden gitweb_get_feature() so that it doesn't call
git_get_project_config() if $project (and therefore $git_dir) is not
defined; there is no project for project specific config.
2. Harden git_get_project_config() like you did in your fix, returning early
if $git_dir is not defined.
3. Harden git_cmd() so that it doesn't add "--git-dir=$git_dir" if $git_dir
is not defined, and change git_get_project_config() so that it doesn't
even try to access $git_dir if it is not defined.
This commit implements both 1.) and 2.), i.e. gitweb_get_feature() doesn't
call project-specific override if $git_dir is not defined (if there is no
project), and git_get_project_config() returns early if $git_dir is not
defined.
Add a test for this bug to t/t9500-gitweb-standalone-no-errors.sh test.
Reported-by: Eli Barzilay <eli@barzilay.org> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bisect: error out when passing bad path parameters
As reported by Mark Lodato, "git bisect", when it was started with
path parameters that match no commit was kind of working without
taking account of path parameters and was reporting something like:
Bisecting: -1 revisions left to test after this (roughly 0 steps)
It is more correct and safer to just error out in this case, before
displaying the revisions left, so this patch does just that.
Note that this bug is very old, it exists at least since v1.5.5.
And it is possible to detect that case earlier in the bisect
algorithm, but it is not clear that it would be an improvement to
error out earlier, on the contrary it may change the behavior of
"git rev-list --bisect-all" for example, which is currently correct.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove reference to GREP_COLORS from documentation
There is no longer support for external grep, as per bbc09c2 (grep: rip
out support for external grep, 2010-01-12), so remove the reference to it
from the documentation.
Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of TEST_OBJS in commit daa99a91 (Makefile: make sure
test helpers are rebuilt when headers change, 2010-01-26) moved a use
of $X to before the platform-specific section where it gets defined.
There are at least two ways to fix that:
- Change the definition of TEST_OBJS to use the = delayed
evaluation operator. This way, one need not worry about $(X)
needing to be defined before TEST_OBJS is set.
- Move the definition of TEST_OBJS to below the definition of $X.
Carry out the second. The later site of definition makes the code more
readable, since now a reader only has to look down one line to see what
TEST_OBJS is meant to be used for.
Oddly enough, with or without this change the behavior of the Makefile
is the same. Since TEST_PROGRAMS is defined with delayed evaluation,
the value of
is independent of the value of $X when it is evaluated: the $X in the
pattern and the $X in $(TEST_PROGRAMS) will simply always cancel out.
Make sure $X has the expected expansion anyway to make the code and
the reader’s sanity more robust in the face of future changes.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When svn:mergeinfo contains two new parents in a specific order and
one is ancestor of the other, it is possible that git-svn discards the
wrong one. The first test case ("commit made to merged branch is
reachable from the merge") proves this.
The second test case ("merging two branches in one commit is detected
correctly") is just for completeness, since there was no test for
merging two (feature) branches to trunk in one commit.
Signed-off-by: Tuomas Suutari <tuomas.suutari@gmail.com> Acked-by: Eric Wong <normalperson@yhbt.net>
A few "svn cp" commands and commit commands were executed in incorrect
order. Therefore some of the desired commits were missing and some
were committed with wrong revision number in the commit message. This
made it hard to compare the produced git repository with the SVN
repository.
The dump file is updated too, but only the relevant parts and with
hand-edited timestamps to make history linear.
Signed-off-by: Tuomas Suutari <tuomas.suutari@gmail.com> Acked-by: Eric Wong <normalperson@yhbt.net>
Windows: redirect f[re]open("/dev/null") to f[re]open("nul")
On Windows, the equivalent of "/dev/null" is "nul". This implements
compatibility wrappers around fopen() and freopen() that check for this
particular file name.
The new tests exercise code paths where this is relevant.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git am does an automatic gc it doesn't clean up the rebase-apply
directory until after this has finished. This means that if the user
aborts the gc then future am or rebase operations will report that an
existing operation is in progress, which is undesirable and confusing.
Reported by Mark Brown <broonie@debian.org> through
http://bugs.debian.org/570966
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: mention conflict marker size argument (%L) for merge driver
23a64c9e (conflict-marker-size: new attribute, 2010-01-16) introduced the
new attribute and also pass the conflict marker size as %L to merge driver
commands. This documents the substitution.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
move encode_in_pack_object_header() to a better place
Commit 1b22b6c897 made duplicated versions of encode_header() into a
common version called encode_in_pack_object_header(). There is however
a better location that sha1_file.c for such a function though, as
sha1_file.c contains nothing related to the creation of packs, and
it is quite populated already.
Also the comment that was moved to the header file should really remain
near the function as it covers implementation details and provides no
information about the actual function interface.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pagers that do not consume their input are dangerous: for example,
$ GIT_PAGER=: git log
$ echo $?
141
$
The only reason these tests were able to work before was that
'git log' would write to the pipe (and not fill it) before the
pager had time to terminate and close the pipe.
Fix it by using a program that consumes its input, namely wc (as
suggested by Johannes).
Reported-by: Johannes Sixt <j.sixt@viscovery.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1_file: be paranoid when creating loose objects
We don't want the data being deflated and stored into loose objects
to be different from what we expect. While the deflated data is
protected by a CRC which is good enough for safe data retrieval
operations, we still want to be doubly sure that the source data used
at object creation time is still what we expected once that data has
been deflated and its CRC32 computed.
The most plausible data corruption may occur if the source file is
modified while Git is deflating and writing it out in a loose object.
Or Git itself could have a bug causing memory corruption. Or even bad
RAM could cause trouble. So it is best to make sure everything is
coherent and checksum protected from beginning to end.
To do so we compute the SHA1 of the data being deflated _after_ the
deflate operation has consumed that data, and make sure it matches
with the expected SHA1. This way we can rely on the CRC32 checked by
the inflate operation to provide a good indication that the data is still
coherent with its SHA1 hash. One pathological case we ignore is when
the data is modified before (or during) deflate call, but changed back
before it is hashed.
There is some overhead of course. Using 'git add' on a set of large files:
Before:
real 0m25.210s
user 0m23.783s
sys 0m1.408s
After:
real 0m26.537s
user 0m25.175s
sys 0m1.358s
The overhead is around 5% for full data coherency guarantee.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/push-sideband:
receive-pack: Send internal errors over side-band #2
t5401: Use a bare repository for the remote peer
receive-pack: Send hook output over side band #2
receive-pack: Wrap status reports inside side-band-64k
receive-pack: Refactor how capabilities are shown to the client
send-pack: demultiplex a sideband stream with status data
run-command: support custom fd-set in async
run-command: Allow stderr to be a caller supplied pipe
Using read() instead of mmap() can be 39% speed up for 1Kb files and is
1% speed up 1Mb files. For larger files, it is better to use mmap(),
because the difference between is not significant, and when there is not
enough memory, mmap() performs much better, because it avoids swapping.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1_file: don't malloc the whole compressed result when writing out objects
There is no real advantage to malloc the whole output buffer and
deflate the data in a single pass when writing loose objects. That is
like only 1% faster while using more memory, especially with large
files where memory usage is far more. It is best to deflate and write
the data out in small chunks reusing the same memory instead.
For example, using 'git add' on a few large files averaging 40 MB ...
t7006-pager: if stdout is not a terminal, make a new one
Testing pagination requires (fake or real) access to a terminal so we
can see whether the pagination automatically kicks in, which makes it
hard to get good coverage when running tests without --verbose. There
are a number of ways to work around that:
- Replace all isatty calls with calls to a custom xisatty wrapper
that usually checks for a terminal but can be overridden for tests.
This would be workable, but it would require implementing xisatty
separately in three languages (C, shell, and perl) and making sure
that any code that is to be tested always uses the wrapper.
- Redirect stdout to /dev/tty. This would be problematic because
there might be no terminal available, and even if a terminal is
available, it might not be appropriate to spew output to it.
- Create a new pseudo-terminal on the fly and capture its output.
This patch implements the third approach.
The new test-terminal.perl helper uses IO::Pty from Expect.pm to create
a terminal and executes the program specified by its arguments with
that terminal as stdout. If the IO::Pty module is missing or not
working on a system, the test script will maintain its old behavior
(skipping most of its tests unless GIT_TEST_OPTS includes --verbose).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>