Merge branch 'jk/make-fix-dependencies' into maint
Build clean-up.
* jk/make-fix-dependencies:
Makefile: silence perl/PM.stamp recipe
Makefile: avoid timestamp updates to GIT-BUILD-OPTIONS
Makefile: drop dependency between git-instaweb and gitweb
There was a dead code that used to handle "git pull --tags" and
show special-cased error message, which was made irrelevant when
the semantics of the option changed back in Git 1.9 days.
* pt/pull-tags-error-diag:
pull: remove --tags error in no merge candidates case
Merge branch 'jk/diagnose-config-mmap-failure' into maint
The configuration reader/writer uses mmap(2) interface to access
the files; when we find a directory, it barfed with "Out of memory?".
* jk/diagnose-config-mmap-failure:
xmmap(): drop "Out of memory?"
config.c: rewrite ENODEV into EISDIR when mmap fails
config.c: avoid xmmap error messages
config.c: fix mmap leak when writing config
read-cache.c: drop PROT_WRITE from mmap of index
Merge branch 'jk/squelch-missing-link-warning-for-unreachable' into maint
Recent "git prune" traverses young unreachable objects to safekeep
old objects in the reachability chain from them, which sometimes
caused error messages that are unnecessarily alarming.
* jk/squelch-missing-link-warning-for-unreachable:
suppress errors on missing UNINTERESTING links
silence broken link warnings with revs->ignore_missing_links
add quieter versions of parse_{tree,commit}
Merge branch 'mm/rebase-i-post-rewrite-exec' into maint
"git rebase -i" fired post-rewrite hook when it shouldn't (namely,
when it was told to stop sequencing with 'exec' insn).
* mm/rebase-i-post-rewrite-exec:
t5407: use <<- to align the expected output
rebase -i: fix post-rewrite hook with failed exec command
rebase -i: demonstrate incorrect behavior of post-rewrite
The improved ARRAY_SIZE macro uses BARF_UNLESS_AN_ARRAY which expands
to a valid check for recent gcc versions and to 0 for older gcc
versions but is not defined on non-gcc builds.
Non-gcc builds need this macro to expand to 0 as well. The current outer
test (defined(__GNUC__) && (__GNUC__ >= 3)) is a strictly weaker
condition than the inner test (GIT_GNUC_PREREQ(3, 1)) so we can omit the
outer test and cause the BARF_UNLESS_AN_ARRAY macro to be defined
correctly on non-gcc builds as well as gcc builds with older versions.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jk/http-backend-deadlock' into maint
Communication between the HTTP server and http_backend process can
lead to a dead-lock when relaying a large ref negotiation request.
Diagnose the situation better, and mitigate it by reading such a
request first into core (to a reasonable limit).
* jk/http-backend-deadlock:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
Merge branch 'jh/filter-empty-contents' into maint
The clean/smudge interface did not work well when filtering an
empty contents (failed and then passed the empty input through).
It can be argued that a filter that produces anything but empty for
an empty input is nonsense, but if the user wants to do strange
things, then why not?
* jh/filter-empty-contents:
sha1_file: pass empty buffer to index empty file
That commit was an attempt to improve the safety of applying
a stash, because the application process may create
conflicted index entries, after which it is hard to restore
the original index state.
Unfortunately, this hurts some common workflows around "git
stash -k", like:
git add -p ;# (1) stage set of proposed changes
git stash -k ;# (2) get rid of everything else
make test ;# (3) make sure proposal is reasonable
git stash apply ;# (4) restore original working tree
If you "git commit" between steps (3) and (4), then this
just works. However, if these steps are part of a pre-commit
hook, you don't have that opportunity (you have to restore
the original state regardless of whether the tests passed or
failed).
It's possible that we could provide better tools for this
sort of workflow. In particular, even before ed178ef, it
could fail with a conflict if there were conflicting hunks
in the working tree and index (since the "stash -k" puts the
index version into the working tree, and we then attempt to
apply the differences between HEAD and the old working tree
on top of that). But the fact remains that people have been
using it happily for a while, and the safety provided by ed178ef is simply not that great. Let's revert it for now.
In the long run, people can work on improving stash for this
sort of workflow, but the safety tradeoff is not worth it in
the meantime.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
hooks/pre-auto-gc: adjust power checking for newer OS X
The output of "pmset -g batt" changed at some point from "Currently
drawing from 'AC Power'" to the slightly different "Now drawing from
'AC Power'". Starting the match from "drawing" makes the check work
in both old and new versions of OS X.
Signed-off-by: Panagiotis Astithas <pastith@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
l10n: de.po: change error message from "sagen" to "Meinten Sie"
We should not use "sagen" if someone has written something wrong.
Although it's "say" in English, we should not use it in German
and instead use our normal error message.
tcsh users who happen to have 'set noclobber' elsewhere in their
~/.tcshrc or ~/.cshrc startup files get a 'File exist' error, and
the tcsh completion file doesn't get generated/updated.
Adding a `!` in the redirect works correctly for both clobber (default)
and 'set noclobber' users.
Reviewed-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Ariel Faigon <github.2009@yendor.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fsck: report errors if reflog entries point at invalid objects
Previously, if a reflog entry's old or new SHA-1 was not resolvable to
an object, that SHA-1 was silently ignored. Instead, report such cases
as errors.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
New function, extracted from fsck_handle_reflog_ent(). The extra
is_null_sha1() test for the new reference is currently unnecessary, as
reflogs are deleted when the reference itself is deleted. But it
doesn't hurt, either.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'rs/plug-leak-in-pack-bitmaps' into maint
The code to read pack-bitmap wanted to allocate a few hundred
pointers to a structure, but by mistake allocated and leaked memory
enough to hold that many actual structures. Correct the allocation
size and also have it on stack, as it is small enough.
Various documentation mark-up fixes to make the output more
consistent in general and also make AsciiDoctor (an alternative
formatter) happier.
* jk/asciidoc-markup-fix:
doc: convert AsciiDoc {?foo} to ifdef::foo[]
doc: put example URLs and emails inside literal backticks
doc: drop backslash quoting of some curly braces
doc: convert \--option to --option
doc/add: reformat `--edit` option
doc: fix length of underlined section-title
doc: fix hanging "+"-continuation
doc: fix unquoted use of "{type}"
doc: fix misrendering due to `single quote'
Merge branch 'mh/write-refs-sooner-2.4' into maint
Multi-ref transaction support we merged a few releases ago
unnecessarily kept many file descriptors open, risking to fail with
resource exhaustion. This is for 2.4.x track.
* mh/write-refs-sooner-2.4:
ref_transaction_commit(): fix atomicity and avoid fd exhaustion
ref_transaction_commit(): remove the local flags variable
ref_transaction_commit(): inline call to write_ref_sha1()
rename_ref(): inline calls to write_ref_sha1() from this function
commit_ref_update(): new function, extracted from write_ref_sha1()
write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
update-ref: test handling large transactions properly
ref_transaction_commit(): fix atomicity and avoid fd exhaustion
ref_transaction_commit(): remove the local flags variable
ref_transaction_commit(): inline call to write_ref_sha1()
rename_ref(): inline calls to write_ref_sha1() from this function
commit_ref_update(): new function, extracted from write_ref_sha1()
write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
update-ref: test handling large transactions properly
The ref API did not handle cases where 'refs/heads/xyzzy/frotz' is
removed at the same time as 'refs/heads/xyzzy' is added (or vice
versa) very well.
* mh/ref-directory-file:
reflog_expire(): integrate lock_ref_sha1_basic() errors into ours
ref_transaction_commit(): delete extra "the" from error message
ref_transaction_commit(): provide better error messages
rename_ref(): integrate lock_ref_sha1_basic() errors into ours
lock_ref_sha1_basic(): improve diagnostics for ref D/F conflicts
lock_ref_sha1_basic(): report errors via a "struct strbuf *err"
verify_refname_available(): report errors via a "struct strbuf *err"
verify_refname_available(): rename function
refs: check for D/F conflicts among refs created in a transaction
ref_transaction_commit(): use a string_list for detecting duplicates
is_refname_available(): use dirname in first loop
struct nonmatching_ref_data: store a refname instead of a ref_entry
report_refname_conflict(): inline function
entry_matches(): inline function
is_refname_available(): convert local variable "dirname" to strbuf
is_refname_available(): avoid shadowing "dir" variable
is_refname_available(): revamp the comments
t1404: new tests of ref D/F conflicts within transactions
The "log --decorate" enhancement in Git 2.4 that shows the commit
at the tip of the current branch e.g. "HEAD -> master", did not
work with --decorate=full.
* mg/log-decorate-HEAD:
log: do not shorten decoration names too early
log: decorate HEAD with branch name under --decorate=full, too
There was a commented-out (instead of being marked to expect
failure) test that documented a breakage that was fixed since the
test was written; turn it into a proper test.
* sb/t1020-cleanup:
subdirectory tests: code cleanup, uncomment test
core.excludesfile (defaulting to $XDG_HOME/git/ignore) is supposed
to be overridden by repository-specific .git/info/exclude file, but
the order was swapped from the beginning. This belatedly fixes it.
* jc/gitignore-precedence:
ignore: info/exclude should trump core.excludesfile
The connection initiation code for "ssh" transport tried to absorb
differences between the stock "ssh" and Putty-supplied "plink" and
its derivatives, but the logic to tell that we are using "plink"
variants were too loose and falsely triggered when "plink" appeared
anywhere in the path (e.g. "/home/me/bin/uplink/ssh").
* bc/connect-plink:
connect: improve check for plink to reduce false positives
t5601: fix quotation error leading to skipped tests
connect: simplify SSH connection code path
Merge branch 'tb/blame-resurrect-convert-to-git' into maint
Some time ago, "git blame" (incorrectly) lost the convert_to_git()
call when synthesizing a fake "tip" commit that represents the
state in the working tree, which broke folks who record the history
with LF line ending to make their project portabile across
platforms while terminating lines in their working tree files with
CRLF for their platform.
* tb/blame-resurrect-convert-to-git:
blame: CRLF in the working tree and LF in the repo
* pt/xdg-config-path:
path.c: remove home_config_paths()
git-config: replace use of home_config_paths()
git-commit: replace use of home_config_paths()
credential-store.c: replace home_config_paths() with xdg_config_home()
dir.c: replace home_config_paths() with xdg_config_home()
attr.c: replace home_config_paths() with xdg_config_home()
path.c: implement xdg_config_home()
t0302: "unreadable" test needs POSIXPERM
t0302: test credential-store support for XDG_CONFIG_HOME
git-credential-store: support XDG_CONFIG_HOME
git-credential-store: support multiple credential files
format-patch: do not feed tags to clear_commit_marks()
"git format-patch --ignore-if-in-upstream A..B", when either A or B
is a tag, failed miserably.
This is because the code passes the tips it used for traversal to
clear_commit_marks(), after running a temporary revision traversal
to enumerate the commits on both branches to find if they have
commits that make equivalent changes. The revision traversal
machinery knows how to enumerate commits reachable starting from a
tag, but clear_commit_marks() wants to take nothing but a commit.
In the longer term, it might be a more correct fix to teach
clear_commit_marks() to do the same "committish to commit"
dereferencing that is done in the revision traversal machinery,
but for now this fix should suffice.
Reported-by: Bruce Korb <bruce.korb@gmail.com> Helped-by: Christian Couder <christian.couder@gmail.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are traversing commit parents along the
UNINTERESTING side of a revision walk, we do not care if
the parent turns out to be missing. That lets us limit
traversals using unreachable and possibly incomplete
sections of history. However, we do still print error
messages about the missing commits; this patch suppresses
the error, as well.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
silence broken link warnings with revs->ignore_missing_links
We set revs->ignore_missing_links to instruct the
revision-walking machinery that we know the history graph
may be incomplete. For example, we use it when walking
unreachable but recent objects; we want to add what we can,
but it's OK if the history is incomplete.
However, we still print error messages for the missing
objects, which can be confusing. This is not an error, but
just a normal situation when transitioning from a repository
last pruned by an older git (which can leave broken segments
of history) to a more recent one (where we try to preserve
whole reachable segments).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call parse_commit, it will complain to stderr if the
object does not exist or cannot be read. This means that we
may produce useless error messages if this situation is
expected (e.g., because the object is marked UNINTERESTING,
or because revs->ignore_missing_links is set).
We can fix this by adding a new "parse_X_gently" form that
takes a flag to suppress the messages. The existing
"parse_X" form is already gentle in the sense that it
returns an error rather than dying, and we could in theory
just add a "quiet" flag to it (with existing callers passing
"0"). But doing it this way means we do not have to disturb
existing callers.
Note also that the new flag is "quiet_on_missing", and not
just "quiet". We could add a flag to suppress _all_ errors,
but besides being a more invasive change (we would have to
pass the flag down to sub-functions, too), there is a good
reason not to: we would never want to use it. Missing a
linked object is expected in some circumstances, but it is
never expected to have a malformed commit, or to get a tree
when we wanted a commit. We should always complain about
these corruptions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Noticed-by: Philip Oakley <philipoakley@iee.org> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If both core.bare and core.worktree are set, we complain
about the bogus config and die. Dying is good, because it
avoids commands running and doing damage in a potentially
incorrect setup. But dying _there_ is bad, because it means
that commands which do not even care about the work tree
cannot run. This can make repairing the situation harder:
[OK, expected.]
$ git status
fatal: core.bare and core.worktree do not make sense
[Hrm...]
$ git config --unset core.worktree
fatal: core.bare and core.worktree do not make sense
[Nope...]
$ git config --edit
fatal: core.bare and core.worktree do not make sense
[Gaaah.]
$ git help config
fatal: core.bare and core.worktree do not make sense
Instead, let's issue a warning about the bogus config when
we notice it (i.e., for all commands), but only die when the
command tries to use the work tree (by calling setup_work_tree).
So we now get:
$ git status
warning: core.bare and core.worktree do not make sense
fatal: unable to set up work tree using invalid config
$ git config --unset core.worktree
warning: core.bare and core.worktree do not make sense
We have to update t1510 to accomodate this; it uses
symbolic-ref to check whether the configuration works or
not, but of course that command does not use the working
tree. Instead, we switch it to use `git status`, as it
requires a work-tree, does not need any special setup, and
is read-only (so a failure will not adversely affect further
tests).
In addition, we add a new test that checks the desired
behavior (i.e., that running "git config" with the bogus
config does in fact work).
Reported-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every time we run "make", we update perl/PM.stamp, which
contains a list of all of the perl module files (if it's
updated, we need to rebuild perl/perl.mak, since the
Makefile will not otherwise know about the new files).
This means that every time "make" is run, we see:
GEN perl/PM.stamp
in the output, even though it is not likely to have changed.
Let's make this recipe completely silent, as we do for other
auto-generated dependency files (e.g., GIT-CFLAGS).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: avoid timestamp updates to GIT-BUILD-OPTIONS
We force the GIT-BUILD-OPTIONS recipe to run every time
"make" is invoked. We must do this to catch new options
which may have come from the command-line or environment.
However, we actually update the file's timestamp each time
the recipe is run, whether anything changed or not. As a
result, any files which depend on it (for example, all of
the perl scripts, which need to know whether NO_PERL was
set) will be re-built every time.
Let's do our usual trick of writing to a tempfile, then
doing a "cmp || mv" to update the file only when something
changed.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: drop dependency between git-instaweb and gitweb
The rule for "git-instaweb" depends on "gitweb". This makes
no sense, because:
1. git-instaweb has no build-time dependency on gitweb; it
is a run-time dependency
2. gitweb is a directory that we want to recursively make
in. As a result, its recipe is marked .PHONY, which
causes "make" to rebuild git-instaweb every time it is
run.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's better to start the man page with a description of what
submodules actually are, instead of saying what they are not.
Reorder the paragraphs such that
- the first short paragraph introduces the submodule concept,
- the second paragraph highlights the usage of the submodule command,
- the third paragraph giving background information, and finally
- the fourth paragraph discusing alternatives such as subtrees and
remotes, which we don't want to be confused with.
This ordering deepens the knowledge on submodules with each paragraph.
First the basic questions like "How/what" will be answered, while the
underlying concepts will be taught at a later time.
Making sure it is not confused with subtrees and remotes is not really
enhancing knowledge of submodules itself, but rather painting the big
picture of git concepts, so you could also argue to have it as the second
paragraph. Personally I think this may confuse readers, specially
newcomers though.
Additionally to reordering the paragraphs, they have been slightly
reworded.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: include 'merge.branchdesc' for merge and config as well
'merge.branchdesc' is only mentioned in the docs of 'git fmt-merge-msg'.
The description of 'merge.log' is already duplicated between
'merge-config.txt' and 'git-fmt-merge-msg.txt'; instead of duplicating the
description of another config variable, extract the descriptions of both
of these variables from 'git-fmt-merge-msg.txt' into a separate file and
include it there and in 'merge-config.txt'.
Leave 'merge.summary' only in git-fmt-merge-msg.txt, as it is marked
as deprecated.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We show that message with die_errno(), but the OS is ought to know
why mmap(2) failed much better than we do. There is no reason for
us to say "Out of memory?" here.
config.c: rewrite ENODEV into EISDIR when mmap fails
If we try to mmap a directory, we'll get ENODEV. This
translates to "no such device" for the user, which is not
very helpful. Since we've just fstat()'d the file, we can
easily check whether the problem was a directory to give a
better message.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config-writing code uses xmmap to map the existing
config file, which will die if the map fails. This has two
downsides:
1. The error message is not very helpful, as it lacks any
context about the file we are mapping:
$ mkdir foo
$ git config --file=foo some.key value
fatal: Out of memory? mmap failed: No such device
2. We normally do not die in this code path; instead, we'd
rather report the error and return an appropriate exit
status (which is part of the public interface
documented in git-config.1).
This patch introduces a "gentle" form of xmmap which lets us
produce our own error message. We do not want to use mmap
directly, because we would like to use the other
compatibility elements of xmmap (e.g., handling 0-length
maps portably).
The end result is:
$ git.compile config --file=foo some.key value
error: unable to mmap 'foo': No such device
$ echo $?
3
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We mmap the existing config file, but fail to unmap it if we
hit an error. The function already has a shared exit path,
so we can fix this by moving the mmap pointer to the
function scope and clearing it in the shared exit.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, git's in-memory representation of a cache
entry actually pointed to the mmap'd on-disk data. So in 520fc24 (Allow writing to the private index file mapping.,
2005-04-26), we specified PROT_WRITE so that we could tweak
the entries while we run (in our own MAP_PRIVATE copy-on-write
version, of course).
Later, 7a51ed6 (Make on-disk index representation separate
from in-core one, 2008-01-14) stopped doing this; we copy
the data into our in-core representation, and then drop the
mmap immediately. We can therefore drop the PROT_WRITE flag.
It's probably not hurting anything as it is, but it's
potentially confusing.
Note that we could also mark the mapping as "const" to
verify that we never write to it. However, we don't
typically do that for our other maps, as it then requires
casting to munmap() it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT
The latter is a much more descriptive name (and we support
"color.diff.context" now). This also updates the name of any
local variables which were used to store the color.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-list --objects $old --not --all" to see if everything that
is reachable from $old is already connected to the existing refs
was very inefficient.
* jk/still-interesting:
limit_list: avoid quadratic behavior from still_interesting
"hash-object --literally" introduced in v2.2 was not prepared to
take a really long object type name.
* jc/hash-object:
write_sha1_file(): do not use a separate sha1[] array
t1007: add hash-object --literally tests
hash-object --literally: fix buffer overrun with extra-long object type
git-hash-object.txt: document --literally option
Merge branch 'jk/filter-branch-use-of-sed-on-incomplete-line' into maint
"filter-branch" corrupted commit log message that ends with an
incomplete line on platforms with some "sed" implementations that
munge such a line. Work it around by avoiding to use "sed".
* jk/filter-branch-use-of-sed-on-incomplete-line:
filter-branch: avoid passing commit message through sed
Merge branch 'jk/stash-require-clean-index' into maint
"git stash pop/apply" forgot to make sure that not just the working
tree is clean but also the index is clean. The latter is important
as a stash application can conflict and the index will be used for
conflict resolution.
* jk/stash-require-clean-index:
stash: require a clean index to apply
t3903: avoid applying onto dirty index
t3903: stop hard-coding commit sha1s
Merge branch 'jk/git-no-more-argv0-path-munging' into maint
We have prepended $GIT_EXEC_PATH and the path "git" is installed in
(typically "/usr/bin") to $PATH when invoking subprograms and hooks
for almost eternity, but the original use case the latter tried to
support was semi-bogus (i.e. install git to /opt/foo/git and run it
without having /opt/foo on $PATH), and more importantly it has
become less and less relevant as Git grew more mainstream (i.e. the
users would _want_ to have it on their $PATH). Stop prepending the
path in which "git" is installed to users' $PATH, as that would
interfere the command search order people depend on (e.g. they may
not like versions of programs that are unrelated to Git in /usr/bin
and want to override them by having different ones in /usr/local/bin
and have the latter directory earlier in their $PATH).
* jk/git-no-more-argv0-path-munging:
stop putting argv[0] dirname at front of PATH
Merge branch 'jk/http-backend-deadlock-2.3' into jk/http-backend-deadlock
* jk/http-backend-deadlock-2.3:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
Merge branch 'jk/http-backend-deadlock-2.2' into jk/http-backend-deadlock-2.3
* jk/http-backend-deadlock-2.2:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>