With the '0' timeout given to poll, it returns instantly without any
events on my system, causing git-daemon to consume all the CPU time. Use
-1 as the timeout so poll() only returns in case of EINTR or actually
events being available.
Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge /pub/scm/git/git to recover lost side branch
Sorry for the mistake of rewinding something already pushed out.
This recovers the side branch lost by that mistake, specifically ea5a65a59916503d2a14369c46b1023384d51645 commit.
Signed-off-by: Junio C Hamano <junio@hera.kernel.org>
Be more careful tangling object chains while marking commits.
Also Johannes noticed we use parse_object to look up if we know that
object already -- we should just ask the in-core object registry with
lookup_object() for that.
The previous round to optimize fetch-pack has a small bug that
feeds SHA1^ ("parent commit") before making sure SHA1 is
actually a commit (or a tag that eventually dereferences to a
commit). Also it did not help culling the known-to-be-common
parents if the common one was a merge.
[PATCH] Do not send "want" lines for complete objects
It was all good and well to check if all remote refs are complete (local
refs or descendants thereof), but we can just as easily use the same
information to avoid sending "want" lines just for the complete objects in
the case that not all remote refs are complete (or their names differ).
Also, git-fetch-pack does not have to ask for descendants of remote refs
which are complete (for now, git-rev-list is told to ignore only the first
parent). That change also eliminates a code path where a popen()ed handle
was not pclose()ed.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
It turns out that not only did git-daemon do DWIM, but git-upload-pack
does as well. This is bad; security checks have to be performed *after*
canonicalization, not before.
Additionally, the current git-daemon can be trivially DoSed by spewing
SYNs at the target port.
This patch adds a --strict option to git-upload-pack to disable all
DWIM, a --timeout option to git-daemon and git-upload-pack, and an
--init-timeout option to git-daemon (which is typically set to a much
lower value, since the initial request should come immediately from the
client.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
On top of optimization by Linus not to ask refs that already match, we
can walk our refs and not issue "want" for things that are known to be
reachable from them.
Support for HTTP transfer timeouts based on transfer speed
Add configuration settings to abort HTTP requests if the transfer rate
drops below a threshold for a specified length of time. Environment
variables override config file settings.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
It turns out that not only did git-daemon do DWIM, but git-upload-pack
does as well. This is bad; security checks have to be performed *after*
canonicalization, not before.
Additionally, the current git-daemon can be trivially DoSed by spewing
SYNs at the target port.
This patch adds a --strict option to git-upload-pack to disable all
DWIM, a --timeout option to git-daemon and git-upload-pack, and an
--init-timeout option to git-daemon (which is typically set to a much
lower value, since the initial request should come immediately from the
client.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Yes I said 0.99.8e was the last maintenance release for 0.99.8, but it
turns out that there was another backport necessary after git-daemon
was unleashed on kernel.org servers.
Contains the following since 0.99.8e:
H. Peter Anvin:
revised^2: git-daemon extra paranoia, and path DWIM
Johannes Schindelin:
Fix cvsimport warning when called without --no-cvs-direct
Junio C Hamano:
Do not ask for objects known to be complete.
Linus Torvalds:
git-fetch-pack: avoid unnecessary zero packing
Optimize common case of git-rev-list
On top of optimization by Linus not to ask refs that already match, we
can walk our refs and not issue "want" for things that are known to be
reachable from them.
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically
git-rev-list --header --parents --max-count=1 HEAD
Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.
It will parse:
- the commit we want (it obviously needs this)
- it's parent(s) as part of the "pop_most_recent_commit()" logic
- it will then pop one of the parents before it notices that it doesn't
need any more
- and as part of popping the parent, it will parse the grandparent (again
due to "pop_most_recent_commit()".
Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.
So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
revised^2: git-daemon extra paranoia, and path DWIM
This patch adds some extra paranoia to the git-daemon filename test. In
particular, it now rejects pathnames containing //; it also adds a
redundant test for pathname absoluteness (belts and suspenders.)
A single / at the end of the path is still permitted, however, and the
.git and /.git append DWIM stuff is now handled in an integrated manner,
which means the resulting path will always be subjected to pathname checks.
[jc: backported to 0.99.8 maintenance branch]
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
If everything is up-to-date locally, we don't need to even ask for a
pack-file from the remote, or try to unpack it.
This is especially important for tags - since the pack-file common commit
logic is based purely on the commit history, it will never be able to find
a common tag, and will thus always end up re-fetching them.
Especially notably, if the tag points to a non-commit (eg a tagged tree),
the pack-file would be unnecessarily big, just because it cannot any most
recent common point between commits for pruning.
Short-circuiting the case where we already have that reference means that
we avoid a lot of these in the common case.
NOTE! This only matches remote ref names against the same local name,
which works well for tags, but is not as generic as it could be. If we
ever need to, we could match against _any_ local ref (if we have it, we
have it), but this "match against same name" is simpler and more
efficient, and covers the common case.
Renaming of refs is common for branch heads, but since those are always
commits, the pack-file generation can optimize that case.
In some cases we might still end up fetching pack-files unnecessarily, but
this at least avoids the re-fetching of tags over and over if you use a
regular
git fetch --tags ...
which was the main reason behind the change.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
On top of optimization by Linus not to ask refs that already match, we
can walk our refs and not issue "want" for things that are known to be
reachable from them.
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically
git-rev-list --header --parents --max-count=1 HEAD
Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.
It will parse:
- the commit we want (it obviously needs this)
- it's parent(s) as part of the "pop_most_recent_commit()" logic
- it will then pop one of the parents before it notices that it doesn't
need any more
- and as part of popping the parent, it will parse the grandparent (again
due to "pop_most_recent_commit()".
Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.
So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
revised^2: git-daemon extra paranoia, and path DWIM
This patch adds some extra paranoia to the git-daemon filename test. In
particular, it now rejects pathnames containing //; it also adds a
redundant test for pathname absoluteness (belts and suspenders.)
A single / at the end of the path is still permitted, however, and the
.git and /.git append DWIM stuff is now handled in an integrated manner,
which means the resulting path will always be subjected to pathname checks.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
If everything is up-to-date locally, we don't need to even ask for a
pack-file from the remote, or try to unpack it.
This is especially important for tags - since the pack-file common commit
logic is based purely on the commit history, it will never be able to find
a common tag, and will thus always end up re-fetching them.
Especially notably, if the tag points to a non-commit (eg a tagged tree),
the pack-file would be unnecessarily big, just because it cannot any most
recent common point between commits for pruning.
Short-circuiting the case where we already have that reference means that
we avoid a lot of these in the common case.
NOTE! This only matches remote ref names against the same local name,
which works well for tags, but is not as generic as it could be. If we
ever need to, we could match against _any_ local ref (if we have it, we
have it), but this "match against same name" is simpler and more
efficient, and covers the common case.
Renaming of refs is common for branch heads, but since those are always
commits, the pack-file generation can optimize that case.
In some cases we might still end up fetching pack-files unnecessarily, but
this at least avoids the re-fetching of tags over and over if you use a
regular
git fetch --tags ...
which was the main reason behind the change.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-checkout: revert specific paths to either index or a given tree-ish.
When extra paths arguments are given, git-checkout reverts only those
paths to either the version recorded in the index or the version
recorded in the given tree-ish.
Teach git-add and git-commit to handle filenames starting with '-'.
Recent '--' fixes to "git diff" by Linus made it possible to specify
filenames that start with '-'. But in order to do that, you need to
be able to add and commit such file to begin with.
Teach git-add and git-commit to honor the same '--' convention.
This fixes the default built-in exec() of "diff" to add a "--" before the
filenames, so that if a filename starts with a "-", the diff program won't
think it's an option.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Linus Torvalds:
make checkout-index '-a' flag saner.
Junio C Hamano:
whatchanged: document -m option from git-diff-tree.
Functions to quote and unquote pathnames in C-style.
Update git-apply to use C-style quoting for funny pathnames.
Do not quote SP.
git-checkout-index: documentation updates.
Follow the "encode minimally" principle -- our tools, including
git-apply and git-status, can handle pathnames with embedded SP just
fine. The only problematic ones are TAB and LF, and we need to quote
the metacharacters introduced for quoting.
This makes it possible to add paths that have funny characters (TAB
and LF) in them, and makes adding many paths more efficient in
general.
New flag "--stdin" to update-index was initially added for different
purpose, but it turns out to be a perfect match for feeding "ls-files
--others -z" output to improve "git add".
It also adds "--verbose" flag to update-index for use with "git add"
command.
Functions to quote and unquote pathnames in C-style.
Following the list discussion, define two functions, quote_c_style and
unquote_c_style, to help adopting the proposed way for quoting funny
pathname letters for GNU patch. The rule is described in:
Currently we do not support the leading '!', but we probably should
barf upon seeing it. Rule B4. is interpreted to require always 3
octal digits in \XYZ notation.
Follow the "encode minimally" principle -- our tools, including
git-apply and git-status, can handle pathnames with embedded SP just
fine. The only problematic ones are TAB and LF, and we need to quote
the metacharacters introduced for quoting.
This will be removed when merging the second phase of Linus' "Create
object subdirectories on demand" change anyway, but the code to
recreate the empty .git/objects/??/ directory was confused.
Deb packaging claim we depend on patch, but I think we use git-apply
where it matters. When a patch does not apply with git-apply, using
GNU patch still is helpful sometimes. So demote it from "Depends" to
"Suggests".
Functions to quote and unquote pathnames in C-style.
Following the list discussion, define two functions, quote_c_style and
unquote_c_style, to help adopting the proposed way for quoting funny
pathname letters for GNU patch. The rule is described in:
Currently we do not support the leading '!', but we probably should
barf upon seeing it. Rule B4. is interpreted to require always 3
octal digits in \XYZ notation.
This patch cleans out all sparse warnings from http-fetch.c
I'm a bit uncomfortable with adding extra #ifdefs to avoid either
'mixing declaration with code' or 'unused variable' warnings, but I
figured that since those functions are already littered with #ifdefs I
might just get away with it. Comments?
[jc: I adjusted Peter's patch to address uncomfortableness issues.]
Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
whatchanged: document -m option from git-diff-tree.
The documentation for git-whatchanged is meant to describe only
the most frequently used options from git-diff-tree. Because "why
doesn't it show merges" was asked more than once, we'd better
describe '-m' option there.
Johannes Schindelin:
Teach git-status about spaces in file names also on MacOSX
t5400-send-pack relies on a working cpio
Jonas Fonseca:
git.sh: quote all paths
Junio C Hamano:
Also force LC_ALL in test scripts.
OpenBSD needs the strcasestr replacement.
git-check-ref-format: reject funny ref names.
Refuse to create funny refs in clone-pack, git-fetch and receive-pack.
Ignore funny refname sent from remote
Introduce notation "ref^{type}".
Martin Langhoff:
cvsimport: don't pass --cvs-direct if user options contradict us
Ralf Baechle:
rsh.c: typo fix
Note that "funny ref" bits are not strictly fixes but rather
backport from the "master" branch. They will prevent refs and
heads with funny names from being created. In addition, what is
in the master branch will start feeding the clients unwrapped
tag information to help Martin's findtags and possibly later
Cogito. These backported "funny ref" changes are to prevent
clients on the "maint" branch from getting confused when talking
with newer git-upload-pack and when reading from info/refs file
prepared with newer git-update-server-info.
Existing "tagname^0" notation means "dereference tag zero or more
times until you cannot dereference it anymore, and make sure it is a
commit -- otherwise barf". But tags do not necessarily reference
commit objects.
This commit introduces a bit more generalized notation, "ref^{type}".
Existing "ref^0" is a shorthand for "ref^{commit}". If the type
is empty, it just dereferences tags until it hits a non-tag object.
With this, "git-rev-parse --verify 'junio-gpg-pub^{}'" shows the blob
object name -- there is no need to manually read the tag object and
find out the object name anymore.
"git-rev-parse --verify 'HEAD^{tree}'" can be used to find out the
tree object name of the HEAD commit.
This allows the remote side (most notably, upload-pack) to show
additional information without affecting the downloader. Peek-remote
does not ignore them -- this is to make it useful for Pasky's
automatic tag following.
Refuse to create funny refs in clone-pack, git-fetch and receive-pack.
Using git-check-ref-format, make sure we do not create refs with
funny names when cloning from elsewhere (clone-pack), fast forwarding
local heads (git-fetch), or somebody pushes into us (receive-pack).
Update check_ref_format() function to reject ref names that:
* has a path component that begins with a ".", or
* has a double dots "..", or
* has ASCII control character, "~", "^", ":" or SP, anywhere, or
* ends with a "/".
Use it in 'git-checkout -b', 'git-branch', and 'git-tag' to make sure
that newly created refs are well-formed.
Show peeled onion from upload-pack and server-info.
This updates git-ls-remote to show SHA1 names of objects that are
referred by tags, in the "ref^{}" notation.
This would make git-findtags (without -t flag) almost trivial.
git-peek-remote . |
sed -ne "s:^$target "'refs/tags/\(.*\)^{}$:\1:p'
Also Pasky could do:
git-ls-remote --tags $remote |
sed -ne 's:\( refs/tags/.*\)^{}$:\1:p'
to find out what object each of the remote tags refers to, and
if he has one locally, run "git-fetch $remote tag $tagname" to
automatically catch up with the upstream tags.
Existing "tagname^0" notation means "dereference tag zero or more
times until you cannot dereference it anymore, and make sure it is a
commit -- otherwise barf". But tags do not necessarily reference
commit objects.
This commit introduces a bit more generalized notation, "ref^{type}".
Existing "ref^0" is a shorthand for "ref^{commit}". If the type
is empty, it just dereferences tags until it hits a non-tag object.
With this, "git-rev-parse --verify 'junio-gpg-pub^{}'" shows the blob
object name -- there is no need to manually read the tag object and
find out the object name anymore.
"git-rev-parse --verify 'HEAD^{tree}'" can be used to find out the
tree object name of the HEAD commit.
This allows the remote side (most notably, upload-pack) to show
additional information without affecting the downloader. Peek-remote
does not ignore them -- this is to make it useful for Pasky's
automatic tag following.
Refuse to create funny refs in clone-pack, git-fetch and receive-pack.
Using git-check-ref-format, make sure we do not create refs with
funny names when cloning from elsewhere (clone-pack), fast forwarding
local heads (git-fetch), or somebody pushes into us (receive-pack).
Update check_ref_format() function to reject ref names that:
* has a path component that begins with a ".", or
* has a double dots "..", or
* has ASCII control character, "~", "^", ":" or SP, anywhere, or
* ends with a "/".
Use it in 'git-checkout -b', 'git-branch', and 'git-tag' to make sure
that newly created refs are well-formed.
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > This patch looks bigger than it really is: The code to get the
> > default handle was refactored into a function, and is called
> > instead of curl_easy_duphandle() if that does not exist.
>
> I'd like to take Nick's config file patch first, which
> unfortunately interferes with your patch. I'd hate to ask you
> this, but could you rebase it on top of Nick's patch, [...]
No need to hate it. Here comes the rebased patch, and this time, I
actually tested it a bit.
Do our own ctype.h, just to get the sane semantics: we want
locale-independence, _and_ we want the right signed behaviour. Plus we
only use a very small subset of ctype.h anyway (isspace, isalpha,
isdigit and isalnum).
git-http-fetch: Remove size limit for objects/info/{packs,alternates}
git-http-fetch received objects/info/packs into a fixed-size buffer
and started to fail when this file became larger than the buffer.
Change it to grow the buffer dynamically, and do the same thing for
objects/info/alternates. Also add missing free() calls for these
buffers.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
This enhances set of revs you can give format-patch.
Originally, format-patch took either one rev, or two revs:
format-patch rev1
format-patch rev1 rev2
The first format was a short-hand for "format-patch rev1 HEAD"
(i.e. rev2==HEAD). What this meant was to find commits that are
in branch rev2 that has not been merged to branch rev1.
The above notation is still supported, but now it takes sequence
of "from1..to1 from2..to2 ...". In short, the second format has
become a short-hand for "format-patch rev1..rev2". Commits in
to1 but not in from1, to2 but not in from2, ... are formatted as
emailable patches.
With this, cherry-picking from other branch can be written as:
which is generally faster than traditional cherry-pick (which
always did 3-way merge) if patches apply cleanly, and still
falls back on 3-way merge if some of them do not.
This uses the new "--local" flag to git-pack-objects. It currently only
makes a difference together with "-a", since a normal incremental repack
won't pack any packed objects at all (whether local or remote).
Eventually, it might end up skipping any objects that aren't local to
the current object directory, but for now it only knows to skip packed
objects.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds the "--local" flag to git-pack-objects, which acts like
"--incremental", except that instead of ignoring all packed objects, it
only ignores objects that are packed and in an alternate object tree.
As a result, it effectively only does a local re-pack: any remote-packed
objects will stay in the alternate object directories.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Lacking reliable symlinks, the instructions in the tutorial did not work
in a cygwin setup. Also, a few outputs were not correct.
This patch fixes these, and adds a test case which follows the
instructions of the tutorial (except git-clone, -fetch and -push, which I
have not done yet).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
A short perl script that will walk the tag refs, tag objects, and even commit
objects in its quest to figure out whether the given SHA1 (for a commit or
tree) was ever tagged.
This version is reworked incorporating sanity, feature and style fixes from
Junio.
Object references are used in server-info.c:find_pack_info_one() to
find out which objects in the pack are heads, therefore tracking of
references cannot be disabled.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
clone-pack: new option --keep tells it not to explode the pack.
With new option --keep, or a configuration item clone.keeppack (we
need a better name, or start allowing dash,"clone.keep-pack"), the packed
data downloaded while cloning is saved as a pack in .git/objects/pack/
locally, with index generated for it with git-index-pack.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Junio C Hamano <junkio@cox.net>