Different versions of Mercurial have different arguments for
bookmarks.updatefromremote(), while it should be possible to call the
right function with the right arguments depending on the version, it's
safer to restore the old behavior for now.
Reported by Rodney Lorrimar.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_branchname(): do not double-expand @{-1}~22
If you were on 'frotz' branch before you checked out your current
branch, "git merge @{-1}~22" means the same as "git merge frotz~22".
The strbuf_branchname() function, when interpret_branch_name() gives
up resolving "@{-1}~22" fully, returns "frotz" and tells the caller
that it only resolved "@{-1}" part of the input, mistakes this as a
total failure, and appends the whole thing to the result, yielding
"frotz@{-1}~22", which does not make any sense.
Inspect the return value from interpret_branch_name() a bit more
carefully. When it errored out without consuming anything, we will
get -1 and we should return the whole thing. Otherwise, we should
append the remainder (i.e. "~22" in the earlier example) to the
partially resolved name (i.e. "frotz").
The test suite adds enough number of checkout to make @{-12} in the
last test in t0100 that tried to check "we haven't flipped branches
that many times" error case succeed; raise the number to a hundred.
git-submodule.txt: Clarify 'init' and 'add' subcommands.
Describe how 'add' sets the submodule's logical name, which is used in
the configuration entry names.
Clarify that 'init' only sets up the configuration entries for
submodules that have already been added elsewhere. Describe that
<path> arguments limit the submodules that are configured.
Signed-off-by: Dale Worley <worley@ariadne.com> Acked-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
revision.c: treat A...B merge bases as if manually specified
The documentation assures users that "A...B" is defined as "A B --not
$(git merge-base --all A B)". This wasn't in fact quite true, because
the calculated merge bases were not sent to add_rev_cmdline().
The main effect of this was that although
git rev-list --ancestry-path A B --not $(git merge-base --all A B)
worked, the simpler form
git rev-list --ancestry-path A...B
failed with a "no bottom commits" error.
Other potential users of bottom commits could also be affected by this
problem, if they examine revs->cmdline_info; I came across the issue in
my proposed history traversal refinements series.
So ensure that the calculated merge bases are sent to add_rev_cmdline(),
flagged with new 'whence' enum value REV_CMD_MERGE_BASE.
Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 95b0c60 (remote-bzr: add support for bzr repos) introduced a
regression by assuming all bzr remote repos are listable, but they are
not.
If they are not listable they are basically useless, so let's assume
there is no bzr repo.
Reported-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In certain situations we might end up pushing garbage revisions
(e.g. in a rebase), and the patches to deal with that haven't been
merged yet. So let's disable forced pushes by default.
We are essentially reverting back to the old v1.8.2 behavior, to
minimize the possibility of regressions, but in a way the user can
configure.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
combine-diff.c: Fix output when changes are exactly 3 lines apart
When a deletion is followed by exactly 3 (or whatever the number of
context lines) unchanged lines, followed by another change, the combined
diff output would hide the first deletion, resulting in a malformed
diff.
This happened because the 3 lines before each change are painted
interesting, but also marked as no_pre_delete to prevent showing deletes
that were previously marked as uninteresting. This behaviour was
introduced in c86fbe53 (diff -c/--cc: do not include uninteresting
deletion before leading context). However, as a side effect, this could
also mark deletes that were already interesting as no_pre_delete. This
would happen only if the delete was exactly 3 lines away from the next
change, since lines farther away would not be touched by the "paint
three lines before the change" code and lines closer would be painted
by the "merge two adjacent hunks" code instead, which does not set the
no_pre_delete flag.
This commit fixes this problem by only setting the no_pre_delete flag
for changes that were previously uninteresting.
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a clone exists with the old organization (v1.8.2) it will prevent
the new shared bzr repository organization from working, so let's
remove this repository, which is not used any more.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
coverage: set DEFAULT_TEST_TARGET to avoid using prove
If the user sets DEFAULT_TEST_TARGET=prove in his config.mak, that
carries over into the coverage tests. Which is really bad if he also
sets GIT_PROVE_OPTS=-j<..> as that completely breaks the coverage
runs.
Instead of attempting to mess with the GIT_PROVE_OPTS, just force the
test target to 'test' so that we run under make, like we intended all
along.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
coverage: do not delete .gcno files before building
The coverage-compile target depends on coverage-clean, which is
supposed to remove the earlier build products that would get in the
way of the next coverage test run.
However, removing *.gcno is actively wrong. These are the files that
contain the compile-time coverage related data. They are only rebuilt
if the source is compiled. So if one ran 'make coverage' two times in
a row, the second run would remove *.gcno, but then fail to recreate
them because neither source files nor build flags have changed. (This
remained hidden for so long most likely because any other intervening
use of 'make' will change the build flags, causing a full rebuild.)
So we make an exception for *.gcno. The *.gcda are the coverage
results, written when the gcov-instrumented program is run. We still
remove those, so as to get a one-test-run view of the data; you could
probably argue the other way too.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git://ozlabs.org/~paulus/gitk:
gitk: On OSX, bring the gitk window to front
gitk: Add support for -G'regex' pickaxe variant
gitk: Add menu item for reverting commits
gitk: Simplify file filtering
gitk: Display the date of a tag in a human-friendly way
gitk: Improve behaviour of drop-down lists
gitk: Move hard-coded colors to .gitk
On OSX, Tcl/Tk application windows are created behind all
the applications down the stack of windows. This is very
annoying, because once a gitk window appears, it's the
downmost window and switching to it is pain.
The patch is: if we are on OSX, use osascript to
bring the current Wish process window to front.
Signed-off-by: Tair Sabirgaliev <tair.sabirgaliev@gmail.com>
Thanks-to: Stefan Haller <lists@haller-berlin.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
The set-up step to prepare a repository with 50000 tags used a
non-porable '\+' to match one-or-more.
The error was not caught because the next test that uses that
repository did not even bother to check if these expected tags were
actually cloned to the resulting repository.
Fix the sed construct to use BRE and update the "clone" test that
wanted to test cloning from such a repository with many refs to
check the resulting repository.
When we run a regular "git fetch" without arguments, we
update the tracking refs according to the configured
refspec. However, when we run "git fetch origin master" (or
"git pull origin master"), we do not look at the configured
refspecs at all, and just update FETCH_HEAD.
We miss an opportunity to update "refs/remotes/origin/master"
(or whatever the user has configured). Some users find this
confusing, because they would want to do further comparisons
against the old state of the remote master, like:
In the currnet code, they are comparing against whatever
commit happened to be in origin/master from the last time
they did a complete "git fetch". This patch will update a
ref from the RHS of a configured refspec whenever we happen
to be fetching its LHS. That makes the case above work.
The downside is that any users who really care about whether
and when their tracking branches are updated may be
surprised.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each "struct ref" has a boolean flag that is set by the
fetch code to determine whether the ref should be marked as
"not-for-merge" or not when we write it out to FETCH_HEAD.
It would be useful to turn this boolean into a tri-state,
with the third state meaning "do not bother writing it out
to FETCH_HEAD at all". That would let us add extra refs to
the set of refs to be stored (e.g., to store copies of
things we fetched) without impacting FETCH_HEAD.
This patch turns it into an enum that covers the tri-state
case, and hopefully makes the code more explicit and easier
to read.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation erroneously used the same wording for both fetch and
pull, stating that something will be merged even in git-fetch(1).
In addition, saying that "<ref> is equivalent to <ref>:" doesn't
really help anyone who still needs to read manpages. Clarify what is
actually going on.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5510: start tracking-ref tests from a known state
We have three sequential tests for for whether tracking refs
are updated by various fetches and pulls; the first two
should not update the ref, and the third should. Each test
depends on the state left by the test before.
This is fragile (a failing early test will confuse later
tests), and means we cannot add more "should update" tests
after the third one.
Let's instead save the initial state before these tests, and
then reset to a known state before running each test.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitk: Display the date of a tag in a human-friendly way
By selecting a tag within gitk you can display information about it.
This information is output by using the command
'git cat-file tag <tagid>'
This outputs the *raw* information from the tag, amongst which is the
time - in seconds since the epoch. As useful as that value is, I find it
a lot easier to read and process time which it is something like:
"Mon Dec 31 14:26:11 2012 -0800"
This change will modify the display of tags in gitk like so:
@@ -1,7 +1,7 @@
object 5d417842efeafb6e109db7574196901c4e95d273
type commit
tag v1.8.1
-tagger Junio C Hamano <gitster@pobox.com> 1356992771 -0800
+tagger Junio C Hamano <gitster@pobox.com> Mon Dec 31 14:26:11 2012 -0800
Git 1.8.1
-----BEGIN PGP SIGNATURE-----
Signed-off-by: Anand Kumria <wildfire@progsoc.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The drop-down lists used for things like the criteria for finding
commits (containing/touching paths/etc.) use a combobox if we are
using the ttk widgets. By default the combobox exports its value
as the selection when it is changed, which is unnecessary, and sometimes
the combobox wouldn't release the selection, which is annoying.
To fix this, we make these comboboxes not export their selection,
and also clear their selection whenever they are changed. This makes
them more like a simple selection of alternatives, improving the look
and feel of gitk.
Commit 664059f (transport-helper: update remote helper namespace)
updates the namespace when the push succeeds or not; we should do it
only when it succeeded.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
CodingGuidelines: Documentation/*.txt are the sources
People not familiar with AsciiDoc may not realize they are
supposed to update *.txt files and not *.html/*.1 files when
preparing patches to the project.
Signed-off-by: Dale Worley <worley@ariadne.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Git 1.8.2.3
t5004: avoid using tar for checking emptiness of archive
t5004: ignore pax global header file
mergetools/kdiff3: do not use --auto when diffing
transport-helper: trivial style cleanup
cherry-pick: picking a tag that resolves to a commit is OK
Earlier, 21246dbb9e0a (cherry-pick: make sure all input objects are
commits, 2013-04-11) tried to catch an unlikely "git cherry-pick $blob"
as an error, but broke a more important use case to cherry-pick a
tag that points at a commit.
t5004: avoid using tar for checking emptiness of archive
Test 2 of t5004 checks if a supposedly empty tar archive really
contains no files. 24676f02 (t5004: fix issue with empty archive test
and bsdtar) removed our commit hash to make it work with bsdtar, but
the test still fails on NetBSD and OpenBSD, which use their own tar
that considers a tar file containing only NULs as broken.
Here's what the different archivers do when asked to create a tar
file without entries:
So NetBSD's native tar creates an empty file, while GNU tar and bsdtar
both give us 10KB of NULs -- just like git archive with an empty tree.
Now let's see how the archivers handle these two kinds of empty tar
files:
$ tar tf zero.tar; echo $?
tar: Unexpected EOF on archive file
1
$ gtar tf zero.tar; echo $?
gtar: This does not look like a tar archive
gtar: Exiting with failure status due to previous errors
2
$ bsdtar tf zero.tar; echo $?
0
$ tar tf tenk.tar; echo $?
tar: Cannot identify format. Searching...
tar: End of archive volume 1 reached
tar: Sorry, unable to determine archive format.
1
$ gtar tf tenk.tar; echo $?
0
$ bsdtar tf tenk.tar; echo $?
0
NetBSD's tar complains about both, bsdtar happily accepts any of them
and GNU tar doesn't like zero-length archive files. So the safest
course of action is to stay with our block-of-NULs format which is
compatible with GNU tar and bsdtar, as we can't make NetBSD's native
tar happy anyway.
We can simplify our test, however, by taking tar out of the picture.
Instead of extracting the archive and checking for the non-presence of
files, check if the file has a size of 10KB and contains only NULs.
This makes t5004 pass on NetBSD and OpenBSD.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test to verify the emptiness of an archive by extracting its
contents. Don't run this test if the version of tar doesn't support
archives containing only a comment header, though.
The existing check 'tar archive of empty tree is empty' used to work
like that (minus the tar capability check) but was changed to depend
on the exact representation of empty tar files created by git archive
instead of on the behaviour of tar in order to avoid issues with
different tar versions.
The different approaches test different things: The existing one is
for empty trees, for which we know the exact expected output and thus
we can simply check it without extracting; the new one is for commits
with empty trees, whose archives include stamps and so the more
"natural" check by extraction is a better fit because it focuses on
the interesting aspect, namely the absence of any archive entries.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5004: avoid using tar for checking emptiness of archive
Test 2 of t5004 checks if a supposedly empty tar archive really
contains no files. 24676f02 (t5004: fix issue with empty archive test
and bsdtar) removed our commit hash to make it work with bsdtar, but
the test still fails on NetBSD and OpenBSD, which use their own tar
that considers a tar file containing only NULs as broken.
Here's what the different archivers do when asked to create a tar
file without entries:
So NetBSD's native tar creates an empty file, while GNU tar and bsdtar
both give us 10KB of NULs -- just like git archive with an empty tree.
Now let's see how the archivers handle these two kinds of empty tar
files:
$ tar tf zero.tar; echo $?
tar: Unexpected EOF on archive file
1
$ gtar tf zero.tar; echo $?
gtar: This does not look like a tar archive
gtar: Exiting with failure status due to previous errors
2
$ bsdtar tf zero.tar; echo $?
0
$ tar tf tenk.tar; echo $?
tar: Cannot identify format. Searching...
tar: End of archive volume 1 reached
tar: Sorry, unable to determine archive format.
$ gtar tf tenk.tar; echo $?
0
$ bsdtar tf tenk.tar; echo $?
0
NetBSD's tar complains about both, bsdtar happily accepts any of them
and GNU tar doesn't like zero-length archive files. So the safest
course of action is to stay with our block-of-NULs format which is
compatible with GNU tar and bsdtar, as we can't make NetBSD's native
tar happy anyway.
We can simplify our test, however, by taking tar out of the picture.
Instead of extracting the archive and checking for the non-presence of
files, check if the file has a size of 10KB and contains only NULs.
This makes t5004 pass on NetBSD and OpenBSD.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Versions of tar that don't know pax headers -- like the ones in NetBSD 6
and OpenBSD 5.2 -- extract them as regular files. Explicitly ignore the
file created for our global header when checking the list of extracted
files, as this is normal and harmless fall-back behaviour. This fixes
test 3 of t5004 on these platforms.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `kdiff3 --auto` help message is, "No GUI if all conflicts are auto-
solvable." This flag was carried over from the original mergetool
commands. diff_cmd() is for two-way comparisons only so remove the
superfluous flag.
Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The SVN::Fetcher module is now able to filter for inclusion as well
as exclusion (as used by --ignore-path). Also added tests, documentation
changes and git completion script.
If you have an SVN repository with many top level directories and you
only want a git-svn clone of some of them then using --ignore-path is
difficult as it requires a very long regexp. In this case it's much
easier to filter for inclusion.
[ew: remove trailing whitespace]
Signed-off-by: Paul Walmsley <pjwhams@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
Git::SVN::*: add missing "NAME" section to perldoc
lexgrog(1) relies on the NAME section to find a manpage's subject's
name and description for easy access later using "man -k". Add the
section it expects.
Noticed using lintian.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
When svn.pushmergeinfo is set, the target branch is included in the
mergeinfo if it was previously merged into one of the source branches.
SVN does not do this.
Remove merge target branch path from resulting mergeinfo when
svn.pushmergeinfo is set to better match the behavior of SVN. Update the
svn-mergeinfo-push test.
[ew: 80 columns]
Signed-off-by: Michael Contreras <michael@inetric.com> Reported-by: Avishay Lavie <avishay.lavie@gmail.com> Acked-by: Eric Wong <normalperson@yhbt.net>
Use help.c:help_unknown_ref() instead of die() to provide a
friendlier error message before exiting, when one of the refs
specified in a merge is unknown.
Signed-off-by: Vikrant Varma <vikrant.varma94@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user gives an unknown string to a command that expects to
get a ref, we could be more helpful than just saying "that's not a
ref" and die.
Add helper function help_unknown_ref() to take care of displaying an
error message along with a list of suggested refs the user might
have meant. An interaction with "git merge" might go like this:
$ git merge foo
merge: foo - not something we can merge
Did you mean one of these?
origin/foo
upstream/foo
Signed-off-by: Vikrant Varma <vikrant.varma94@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible that the previous tip goes away, we should not assume it's
always present. Fortunately we are only using it to calculate the
progress to display to the user, so only that needs to be fixed.
Also, add a test that triggers this issue.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clone: allow cloning local paths with colons in them
Usually "foo:bar" is interpreted as an ssh url. This patch allows to
clone from such paths by putting at least one slash before the colon
(i.e. /path/to/foo:bar or just ./foo:bar).
file://foo:bar should also work, but local optimizations are off in
that case, which may be unwanted. While at there, warn the users about
--local being ignored in this case.
Reported-by: William Giokas <1007380@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export: do not parse non-commit objects while reading marks file
We read from the marks file and keep only marked commits, but in
order to find the type of object, we are parsing the whole thing,
which is slow, specially in big repositories with lots of big files.
There's no need for that, we can query the object information with
sha1_object_info().
Before this, loading the objects of a fresh emacs import, with 260598
blobs took 14 minutes, after this patch, it takes 3 seconds.
This is the way fast-import does it. Also die if the object is not
found (like fast-import).
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
c08e4d5b5cfa (Enable minimal stat checking, 2013-01-22) advertised
the configuration variable core.checkstat in the documentation and
its log message, but the code expected core.statinfo instead.
For now, add core.checkstat, and warn people who have core.statinfo
in their configuration file that we will remove it in Git 2.0.
Noticed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-merge-tree causes a null pointer dereference when a directory
entry exists in only one or two of the three trees being compared with
no corresponding entry in the other tree(s).
When this happens, we want to handle the entry as a directory and not
attempt to mark it as a file merge. Do this by setting the entries bit
in the directory mask when the entry is missing or when it is a
directory, only performing the file comparison when we know that a file
entry exists.
Reported-by: Andreas Jacobsen <andreas@andreasjacobsen.com> Signed-off-by: John Keeping <john@keeping.me.uk> Tested-by: Andreas Jacobsen <andreas@andreasjacobsen.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fc/remote-bzr:
remote-bzr: avoid bad refs
remote-bzr: convert all unicode keys to str
remote-bzr: access branches only when needed
remote-bzr: delay peer branch usage
remote-bzr: iterate revisions properly
remote-bzr: improve progress reporting
remote-bzr: add option to specify branches
remote-bzr: add custom method to find branches
remote-bzr: improve author sanitazion
remote-bzr: add support for shared repo
remote-bzr: fix branch names
remote-bzr: add support for bzr repos
remote-bzr: use branch variable when appropriate
remote-bzr: fix partially pushed merge
remote-bzr: fixes for branch diverge
remote-bzr: add support to push merges
remote-bzr: always try to update the worktree
remote-bzr: fix order of locking in CustomTree
remote-bzr: delay blob fetching until the very end
remote-bzr: cleanup CustomTree
Versions of fast-export before v1.8.2 throws a bad 'reset' commands
because of a behavior in transport-helper that is not even needed.
We should ignore them, otherwise we will treat them as branches and
fail.
This was fixed in v1.8.2, but some people use this script in older
versions of git.
Also, check if the ref was a tag, and skip it for now.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 54bb901 (t/Makefile: fix result handling with
TEST_OUTPUT_DIRECTORY - 2013-04-26) incorrectly defined
TEST_RESULTS_DIRECTORY relative to itself, when it should be relative to
TEST_OUTPUT_DIRECTORY. Fix this.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there's already a remote-helper tracking ref, we can fetch the
SHA-1 to report proper push messages (as opposed to always reporting
[new branch]).
The remote-helper currently can specify the old SHA-1 to avoid this
problem, but there's no point in forcing all remote-helpers to be aware
of git commit ids; they should be able to be agnostic of them.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'tr/remote-tighten-commandline-parsing' into maint
* tr/remote-tighten-commandline-parsing:
remote: 'show' and 'prune' can take more than one remote
remote: check for superfluous arguments in 'git remote add'
remote: add a test for extra arguments, according to docs
t5500: add test for fetching with an unknown 'shallow'
When the client sends a 'shallow' line for an object that the server does
not have, the server should just ignore it and let the client keep that
unknown shallow boundary.
Signed-off-by: Michael Heemskerk <mheemskerk@atlassian.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The lookup_object function is backed by a hash table of all
objects we have seen in the program. We manage collisions
with a linear walk over the colliding entries, checking each
with hashcmp(). The main cost of lookup is in these
hashcmp() calls; finding our item in the first slot is
cheaper than finding it in the second slot, which is cheaper
than the third, and so on.
If we assume that there is some locality to the object
lookups (e.g., if X and Y collide, and we have just looked
up X, the next lookup is more likely to be for X than for
Y), then we can improve our average lookup speed by checking
X before Y.
This patch does so by swapping a found item to the front of
the collision chain. The p0001 perf test reveals that this
does indeed exploit locality in the case of "rev-list --all
--objects":
Test origin this tree
-------------------------------------------------------------------------
0001.1: rev-list --all 0.40(0.38+0.02) 0.40(0.36+0.03) +0.0%
0001.2: rev-list --all --objects 2.24(2.17+0.05) 1.86(1.79+0.05) -17.0%
This is not surprising, as the full object traversal will
hit the same tree entries over and over (e.g., for every
commit that doesn't change "Documentation/", we will have to
look up the same sha1 just to find out that we already
processed it).
The reason why this technique works (and does not violate
any properties of the hash table) is subtle and bears some
explanation. Let's imagine we get a lookup for sha1 `X`, and
it hashes to bucket `i` in our table. That stretch of the
table may look like:
index | i-1 | i | i+1 | i+2 |
-----------------------------------
entry ... | A | B | C | X | ...
-----------------------------------
We start our probe at i, see that B does not match, nor does
C, and finally find X. There may be multiple C's in the
middle, but we know that there are no empty slots (or else
we would not find X at all).
We do not know the original index of B; it may be `i`, or it
may be less than i (e.g., if it were `i-1`, it would collide
with A and spill over into the `i` bucket). So it is
acceptable for us to move it to the right of a contiguous
stretch of entries (because we will find it from a linear
walk starting anywhere at `i` or before), but never to the
left (if we moved it to `i-1`, we would miss it when
starting our walk at `i`).
We do know the original index of X; it is `i`, so it is safe
to place it anywhere in the contiguous stretch between `i`
and where we found it (`i+2` in the this case).
This patch does a pure swap; after finding X in the
situation above, we would end with:
index | i-1 | i | i+1 | i+2 |
-----------------------------------
entry ... | A | X | C | B | ...
-----------------------------------
We could instead bump X into the `i` slot, and then shift
the whole contiguous chain down by one, resulting in:
index | i-1 | i | i+1 | i+2 |
-----------------------------------
entry ... | A | X | B | C | ...
-----------------------------------
That puts our chain in true most-recently-used order.
However, experiments show that it is not any faster (and in
fact, is slightly slower due to the extra manipulation).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hold the ref_cache instance for the main repository in a dedicated,
statically-allocated instance to avoid the need for a function call
and a linked-list traversal when it is needed.
Suggested by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack_one_ref(): use write_packed_entry() to do the writing
Change pack_refs() to work with a file descriptor instead of a FILE*
(making the file-locking code less awkward) and use
write_packed_entry() to do the writing.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change pack_one_ref() to call peel_entry() rather than using its own
code for peeling references. Aside from sharing code, this lets it
take advantage of the optimization introduced by 6c4a060d7d.
Please note that we *could* use any peeled values that happen to
already be stored in the ref_entries, which would avoid some object
lookups for references that were already packed. But doing so would
also propagate any peeling errors across runs of "git pack-refs" and
give no way to recover from such errors. And "git pack-refs" isn't
run often enough that the performance cost is a problem. So instead,
add a new option to peel_entry() to force the entry to be re-peeled,
and call it with that option set.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Function do_not_prune() was redundantly checking REF_ISSYMREF, which
was already tested at the top of pack_one_ref(), so remove that check.
And the rest was trivial, so inline the function.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack_refs() was not using any of the extra features of for_each_ref(),
so change it to use do_for_each_entry(). This also gives it access to
the ref_entry and in particular its peeled field, which will be taken
advantage of in the next commit.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-refs: merge code from pack-refs.{c,h} into refs.{c,h}
pack-refs.c doesn't contain much code, and the code it does contain is
closely related to reference handling. Moreover, there is some
duplication between pack_refs() and repack_without_ref(). Therefore,
merge pack-refs.c into refs.c and pack-refs.h into refs.h.
The code duplication will be addressed in future commits.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>