"git clone foo/bar:baz" cannot be a request to clone from a remote
over git-over-ssh specified in the scp style. Detect this case and
clone from a local repository at "foo/bar:baz".
* nd/clone-local-with-colon:
clone: allow cloning local paths with colons in them
Optimization for fast-export by avoiding unnecessarily resolving
arbitrary object name and parsing object when only presence and
type information is necessary, etc.
* fc/fast-export-persistent-marks:
fast-{import,export}: use get_sha1_hex() to read from marks file
fast-export: don't parse commits while reading marks file
fast-export: do not parse non-commit objects while reading marks file
unpack-trees: free cache_entry array members for merges
The merge functions duplicate entries as needed and they don't free
them. Release them in unpack_nondirectories, the same function
where they were allocated, after we're done.
As suggested by Felipe, use the same loop style (zero-based for loop)
for freeing as for allocating.
Improved-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff-lib, read-tree, unpack-trees: mark cache_entry array paramters const
Change the type merge_fn_t to accept the array of cache_entry pointers
as const pointers to const pointers. This documents the fact that the
merge functions don't modify the cache_entry contents or replace any of
the pointers in the array.
Only a single cast is necessary in unpack_nondirectories because adding
two const modifiers at once is not allowed in C. The cast is safe in
that it doesn't mask any modfication; call_unpack_fn only needs the
array for reading.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
unpack-trees: create working copy of merge entry in merged_entry
Duplicate the merge entry right away and work with that instead of
modifying the entry we got and duplicating it only at the end of
the function. Then mark that pointer const to document that we
don't modify the referenced cache_entry.
This change is safe because all existing merge functions call
merged_entry just before returning (or not at all), i.e. they don't
care about changes to the referenced cache_entry after the call.
unpack_nondirectories and unpack_index_entry, which call the merge
functions through call_unpack_fn, aren't interested in such changes
neither.
The change complicates merged_entry a bit because we have to free the
copy if we error out, but allows callers to pass a const pointer.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we're add it, mark the struct cache_entry pointer of add_entry
const because we only read from it and this allows callers to pass in
const pointers.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ie_match_stat and ie_modified only derefence their struct cache_entry
pointers for reading. Add const to the parameter declaration here and
do the same for the static helper function used by them, as it's the
same there as well. This allows callers to pass in const pointers.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add const for pointers that are only dereferenced for reading by the
inline functions copy_cache_entry and ce_mode_from_stat. This allows
callers to pass in const pointers.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: document the lifetime of the args passed to each_ref_fn
The lifetime of the memory pointed to by the refname and sha1
arguments to each_ref_fn was never documented, but some callers used
to assume that it was essentially permanent. In fact the API does
*not* guarantee that these objects live beyond a single callback
invocation.
In the current code, the lifetimes are bound together with the
lifetimes of the ref_caches. Since these are usually long, the
callers usually got away with their sloppiness. But even today, if a
ref_cache is invalidated the memory can be freed. And planned changes
to reference caching, needed to eliminate race conditions, will
probably need to shorten the lifetimes of these objects.
The commits leading up to this have (hopefully) fixed all of the
callers of the for_each_ref()-like functions. This commit does the
last step: documents what each_ref_fn callbacks can assume about
object lifetimes.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
exclude_existing(): set existing_refs.strdup_strings
The each_ref_fn add_existing() adds refnames to the existing_refs
list. But the lifetimes of these refnames is not guaranteed by the
refs API, so configure the string_list to make copies as it adds them.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
string_list_add_refs_by_glob(): add a comment about memory management
Since string_list_add_one_ref() adds refname to the string list, but
the lifetime of refname is limited, it is important that the
string_list passed to string_list_add_one_ref() has strdup_strings
set. Document this fact.
All current callers do the right thing.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object_array_entry: fix memory handling of the name field
Previously, the memory management of the object_array_entry::name
field was inconsistent and undocumented. object_array_entries are
ultimately created by a single function, add_object_array_with_mode(),
which has an argument "const char *name". This function used to
simply set the name field to reference the string pointed to by the
name parameter, and nobody on the object_array side ever freed the
memory. Thus, it assumed that the memory for the name field would be
managed by the caller, and that the lifetime of that string would be
at least as long as the lifetime of the object_array_entry. But
callers were inconsistent:
* Some passed pointers to constant strings or argv entries, which was
OK.
* Some passed pointers to newly-allocated memory, but didn't arrange
for the memory ever to be freed.
* Some passed the return value of sha1_to_hex(), which is a pointer to
a statically-allocated buffer that can be overwritten at any time.
* Some passed pointers to refnames that they received from a
for_each_ref()-type iteration, but the lifetimes of such refnames is
not guaranteed by the refs API.
Bring consistency to this mess by changing object_array to make its
own copy for the object_array_entry::name field and free this memory
when an object_array_entry is deleted from the array.
Many callers were passing the empty string as the name parameter, so
as a performance optimization, treat the empty string specially.
Instead of making a copy, store a pointer to a statically-allocated
empty string to object_array_entry::name. When deleting such an
entry, skip the free().
Change the callers that were already passing copies to
add_object_array_with_mode() to either skip the copy, or (if the
memory needed to be allocated anyway) freeing the memory itself.
A part of this commit effectively reverts
70d26c6e76 read_revisions_from_stdin: make copies for handle_revision_arg
because the copying introduced by that commit (which is still
necessary) is now done at a deeper level.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c: fix ignore processing within not-ignored directories
As of 95c6f271 "dir.c: unify is_excluded and is_path_excluded APIs", the
is_excluded API no longer recurses into directories that match an ignore
pattern, and returns the directory's ignored state for all contained paths.
This is OK for normal ignore patterns, i.e. ignoring a directory affects
the entire contents recursively.
Unfortunately, this also "works" for negated ignore patterns ('!dir'), i.e.
the entire contents is "not-ignored" recursively, regardless of ignore
patterns that match the contents directly.
In prep_exclude, skip recursing into a directory only if it is really
ignored (i.e. the ignore pattern is not negated).
Signed-off-by: Karsten Blees <blees@dcon.de> Tested-by: Øystein Walle <oystwa@gmail.com> Reviewed-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Case folding is not done correctly when matching against the [:upper:]
character class and uppercased character ranges (e.g. A-Z).
Specifically, an uppercase letter fails to match against any of them
when case folding is requested because plain characters in the pattern
and the whole string are preemptively lowercased to handle the base case
fast.
That optimization is kept and ISLOWER() is used in the [:upper:] case
when case folding is requested, while matching against a character range
is retried with toupper() if the character was lowercase, as the bounds
of the range itself cannot be modified (in a case-insensitive context,
[A-_] is not equivalent to [a-_]).
Signed-off-by: Anthony Ramine <n.oxyde@gmail.com> Reviewed-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a sub-process dies with a signal, we convert the exit
code to the shell convention of 128+sig. Callers of git may
be relying on this behavior, so let's make sure it does not
break.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some index handling subtleties in 'commit --only' that are
best tested when we have an existing index, but an unborn or empty
HEAD. These circumstances are easily produced by 'checkout --orphan',
but we did not previously have a test for it.
The main expected failure mode would be: erroneously loading the
existing index contents when building the temporary index that is used
for --only. Cf.
sha1_name: fix error message for @{<N>}, @{<date>}
Currently, when we try to resolve @{<N>} or @{<date>} when the reflog
doesn't go back far enough, we get errors like:
# on branch master
$ git show @{10000}
fatal: Log for '' only has 7 entries.
$ git show @{10000.days.ago}
warning: Log for '' only goes back to Tue, 21 May 2013 14:14:45 +0530.
...
# detached HEAD case
$ git show @{10000}
fatal: Log for '' only has 2005 entries.
$ git show master@{10000}
fatal: Log for 'master' only has 7 entries.
The empty string '' is confusing and does not convey information
about whose logs we are inspecting. Change this so that we get:
# on branch master
$ git show @{10000}
fatal: Log for 'master' only has 7 entries.
$ git show @{10000.days.ago}
warning: Log for 'master' only goes back to Tue, 21 May 2013 14:14:45 +0530.
...
# detached HEAD case
$ git show @{10000}
fatal: Log for 'HEAD' only has 2005 entries.
$ git show master@{10000}
fatal: Log for 'master' only has 7 entries.
Also one of the message strings given to die() now points into
real_ref that was not used in that fashion, so stop freeing the
underlying storage for it.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Bug-spotted-and-fixed-by: Thomas Rast Signed-off-by: Junio C Hamano <gitster@pobox.com>
On MinGW, sparse issues an "'get_st_mode_bits' not declared. Should
it be static?" warning. The MinGW and MSVC builds do not see the
declaration of this function, within git-compat-util.h, due to its
placement within an preprocessor conditional.
In order to suppress the warning, we simply move the declaration to
the top level of the header.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch was written with a different motivation. There is a problem
unique to push.default = current:
# on branch push-current-head
$ git push
# on another terminal
$ git checkout master
# return to the first terminal
# the push tried to push master!
This happens because the 'git checkout' on the second terminal races
with the 'git push' on the first terminal. Although this patch does not
solve the core problem (there is still no guarantee that 'git push' on
the first terminal will resolve HEAD before 'git checkout' changes HEAD
on the second), it works in practice.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting push.default to current adds the refspec "HEAD" for the
transport layer to handle. If "HEAD" doesn't resolve to a branch (and
since no refspec rhs is specified), the push fails after some time with
a cryptic error message:
$ git push
error: unable to push to unqualified destination: HEAD
The destination refspec neither matches an existing ref on the remote nor
begins with refs/, and we are unable to guess a prefix based on the source ref.
error: failed to push some refs to 'git@github.com:artagnon/git'
Fail early with a nicer error message:
$ git push
fatal: You are not currently on a branch.
To push the history leading to the current (detached HEAD)
state now, use
git push ram HEAD:<name-of-remote-branch>
Just like in the upstream and simple cases.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When TEST_OUTPUT_DIRECTORY setting is used, it was handled somewhat
inconsistently between the test framework and t/Makefile, and logic
to summarize the results looked at a wrong place.
* jk/test-output:
t/Makefile: don't define TEST_RESULTS_DIRECTORY recursively
test output: respect $TEST_OUTPUT_DIRECTORY
t/Makefile: fix result handling with TEST_OUTPUT_DIRECTORY
Update reading and updating packed-refs file, correcting corner case
bugs.
* mh/packed-refs-various: (33 commits)
refs: handle the main ref_cache specially
refs: change do_for_each_*() functions to take ref_cache arguments
pack_one_ref(): do some cheap tests before a more expensive one
pack_one_ref(): use write_packed_entry() to do the writing
pack_one_ref(): use function peel_entry()
refs: inline function do_not_prune()
pack_refs(): change to use do_for_each_entry()
refs: use same lock_file object for both ref-packing functions
pack_one_ref(): rename "path" parameter to "refname"
pack-refs: merge code from pack-refs.{c,h} into refs.{c,h}
pack-refs: rename handle_one_ref() to pack_one_ref()
refs: extract a function write_packed_entry()
repack_without_ref(): write peeled refs in the rewritten file
t3211: demonstrate loss of peeled refs if a packed ref is deleted
refs: change how packed refs are deleted
search_ref_dir(): return an index rather than a pointer
repack_without_ref(): silence errors for dangling packed refs
t3210: test for spurious error messages for dangling packed refs
refs: change the internal reference-iteration API
refs: extract a function peel_entry()
...
Enhance "check-ignore" (1.8.2 update) to work more like "check-attr"
over bidi-pipes.
* as/check-ignore:
t0008: use named pipe (FIFO) to test check-ignore streaming
Documentation: add caveats about I/O buffering for check-{attr,ignore}
check-ignore: allow incremental streaming of queries via --stdin
check-ignore: move setup into cmd_check_ignore()
check-ignore: add -n / --non-matching option
t0008: remove duplicated test fixture data
Update "git checkout foo" that DWIMs the intended "upstream" and
turns it into "git checkout -t -b foo remotes/origin/foo" to
correctly take existing remote definitions into account.
The remote "origin" may be what uniquely map its own branch to
remotes/some/where/foo but that some/where may not be "origin".
* jh/checkout-auto-tracking:
glossary: Update and rephrase the definition of a remote-tracking branch
branch.c: Validate tracking branches with refspecs instead of refs/remotes/*
t9114.2: Don't use --track option against "svn-remote"-tracking branches
t7201.24: Add refspec to keep --track working
t3200.39: tracking setup should fail if there is no matching refspec.
checkout: Use remote refspecs when DWIMming tracking branches
t2024: Show failure to use refspec when DWIMming remote branch names
t2024: Add tests verifying current DWIM behavior of 'git checkout <branch>'
We used the approxidate() parser for "--expire=<timestamp>" options
of various commands, but it is better to treat --expire=all and
--expire=now a bit more specially than using the current timestamp.
Update "git gc" and "git reflog" with a new parsing function for
expiry dates.
* jc/prune-all:
prune: introduce OPT_EXPIRY_DATE() and use it
api-parse-options.txt: document "no-" for non-boolean options
git-gc.txt, git-reflog.txt: document new expiry options
date.c: add parse_expiry_date()
"git fetch" into a shallow repository from a repository that does
not know about the shallow boundary commits (e.g. a different fork
from the repository the current shallow repository was cloned from)
did not work correctly.
* mh/fetch-into-shallow:
t5500: add test for fetching with an unknown 'shallow'
upload-pack: ignore 'shallow' lines with unknown obj-ids
Finishing touches to fc/transport-helper-error-reporting topic.
* js/transport-helper-error-reporting-fix:
git-remote-testgit: build it to run under $SHELL_PATH
git-remote-testgit: further remove some bashisms
git-remote-testgit: avoid process substitution
difftool --dir-diff: allow changing any clean working tree file
The temporary directory prepared by "difftool --dir-diff" to
show the result of a change can be modified by the user via
the tree diff program, and we try hard not to lose changes
to them after tree diff program returns to us.
However, the set of files to be copied back is computed
differently between --symlinks and --no-symlinks modes. The
former checks all paths that start out as identical to the
working tree file, while the latter checks paths that
already had a local modification in the working tree,
allowing changes made in the tree diff program to paths that
did not have any local change to be lost.
Signed-off-by: Kenichi Saita <nitoyon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
With push.default set to upstream or simple, and a detached HEAD, git
push prints the following error:
$ git push
fatal: You are not currently on a branch.
To push the history leading to the current (detached HEAD)
state now, use
git push ram HEAD:<name-of-remote-branch>
This error is not unique to upstream or simple: current cannot push with
a detached HEAD either. So, factor out the error string in preparation
for using it in current.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_sha1: warn about full or short object names that look like refs
When we get 40 hex digits, we immediately assume it's an SHA-1. This
is the right thing to do because we have no way else to specify an
object. If there is a ref with the same object name, it will be
ignored. Warn the user about this case because the ref with full
object name is likely a mistake, for example
When we are rebasing without options ('am' mode), the head rebased lives
in '$g/rebase-apply/head-name', so lets use that information so it's
reported the same way as if we were doing other rebases (-i or -m).
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase: implement --[no-]autostash and rebase.autostash
This new feature allows a rebase to be executed on a dirty worktree or
index. It works by creating a temporary "dangling merge commit" out
of the worktree and index changes (via 'git stash create'), and
automatically applying it after a successful rebase or abort.
rebase stores the SHA-1 hex of the temporary merge commit, along with
the rest of the rebase state, in either
.git/{rebase-merge,rebase-apply}/autostash depending on the kind of
rebase. Since $state_dir is automatically removed at the end of a
successful rebase or abort, so is the autostash.
The advantage of this approach is that we do not affect the normal
stash's reflogs, making the autostash invisible to the end-user. This
means that you can use 'git stash' during a rebase as usual.
When the autostash application results in a conflict, we push
$state_dir/autostash onto the normal stash and remove $state_dir
ending the rebase. The user can inspect the stash, and pop or drop at
any time.
Most significantly, this feature means that a caller like pull (with
pull.rebase set to true) can easily be patched to remove the
require_clean_work_tree restriction.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-remote-mediawiki: better error message when HTTP(S) access fails
My use-case is an invalid SSL certificate. Pulling from the wiki with a
recent version of libwww-perl fails, and git-remote-mediawiki gave no
clue about the reason. Give the mediawiki API detailed error message, and
since it is not so informative, hint the user about an invalid SSL
certificate on https:// urls.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit: don't start editor if empty message is given with -m
If an empty message is specified with the option -m of git commit then
the editor is started. That's unexpected and unnecessary. Instead of
using the length of the message string for checking if the user
specified one, directly remember if the option -m was given.
Reported-by: Mislav Marohnić <mislav.marohnic@gmail.com> Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add protocol imap, imaps, ftp and smtp for credential-osxkeychain.
Signed-off-by: Xidorn Quan <quanxunzhen@gmail.com> Acked-by: John Szakmeister <john@szakmeister.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In diff_tree_combined we make a copy of diffopts. In
try_to_follow_renames, called via diff_tree_sha1, we free and
re-initialize diffopts->pathspec->items. Since we did not make a deep
copy of diffopts in diff_tree_combined, the original diffopts does not
get the update. By the time we return from diff_tree_combined,
rev->diffopt->pathspec->items points to an invalid memory address. We
get a segfault next time we try to access that pathspec.
Instead, along with the copy of diffopts, make a copy pathspec->items as
well.
We would also have to make a copy of pathspec->raw to keep it consistent
with pathspec->items, but nobody seems to rely on that.
Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: remove warning about unset chainreplyto
Three years and a half is probably more than enough time to give users
the opportunity to configure Git to do what they want. If they haven't
changed the configuration by now, this warning message is not going to
do anything for them anyway.
This effectively reverts commit 528fb08 (prepare send-email for smoother
change of --chain-reply-to default).
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
, which wedged a pointer to parent into the object_array_entry's name
field. The parent pointer was passed to traverse_one_object(), even
though that function *didn't use it*.
The useless code has been deleted over time. Commit
a1cdc25172 fsck: drop unused parameter from traverse_one_object()
removed the parent pointer from traverse_one_object()'s
signature. Commit
object_array_remove_duplicates(): rewrite to reduce copying
The old version copied one entry to its destination position, then
deleted any matching entries from the tail of the array. This
required the tail of the array to be copied multiple times. It didn't
affect the complexity of the algorithm because the whole tail has to
be searched through anyway. But all the copying was unnecessary.
Instead, check for the existence of an entry with the same name in the
*head* of the list before copying an entry to its final position.
This way each entry has to be copied at most one time.
Extract a helper function contains_name() to do a bit of the work.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function that allows unwanted entries in an object_array to be
removed. This encapsulation is a step towards giving object_array
ownership of its entries' name memory.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_diff(): make it obvious which cases are exclusive of each other
At first glance the OBJ_COMMIT, OBJ_TREE, and OBJ_BLOB cases look like
they might be mutually exclusive. But the OBJ_COMMIT case doesn't end
the loop iteration with "continue" like the other two cases, but
rather falls through. So use if...else if...else construct to make it
more obvious that only the last two cases are mutually exclusive.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_rev_cmdline(): make a copy of the name argument
Instead of assuming that the memory pointed to by the name argument
will live forever, make a local copy of it before storing it in the
ref_cmdline_info.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do not retain a reference to the refname passed to the each_ref_fn
callback get_name(), because there is no guarantee of the lifetimes of
these names. Instead, make a local copy when needed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
prune-packed: avoid implying "1" is DRY_RUN in prune_packed_objects()
Commit b60daf0 (Make git-prune-packed a bit more chatty. - 2007-01-12)
changes the meaning of prune_packed_objects()'s argument, from "dry
run or not dry run" to a bitmap.
It however forgot to update prune_packed_objects() caller in
builtin/prune.c to use new DRY_RUN macro. It's fine (for a long time!)
but there is a risk that someday someone may change the value of
DRY_RUN to something else and builtin/prune.c suddenly breaks. Avoid
that possibility.
While at there, change "opts == VERBOSE" to "opts & VERBOSE" as there
is no obvious reason why we only be chatty when DRY_RUN is not set.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: don't try to update unfetched tracking refs
Since commit f269048 (fetch: opportunistically update tracking refs,
2013-05-11) we update tracking refs opportunistically when fetching
remote branches. However, if there is a configured non-pattern refspec
that does not match any of the refspecs given on the command line then a
fatal error occurs.
Fix this by setting the "missing_ok" flag when calling get_fetch_map.
Test-added-by: Jeff King <peff@peff.net> Signed-off-by: John Keeping <john@keeping.me.uk> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.
In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":
- all refs point to an object in the pack
- there are no dangling pointers in any object in the pack
- no objects in the pack point to objects outside the pack
The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.
"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.
Cloning linux-2.6 over file://
before after
real 3m25.693s 2m53.050s
user 5m2.037s 4m42.396s
sys 0m13.750s 0m16.574s
A more realistic test with ssh:// over wireless
before after
real 11m26.629s 10m4.213s
user 5m43.196s 5m19.444s
sys 0m35.812s 0m37.630s
This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.
This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: prepare updated shallow file before fetching the pack
index-pack --strict looks up and follows parent commits. If shallow
information is not ready by the time index-pack is run, index-pack may
be led to non-existent objects. Make fetch-pack save shallow file to
disk before invoking index-pack.
git learns new global option --shallow-file to pass on the alternate
shallow file path. Undocumented (and not even support --shallow-file=
syntax) because it's unlikely to be used again elsewhere.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>