* es/overlapping-range-set:
range_set: fix coalescing bug when range is a subset of another
t4211: fix broken test when one -L range is subset of another
"git clone -s/-l" is a filesystem level copy and does not offer any
protection against source repository being corrupt. While the
connectivity validation checks commits and trees being readable, it
made the otherwise instantaneous local modes of clone much more
expensive, without protecting blob data from bitflips.
* jk/maint-clone-shared-no-connectivity-validation:
clone: drop connectivity check for local clones
Merge branch 'mt/send-email-cc-match-fix' into maint
Logic used by git-send-email to suppress cc mishandled names like "A
U. Thor" <author@example.xz>, where the human readable part needs to
be quoted (the user input may not have the double quotes around the
name, and comparison was done between quoted and unquoted strings).
It also mishandled names that need RFC2047 quoting.
* mt/send-email-cc-match-fix:
send-email: sanitize author when writing From line
send-email: add test for duplicate utf8 name
test-send-email: test for pre-sanitized self name
t/send-email: test suppress-cc=self with non-ascii
t/send-email: add test with quoted sender
send-email: make --suppress-cc=self sanitize input
t/send-email: test suppress-cc=self on cccmd
send-email: fix suppress-cc=self on cccmd
t/send-email.sh: add test for suppress-cc=self
Pass port number as a separate argument when send-email initializes
Net::SMTP, instead of as a part of the hostname, i.e. host:port.
This allows GSSAPI codepath to match with the hostname given.
* bc/send-email-use-port-as-separate-param:
send-email: provide port separately from hostname
In addition to the choice from "rebase, merge, or checkout-detach",
allow a custom command to be used in "submodule update" to update
the working tree of submodules.
* cp/submodule-custom-update:
submodule update: allow custom command to update submodule working tree
"git format-patch" learned "--from[=whom]" option, which sets the
"From: " header to the specified person (or the person who runs the
command, if "=whom" part is missing) and move the original author
information to an in-body From: header as necessary.
* jk/format-patch-from:
teach format-patch to place other authors into in-body "From"
pretty.c: drop const-ness from pretty_print_context
The configuration variable "merge.ff" was cleary a tri-state to
choose one from "favor fast-forward when possible", "always create
a merge even when the history could fast-forward" and "do not
create any merge, only update when the history fast-forwards", but
the command line parser did not implement the usual convention of
"last one wins, and command line overrides the configuration"
correctly.
* mv/merge-ff-tristate:
merge: handle --ff/--no-ff/--ff-only as a tri-state option
Fetching between repositories with many refs employed O(n^2)
algorithm to match up the common objects, which has been corrected.
* jk/fetch-pack-many-refs:
fetch-pack: avoid quadratic behavior in rev_list_push
commit.c: make compare_commits_by_commit_date global
fetch-pack: avoid quadratic list insertion in mark_complete
An Gitweb installation that is a part of larger site can optionally
show extra links that point at the levels higher than the Gitweb
pages itself in the link hierarchy of pages.
* tf/gitweb-extra-breadcrumbs:
gitweb: allow extra breadcrumbs to prefix the trail
"log --format=" did not honor i18n.logoutputencoding configuration
and this attempts to fix it.
* as/log-output-encoding-in-user-format:
t4205 (log-pretty-formats): avoid using `sed`
t6006 (rev-list-format): add tests for "%b" and "%s" for the case i18n.commitEncoding is not set
t4205, t6006, t7102: make functions better readable
t4205 (log-pretty-formats): revert back single quotes
t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1
t4205: replace .\+ with ..* in sed commands
pretty: --format output should honor logOutputEncoding
pretty: Add failing tests: --format output should honor logOutputEncoding
t4205 (log-pretty-formats): don't hardcode SHA-1 in expected outputs
t7102 (reset): don't hardcode SHA-1 in expected outputs
t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs
git-clone.txt: remove the restriction on pushing from a shallow clone
The document says one cannot push from a shallow clone. But that is
not true (maybe it was at some point in the past). The client does not
stop such a push nor does it give any indication to the receiver that
this is a shallow push. If the receiver accepts it, it's in.
Since 52fed6e (receive-pack: check connectivity before concluding "git
push" - 2011-09-02), receive-pack is prepared to deal with broken
push, a shallow push can't cause any corruption. Update the document
to reflect that.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash save" is not just about "saving" the local changes, but
also is to restore the working tree state to that of HEAD. If you
changed a non-directory into a directory in the local change, you
may have untracked files in that directory, which have to be killed
while doing so, unless you run it with --include-untracked. Teach
the command to detect and error out before spreading the damage.
This needed a small fix to "ls-files --killed".
* pb/stash-refuse-to-kill:
git stash: avoid data loss when "git stash save" kills a directory
treat_directory(): do not declare submodules to be untracked
"git status" learned status.branch and status.short configuration
variables to use --branch and --short options by default (override
with --no-branch and --no-short options from the command line).
* jg/status-config:
status/commit: make sure --porcelain is not affected by user-facing config
commit: make it work with status.short
status: introduce status.branch to enable --branch by default
status: introduce status.short to enable --short by default
Invocations of "git checkout" used internally by "git rebase" were
counted as "checkout", and affected later "git checkout -" to the
the user to an unexpected place.
* rr/rebase-checkout-reflog:
checkout: respect GIT_REFLOG_ACTION
status: do not depend on rebase reflog messages
t/t2021-checkout-last: "checkout -" should work after a rebase finishes
wt-status: remove unused field in grab_1st_switch_cbdata
t7512: test "detached from" as well
Earlier remote.pushdefault (and per-branch branch.*.pushremote)
were introduced as an additional mechanism to choose what
repository to push into when "git push" did not say it from the
command line, to help people who push to a repository that is
different from where they fetch from. This attempts to finish that
topic by teaching the default mechanism to choose branch in the
remote repository to be updated by such a push.
The 'current', 'matching' and 'nothing' modes (specified by the
push.default configuration variable) extend to such a "triangular"
workflow naturally, but 'upstream' and 'simple' have to be updated.
. 'upstream' is about pushing back to update the branch in the
remote repository that the current branch fetches from and
integrates with, it errors out in a triangular workflow.
. 'simple' is meant to help new people by avoiding mistakes, and
will be the safe default in Git 2.0.
In a non-triangular workflow, it will continue to act as a cross
between 'upstream' and 'current' in that it pushes to the current
branch's @{upstream} only when it is set to the same name as the
current branch (e.g. your 'master' forks from the 'master' from
the central repository).
In a triangular workflow, this series tentatively defines it as
the same as 'current', but we may have to tighten it to avoid
surprises in some way.
range_set: fix coalescing bug when range is a subset of another
When coalescing ranges, sort_and_merge_range_set() unconditionally
assumes that the end of a range being folded into a preceding range
should become the end of the coalesced range. This assumption, however,
is invalid when one range is a subset of another. For example, given
ranges 1-5 and 2-3 added via range_set_append_unsafe(),
sort_and_merge_range_set() incorrectly coalesces them to range 1-3
rather than the correct union range 1-5. Fix this bug.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t4211: fix broken test when one -L range is subset of another
t4211 attempts to test multiple git-log -L ranges where one range is a
superset of the other, and falsely succeeds because its "expected"
output is incorrect.
Overlapping -L ranges handed to git-log are coalesced by
line-log.c:sort_and_merge_range_set() into a set of non-overlapping,
disjoint ranges. When one range is a subset of another,
sort_and_merge_range_set() should coalesce both ranges to the superset
range, but instead the coalesced range often is incorrectly truncated to
the end of the subset range. For example, ranges 2-8 and 3-4 are
coalesced incorrectly to 2-4.
One can observe this incorrect behavior with git-log -L using the test
repository created by t4211. The superset/subset ranges t4211 employs
are 4-$ and 8-12 (where $ represents end-of-file). The coalesced range
should be 4-$. Manually invoking git-log with the same ranges the test
employs, we see:
This last output is incorrect. 8-12 is a subset of 4-$, hence the output
of the coalesced range should be the same as the 4-$ output shown first.
In fact, the above incorrect output is the truncated bogus range 4-12:
Fix the test to correctly fail in the presence of the
sort_and_merge_range_set() coalescing bug. Do so by changing the
"expected" output to the commits mentioned in the 4-$ output above.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull: change the description to "integrate" changes
Since git-pull learned the --rebase option it has not just been about
merging changes from a remote repository (where "merge" is in the sense
of "git merge"). Change the description to use "integrate" instead of
"merge" in order to reflect this.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The == operator as an alias to = is not POSIX. This doesn't actually
matter for the execution of the script, because it only runs when the
shell is bash. However, it trips up test-lint, so it's nicer to use
the standard form.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote.c: avoid O(m*n) behavior in match_push_refs
When pushing using a matching refspec or a pattern refspec, each ref
in the local repository must be paired with a ref advertised by the
remote server. This is accomplished by using the refspec to transform
the name of the local ref into the name it should have in the remote
repository, and then performing a linear search through the list of
remote refs to see if the remote ref was advertised by the remote
system.
Each of these lookups has O(n) complexity and makes match_push_refs()
be an O(m*n) operation, where m is the number of local refs and n is
the number of remote refs. If there are many refs 100,000+, then this
ref matching can take a significant amount of time. Let's prepare an
index of the remote refs to allow searching in O(log n) time and
reduce the complexity of match_push_refs() to O(m log n).
We prepare the index lazily so that it is only created when necessary.
So, there should be no impact when _not_ using a matching or pattern
refspec, i.e. when pushing using only explicit refspecs.
Dry-run push of a repository with 121,913 local and remote refs:
before after
real 1m40.582s 0m0.804s
user 1m39.914s 0m0.515s
sys 0m0.125s 0m0.106s
The creation of the index has overhead. So, if there are very few
local refs, then it could take longer to create the index than it
would have taken to just perform n linear lookups into the remote
ref space. Using the index should provide some improvement when
the number of local refs is roughly greater than the log of the
number of remote refs (i.e. m >= log n). The pathological case is
when there is a single local ref and very many remote refs.
Dry-run push of a repository with 121,913 remote refs and a single
local ref:
before after
real 0m0.525s 0m0.566s
user 0m0.243s 0m0.279s
sys 0m0.075s 0m0.099s
Using an index takes 41 ms longer, or roughly 7.8% longer.
Jeff King measured a no-op push of a single ref into a remote repo
with 370,000 refs:
before after
real 0m1.087s 0m1.156s
user 0m1.344s 0m1.412s
sys 0m0.288s 0m0.284s
Using an index takes 69 ms longer, or roughly 6.3% longer.
None of the measurements above required transferring any objects to
the remote repository. If the push required transferring objects and
updating the refs in the remote repository, the impact of preparing
the search index would be even smaller.
A similar operation is performed in the reverse direction when pruning
using a matching or pattern refspec. Let's avoid O(m*n) behavior in
the same way by lazily preparing an index on the local refs.
Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 0433ad1 (clone: run check_everything_connected,
2013-03-25) added the same connectivity check to clone that
we use for fetching. The intent was to provide enough safety
checks that "git clone git://..." could be counted on to
detect bit errors and other repo corruption, and not
silently propagate them to the clone.
For local clones, this turns out to be a bad idea, for two
reasons:
1. Local clones use hard linking (or even shared object
stores), and so complete far more quickly. The time
spent on the connectivity check is therefore
proportionally much more painful.
2. Local clones do not actually meet our safety guarantee
anyway. The connectivity check makes sure we have all
of the objects we claim to, but it does not check for
bit errors. We will notice bit errors in commits and
trees, but we do not load blob objects at all. Whereas
over the pack transport, we actually recompute the sha1
of each object in the incoming packfile; bit errors
change the sha1 of the object, which is then caught by
the connectivity check.
This patch drops the connectivity check in the local case.
Note that we have to revert the changes from 0433ad1 to
t5710, as we no longer notice the corruption during clone.
We could go a step further and provide a "verify even local
clones" option, but it is probably not worthwhile. You can
already spell that as "cd foo.git && git fsck && git clone ."
or as "git clone --no-local foo.git".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
With some workflows, it is more suitable to rebase on top of remote
changes when a push does not fast-forward. Change the advice messages
in git-push to suggest that a user "integrate the remote changes"
instead of "merge the remote changes" to make this slightly clearer.
Also change the suggested 'git pull' to 'git pull ...' to hint to users
that they may want to add other parameters.
Suggested-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-config(1): clarify precedence of multiple values
In order to clarify which value is used when there are multiple values
defined for a key, re-order the list of file locations so that it runs
from least specific to most specific. Then add a paragraph which simply
says that the last value will be used.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The path of the file to be locked is held in lock_file::filename,
which is a fixed-length buffer of length PATH_MAX. This buffer is
also (temporarily) used to hold the path of the lock file, which is
the path of the file being locked plus ".lock". Because of this, the
path of the file being locked must be less than (PATH_MAX - 5)
characters long (5 chars are needed for ".lock" and one character for
the NUL terminator).
On entry into lock_file(), the path length was only verified to be
less than PATH_MAX characters, not less than (PATH_MAX - 5)
characters.
When and if resolve_symlink() is called, then that function is
correctly told to treat the buffer as (PATH_MAX - 5) characters long.
This part is correct. However:
* If LOCK_NODEREF was specified, then resolve_symlink() is never
called.
* If resolve_symlink() is called but the path is not a symlink, then
the length check is never applied.
So it is possible for a path with length (PATH_MAX - 5 <= len <
PATH_MAX) to make it through the checks. When ".lock" is strcat()ted
to such a path, the lock_file::filename buffer is overflowed.
Fix the problem by adding a check when entering lock_file() that the
original path is less than (PATH_MAX - 5) characters.
[jc: with independent development by Peff]
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diffcore-pickaxe: simplify has_changes and contains
Halve the number of callsites of contains() to two using temporary
variables, simplifying the code. While at it, get rid of the
diff_options parameter, which became unused with 8fa4b09f.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default similarity index of 50% is documented in gitdiffcore(7)
but it is worth also mentioning it in the description of the
-M/--find-renames option.
Signed-off-by: Fraser Tweedale <frase@frase.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
For testing truncated log messages 'commit_msg' function uses `sed` to
cut a message. On various platforms `sed` behaves differently and
results of its work depend on locales installed. So, avoid using `sed`.
Use predefined expected outputs instead of calculated ones.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t6006 (rev-list-format): add tests for "%b" and "%s" for the case i18n.commitEncoding is not set
In de6029a (pretty: Add failing tests: --format output should honor
logOutputEncoding, 2013-06-26) 'complex-subject' test was changed.
Revert it back, because that change actually removed tests for "%b"
and "%s" with i18n.commitEncoding set. Also, add two more tests for
mentioned above "%b" and "%s" to test encoding conversions with no
i18n.commitEncoding set.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com> Suggested-by: Johannes Sixt <j.sixt@viscovery.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t4205, t6006, t7102: make functions better readable
Function 'test_format' has become harder to read after its change in de6029a2 (pretty: Add failing tests: --format output should honor
logOutputEncoding, 2013-06-26). Simplify it by moving its "should we
expect it to fail?" parameter to the end.
Note, current code does not use this last parameter as far as there
are no tests expected to fail. We can keep that for future use.
Also, reformat comments.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com> Improved-by: Johannes Sixt <j.sixt@viscovery.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t4205 (log-pretty-formats): revert back single quotes
In previuos commit de6029a (pretty: Add failing tests: --format output
should honor logOutputEncoding, 2013-06-26) single quotes were replaced
with double quotes to make "$(commit_msg)" expression in heredoc to
work. The same effect can be achieved by using "EOF" as a heredoc
delimiter instead of "\EOF".
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com> Suggested-by: Johannes Sixt <j.sixt@viscovery.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allows N instances of tests run in parallel, each running 1/N parts
of the test suite under Valgrind, to speed things up.
* tr/test-v-and-v-subtest-only:
perf-lib: fix start/stop of perf tests
test-lib: support running tests under valgrind in parallel
test-lib: allow prefixing a custom string before "ok N" etc.
test-lib: valgrind for only tests matching a pattern
test-lib: verbose mode for only tests matching a pattern
test-lib: self-test that --verbose works
test-lib: rearrange start/end of test_expect_* and test_skip
test-lib: refactor $GIT_SKIP_TESTS matching
test-lib: enable MALLOC_* for the actual tests
Do not use FIFOs on cygwin, they do not work. Cygwin includes
coreutils, so has mkfifo, and that command does something. However,
the resultant named pipe is known (on the Cygwin mailing list at
least) to not work correctly.
This disables PIPE for Cygwin, allowing t0008.sh to complete (all other
tests in that file work correctly).
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1
Both "iso8859-1" and "iso-8859-1" are understood as latin-1 by
modern platforms, but the latter is not understood by older
platforms;update tests to use the former.
This is in line with 3994e8a9 (t4201: use ISO8859-1 rather than
ISO-8859-1, 2009-12-03), which did the same.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: allow extra breadcrumbs to prefix the trail
There are often parent pages logically above the gitweb projects
list, e.g. home pages of the organization and department that host
the gitweb server. This change allows you to include links to those
pages in gitweb's breadcrumb trail.
Signed-off-by: Tony Finch <dot@dotat.at> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the SMTP port is provided as part of the hostname to Net::SMTP, it passes
the combined string to the SASL provider; this causes GSSAPI authentication to
fail since Kerberos does not want the port information. Instead, pass the port
as a separate argument as is done for SSL connections.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fixup-builtins: retire an old transition helper script
This script was added in 36e5e70 (Start deprecating "git-command" in
favor of "git command", 2007-06-30) with the intent of aiding the
transition away from dashed forms.
It has already been used to help the transision and served its
purpose, and is no longer very useful for follow-up work, because
the majority of remaining matches it finds are false positives.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'cm/gitweb-project-list-persistent-cgi-fix' into maint
"gitweb" forgot to clear a global variable $search_regexp upon each
request, mistakenly carrying over the previous search to a new one
when used as a persistent CGI.
* cm/gitweb-project-list-persistent-cgi-fix:
gitweb: fix problem causing erroneous project list
Fix a typo ("remote remote-tracking") going back to the big cleanup
in 2010 (8b3f3f84 etc). Also, remove some more occurrences of
"tracking" and "remote tracking" in favor of "remote-tracking".
Signed-off-by: Michael Schubert <mschub@elegosoft.com> Reviewed-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
teach format-patch to place other authors into in-body "From"
Format-patch generates emails with the "From" address set to the
author of each patch. If you are going to send the emails, however,
you would want to replace the author identity with yours (if they
are not the same), and bump the author identity to an in-body
header.
Normally this is handled by git-send-email, which does the
transformation before sending out the emails. However, some
workflows may not use send-email (e.g., imap-send, or a custom
script which feeds the mbox to a non-git MUA). They could each
implement this feature themselves, but getting it right is
non-trivial (one must canonicalize the identities by reversing any
RFC2047 encoding or RFC822 quoting of the headers, which has caused
many bugs in send-email over the years).
This patch takes a different approach: it teaches format-patch a
"--from" option which handles the ident check and in-body header
while it is writing out the email. It's much simpler to do at this
level (because we haven't done any quoting yet), and any workflow
based on format-patch can easily turn it on.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pretty.c: drop const-ness from pretty_print_context
In the current code, callers are expected to fill in the
pretty_print_context, and then the pretty.c functions simply
read from it. This leaves no room for the pretty.c functions
to communicate with each other by manipulating the context
(e.g., data seen while printing the header may impact how we
print the body).
Rather than introduce a new struct to hold modifiable data,
let's just drop the const-ness of the existing context
struct.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-remote-mediawiki: un-brace file handles in binmode calls
Commit e83d36b66fc turned "print STDOUT" into "print {*STDOUT}", as
suggested by perlcritic. Unfortunately, it also changed two "binmode
STDOUT" calls the same way, which does not work and yield a "Not a GLOB
reference" error.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-config: update doc for --get with multiple values
Since commit 00b347d (git-config: do not complain about duplicate
entries, 2012-10-23), "git config --get" does not exit with an error if
there are multiple values for the specified key but instead returns the
last value. Update the documentation to reflect this.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the --depth option to the add and update commands of "git submodule",
which is then passed on to the clone command. This is useful when the
submodule(s) are huge and you're not really interested in anything but
the latest commit.
Tests are added and some indention adjustments were made to conform to the
rest of the testfile on "submodule update can handle symbolic links in pwd".
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com> Acked-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule update: allow custom command to update submodule working tree
Users can set submodule.$name.update to '!command' which will cause
'command' to be run instead of checkout/merge/rebase. This allows
the user finer-grained control over how the update is done.
The primary motivation for this was interoperability with stgit;
however being able to intercept the submodule update process may
prove useful for integrating with or extending other tools.
Signed-off-by: Chris Packham <judge.packham@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge: handle --ff/--no-ff/--ff-only as a tri-state option
These three options mean "favor fast-forwarding when possible,
without creating an unnecessary merge", "never fast-forward and
always create a merge commit even when the commit being merged is a
strict descendant", and "we do not want to create any merge commit;
update only when the merged commit is a strict descendant".
They are "pick one out of these three possibilities" options, and
correspond to "merge.ff" configuration that is tri-state (yes, no
and only).
However, the implementation did not follow the usual convention for
the command line options (later one wins, and command line overrides
what is in the configuration).
Fix this by consolidating two variables (fast_forward_only and
allow_fast_forward) used in the implementation into one enum that
can take one of the three possible values.
Signed-off-by: Miklos Vajna <vmiklos@suse.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: avoid quadratic behavior in rev_list_push
When we call find_common to start finding common ancestors
with the remote side of a fetch, the first thing we do is
insert the tip of each ref into our rev_list linked list. We
keep the list sorted the whole time with
commit_list_insert_by_date, which means our insertion ends
up doing O(n^2) timestamp comparisons.
We could teach rev_list_push to use an unsorted list, and
then sort it once after we have added each ref. However, in
get_rev, we process the list by popping commits off the
front and adding parents back in timestamp-sorted order. So
that procedure would still operate on the large list.
Instead, we can replace the linked list with a heap-based
priority queue, which can do O(log n) insertion, making the
whole insertion procedure O(n log n).
As a result of switching to the prio_queue struct, we fix
two minor bugs:
1. When we "pop" a commit in get_rev, and when we clear
the rev_list in find_common, we do not take care to
free the "struct commit_list", and just leak its
memory. With the prio_queue implementation, the memory
management is handled for us.
2. In get_rev, we look at the head commit of the list,
possibly push its parents onto the list, and then "pop"
the front of the list off, assuming it is the same
element that we just peeked at. This is typically going
to be the case, but would not be in the face of clock
skew: the parents are inserted by date, and could
potentially be inserted at the head of the list if they
have a timestamp newer than their descendent. In this
case, we would accidentally pop the parent, and never
process it at all.
The new implementation pulls the commit off of the
queue as we examine it, and so does not suffer from
this problem.
With this patch, a fetch of a single commit into a
repository with 50,000 refs went from:
real 0m7.984s
user 0m7.852s
sys 0m0.120s
to:
real 0m2.017s
user 0m1.884s
sys 0m0.124s
Before this patch, a larger case with 370K refs still had
not completed after tens of minutes; with this patch, it
completes in about 12 seconds.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit.c: make compare_commits_by_commit_date global
This helper function was introduced as a prio_queue
comparator to help topological sorting. However, other users
of prio_queue who want to replace commit_list_insert_by_date
will want to use it, too. So let's make it public.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: avoid quadratic list insertion in mark_complete
We insert the commit pointed to by each ref one-by-one into
the "complete" commit_list using insert_by_date. Because
each insertion is O(n), we end up with O(n^2) behavior.
This typically doesn't matter, because the number of refs is
reasonably small. And even if there are a lot of refs, they
often point to a smaller set of objects (in which case the
optimization in commit ea5f220 keeps our "n" small).
However, in pathological repositories (hundreds of thousands
of refs, each pointing to a unique commit), this quadratic
behavior can make a difference. Since we do not care about
the list order until we have finished building it, we can
simply keep it unsorted during the insertion phase, then
sort it afterwards.
On a repository like the one described above, this dropped
the time to do a no-op fetch from 2.0s to 1.7s. On normal
repositories, it probably does not matter at all, but it
does not hurt to protect ourselves from pathological cases.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>