make shallow repository deepening more network efficient
First of all, I can't find any reason why thin pack generation is
explicitly disabled when dealing with a shallow repository. The
possible delta base objects are collected from the edge commits which
are always obtained through history walking with the same shallow refs
as the client, Therefore the client is always going to have those base
objects available. So let's remove that restriction.
Then we can make shallow repository deepening much more efficient by
using the remote's unshallowed commits as edge commits to get preferred
base objects for thin pack generation. On git.git, this makes the data
transfer for the deepening of a shallow repository from depth 1 to depth 2
around 134 KB instead of 3.68 MB.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If all refs sent by the remote repo during a fetch are reachable
locally, then no further conversation is performed with the remote. This
check is skipped when the --depth argument is provided to allow the
deepening of a shallow clone which corresponding remote repo has no
changed.
However, some additional filtering was added in commit c29727d5 to
remove those refs which are equal on both sides. If the remote repo has
not changed, then the list of refs to give the remote process becomes
empty and simply attempting to deepen a shallow repo always fails.
Let's stop being smart in that case and simply send the whole list over
when that condition is met. The remote will do the right thing anyways.
Test cases for this issue are also provided.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change mentions of "git programs" to "git commands"
Most of the docs and printouts refer to "commands" when discussing what
the end users call via the "git" top-level program. We should refer them
as "git programs" when we discuss the fact that the commands are
implemented as separate programs, but in other contexts, it is better to
use the term "git commands" consistently.
Signed-off-by: Ori Avtalion <ori@avtalion.name> Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When making a histogram of delta chain length in the pack, the program
collects number of objects whose delta depth exceeds the MAX_CHAIN limit
in histogram[0], and showed it as the number of items that exceeds the
limit correctly. HOWEVER, it also showed the same number labeled as
"chain length = 0".
In fact, we are not showing the number of objects whose chain length is
zero, i.e. the base objects. Correct this.
Importing the popen2 module in Python-2.6 results in the
"DeprecationWarning: The popen2 module is deprecated. Use the
subprocess module." message. The module itself isn't used in fact, so
just removing it solves the problem.
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration values are expected to be quoted when they have leading or
trailing whitespace, but inner whitespace should be kept verbatim even if
the value is not quoted. This is already documented in git-config(1), but
the code caused inner whitespace to be collapsed to a single space,
breaking, for example, clones from a path that has two consecutive spaces
in it, as future fetches would only see a single space.
Reported-by: John te Bokkel <tanj.tanj@gmail.com> Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
request-pull: allow ls-remote to notice remote.$nickname.uploadpack
The location to pull from should be converted from the configured nickname
to URL in the message, but ls-remote should be fed the nickname so that
the command uses remote.$nickname.* variables, most notably "uploadpack".
Signed-off-by: Tom Grennan <tgrennan@redback.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The second and third tests of this script expected that Russian strings
are converted between ISO-8859-5 and Shift_JIS in the "blame --porcelain"
format output correctly.
Sure, many platforms may convert between such a combination, but that is
only because one of the base character set of Shift_JIS, JIS X 0208,
defines codepoints for Russian characters (among others); I do not think
anybody uses Shift_JIS when seriously writing Russian, and it is perfectly
understandable if iconv() libraries on some platforms fail converting
between this combination, as it does not matter in reality.
This patch changes the test to verify Japanese strings are converted
correctly between EUC-JP and Shift_JIS in the same procedure. The point
of the test is not about verifying the platform's iconv() library, but to
see if "git blame" makes correct iconv() library calls when it should.
We could instead use ISO-8859-5 and KOI8-R as the combination, because
they are both meant to represent Russian, in order to make this test
meaningful on more platforms, but we already use Shift_JIS vs EUC-JP
combinations to test other programs in our test suite, so this combination
is safer from the point of view of the portability. Besides, I do not
read nor write Russian; sorry ;-)
This change allows tests to pass on my (friend's) Solaris 5.11 box.
The first "grep -C1" test in t7002 does not pass on my SunOS-5.11-i86pc,
and that is not because our way to spawn external grep is broken, but
because the native grep does not understand -C<n>.
It turns out that Peff was also using this option himself because our
Makefile doesn't do that automatically. Brandon Casey uses SUNWspro
compiler without having to set this, and it turns out that the compiler
does not define preprocessor macro __unix__ which made him always use the
built-in grep, never an external one.
Let's be more explicit and say that we do not use external grep on Suns.
Make the 'show detached branch info' a routine of its own. And in the
process, avoid the object lookup that is unnecessary if the current
branch isn't detached.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git branch' looks at _all_ the refs, and verifies them. Which means that
during cold-cache situations with a slow disk (and lots of tags, for
example) it can take several very annoying seconds (7.5s according to a
report by Carlos R. Mafra).
This avoids most of it by simply doing the filtering before looking up
the commits, by using the "raw" version of for_each_ref.
Reported-by: Carlos R. Mafra <crmafra2@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
do_one_ref(): null_sha1 check is not about broken ref
f8948e2 (remote prune: warn dangling symrefs, 2009-02-08) introduced a
more dangerous variant of for_each_ref() family that skips the check for
dangling refs, but it also made another unrelated check optional by
mistake.
The check to see if a ref points at 0{40} is not about brokenness, but is
about a possible future plan to represent a deleted ref by writing 40 "0"
in a loose ref when there is a stale version of the same ref already in
.git/packed-refs, so that we can implement deletion of a ref without
having to rewrite the packed refs file excluding the ref being deleted.
This check has to live outside of the conditional.
If a patch adds a new line to the end of a file and this line ends with
one trailing whitespace character and has no newline, then
'--whitespace=fix' currently does not remove that trailing whitespace.
This patch fixes this by removing the check for trailing whitespace at
the end of the line at a hardcoded offset which does not take the
eventual absence of newline into account.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --cc: a lost line at the beginning of the file is shown incorrectly
When combine-diff inspected the diff from one parent to the merge result,
it misinterpreted a header in the form @@ -l,k +0,0 @@.
This hunk header means that K lines were removed from the beginning of the
file, so the lost lines must be queued to the sline that represents the
first line of the merge result, but we incremented our pointer incorrectly
and ended up queuing it to the second line, which in turn made the lossage
appear _after_ the first line.
combine-diff.c: fix performance problem when folding common deleted lines
For a deleted line in a patch with the parent we are looking at, the
append_lost() function finds the same line among a run of lines that were
deleted from the same location by patches from parents we previously
checked. This is so that patches with two parents
@@ -1,4 +1,3 @@ @@ -1,4 +1,3 @@
one one
-two -two
three three
-quatro -fyra
+four +four
can be coalesced into this sequence, reusing one line that describes the
removal of "two" for both parents.
@@@ -1,4 -1,4 +1,3 @@@
one
--two
three
- quatro
-frya
++four
While reading the second patch (that removes "two" and then "fyra"), after
finding where removal of the "two" matches, we need to find existing
removal of "fyra" (if exists) in the removal list, but the match has to
happen after all the existing matches (in this case "two"). The code used
a naïve O(n^2) algorithm to compute this by scanning the whole removal
list over and over again.
This patch remembers where the next scan should be started in the existing
removal list to avoid this.
checkout -f: deal with a D/F conflict entry correctly
When we switch branches with "checkout -f", unpack_trees() feeds two
cache_entries to oneway_merge() function in its src[] array argument. The
zeroth entry comes from the current index, and the first entry represents
what the merge result should be, taken from the tree recorded in the
commit we are switching to.
When we have a blob (either regular file or a symlink) in the index and in
the work tree at path "foo", and the switched-to tree has "foo/bar",
i.e. "foo" becomes a directory, src[0] is obviously that blob currently
registered at "foo". Even though we do not have anything at "foo" in the
switched-to tree, src[1] is _not_ NULL in this case.
The unpack_trees() machinery places a special marker df_conflict_entry
to signal that no blob exists at "foo", but it will become a directory
that may have somthing underneath it (namely "foo/bar"), so a usual 3-way
merge can notice the situation.
But oneway_merge() codepath failed to notice this and passed the special
marker directly to merged_entry(). This happens to remove the "foo" in
the end because the df_conflict_entry does not have any name (hence the
"error" message) and its addition in add_index_entry() is rejected, but it
is wrong.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
When we fall back to a standard for_each_reflog_ent() after failing to find
the nth branch switch (or if we had a short reflog) with the call to
for_each_recent_reflog_ent(), we do not need to free the memory allocated
for our strbuf's since a strbuf_reset() will be performed in
grab_nth_branch_switch() before assigning to the entry.
Plus, the strbuf_release() negates the non-zero hint we initially gave to
strbuf_init() just above these lines.
Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier 476cc72 (request-pull: really disable pager, 2009-06-30)
tried to use the correct environment variable to disable paging
from multiple calls to "git log" and friends, but there was one
extra call to "git log" that was not covered by the trick.
Move the setting and exporting of GIT_PAGER much earlier in the
script to cover everybody.
ff06c74 (Improve request-pull to handle non-rebased branches, 2007-05-01)
attempted to disable pager when running subcommands in this script, but
with a wrong variable. If GIT_PAGER is set, it takes precedence over
PAGER.
There are some different but little cleanup changes to fix some missing
quotes, to fix what seemed to be an unended sentence, to reident a
little paragraph with too large a sentence and fix a branch name that
was referred to twice later by another name.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When combining "dumb client" and human-friendly access by using the
'.git' extension to switch between the two, make sure the AliasMatch
covers the entire request. Without a full match, a request for
git-remote: fix missing .uploadpack usage for show command
For users pulling from machines with self compiled git installs,
in non-PATH locations, they can set the config option
remote.<name>.uploadpack to set the location of git-upload-pack.
When using 'git remote show <name>', the remote HEAD check
did not use the uploadpack configuration setting, and would
not use the configured program.
In builtin-remote.c, the config setting is already loaded
with the call to remote_get(), so this patch passes that remote
along to transport_get().
Signed-off-by: Chris Frey <cdfrey@foursquare.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attribute: whitespace set to true detects all errors known to git
That is what the documentation says, but the code pretends as if all the
known whitespace error tokens were given.
Among the whitespace error tokens, there is one kind that loosens the rule
when set: cr-at-eol. Which means that whitespace error token that is set
to true ignores a newly introduced CR at the end, which is inconsistent
with the documentation.
I think this is because the "whitespace" attribute is set to *.[ch] files
without specifying what kind of errors are caught. It makes git "notice
all types of errors" (as described in the documentation), but I think it
is incorrectly setting cr-at-eol, too, and hides this error.
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test wanted to make sure that cherry-pick exits with status 1,
but with the way it was placed after "git checkout master &&" meant
that it could have misjudged success if checkout barfed with the
same failure status.
* maint-1.6.2:
git-show-ref.txt: remove word and make consistent
git-svn documentation: fix typo in 'rebase vs. pull/merge' section
use xstrdup, not strdup in ll-merge.c
* maint-1.6.1:
git-show-ref.txt: remove word and make consistent
git-svn documentation: fix typo in 'rebase vs. pull/merge' section
use xstrdup, not strdup in ll-merge.c
* maint-1.6.0:
git-show-ref.txt: remove word and make consistent
git-svn documentation: fix typo in 'rebase vs. pull/merge' section
use xstrdup, not strdup in ll-merge.c
Change the minimimum required libcurl version for the http.sslKey option
to 7.9.3. Previously, preprocessor macros checked for >= 7.9.2, which
is incorrect because CURLOPT_SSLKEY was introduced in 7.9.3. This now
allows git to compile with libcurl 7.9.2.
Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Cygwin, poll() reports POLLIN even for file descriptors that have
reached their end. This caused git upload-archive to be stuck in an
infinite loop, as it only looked at the POLLIN flag.
In addition to POLLIN, check if read() returned 0, which indicates
end-of-file, and keep looping only as long as at least one of the file
descriptors has input. This lets the following command finish on its
own when run in a git repository on Cygwin, instead of it getting stuck
after printing all file names:
$ git archive -v --remote . HEAD >/dev/null
Reported-by: Bob Kagy <bobkagy@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
you can backtrack arbitrarily from [A-Za-z_0-9]* into [A-Za-z_], thus
causing an exponential number of backtracks. Ironically it also causes
the regex not to work as intended; for example "catch" can match the
underlined part of the regex, the first repetition matching "c" and
the second matching "atch".
The replacement regex avoids this problem, because it makes sure that
at least a space/tab is eaten on each repetition. In other words,
a suffix of a repetition can never be a prefix of the next repetition.
Signed-off-by: Paolo Bonzini <bonzini@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shifting 'unsigned char' or 'unsigned short' left can result in sign
extension errors, since the C integer promotion rules means that the
unsigned char/short will get implicitly promoted to a signed 'int' due to
the shift (or due to other operations).
This normally doesn't matter, but if you shift things up sufficiently, it
will now set the sign bit in 'int', and a subsequent cast to a bigger type
(eg 'long' or 'unsigned long') will now sign-extend the value despite the
original expression being unsigned.
One example of this would be something like
unsigned long size;
unsigned char c;
size += c << 24;
where despite all the variables being unsigned, 'c << 24' ends up being a
signed entity, and will get sign-extended when then doing the addition in
an 'unsigned long' type.
Since git uses 'unsigned char' pointers extensively, we actually have this
bug in a couple of places.
I may have missed some, but this is the result of looking at
which catches at least the common byte cases (shifting variables by a
variable amount, and shifting by 24 bits).
I also grepped for just 'unsigned char' variables in general, and
converted the ones that most obviously ended up getting implicitly cast
immediately anyway (eg hash_name(), encode_85()).
In addition to just avoiding 'unsigned char', this patch also tries to use
a common idiom for the delta header size thing. We had three different
variations on it: "& 0x7fUL" in one place (getting the sign extension
right), and "& ~0x80" and "& 0x7f" in two other places (not getting it
right). Apart from making them all just avoid using "unsigned char" at
all, I also unified them to then use a simple "& 0x7f".
I considered making a sparse extension which warns about doing implicit
casts from unsigned types to signed types, but it gets rather complex very
quickly, so this is just a hack.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the gc section using unresolved and resolved instead of "not
recorded". Add plurals and missing articles. Make some sentences have
consistent tense. Try and be more active by removing "that" and
simplifying sentences.
The terms "hand-resolve" and "hand resolve" were used, so just use "hand
resolve" to be more consistent.
Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: git-send-mail can take rev-list arg to drive format-patch
The git-send-email docs do not mention except in the usage lines
the combined patch formatting/sending ability of git-send-email.
This patch expands on the possible arguments to git-send-email
and explains the meaning of the rev-list argument.
Signed-off-by: Paolo Bonzini <bonzini@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: close output channel after sideband demultiplexer terminates
fetch-pack runs the sideband demultiplexer using start_async(). This
facility requires that the asynchronously executed function closes the
output file descriptor (see Documentation/technical/api-run-command.txt).
But the sideband demultiplexer did not do that. This fixes it.
In certain error situations this could lock up a fetch operation on
Windows because the asynchronous function is run in a thread; by not
closing the output fd the reading end never got EOF and waited for more
data indefinitely. On Unix this is not a problem because the asynchronous
function is run in a separate process, which exits after the function ends
and so implicitly closes the output.
Since the pack that is sent over the wire encodes the number of objects in
the stream, during normal operation the reading end knows when the stream
ends and terminates by itself, and does not lock up.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the way in which the configure script handles --without-iconv
(and --with-iconv=no), which it used to essentially ignore.
Also fix the way the configure script determines the value of
NEEDS_LIBICONV, which would be incorrectly set to 'YesPlease' on
systems that lack iconv entirely.
Signed-off-by: Marco Nelissen <marcone@xs4all.nl> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: refer to gitworkflows(7) from tutorial and git(1)
Add references to the gitworkflows(7) manpage added in f948dd8
(Documentation: add manpage about workflows, 2008-10-19) to both
gittutorial(1) and git(1), so that new users might actually discover
and read it.
Noticed by Randal L. Schwartz.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
daemon: Strictly parse the "extra arg" part of the command
Since 1.4.4.5 (49ba83fb67 "Add virtualization support to git-daemon")
git daemon enters an infinite loop and never terminates if a client
hides any extra arguments in the initial request line which is not
exactly "\0host=blah\0".
Since that change, a client must never insert additional extra
arguments, or attempt to use any argument other than "host=", as
any daemon will get stuck parsing the request line and will never
complete the request.
Since the client can't tell if the daemon is patched or not, it
is not possible to know if additional extra args might actually be
able to be safely requested.
If we ever need to extend the git daemon protocol to support a new
feature, we may have to do something like this to the exchange:
# If both support git:// v2
#
C: 000cgit://v2
S: 0010ok host user
C: 0018host git.kernel.org
C: 0027git-upload-pack /pub/linux-2.6.git
S: ...git-upload-pack header...
# If client supports git:// v2, server does not:
#
C: 000cgit://v2
S: <EOF>
This requires the client to create two TCP connections to talk to
an older git daemon, however all daemons since the introduction of
daemon.c will safely reject the unknown "git://v2" command request,
so the client can quite easily determine the server supports an
older protocol.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set slot->local to NULL after doing a fclose() on the file it points
to. This prevents the passing of a FILE* pointer to a fclose()'d file
to ftell() in http.c::run_active_slot().
This issue was raised by Clemens Buchacher on 30th May 2009:
http://www.spinics.net/lists/git/msg104623.html
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command "git grep -w ''" dies as soon as it encounters an empty line,
reporting (wrongly) that "regexp returned nonsense". The first hunk of
this patch relaxes the sanity check that is responsible for that,
allowing matches to start at the end.
The second hunk complements it by making sure that empty matches are
rejected if -w was specified, as they are not really words.
blame: correctly handle a path that used to be a directory
When trying to see if the same path exists in the parent, we ran
"diff-tree" with pathspec set to the path we are interested in with the
parent, and expect either to have exactly one resulting filepair (either
"changed from the parent", "created when there was none") or nothing (when
there is no change from the parent).
If the path used to be a directory, however, we will also see unbounded
number of entries that talk about the files that used to exist underneath
the directory in question. Correctly pick only the entry that describes
the path we are interested in in such a case (namely, the creation of the
path as a regular file).
Merge branch 'cb/maint-1.6.0-xdl-merge-fix' into maint
* cb/maint-1.6.0-xdl-merge-fix:
Change xdl_merge to generate output even for null merges
t6023: merge-file fails to output anything for a degenerate merge
Merge branch 'jc/maint-add-p-coalesce-fix' into maint
* jc/maint-add-p-coalesce-fix:
t3701: ensure correctly set up repository after skipped tests
Revert "git-add--interactive: remove hunk coalescing"
Splitting a hunk that adds a line at the top fails in "add -p"
If a zero-length match is encountered, break out of loop and show the rest
of the line uncoloured. Otherwise we'd be looping forever, trying to make
progress by advancing the pointer by zero characters.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following is an easy mistake to make for users coming from version
control systems with an "update and commit"-style workflow.
1. git pull
2. resolve conflicts
3. git pull
Step 3 overrides MERGE_HEAD, starting a new merge with dirty index.
IOW, probably not what the user intended. Instead, refuse to merge
again if a merge is in progress.
Reported-by: Dave Olszewski <cxreg@pobox.com> Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
for-each-ref: Do not lookup objects when they will not be used
This makes commands such as `git for-each-ref --format='%(refname)'`,
which are used heavily by the bash_completion code, run about 6 times
faster on an uncached repository (3 s intead of 18 s on my linux-2.6
repository with several remotes).
Signed-off-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>