merge-recursive: don't detect renames of empty files
Merge-recursive detects renames so that if one side modifies
"foo" and the other side moves it to "bar", the modification
is applied to "bar". However, our rename detection is based
on content analysis, it can be wrong (i.e., two files were
not intended as a rename, but just happen to have the same
or similar content).
This is quite rare if the files actually contain content,
since two unrelated files are unlikely to have exactly the
same content. However, empty files present a problem, in
that there is nothing to analyze. An uninteresting
placeholder file with zero bytes may or may not be related
to a placeholder file with another name.
The result is that adding content to an empty file may cause
confusion if the other side of a merge removed it; your
content may end up in another random placeholder file that
was added.
Let's err on the side of caution and not consider empty
files as renames. This will cause a modify/delete conflict
on the merge, which will let the user sort it out
themselves.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
teach diffcore-rename to optionally ignore empty content
Our rename detection is a heuristic, matching pairs of
removed and added files with similar or identical content.
It's unlikely to be wrong when there is actual content to
compare, and we already take care not to do inexact rename
detection when there is not enough content to produce good
results.
However, we always do exact rename detection, even when the
blob is tiny or empty. It's easy to get false positives with
an empty blob, simply because it is an obvious content to
use as a boilerplate (e.g., when telling git that an empty
directory is worth tracking via an empty .gitignore).
This patch lets callers specify whether or not they are
interested in using empty files as rename sources and
destinations. The default is "yes", keeping the original
behavior. It works by detecting the empty-blob sha1 for
rename sources and destinations.
One more flexible alternative would be to allow the caller
to specify a minimum size for a blob to be "interesting" for
rename detection. But that would catch small boilerplate
files, not large ones (e.g., if you had the GPL COPYING file
in many directories).
A better alternative would be to allow a "-rename"
gitattribute to allow boilerplate files to be marked as
such. I'll leave the complexity of that solution until such
time as somebody actually wants it. The complaints we've
seen so far revolve around empty files, so let's start with
the simple thing.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The read-cache implementation defines this static function,
but it is a generally useful concept in git. Let's give
the empty blob the same treatment as the empty tree,
providing both hex and binary forms of the sha1.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro already evaluates to the correct type, as it
casts the string literal to "unsigned char *" itself
(and callers who want the literal can use the _LITERAL
form).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
perl/Makefile: install Git::I18N under NO_PERL_MAKEMAKER
When I added the i18n infrastructure in v1.7.8-rc2-1-g5e9637c I forgot
to install Git::I18N also when NO_PERL_MAKEMAKER=YesPlease was
set. Change the generation of the fallback perl.mak file to do that.
Now Git/I18N.pm is installed alongside Git.pm in such a way that
anything that uses GITPERLLIB will find it.
Reported-by: Tom G. Christensen <tgc@statsbiblioteket.dk> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to validate the history connectivity between old refs and new
refs used by fetch and receive-pack, introduced in 1.7.8, was grossly
inefficient and unnecessarily tried to re-validate integrity of individual
objects. This essentially reverts that performance regression.
Merge "two fixes for fast-import's 'ls' command" from Jonathan
Andrew Sayers noticed that the svn-fe | git fast-import pipeline
mishandles a subversion history that copies the root directory to a
sub-directory (e.g. doing `svn cp . trunk` to standardise your
layout). As David Barr explained, the bug arises when the following
command is sent to git fast-import:
'ls' SP ':1' SP LF
Instead of reading back what is at the root of r1, it unconditionally
reports the path as missing.
After sleeping on it, here are two patches for 'maint'. One plugs a
memory leak. The other ensures that trying to pass an empty path to
the 'ls' command results in an error message that can help the
frontend author instead of the silently broken conversion Andrew
found.
Then we can carefully add 'ls ""' support in 1.7.11.
* commit 'refs/pull-request-tags/jn/maint-fast-import-empty-ls':
fast-import: don't allow 'ls' of path with empty components
fast-import: leakfix for 'ls' of dirty trees
* th/git-diffall:
contrib/diffall: fix cleanup trap on Windows
contrib/diffall: eliminate duplicate while loops
contrib/diffall: eliminate use of tar
contrib/diffall: create tmp dirs without mktemp
contrib/diffall: comment actual reason for 'cdup'
Git 1.7.8 introduced an object and history re-validation step after
"fetch" or "push" causes new history to be added to a receiving
repository. This is to protect a malicious server or pushing client from
corrupting the repository by taking advantage of an existing corrupt
object that is unconnected to existing history.
But this check is way over-pessimistic. During "fetch" or "receive-pack"
(the server side of "push"), unpack-objects and index-pack already
validate individual objects that are received, and the only thing we would
want to catch are corrupted objects that already happen to exist in our
repository but are not referenced from our refs. Such objects must have
been written by an earlier run of our codepaths that write out loose
objects or packfiles, and they must have done the validation of individual
objects when they did so. The only thing left to worry about is the
connectivity integrity, which can be checked with "rev-list --objects",
which is much cheaper. We have been paying the 5x to 8x runtime overhead
the --verify-objects often adds for no real gain.
Revert check_everything_connected() not to use this over-pessimistic
check.
Credit goes to Nguyễn Thái Ngọc Duy, who originally identified the
performance regression and endured multiple rounds of reviews to fix it.
Prior to this commit, the cleanup trap that removes the tmp dir
created by the script would fail on Windows. The error was silently
ignored by the script.
On Windows, a directory cannot be removed while it is the working
directory of the process (thanks to Johannes Sixt on the Git list
for this info [1]).
This commit eliminates the 'cd' into the tmp directory that caused
the error.
The 'tar' utility is not available on all platforms (some only support
'gnutar'). An earlier commit created a work-around for this problem,
but a better solution is to eliminate the use of 'tar' completely.
Signed-off-by: Tim Henigan <tim.henigan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was unclear what a test in t0204 wanted to check; it turns out
that it was only to observe an undefined behaviour of the system,
and did not anticipate one kind of reasonable error behaviour.
* jc/maint-undefined-i18n-observation-test:
t0204: clarify the "observe undefined behaviour" test
When "git config" diagnoses an error in a configuration file and
shows the line number for the offending line, it miscounted if the
error was at the end of line.
By Martin Stenberg
* ms/maint-config-error-at-eol-linecount:
config: report errors at the EOL with correct line number
We have had these options as harmless no-op for more than 3 years without
officially deprecating them. Let's announce the deprecation and start
warning against their use, but without failing the command just not yet,
so that we can later repurpose the option if we want to in the future.
Merge branch 'tr/maint-bundle-boundary' into maint
"git bundle" did not record boundary commits correctly when there
are many of them.
By Thomas Rast
* tr/maint-bundle-boundary:
bundle: keep around names passed to add_pending_object()
t5510: ensure we stay in the toplevel test dir
t5510: refactor bundle->pack conversion
Merge branch 'jc/maint-diff-patch-header' into maint
"git diff-index" and its friends at the plumbing level showed the
"diff --git" header and nothing else for a path whose cached stat
info is dirty without actual difference when asked to produce a
patch. This was a longstanding bug that we could have fixed long
time ago.
By Junio C Hamano
* jc/maint-diff-patch-header:
diff -p: squelch "diff --git" header for stat-dirty paths
t4011: illustrate "diff-index -p" on stat-dirty paths
t4011: modernise style
Merge branch 'jn/maint-do-not-match-with-unsanitized-searchtext' into maint
"gitweb" did use quotemeta() to prepare search string when asked to
do a fixed-string project search, but did not use it by mistake and
used the user-supplied string instead.
By Jakub Narebski
* jn/maint-do-not-match-with-unsanitized-searchtext:
gitweb: Fix fixed string (non-regexp) project search
Merge branch 'jc/am-3-nonstandard-popt' into maint
The code to synthesize the fake ancestor tree used by 3-way merge
fallback in "git am" was not prepared to read a patch created with
a non-standard -p<num> value.
* jc/am-3-nonstandard-popt:
test: "am -3" can accept non-standard -p<num>
am -3: allow nonstandard -p<num> option
The --binary option to git-apply has been a no-op since 2b6eef9 (Make
apply --binary a no-op., 2006-09-06) and was deprecated in cb3a160
(git-am: ignore --binary option, 2008-08-09).
We could remove it outright, but let's be nice to people who still
have scripts saying 'git am -b' (if they exist) and tell them the
reason for the sudden failure.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
i18n: fix auto detection of gettext scheme for shell scripts
A new code added by ad17ea7 (add a Makefile switch to avoid gettext
translation in shell scripts, 2012-01-23) tried to optionally force
a gettext scheme to "fallthrough", but ended up forcing it to everybody.
config: report errors at the EOL with correct line number
A section in a config file with a missing "]" reports the next line
as bad, same goes to a value with a missing end quote.
This happens because the error is not detected until the end of the
line, when line number is already increased. Fix this by decreasing
line number by one for these cases.
Signed-off-by: Martin Stenberg <martin@gnutiken.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Updates to localized messages for zn_CN and sv locales.
via Jiang Xin
* https://github.com/git-l10n/git-po:
l10n: Improve zh_CN translation for msg "not something we can merge"
l10n: Improve zh_CN trans for msg that cannot fast-forward
l10n: Update zh_CN translation for 1.7.10-rc0
Update Swedish translation (732t0f0u).
po/sv.po: add Swedish translation
l10n: Update git.pot (1 new message)
l10n: Update zh_CN translation for 1.7.9.2
l10n: Improve commit msg for zh_CN translation
l10n: Improve zh_CN translation for msg that make empty commit when amend.
l10n: Improve zh_CN translation for empty cherry-pick msg.
l10n: Improve zh_CN translation for msg about branch deletion deny
l10n: Improve zh_CN translation for lines insertion and deletion.
Change the Exporter invocation in Git::I18N to be compatible with
5.8.0 to 5.8.2 inclusive. Before Exporter 5.57 (released with 5.8.3)
Exporter didn't export the 'import' subroutine.
Reported-by: Tom G. Christensen <tgc@statsbiblioteket.dk> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import: don't allow 'ls' of path with empty components
As the fast-import manual explains:
The value of <path> must be in canonical form. That is it must
not:
. contain an empty directory component (e.g. foo//bar is invalid),
. end with a directory separator (e.g. foo/ is invalid),
. start with a directory separator (e.g. /foo is invalid),
Unfortunately the "ls" command accepts these invalid syntaxes and
responds by declaring that the indicated path is missing. This is too
subtle and causes importers to silently misbehave; better to error out
so the operator knows what's happening.
The C, R, and M commands already error out for such paths.
Reported-by: Andrew Sayers <andrew-git@pileofstuff.org> Analysis-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When the chosen directory has changed since it was last written to
pack, "tree_content_get" makes a deep copy of its content to scribble
on while computing the tree name, which we forgot to free.
This leak has been present since the 'ls' command was introduced in
v1.7.5-rc0~3^2~33 (fast-import: add 'ls' command, 2010-12-02).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
t0204: clarify the "observe undefined behaviour" test
This test asks for an impossible conversion to the system by
preparing an UTF-8 translation with characters that cannot be
expressed in ISO-8859-1, and then asking the message shown in
ISO-8859-1. Even though the behaviour against such a request is
undefined, it may be interesting to see what the system does, and
the purpose of this test is to see if there are platforms that
exhibit behaviour that we haven't seen.
The original recognized two known modes of behaviour:
- the key used to query the message catalog ("TEST: Old English
Runes"), saying "I cannot do that i18n".
- impossible characters replaced with ASCII "?", saying "I punt".
but they were treated totally differently. The test simply issued
an informational message "Your system punts on this one" for the
first error mode, while it diagnosed the latter as "Your system is
good; you pass!".
It turns out that Mac OS X exhibits a third mode of error behaviour,
to spew out the raw value stored in the message catalog. The test
diagnosed this behaviour as "broken", but it is merely trying to do
its best to respond to an impossible request by saying "I punt" in a
way that is slightly different from the second one.
Update the offending test to make it clear what is (and is not)
being tested, update the code structure so that newly discovered
error mode can easily be added to it later, and reword the message
that comes from a failing case to clarify that it is not the system
that is broken when it fails, but merely that the behaviour is not
something we have seen.
configure: allow user to prevent $PATH "sanitization" on Solaris
On a Solaris 10 system with Solaris make installed as '/usr/xpg4/bin/make',
GNU make installed as '/usr/local/bin/make', and with '/usr/local/bin'
appearing in $PATH *before* '/usr/xpg4/bin', I was seeing errors like this
upon invoking "make all":
This happenes because the Git's Makefile, when running on Solaris,
automatically "sanitizes" $PATH by prepending '/usr/xpg6/bin' and
'/usr/xpg4/bin' to it in order to avoid using non-POSIX /bin/sh from
being used. In the setup described above, however, this has an
unintended consequence of forcing the use of Solaris make in recursive
make invocations -- even if the $(MAKE) macro is being correctly used in
them!
When building without using the autoconf machinery, this can be solved
by overriding $(SANE_TOOL_PATH). Teach the autoconf machinery to also
allow users of ./configure to override it from the command line with a
new --with-sane-tool-path option.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'log -3000 (baseline)' test accidentally still used -1000 from an
earlier version.
Noticed-by: Lawrence Holding <Lawrence.Holding@cubic.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the 'remaining' command to the documentation of
'git rerere'. This command was added in ac49f5ca (Feb 16 2011;
Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>) but
it was never documented.
Touch up the other rerere commands to reduce noise.
First noticed by Vincent van Ravesteijn.
Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modify verify-tag to load relevant GPG variables from the git
configuratio file. This allows git tag -v to use an alternative
GPG binary in the same way that git tag -s does.
Signed-off-by: Alex Zepeda <alex@inferiorhumanorgans.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
By Jens Lehmann (3) and Johannes Sixt (1)
* jl/maint-submodule-relative:
submodules: fix ambiguous absolute paths under Windows
submodules: refactor computation of relative gitdir path
submodules: always use a relative path from gitdir to work tree
submodules: always use a relative path to gitdir
By Vincent van Ravesteijn
* vr/branch-doc:
Documentation/git-branch: add default for --contains
Documentation/git-branch: fix a typo
Documentation/git-branch: cleanups
By Junio C Hamano (2) and Ramsay Jones (1)
* jc/pickaxe-ignore-case:
ctype.c: Fix a sparse warning
pickaxe: allow -i to search in patch case-insensitively
grep: use static trans-case table
fix deletion of .git/objects sub-directories in git-prune/repack
Both git-prune and git-repack (and thus, git-gc) try to rmdir while
holding a DIR* handle on the directory. This can leave dangling
empty directories in the .git/objects on platforms where directory
cannot be removed while they are open.
First call closedir() and then rmdir(); that is more logical ordering.
Reported-by: John Chen <john0312@gmail.com> Reported-by: Stefan Naewe <stefan.naewe@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Improved-and-Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
By Junio C Hamano
* jc/maint-diff-patch-header:
diff -p: squelch "diff --git" header for stat-dirty paths
t4011: illustrate "diff-index -p" on stat-dirty paths
t4011: modernise style
By Thomas Rast
* tr/maint-bundle-boundary:
bundle: keep around names passed to add_pending_object()
t5510: ensure we stay in the toplevel test dir
t5510: refactor bundle->pack conversion
By Zbigniew Jędrzejewski-Szmek (8) and Junio C Hamano (1)
* zj/diff-stat-dyncol:
: This breaks tests. Perhaps it is not worth using the decimal-width stuff
: for this series, at least initially.
diff --stat: add config option to limit graph width
diff --stat: enable limiting of the graph part
diff --stat: add a test for output with COLUMNS=40
diff --stat: use a maximum of 5/8 for the filename part
merge --stat: use the full terminal width
log --stat: use the full terminal width
show --stat: use the full terminal width
diff --stat: use the full terminal width
diff --stat: tests for long filenames and big change counts
Use $search_regexp, where regex metacharacters are quoted, for
searching projects list, rather than $searchtext, which contains
original search term.
Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
OS X's sed and grep would complain with (respectively)
sed: 1: "/^-/{p;q}": extra characters at the end of q command
grep: Regular expression too big
For sed, use an explicit ; to terminate the q command.
For grep, spell the "40 hex digits" explicitly in the regex, which
should be safe as other tests already use this and we haven't got
breakage reports on OS X about them.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff
was faster than both Myers and patience.
We have since incorporated a performance testing framework, so add a
test that compares the various diff tasks performed in a real 'log -p'
workload. This does indeed show that histogram diff slightly beats
Myers, while patience is much slower than the others.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the exact option strings to be typed by end users are
already set in typewriter font by using `--option`, but a few places
used '--option' to call for italics or with no quoting. Uniformly
use `--option`.
Also add a full-stop after a sentence that missed one.
Signed-off-by: Vincent van Ravesteijn <vfr@lyx.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import: zero all of 'struct tag' to silence valgrind
When running t9300, valgrind (correctly) complains about an
uninitialized value in write_crash_report:
==2971== Use of uninitialised value of size 8
==2971== at 0x4164F4: sha1_to_hex (hex.c:70)
==2971== by 0x4073E4: die_nicely (fast-import.c:468)
==2971== by 0x43284C: die (usage.c:86)
==2971== by 0x40420D: main (fast-import.c:2731)
==2971== Uninitialised value was created by a heap allocation
==2971== at 0x4C29B3D: malloc (vg_replace_malloc.c:263)
==2971== by 0x433645: xmalloc (wrapper.c:35)
==2971== by 0x405DF5: pool_alloc (fast-import.c:619)
==2971== by 0x407755: pool_calloc.constprop.14 (fast-import.c:634)
==2971== by 0x403F33: main (fast-import.c:3324)
Fix this by zeroing all of the 'struct tag'. We would only need to
zero out the 'sha1' field, but this way seems more future-proof.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/gitweb-hilite-regions:
gitweb: Highlight matched part of shortened project description
gitweb: Highlight matched part of project description when searching projects
gitweb: Highlight matched part of project name when searching projects
gitweb: Introduce esc_html_match_hl and esc_html_hl_regions
Make git-{pull,rebase} message without tracking information friendlier
The current message is too long and at too low a level for anybody
to understand it if they don't know about the configuration format
already.
The text about setting up a remote is superfluous and doesn't help
understand or recover from the error that has happened. Show the
usage more prominently and explain how to set up the tracking
information. If there is only one remote, that name is used instead
of the generic <remote>.
Also simplify the message we print on detached HEAD to remove
unnecessary information which is better left for the documentation.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Update draft release notes to 1.7.9.3 for the last time
http.proxy: also mention https_proxy and all_proxy
t0300: work around bug in dash 0.5.6
t5512 (ls-remote): modernize style
tests: fix spurious error when run directly with Solaris /usr/xpg4/bin/sh