unpack-trees.c: use path_excluded() in check_ok_to_remove()
This function is responsible for determining if a path that is not
tracked is ignored and allow "checkout" to overwrite it as needed.
It used excluded() without checking if higher level directory in the
path is ignored; correct it to use path_excluded() for this check.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* There are uses of lower-level interface excluded_from_list() in
the codepath for narrow-checkout hack; they are supposed to be
already checking each level as they descend, and are not touched
with this patch.
path_excluded(): update API to less cache-entry centric
It was stupid of me to make the API too much cache-entry specific;
the caller may want to check arbitrary pathname without having a
corresponding cache-entry to see if a path is ignored.
As we know a caller that does not recurse is calling us in the index
order, we can remember the last directory we found to be excluded
and see if the path we are looking at is still inside it, in which
case we can just answer that it is excluded.
ls-files -i: pay attention to exclusion of leading paths
"git ls-files --exclude=t/ -i" does not show paths in directory t/
that have been added to the index, but it should.
The excluded() API was designed for callers who walk the tree from
the top, checking each level of the directory hierarchy as it
descends if it is excluded, and not even bothering to recurse into
an excluded directory. This would allow us optimize for a common
case by not having to check if the exclude pattern "foo/" matches
when looking at "foo/bar", because the caller should have noticed
that "foo" is excluded and did not even bother to read "foo/bar"
out of opendir()/readdir() to call it.
The code for "ls-files -i" however walks the index linearly, feeding
paths without checking if the leading directory is already excluded.
Introduce a helper function path_excluded() to let this caller
properly call excluded() check for higher hierarchies as necessary.
bundle: remove stray single-quote from error message
After running rev-list --boundary to retrieve the list of boundary
commits, "git bundle create" runs its own revision walk. If in this
stage git encounters an unfamiliar option, it writes a message with an
unbalanced quotation mark:
error: unrecognized argument: --foo'
Drop the stray quote to match the "unrecognized argument: %s" message
used elsewhere and save translators some work.
This is mostly a futureproofing measure: for now, the "rev-list
--boundary" command catches most strange arguments on its own and the
above message is not seen unless you try something esoteric like "git
bundle create test.bundle --header HEAD".
Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/cache-tree:
t0090: be prepared that 'wc -l' writes leading blanks
reset: update cache-tree data when appropriate
commit: write cache-tree data when writing index anyway
Refactor cache_tree_update idiom from commit
Test the current state of the cache-tree optimization
Add test-scrap-cache-tree
Merge branch 'cb/maint-t5541-make-server-port-portable' into maint-1.7.8
* cb/maint-t5541-make-server-port-portable:
t5541: check error message against the real port number used
remote-curl: Fix push status report when all branches fail
Merge branch 'tr/maint-bundle-boundary' into maint-1.7.8
* tr/maint-bundle-boundary:
bundle: keep around names passed to add_pending_object()
t5510: ensure we stay in the toplevel test dir
t5510: refactor bundle->pack conversion
Merge branch 'tr/maint-bundle-long-subject' into maint-1.7.8
* tr/maint-bundle-long-subject:
t5704: match tests to modern style
strbuf: improve strbuf_get*line documentation
bundle: use a strbuf to scan the log for boundary commits
bundle: put strbuf_readline_fd in strbuf.c with adjustments
Git 1.7.8 introduced an object and history re-validation step after
"fetch" or "push" causes new history to be added to a receiving
repository. This is to protect a malicious server or pushing client from
corrupting the repository by taking advantage of an existing corrupt
object that is unconnected to existing history.
But this check is way over-pessimistic. During "fetch" or "receive-pack"
(the server side of "push"), unpack-objects and index-pack already
validate individual objects that are received, and the only thing we would
want to catch are corrupted objects that already happen to exist in our
repository but are not referenced from our refs. Such objects must have
been written by an earlier run of our codepaths that write out loose
objects or packfiles, and they must have done the validation of individual
objects when they did so. The only thing left to worry about is the
connectivity integrity, which can be checked with "rev-list --objects",
which is much cheaper. We have been paying the 5x to 8x runtime overhead
the --verify-objects often adds for no real gain.
Revert check_everything_connected() not to use this over-pessimistic
check.
Credit goes to Nguyễn Thái Ngọc Duy, who originally identified the
performance regression and endured multiple rounds of reviews to fix it.
This adds the 'remaining' command to the documentation of
'git rerere'. This command was added in ac49f5ca (Feb 16 2011;
Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>) but
it was never documented.
Touch up the other rerere commands to reduce noise.
First noticed by Vincent van Ravesteijn.
Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bundle: keep around names passed to add_pending_object()
The 'name' field passed to add_pending_object() is used to later
deduplicate in object_array_remove_duplicates().
git-bundle had a bug in this area since 18449ab (git-bundle: avoid
packing objects which are in the prerequisites, 2007-03-08): it passed
the name of each boundary object in a static buffer. In other words,
all that object_array_remove_duplicates() saw was the name of the
*last* added boundary object.
The recent switch to a strbuf in bc2fed4 (bundle: use a strbuf to scan
the log for boundary commits, 2012-02-22) made this slightly worse: we
now free the buffer at the end, so it is not even guaranteed that it
still points into addressable memory by the time object_array_remove_
duplicates looks at it. On the plus side however, it was now
detectable by valgrind.
The fix is easy: pass a copy of the string to add_pending_object.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: use {asterisk} in rev-list-options.txt when needed
Text between two '*' is emphasized in AsciiDoc and makes explanations in
rev-list-options.txt on glob-related options very confusing, as the
rendered text would be missing two asterisks and the text between them
would be emphasized instead.
Use '{asterisk}' where needed to make them show up as asterisks in the
rendered text.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git grep" is run with -P/--perl-regexp, it doesn't match ^ and $ at
the beginning/end of the line. This is because PCRE normally matches ^
and $ at the beginning/end of the whole text, not for each line, and "git
grep" passes a large chunk of text (possibly containing many lines) to
pcre_exec() and then splits the text into lines.
This makes "git grep -P" behave differently from "git grep -E" and also
from "grep -P" and "pcregrep":
$ cat file
a
b
$ git grep --no-index -P '^ ' file
$ git grep --no-index -E '^ ' file
file: b
$ grep -c -P '^ ' file
b
$ pcregrep -c '^ ' file
b
Reported-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-am.sh's check_patch_format function would attempt to preview
the patch to guess its format, but would go into an infinite loop
when the patch file happened to be empty. The solution: exit the
loop when "read" fails, not when the line var, "$l1" becomes empty.
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test did not adhere to the current style on several counts:
. empty lines around the test blocks, but within the test string
. ': > file' or even just '> file' with an extra space
. inconsistent indentation
. hand-rolled commits instead of using test_commit
Fix all of them.
There's a catch to the last point: test_commit creates a tag, which the
original test did not create. We still change it to test_commit, and
explicitly delete the tags, so as to highlight that the test relies on not
having them.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-bundle actually spawns exactly this rev-list invocation, and does
the grepping internally.
There was a subtle bug in the latter step: it used fgets() with a
1024-byte buffer. If the user has sufficiently long subjects (e.g.,
by not adhering to the git oneline-subject convention in the first
place), the 'oneline' format can easily overflow the buffer. fgets()
then returns the rest of the line in the next call(s). If one of
these remaining parts started with '-', git-bundle would mistakenly
insert it into the bundle thinking it was a boundary commit.
Fix it by using strbuf_getwholeline() instead, which handles arbitrary
line lengths correctly.
Note that on the receiving side in parse_bundle_header() we were
already using strbuf_getwholeline_fd(), so that part is safe.
Reported-by: Jannis Pohlmann <jannis.pohlmann@codethink.co.uk> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bundle: put strbuf_readline_fd in strbuf.c with adjustments
The comment even said that it should eventually go there. While at
it, match the calling convention and name of the function to the
strbuf_get*line family. So it now is strbuf_getwholeline_fd.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The imap-send code was adapted from another project, and
still contains many unused bits of code. One of these bits
contains a type "struct string_list" which bears no
resemblence to the "struct string_list" we use elsewhere in
git. This causes the compiler to complain if git's
string_list ever becomes part of cache.h.
Let's just drop the dead code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/Makefile: Use $(sort ...) explicitly where needed
Starting from GNU Make 3.82 $(wildcard ...) no longer sorts the result
(from NEWS):
* WARNING: Backward-incompatibility!
Wildcards were not documented as returning sorted values, but the results
have been sorted up until this release.. If your makefiles require sorted
results from wildcard expansions, use the $(sort ...) function to request
it explicitly.
I usually watch test progress visually, and if tests are sorted, even
with make -j4 they go more or less incrementally by their t number. On
the other side, without sorting, tests are executed in seemingly random
order even for -j1. Let's please maintain sane tests order for perceived
prettyness.
Another note is that in GNU Make sort also works as uniq, so after sort
being removed, we might expect e.g. $(wildcard *.sh a.*) to produce
duplicates for e.g. "a.sh". From this point of view, adding sort could
be seen as hardening t/Makefile from accidentally introduced dups.
It turned out that prevous releases of GNU Make did not perform full
sort in $(wildcard), only sorting results for each pattern, that's why
explicit sort-as-uniq is relevant even for older makes.
Signed-off-by: Kirill Smelkov <kirr@navytux.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-curl: Fix push status report when all branches fail
The protocol between transport-helper.c and remote-curl requires
remote-curl to always print a blank line after the push command
has run. If the blank line is ommitted, transport-helper kills its
container process (the git push the user started) with exit(128)
and no message indicating a problem, assuming the helper already
printed reasonable error text to the console.
However if the remote rejects all branches with "ng" commands in the
report-status reply, send-pack terminates with non-zero status, and
in turn remote-curl exited with non-zero status before outputting
the blank line after the helper status printed by send-pack. No
error messages reach the user.
This caused users to see the following from git push over HTTP
when the remote side's update hook rejected the branch:
$ git push http://... master
Counting objects: 4, done.
Delta compression using up to 6 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 301 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
$
Always print a blank line after the send-pack process terminates,
ensuring the helper status report (if it was output) will be
correctly parsed by the calling transport-helper.c. This ensures
the helper doesn't abort before the status report can be shown to
the user.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff-index: enable recursive pathspec matching in unpack_trees
The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.
The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
thin-pack: try harder to use preferred base objects as base
When creating a pack using objects that reside in existing packs, we try
to avoid recomputing futile delta between an object (trg) and a candidate
for its base object (src) if they are stored in the same packfile, and trg
is not recorded as a delta already. This heuristics makes sense because it
is likely that we tried to express trg as a delta based on src but it did
not produce a good delta when we created the existing pack.
As the pack heuristics prefer producing delta to remove data, and Linus's
law dictates that the size of a file grows over time, we tend to record
the newest version of the file as inflated, and older ones as delta
against it.
When creating a thin-pack to transfer recent history, it is likely that we
will try to send an object that is recorded in full, as it is newer. But
the heuristics to avoid recomputing futile delta effectively forbids us
from attempting to express such an object as a delta based on another
object. Sending an object in full is often more expensive than sending a
suboptimal delta based on other objects, and it is even more so if we
could use an object we know the receiving end already has (i.e. preferred
base object) as the delta base.
Tweak the recomputation avoidance logic, so that we do not punt on
computing delta against a preferred base object.
The effect of this change can be seen on two simulated upload-pack
workloads. The first is based on 44 reflog entries from my git.git
origin/master reflog, and represents the packs that kernel.org sent me git
updates for the past month or two. The second workload represents much
larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
v1.2.0, and so on.
The table below shows the average generated pack size and the average CPU
time consumed for each dataset, both before and after the patch:
dataset
| reflog | tags
---------------------------------
before | 53358 | 2750977
size after | 32398 | 2668479
change | -39% | -3%
---------------------------------
before | 0.18 | 1.12
CPU after | 0.18 | 1.15
change | +0% | +3%
This patch makes a much bigger difference for packs with a shorter slice
of history (since its effect is seen at the boundaries of the pack) though
it has some benefit even for larger packs.
Signed-off-by: Jeff King <peff@peff.net> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function frees the individual "struct match_attr"s we
have allocated, but forgot to free the array holding their
pointers, leading to a minor memory leak (but it can add up
after checking attributes for paths in many directories).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: rerere's rr-cache auto-creation and rerere.enabled
The description of rerere.enabled left the user in the dark as to who
might create an rr-cache directory. Add a note that simply invoking
rerere does this.
In prepare_attr_stack, we pop the old elements of the stack
(which were left from a previous lookup and may or may not
be useful to us). Our loop to do so checks that we never
reach the top of the stack. However, the code immediately
afterwards will segfault if we did actually reach the top of
the stack.
Fortunately, this is not an actual bug, since we will never
pop all of the stack elements (we will always keep the root
gitattributes, as well as the builtin ones). So the extra
check in the loop condition simply clutters the code and
makes the intent less clear. Let's get rid of it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: don't confuse prefixes with leading directories
When we prepare the attribute stack for a lookup on a path,
we start with the cached stack from the previous lookup
(because it is common to do several lookups in the same
directory hierarchy). So the first thing we must do in
preparing the stack is to pop any entries that point to
directories we are no longer interested in.
For example, if our stack contains gitattributes for:
foo/bar/baz
foo/bar
foo
but we want to do a lookup in "foo/bar/bleep", then we want
to pop the top element, but retain the others.
To do this we walk down the stack from the top, popping
elements that do not match our lookup directory. However,
the test do this simply checked strncmp, meaning we would
mistake "foo/bar/baz" as a leading directory of
"foo/bar/baz_plus". We must also check that the character
after our match is '/', meaning we matched the whole path
component.
There are two special cases to consider:
1. The top of our attr stack has the empty path. So we
must not check for '/', but rather special-case the
empty path, which always matches.
2. Typically when matching paths in this way, you would
also need to check for a full string match (i.e., the
character after is '\0'). We don't need to do so in
this case, though, because our path string is actually
just the directory component of the path to a file
(i.e., we know that it terminates with "/", because the
filename comes after that).
Helped-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sendemail.multiedit variable is meant to be a boolean.
However, it is not marked as such in the code, which means
we store its value literally. Thus in the do_edit function,
perl ends up coercing it to a boolean value according to
perl rules, not git rules. This works for "0", but "false",
"no", or "off" will erroneously be interpreted as true.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Fix actionless dispatch for non-existent objects
When gitweb URL does not provide action explicitly, e.g.
http://git.example.org/repo.git/branch
dispatch() tries to guess action (view to be used) based on remaining
parameters. Among others it is based on the type of requested object,
which gave problems when asking for non-existent branch or file (for
example misspelt name).
Now undefined $action from dispatch() should not result in problems.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jn/maint-gitweb-utf8-fix' into maint
* jn/maint-gitweb-utf8-fix:
gitweb: Fix fallback mode of to_utf8 subroutine
gitweb: Output valid utf8 in git_blame_common('data')
gitweb: esc_html() site name for title in OPML
gitweb: Call to_utf8() on input string in chop_and_escape_str()
Documentation: rerere.enabled is the primary way to configure rerere
The wording seems to suggest that creating the directory is needed and the
setting of rerere.enabled is only for disabling the feature by setting it
to 'false'. But the configuration is meant to be the primary control and
setting it to 'true' will enable it; the rr-cache directory will be
created as necessary and the user does not have to create it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequently we assume that there is only one pack. Currently this is
true only by accident. Pass '-a -d' to repack in order to guarantee that
assumption to hold true.
The prune-packed command is now redundant since repack -d already calls
it.
Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.7.7:
docs: describe behavior of relative submodule URLs
Documentation: read-tree --prefix works with existing subtrees
Add MYMETA.json to perl/.gitignore
docs: describe behavior of relative submodule URLs
Since the relative submodule URLs have been introduced in f31a522a2d, they
do not conform to the rules for resolving relative URIs but rather to
those of relative directories.
Document that behavior.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fix hang in git fetch if pointed at a 0 length bundle
git-repo if interupted at the exact wrong time will generate zero
length bundles- literal empty files. git-repo is wrong here, but
git fetch shouldn't effectively spin loop if pointed at a zero
length bundle.
Signed-off-by: Brian Harring <ferringb@chromium.org> Helped-by: Johannes Sixt Helped-by: Nguyen Thai Ngoc Duy Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: read-tree --prefix works with existing subtrees
Since 34110cd4 (Make 'unpack_trees()' have a separate source and
destination index) it is no longer true that a subdirectory with
the same prefix must not exist.
Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ExtUtils::MakeMaker generates MYMETA.json in addition to MYMETA.yml
since version 6.57_07. As it suggests, it is just meta information about
the build and is cleaned up with 'make clean', so it should be ignored.
Signed-off-by: Jack Nagel <jacknagel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/checkout-m-twoway:
t/t2023-checkout-m.sh: fix use of test_must_fail
checkout_merged(): squelch false warning from some gcc
Test 'checkout -m -- path'
checkout -m: no need to insist on having all 3 stages
Merge branch 'jn/maint-sequencer-fixes' into maint
* jn/maint-sequencer-fixes:
revert: stop creating and removing sequencer-old directory
Revert "reset: Make reset remove the sequencer state"
revert: do not remove state until sequence is finished
revert: allow single-pick in the middle of cherry-pick sequence
revert: pass around rev-list args in already-parsed form
revert: allow cherry-pick --continue to commit before resuming
revert: give --continue handling its own function