diff-index: enable recursive pathspec matching in unpack_trees
The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.
The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
thin-pack: try harder to use preferred base objects as base
When creating a pack using objects that reside in existing packs, we try
to avoid recomputing futile delta between an object (trg) and a candidate
for its base object (src) if they are stored in the same packfile, and trg
is not recorded as a delta already. This heuristics makes sense because it
is likely that we tried to express trg as a delta based on src but it did
not produce a good delta when we created the existing pack.
As the pack heuristics prefer producing delta to remove data, and Linus's
law dictates that the size of a file grows over time, we tend to record
the newest version of the file as inflated, and older ones as delta
against it.
When creating a thin-pack to transfer recent history, it is likely that we
will try to send an object that is recorded in full, as it is newer. But
the heuristics to avoid recomputing futile delta effectively forbids us
from attempting to express such an object as a delta based on another
object. Sending an object in full is often more expensive than sending a
suboptimal delta based on other objects, and it is even more so if we
could use an object we know the receiving end already has (i.e. preferred
base object) as the delta base.
Tweak the recomputation avoidance logic, so that we do not punt on
computing delta against a preferred base object.
The effect of this change can be seen on two simulated upload-pack
workloads. The first is based on 44 reflog entries from my git.git
origin/master reflog, and represents the packs that kernel.org sent me git
updates for the past month or two. The second workload represents much
larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
v1.2.0, and so on.
The table below shows the average generated pack size and the average CPU
time consumed for each dataset, both before and after the patch:
dataset
| reflog | tags
---------------------------------
before | 53358 | 2750977
size after | 32398 | 2668479
change | -39% | -3%
---------------------------------
before | 0.18 | 1.12
CPU after | 0.18 | 1.15
change | +0% | +3%
This patch makes a much bigger difference for packs with a shorter slice
of history (since its effect is seen at the boundaries of the pack) though
it has some benefit even for larger packs.
Signed-off-by: Jeff King <peff@peff.net> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function frees the individual "struct match_attr"s we
have allocated, but forgot to free the array holding their
pointers, leading to a minor memory leak (but it can add up
after checking attributes for paths in many directories).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: rerere's rr-cache auto-creation and rerere.enabled
The description of rerere.enabled left the user in the dark as to who
might create an rr-cache directory. Add a note that simply invoking
rerere does this.
In prepare_attr_stack, we pop the old elements of the stack
(which were left from a previous lookup and may or may not
be useful to us). Our loop to do so checks that we never
reach the top of the stack. However, the code immediately
afterwards will segfault if we did actually reach the top of
the stack.
Fortunately, this is not an actual bug, since we will never
pop all of the stack elements (we will always keep the root
gitattributes, as well as the builtin ones). So the extra
check in the loop condition simply clutters the code and
makes the intent less clear. Let's get rid of it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: don't confuse prefixes with leading directories
When we prepare the attribute stack for a lookup on a path,
we start with the cached stack from the previous lookup
(because it is common to do several lookups in the same
directory hierarchy). So the first thing we must do in
preparing the stack is to pop any entries that point to
directories we are no longer interested in.
For example, if our stack contains gitattributes for:
foo/bar/baz
foo/bar
foo
but we want to do a lookup in "foo/bar/bleep", then we want
to pop the top element, but retain the others.
To do this we walk down the stack from the top, popping
elements that do not match our lookup directory. However,
the test do this simply checked strncmp, meaning we would
mistake "foo/bar/baz" as a leading directory of
"foo/bar/baz_plus". We must also check that the character
after our match is '/', meaning we matched the whole path
component.
There are two special cases to consider:
1. The top of our attr stack has the empty path. So we
must not check for '/', but rather special-case the
empty path, which always matches.
2. Typically when matching paths in this way, you would
also need to check for a full string match (i.e., the
character after is '\0'). We don't need to do so in
this case, though, because our path string is actually
just the directory component of the path to a file
(i.e., we know that it terminates with "/", because the
filename comes after that).
Helped-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sendemail.multiedit variable is meant to be a boolean.
However, it is not marked as such in the code, which means
we store its value literally. Thus in the do_edit function,
perl ends up coercing it to a boolean value according to
perl rules, not git rules. This works for "0", but "false",
"no", or "off" will erroneously be interpreted as true.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jn/maint-gitweb-utf8-fix' into maint
* jn/maint-gitweb-utf8-fix:
gitweb: Fix fallback mode of to_utf8 subroutine
gitweb: Output valid utf8 in git_blame_common('data')
gitweb: esc_html() site name for title in OPML
gitweb: Call to_utf8() on input string in chop_and_escape_str()
Documentation: rerere.enabled is the primary way to configure rerere
The wording seems to suggest that creating the directory is needed and the
setting of rerere.enabled is only for disabling the feature by setting it
to 'false'. But the configuration is meant to be the primary control and
setting it to 'true' will enable it; the rr-cache directory will be
created as necessary and the user does not have to create it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequently we assume that there is only one pack. Currently this is
true only by accident. Pass '-a -d' to repack in order to guarantee that
assumption to hold true.
The prune-packed command is now redundant since repack -d already calls
it.
Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.7.7:
docs: describe behavior of relative submodule URLs
Documentation: read-tree --prefix works with existing subtrees
Add MYMETA.json to perl/.gitignore
docs: describe behavior of relative submodule URLs
Since the relative submodule URLs have been introduced in f31a522a2d, they
do not conform to the rules for resolving relative URIs but rather to
those of relative directories.
Document that behavior.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fix hang in git fetch if pointed at a 0 length bundle
git-repo if interupted at the exact wrong time will generate zero
length bundles- literal empty files. git-repo is wrong here, but
git fetch shouldn't effectively spin loop if pointed at a zero
length bundle.
Signed-off-by: Brian Harring <ferringb@chromium.org> Helped-by: Johannes Sixt Helped-by: Nguyen Thai Ngoc Duy Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: read-tree --prefix works with existing subtrees
Since 34110cd4 (Make 'unpack_trees()' have a separate source and
destination index) it is no longer true that a subdirectory with
the same prefix must not exist.
Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ExtUtils::MakeMaker generates MYMETA.json in addition to MYMETA.yml
since version 6.57_07. As it suggests, it is just meta information about
the build and is cleaned up with 'make clean', so it should be ignored.
Signed-off-by: Jack Nagel <jacknagel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/checkout-m-twoway:
t/t2023-checkout-m.sh: fix use of test_must_fail
checkout_merged(): squelch false warning from some gcc
Test 'checkout -m -- path'
checkout -m: no need to insist on having all 3 stages
Merge branch 'jn/maint-sequencer-fixes' into maint
* jn/maint-sequencer-fixes:
revert: stop creating and removing sequencer-old directory
Revert "reset: Make reset remove the sequencer state"
revert: do not remove state until sequence is finished
revert: allow single-pick in the middle of cherry-pick sequence
revert: pass around rev-list args in already-parsed form
revert: allow cherry-pick --continue to commit before resuming
revert: give --continue handling its own function
* jk/maint-mv:
mv: be quiet about overwriting
mv: improve overwrite warning
mv: make non-directory destination error more clear
mv: honor --verbose flag
docs: mention "-k" for both forms of "git mv"
Merge branch 'jk/fetch-no-tail-match-refs' into maint
* jk/fetch-no-tail-match-refs:
connect.c: drop path_match function
fetch-pack: match refs exactly
t5500: give fully-qualified refs to fetch-pack
drop "match" parameter from get_remote_heads
Merge branch 'jk/refresh-porcelain-output' into maint
* jk/refresh-porcelain-output:
refresh_index: make porcelain output more specific
refresh_index: rename format variables
read-cache: let refresh_cache_ent pass up changed flags
builtin/init-db.c: eliminate -Wformat warning on Solaris
On Solaris systems we'd warn about an implicit cast of mode_t when we
printed things out with the %d format. We'd get this warning under GCC
4.6.0 with Solaris headers:
builtin/init-db.c: In function ‘separate_git_dir’:
builtin/init-db.c:354:4: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘mode_t’ [-Wformat]
We've been doing this ever since v1.7.4.1-296-gb57fb80. Just work
around this by adding an explicit cast.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-sh-setup: make require_clean_work_tree part of the interface
92c62a3 (Porcelain scripts: Rewrite cryptic "needs update" error
message, 2010-10-19) refactored git's own checking to a function in
git-sh-setup. This is a very useful thing for script writers, so
document it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change an invocation of test_must_fail() to be inside a
test_expect_success() as is our usual pattern. Having it outside
caused our tests to fail under prove(1) since we wouldn't print a
newline before TAP output:
e5d3de5 (gitweb: use Perl built-in utf8 function for UTF-8 decoding.,
2007-12-04) was meant to make gitweb faster by using Perl's internals
(see subsection "Messing with Perl's Internals" in Encode(3pm) manpage)
Simple benchmark confirms that (old = 00f429a, new = this version):
old new
old -- -65%
new 189% --
Unfortunately it made fallback mode of to_utf8 do not work... except
for default value 'latin1' of $fallback_encoding ('latin1' is Perl
native encoding), which is why it was not noticed for such long time.
utf8::valid(STRING) is an internal function that tests whether STRING
is in a _consistent state_ regarding UTF-8. It returns true is
well-formed UTF-8 and has the UTF-8 flag on _*or*_ if string is held
as bytes (both these states are 'consistent'). For gitweb the second
option was true, as output from git commands is opened without ':utf8'
layer.
What made it work at all for STRING in 'latin1' encoding is the fact
that utf8:decode(STRING) turns on UTF-8 flag only if source string is
valid UTF-8 and contains multi-byte UTF-8 characters... and that if
string doesn't have UTF-8 flag set it is treated as in native Perl
encoding, i.e. 'latin1' / 'iso-8859-1' (unless native encoding it is
EBCDIC ;-)). It was ':utf8' layer that actually converted 'latin1'
(no UTF-8 flag == native == 'latin1) to 'utf8'.
Let's make use of the fact that utf8:decode(STRING) returns false if
STRING is invalid as UTF-8 to check whether to enable fallback mode.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When receive-pack advertises its list of refs, it generally hides the
capabilities information after a NUL at the end of the first ref.
However, when we have an empty repository, there are no refs, and
therefore receive-pack writes a fake ref "capabilities^{}" with the
capabilities afterwards.
On the client side, git reads the result with get_remote_heads(). We pick
the capabilities from the end of the line, and then call check_ref() to
make sure the ref name is valid. We see that it isn't, and don't bother
adding it to our list of refs.
However, the call to check_ref() is enabled by passing the REF_NORMAL flag
to get_remote_heads. For the regular git transport, we pass REF_NORMAL in
get_refs_via_connect() if we are doing a push (since only receive-pack
uses this fake ref). But in remote-curl, we never use this flag, and we
accept the fake ref as a real one, passing it back from the helper to the
parent git-push.
Most of the time this bug goes unnoticed, as the fake ref won't match our
refspecs. However, if "--mirror" is used, then we see it as remote cruft
to be pruned, and try to pass along a deletion refspec for it. Of course
this refspec has bogus syntax (because of the ^{}), and the helper
complains, aborting the push.
Let's have remote-curl mirror what the builtin get_refs_via_connect() does
(at least for the case of using git protocol; we can leave the dumb
info/refs reader as it is).
This also fixes pushing with --mirror to a smart-http remote that uses
alternates. The fake ".have" refs the server gives to avoid unnecessary
network transfer has a similar bad interactions with the machinery.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
By definition, the default value of "advice.*" variables must be true and
they all control various additional help messages that are designed to aid
new users. Setting one to false is to tell Git that the user understands
the nature of the error and does not need the additional verbose help
message.
Also fix the asciidoc markup for linkgit:git-checkout[1] in the
description of the detachedHead advice by removing an excess colon.
The non-streaming version of the filter counts CRLF and LF in the whole
buffer, and returns without doing anything when they match (i.e. what is
recorded in the object store already uses CRLF). This was done to help
people who added files from the DOS world before realizing they want to go
cross platform and adding .gitattributes to tell Git that they only want
CRLF in their working tree.
The streaming version of the filter does not want to read the whole thing
before starting to work, as that defeats the whole point of streaming. So
we instead check what byte follows CR whenever we see one, and add CR
before LF only when the LF does not immediately follow CR already to keep
CRLF as is.
Reported-and-tested-by: Ralf Thielow Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Call to_utf8() on input string in chop_and_escape_str()
a) To fix the comparison with the chopped string,
otherwise we compare bytes with characters, as
chop_str() must run to_utf8() for correct operation
b) To give the title attribute correct encoding;
we need to mark strings as UTF-8 before outpur
Signed-off-by: Jürgen Kreileder <jk@blackdown.de> Acked-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Providing a single "-v" to "git push" currently does
nothing. Giving two flags ("git push -v -v") turns on the
first level of verbosity.
This is caused by a regression introduced in 8afd8dc (push:
support multiple levels of verbosity, 2010-02-24). Before
the series containing 8afd8dc, the verbosity handling for
fetching and pushing was completely separate. Commit bde873c
refactored the verbosity handling out of the fetch side, and
then 8afd8dc converted push to use the refactored code.
However, the fetch and push sides numbered and passed along
their verbosity levels differently. For both, a verbosity
level of "-1" meant "quiet", and "0" meant "default output".
But from there they differed.
For fetch, a verbosity level of "1" indicated to the "fetch"
program that it should make the status table slightly more
verbose, showing up-to-date entries. A verbosity level of
"2" meant that we should pass a verbose flag to the
transport; in the case of fetch-pack, this displays protocol
debugging information.
As a result, the refactored code in bde873c checks for
"verbosity >= 2", and only then passes it on to the
transport. From the transport code's perspective, a
verbosity of 0 or 1 both meant "0".
Push, on the other hand, does not show its own status table;
that is always handled by the transport layer or below
(originally send-pack itself, but these days it is done by
the transport code). So a verbosity level of 1 meant that we
should pass the verbose flag to send-pack, so that it knows
we want a verbose status table. However, once 8afd8dc
switched it to the refactored fetch code, a verbosity level
of 1 was now being ignored. Thus, you needed to
artificially bump the verbosity to 2 (via "-v -v") to have
any effect.
We can fix this by letting the transport code know about the
true verbosity level (i.e., let it distinguish level 0 or
1).
We then have to also make an adjustment to any transport
methods that assumed "verbose > 0" meant they could spew
lots of debugging information. Before, they could only get
"0" or "2", but now they will also receive "1". They need to
adjust their condition for turning on such spew from
"verbose > 0" to "verbose > 1".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the description and options, the fsck manpage contains
some discussion about what it does. Over time, this
discussion has become somewhat obsolete, both in content and
formatting. In particular:
1. There are many options now, so starting the discussion
with "It tests..." makes it unclear whether we are
talking about the last option, or about the tool in
general. Let's start a new "discussion" section and
make our antecedent more clear.
2. It gave an example for --unreachable using for-each-ref
to mention all of the heads, saying that it will do "a
_lot_ of verification". This is hopelessly out-of-date,
as giving no arguments will check much more (reflogs,
the index, non-head refs).
3. It goes on to mention tests "to be added" (like tree
object sorting). We now have these tests.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
lf_to_crlf_filter(): tell the caller we added "\n" when draining
This can only happen when the input size is multiple of the
buffer size of the cascade filter (16k) and ends with an LF,
but in such a case, the code forgot to tell the caller that
it added the "\n" it could not add during the last round.
If you provide a custom rename score on the command line,
like:
git log -M50 --follow foo.c
it is completely ignored, and there is no way to --follow
with a looser rename score. Instead, let's use the same
rename score that will be used for generating diffs. This is
convenient, and mirrors what we do with the break-score.
You can see an example of it being useful in git.git:
$ git log --oneline --summary --follow \
Documentation/technical/api-string-list.txt 86d4b52 string-list: Add API to remove an item from an unsorted list 1d2f80f string_list: Fix argument order for string_list_append e242148 string-list: add unsorted_string_list_lookup() 0dda1d1 Fix two leftovers from path_list->string_list c455c87 Rename path_list to string_list
create mode 100644 Documentation/technical/api-string-list.txt
$ git log --oneline --summary -M40 --follow \
Documentation/technical/api-string-list.txt 86d4b52 string-list: Add API to remove an item from an unsorted list 1d2f80f string_list: Fix argument order for string_list_append e242148 string-list: add unsorted_string_list_lookup() 0dda1d1 Fix two leftovers from path_list->string_list c455c87 Rename path_list to string_list
rename Documentation/technical/{api-path-list.txt => api-string-list.txt} (47%) 328a475 path-list documentation: document all functions and data structures 530e741 Start preparing the API documents.
create mode 100644 Documentation/technical/api-path-list.txt
You could have two separate rename scores, one for following
and one for diff. But almost nobody is going to want that,
and it would just be unnecessarily confusing. Besides which,
we re-use the diff results from try_to_follow_renames for
the actual diff output, which means having them as separate
scores is actively wrong. E.g., with the current code, you
get:
$ git log --oneline --diff-filter=R --name-status \
-M90 --follow git.spec.in 27dedf0 GIT 0.99.9j aka 1.0rc3
R084 git-core.spec.in git.spec.in f85639c Rename the RPM from "git" to "git-core"
R098 git.spec.in git-core.spec.in
The first one should not be considered a rename by the -M
score we gave, but we print it anyway, since we blindly
re-use the diff information from the follow (which uses the
default score). So this could also be considered simply a
bug-fix, as with the current code "-M" is completely ignored
when using "--follow".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout_merged(): squelch false warning from some gcc
gcc 4.6.2 (there may be others) does not realize that the variable "mode"
can never be used uninitialized in this function and issues a false warning
under -Wuninitialized option.
Squelch it with an unnecessary initialization; it is not like a single
assignment matters to the performance in this codepath that writes out
to the filesystem with checkout_entry() anyway.
According to POSIX, setenv should error out with EINVAL if it's
asked to set an environment variable whose name contains an equals
sign. Implement this detail in our compatibility-fallback.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mf/curl-select-fdset:
http: drop "local" member from request struct
http.c: Rely on select instead of tracking whether data was received
http.c: Use timeout suggested by curl instead of fixed 50ms timeout
http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
* nd/misc-cleanups:
unpack_object_header_buffer(): clear the size field upon error
tree_entry_interesting: make use of local pointer "item"
tree_entry_interesting(): give meaningful names to return values
read_directory_recursive: reduce one indentation level
get_tree_entry(): do not call find_tree_entry() on an empty tree
tree-walk.c: do not leak internal structure in tree_entry_len()
* maint-1.7.7:
Git 1.7.7.5
Git 1.7.6.5
blame: don't overflow time buffer
fetch: create status table using strbuf
checkout,merge: loosen overwriting untracked file check based on info/exclude
cast variable in call to free() in builtin/diff.c and submodule.c
apply: get rid of useless x < 0 comparison on a size_t type