topology tests: teach a helper to take abbreviated timestamps
The on_committer_date helper in t/lib-t6000 is used in t6002 and
t6003 with timestamps on a single day within a single minute
(i.e. 1971-08-16 00:00) and the tests repeat this over and over.
The actual value of the timestamp, however, does not matter very
much; only their relative ordering does.
Introduce another helper to expand only the suffix of the timestamp
to a full timestamp to make the lines shorter, and use it in this
helper. Also, because all the commits in the test are made with
specific GIT_COMMITTER_DATE, stop unsetting it at the end of the
helper.
We'll be specifying the author timestamp to these test commits in a
later patch, which will be helped with this change.
Also remove a test that was commented-out from t6003; it used to
test a commit with the same parent listed twice, which was allowed
by mistake but was fixed long time ago.
Use the prio-queue data structure to implement a priority queue of
commits sorted by committer date, when handling --date-order. The
structure can also be used as a simple LIFO stack, which is a good
match for --topo-order processing.
Traditionally we used a singly linked list of commits to hold a set
of in-flight commits while traversing history. The most typical use
of the list is to add commits that are newly discovered to it, keep
the list sorted by commit timestamp, pick up the newest one from the
list, and keep digging. The cost of keeping the singly linked list
sorted is nontrivial, and this typical use pattern better matches a
priority queue.
Introduce a prio-queue structure, that can be used either as a LIFO
stack, or a priority queue. This will be used in the next patch to
hold in-flight commits during sort-in-topological-order.
Tests and the idea to make it usable for any "void *" pointers to
"things" are by Jeff King. Bugs are mine.
The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted until all children of it are. When
traversing a forked history like this with "git log C E":
A----B----C
\
D----E
we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.
In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.
Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output). The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour. We start the traversal by knowing two
commits, C and E. While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected. By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.
When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.
The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.
Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code. The mechanical replacement rule is:
"lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
"lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
commit-slab: introduce a macro to define a slab for new type
Introduce a header file to define a macro that can define the struct
type, initializer, accessor and cleanup functions to manage a commit
slab. Update the "indegree" topological sort facility using it.
To associate 32 flag bits with each commit, you can write:
define_commit_slab(flag32, uint32);
to declare "struct flag32" type, define an instance of it with
struct flag32 flags;
and initialize it by calling
init_flag32(&flags);
After that, a call to flag32_at() function
uint32 *fp = flag32_at(&flags, commit);
will return a pointer pointing at a uint32 for that commit. Once
you are done with these flags, clean them up with
clear_flag32(&flags);
Callers that cannot hard-code how wide the data to be associated
with the commit be at compile time can use the "_with_stride"
variant to initialize the slab.
Suppose you want to give one bit per existing ref, and paint commits
down to find which refs are descendants of each commit. Saying
Instead of using a single "slab" and keep reallocating it as we find
that we need to deal with commits with larger values of commit->index,
make a "slab" an array of many "slab_piece"s. Each access may need
two levels of indirections, but we only need to reallocate the first
level array of pointers when we have to grow the table this way.
commit: allow associating auxiliary info on-demand
The "indegree" field in the commit object is only used while sorting
a list of commits in topological order, and wasting memory otherwise.
We would prefer to shrink the size of individual commit objects,
which we may have to hold thousands of in-core. We could eject
"indegree" field out from the commit object and represent it as a
dynamic table based on the decoration infrastructure, but the
decoration is meant for sparse annotation and is not a good match.
Instead, let's try a different approach.
- Assign an integer (commit->index) to each commit we keep in-core
(reuse the space of "indegree" field for it);
- When running the topological sort, allocate an array of integers
in bulk (called "slab"), use the commit->index as an index into
this array, and store the "indegree" information there.
This does _not_ reduce the memory footprint of a commit object, but
the commit->index can be used as the index to dynamically associate
commits with other kinds of information as needed.
Merge branch 'rr/send-email-perl-critique' into maint
* rr/send-email-perl-critique:
send-email: use the three-arg form of open in recipients_cmd
send-email: drop misleading function prototype
send-email: use "return;" not "return undef;" on error codepaths
Most of these were found using Lucas De Marchi's codespell tool.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: distinguish between ref and offset deltas in pack-format
eb32d236 introduced the OBJ_OFS_DELTA object that uses a relative offset to
identify the base object instead of the 20-byte SHA1 reference. The pack file
documentation only mentions the SHA1 based reference in its description of the
deltified object entry.
Update the pack format documentation to clarify that the deltified object
representation refers to its base using either a relative negative offset or
the absolute SHA1 identifier.
Signed-off-by: Stefan Saasen <ssaasen@atlassian.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advice (consider use of -u when read_directory takes too long) is
separated into 3 different status_printf_ln() calls, and which brings
trouble for translators.
Since status_vprintf() called by status_printf_ln() can handle eol in
buffer, we could simply join these lines into one paragraph.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
archive: clarify explanation of --worktree-attributes
Make it a bit clearer that --worktree-attributes is about files in the
working tree (checked out files, possibly changed) and not the current
working directory ($PWD). Link to the ATTRIBUTES section, which has
more details.
Reported-by: Amit Bakshi <ambakshi@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jc/directory-attrs-regression-fix' into maint-1.8.1
A pattern "dir" (without trailing slash) in the attributes file
stopped matching a directory "dir" by mistake with an earlier change
that wanted to allow pattern "dir/" to also match.
* jc/directory-attrs-regression-fix:
t: check that a pattern without trailing slash matches a directory
dir.c::match_pathname(): pay attention to the length of string parameters
dir.c::match_pathname(): adjust patternlen when shifting pattern
dir.c::match_basename(): pay attention to the length of string parameters
attr.c::path_matches(): special case paths that end with a slash
attr.c::path_matches(): the basename is part of the pathname
* jk/peel-ref:
upload-pack: load non-tip "want" objects from disk
upload-pack: make sure "want" objects are parsed
upload-pack: drop lookup-before-parse optimization
Merge branch 'mg/gpg-interface-using-status' into maint
Verification of signed tags were not done correctly when not in C
or en/US locale.
* mg/gpg-interface-using-status:
pretty: make %GK output the signing key for signed commits
pretty: parse the gpg status lines rather than the output
gpg_interface: allow to request status return
log-tree: rely upon the check in the gpg_interface
gpg-interface: check good signature in a reliable way
Merge branch 'bc/commit-complete-lines-given-via-m-option' into maint
'git commit -m "$msg"' used to add an extra newline even when
$msg already ended with one.
* bc/commit-complete-lines-given-via-m-option:
Documentation/git-commit.txt: rework the --cleanup section
git-commit: only append a newline to -m mesg if necessary
t7502: demonstrate breakage with a commit message with trailing newlines
t/t7502: compare entire commit message with what was expected
The "--match=<pattern>" option of "git describe", when used with
"--all" to allow refs that are not annotated tags to be used as a
base of description, did not restrict the output from the command to
those that match the given pattern.
* jc/describe:
describe: --match=<pattern> must limit the refs even when used with --all
An aliased command spawned from a bare repository that does not say
it is bare with "core.bare = yes" is treated as non-bare by mistake.
* jk/alias-in-bare:
setup: suppress implicit "." work-tree for bare repos
environment: add GIT_PREFIX to local_repo_env
cache.h: drop LOCAL_REPO_ENV_SIZE
"git archive" reports a failure when asked to create an archive out
of an empty tree. It would be more intuitive to give an empty
archive back in such a case.
* jk/empty-archive:
archive: handle commits with an empty tree
test-lib: factor out $GIT_UNZIP setup
* maint-1.8.1:
Start preparing for 1.8.1.6
git-tag(1): we tag HEAD by default
Fix revision walk for commits with the same dates
t2003: work around path mangling issue on Windows
pack-refs: add fully-peeled trait
pack-refs: write peeled entry for non-tags
use parse_object_or_die instead of die("bad object")
avoid segfaults on parse_object failure
entry: fix filter lookup
t2003: modernize style
name-hash.c: fix endless loop with core.ignorecase=true
Merge branch 'ap/maint-diff-rename-avoid-overlap' into maint-1.8.1
* ap/maint-diff-rename-avoid-overlap:
tests: make sure rename pretty print works
diff: prevent pprint_rename from underrunning input
diff: Fix rename pretty-print when suffix and prefix overlap
The <commit>|<object> argument is actually not explained anywhere
(except implicitly in the description of an unannotated tag). Write a
little explanation, in particular to cover the default.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'ap/maint-diff-rename-avoid-overlap' into maint
* ap/maint-diff-rename-avoid-overlap:
tests: make sure rename pretty print works
diff: prevent pprint_rename from underrunning input
diff: Fix rename pretty-print when suffix and prefix overlap
Merge branch 'tb/document-status-u-tradeoff' into maint
* tb/document-status-u-tradeoff:
status: advise to consider use of -u when read_directory takes too long
git status: document trade-offs in choosing parameters to the -u option
* da/downcase-u-in-usage:
contrib/mw-to-git/t/install-wiki.sh: use a lowercase "usage:" string
contrib/examples/git-remote.perl: use a lowercase "usage:" string
tests: use a lowercase "usage:" string
git-svn: use a lowercase "usage:" string
Documentation/user-manual.txt: use a lowercase "usage:" string
templates/hooks--update.sample: use a lowercase "usage:" string
contrib/hooks/setgitperms.perl: use a lowercase "usage:" string
contrib/examples: use a lowercase "usage:" string
contrib/fast-import/import-zips.py: use spaces instead of tabs
contrib/fast-import/import-zips.py: fix broken error message
contrib/fast-import: use a lowercase "usage:" string
contrib/credential: use a lowercase "usage:" string
git-cvsimport: use a lowercase "usage:" string
git-cvsimport: use a lowercase "usage:" string
git-cvsexportcommit: use a lowercase "usage:" string
git-archimport: use a lowercase "usage:" string
git-merge-one-file: use a lowercase "usage:" string
git-relink: use a lowercase "usage:" string
git-svn: use a lowercase "usage:" string
git-sh-setup: use a lowercase "usage:" string
send-email: use the three-arg form of open in recipients_cmd
Perlcritic does not want to see the trailing pipe in the two-args
form of open(), i.e.
open my $fh, "$cmd \Q$file\E |";
If $cmd were a single-token command name, it would make a lot more
sense to use four-or-more-args form "open FILEHANDLE,MODE,CMD,ARGS"
to avoid shell from expanding metacharacters in $file, but we do
expect multi-word string in $to_cmd and $cc_cmd to be expanded by
the shell, so we cannot rewrite it to
open my $fh, "-|", $cmd, $file;
for extra safety. At least, by using this in the three-arg form:
open my $fh, "-|", "$cmd \Q$file\E";
we can silence Perlcritic, even though we do not gain much safety by
doing so.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subroutine check_file_rev_conflict() is called from two places,
both of which expects to pass a single scalar variable and see if
that can be interpreted as a pathname or a revision name. It is
defined with a function prototype ($) to force a scalar context
while evaluating the arguments at the calling site but it does not
help the current calling sites. The only effect it has is to hurt
future calling sites that may want to build an argument list in an
array variable and call it as check_file_rev_confict(@args).
Drop the misleading prototype, as Perlcritic suggests.
While at it, rename the function to avoid new call sites unaware of
this change arising and add a comment clarifying what this function
is for.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: use "return;" not "return undef;" on error codepaths
All the callers of "ask", "extract_valid_address", and "validate_patch"
subroutines assign the return values from them to a single scalar:
$var = subr(...);
and "return undef;" in these subroutine can safely be turned into a
simpler "return;". Doing so will also future-proof a new caller that
mistakenly does this:
@foo = ask(...);
if (@foo) { ... we got an answer ... } else { ... we did not ... }
Note that we leave "return undef;" in validate_address on purpose,
even though Perlcritic may complain. The primary "return" site of
the function returns whatever is in the scalar variable $address, so
it is pointless to change only the other "return undef;" to "return".
The caller must be prepared to see an array with a single undef as
the return value from this subroutine anyway.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After commit cbfd5e1c ("drop some obsolete "x = x" compiler warning
hacks", 21-03-2013) removed a gcc specific hack, older versions of
gcc now issue an "'contents' might be used uninitialized" warning.
In order to suppress the warning, we simply initialize the variable
to NULL in it's declaration.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit cbfd5e1c ("drop some obsolete "x = x" compiler warning hacks",
21-03-2013) removed a gcc hack that suppressed an "might be used
uninitialized" warning issued by older versions of gcc.
However, commit 3aa99df8 ('fast-import: clarify "inline" logic in
file_change_m', 21-03-2013) addresses an (almost) identical issue
(with very similar code), but includes additional code in it's
resolution. The solution used by this commit, unlike that used by
commit cbfd5e1c, also suppresses the -Wuninitialized warning on
older versions of gcc.
In order to suppress the warning (against the 'oe' symbol) in the
note_change_n() function, we adopt the same solution used by commit 3aa99df8.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
the resulting archive would contain only "foo" and ".gitattributes",
not subdir. This was broken with a recent change that intended to
allow "subdir/ export-ignore" to also exclude the directory, but
instead ended up _requiring_ the trailing slash by mistake.
A pattern "subdir" should match any path "subdir", whether it is a
directory or a non-directory. A pattern "subdir/" insists that a
path "subdir" must be a directory for it to match.
This patch adds test not just for this simple case, but also for
deeper cross-directory cases, as well as cases with wildcards.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c::match_pathname(): pay attention to the length of string parameters
This function takes two counted strings: a <pattern, patternlen> pair
and a <pathname, pathlen> pair. But we end up feeding the result to
fnmatch, which expects NUL-terminated strings.
We can fix this by calling the fnmatch_icase_mem function, which
handles re-allocating into a NUL-terminated string if necessary.
While we're at it, we can avoid even calling fnmatch in some cases. In
addition to patternlen, we get "prefix", the size of the pattern that
contains no wildcard characters. We do a straight match of the prefix
part first, and then use fnmatch to cover the rest. But if there are
no wildcards in the pattern at all, we do not even need to call
fnmatch; we would simply be comparing two empty strings.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c::match_pathname(): adjust patternlen when shifting pattern
If we receive a pattern that starts with "/", we shift it
forward to avoid looking at the "/" part. Since the prefix
and patternlen parameters are counts of what is in the
pattern, we must decrement them as we increment the pointer.
We remembered to handle prefix, but not patternlen. This
didn't cause any bugs, though, because the patternlen
parameter is not actually used. Since it will be used in
future patches, let's correct this oversight.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c::match_basename(): pay attention to the length of string parameters
The function takes two counted strings (<basename, basenamelen> and
<pattern, patternlen>) as parameters, together with prefix (the
length of the prefix in pattern that is to be matched literally
without globbing against the basename) and EXC_* flags that tells it
how to match the pattern against the basename.
However, it did not pay attention to the length of these counted
strings. Update them to do the following:
* When the entire pattern is to be matched literally, the pattern
matches the basename only when the lengths of them are the same,
and they match up to that length.
* When the pattern is "*" followed by a string to be matched
literally, make sure that the basenamelen is equal or longer than
the "literal" part of the pattern, and the tail of the basename
string matches that literal part.
* Otherwise, use the new fnmatch_icase_mem helper to make
sure we only lookmake sure we use only look at the
counted part of the strings. Because these counted strings are
full strings most of the time, we check for termination
to avoid unnecessary allocation.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr.c::path_matches(): special case paths that end with a slash
The function is given a string that ends with a slash to signal that
the path is a directory to make sure that a pattern that ends with a
slash (i.e. MUSTBEDIR) can tell directories and non-directories
apart. However, the pattern itself (pat->pattern and
pat->patternlen) that came from such a MUSTBEDIR pattern is
represented as a string that ends with a slash, but patternlen does
not count that trailing slash. A MUSTBEDIR pattern "element/" is
represented as a counted string <"element/", 7> and this must match
match pathname "element/".
Because match_basename() and match_pathname() want to see pathname
"element" to match against the pattern <"element/", 7>, reduce the
length of the path to exclude the trailing slash when calling
these functions.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5516: test interaction between pushURL and pushInsteadOf correctly
1c2eafb89bca (Add url.<base>.pushInsteadOf: URL rewriting for push
only, 2009-09-07) wants to make sure that a push destination read
from URL is not rewritten by pushInsteadOf because an explicit
pushURL exists; for that, a pushInsteadOf rewrite rule for the value
of remote.r.URL is set to a non-existent is set up.
We would also want to make sure that pushInsteadOf rewrite rule is
not applied to the location read from pushURL.
This way, we will make sure that
- "testrepo/" (pushURL) gets updated;
- the push does not try to update "trash2/" (the result of applying
pushInsteadOf to pushURL);
- the push does not try to update "trash3/" (the result of applying
pushInsteadOf to URL).
When calculating whether there is a d/f conflict, the calculation of
whether both sides are directories generates an incorrect references
mask because it does not use the loop index to set the correct bit.
Fix this typo.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>