receive-pack: treat namespace .have lines like alternates
Namely, de-duplicate them. We use the same set as the
alternates, since we call them both ".have" (i.e., there is
no value in showing one versus the other).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment claims that we handle alternate ".have" lines
through this function, but that hasn't been the case since 85f251045 (write_head_info(): handle "extra refs" locally,
2012-01-06).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: use oidset to de-duplicate .have lines
If you have an alternate object store with a very large
number of refs, the peak memory usage of the sha1_array can
grow high, even if most of them are duplicates that end up
not being printed at all.
The similar for_each_alternate_ref() code-paths in
fetch-pack solve this by using flags in "struct object" to
de-duplicate (and so are relying on obj_hash at the core).
But we don't have a "struct object" at all in this case. We
could call lookup_unknown_object() to get one, but if our
goal is reducing memory footprint, it's not great:
- an unknown object is as large as the largest object type
(a commit), which is bigger than an oidset entry
- we can free the memory after our ref advertisement, but
"struct object" entries persist forever (and the
receive-pack may hang around for a long time, as the
bottleneck is often client upload bandwidth).
So let's use an oidset. Note that unlike a sha1-array it
doesn't sort the output as a side effect. However, our
output is at least stable, because for_each_alternate_ref()
will give us the sha1s in ref-sorted order.
In one particularly pathological case with an alternate that
has 60,000 unique refs out of 80 million total, this reduced
the peak heap usage of "git receive-pack . </dev/null" from
13GB to 14MB.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is similar to many of our uses of sha1-array, but it
overcomes one limitation of a sha1-array: when you are
de-duplicating a large input with relatively few unique
entries, sha1-array uses 20 bytes per non-unique entry.
Whereas this set will use memory linear in the number of
unique entries (albeit a few more than 20 bytes due to
hashmap overhead).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: cache results of for_each_alternate_ref
We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.
Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.
The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.
However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.
Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.
There are two extra optimizations I didn't do here:
- we actually store an array of pointer-to-object.
Technically we could just walk the obj_hash table
looking for entries with the ALTERNATE flag set (because
our use case doesn't care about the order here).
But that hash table may be mostly composed of
non-ALTERNATE entries, so we'd waste time walking over
them. So it would be a slight win in memory use, but a
loss in CPU.
- the items we pull out of the cache are actual "struct
object"s, but then we feed "obj->sha1" to our
sub-functions, which promptly call parse_object().
This second parse is cheap, because it starts with
lookup_object() and will bail immediately when it sees
we've already parsed the object. We could save the extra
hash lookup, but it would involve refactoring the
functions we call. It may or may not be worth the
trouble.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
for_each_alternate_ref: replace transport code with for-each-ref
The current method for getting the refs from an alternate is
to run upload-pack in the alternate and parse its output
using the normal transport code. This works and is
reasonably short, but it has a very bad memory footprint
when there are a lot of refs in the alternate. There are two
problems:
1. It reads in all of the refs before passing any back to
us. Which means that our peak memory usage has to store
every ref (including duplicates for peeled variants),
even if our callback could determine that some are not
interesting (e.g., because they point to the same sha1
as another ref).
2. It allocates a "struct ref" for each one. Among other
things, this contains 3 separate 20-byte oids, along
with the name and various pointers. That can add up,
especially if the callback is only interested in the
sha1 (which it can store in a sha1_array as just 20
bytes).
On a particularly pathological case, where the alternate had
over 80 million refs pointing to only around 60,000 unique
objects, the peak heap usage of "git clone --reference" grew
to over 25GB.
This patch instead calls git-for-each-ref in the alternate
repository, and passes each line to the callback as we read
it. That drops the peak heap of the same command to 50MB.
I considered and rejected a few alternatives.
We could read all of the refs in the alternate using our own
ref code, just as we do with submodules. However, as memory
footprint is one of the concerns here, we want to avoid
loading those refs into our own memory as a whole.
It's possible that this will be a better technique in the
future when the ref code can more easily iterate without
loading all of packed-refs into memory.
Another option is to keep calling upload-pack, and just
parse its output ourselves in a streaming fashion. Besides
for-each-ref being simpler (we get to define the format
ourselves, and don't have to deal with speaking the git
protocol), it's more flexible for possible future changes.
For instance, it might be useful for the caller to be able
to limit the set of "interesting" alternate refs. The
motivating example is one where many "forks" of a particular
repository share object storage, and the shared storage has
refs for each fork (which is why so many of the refs are
duplicates; each fork has the same tags). A plausible
future optimization would be to ask for the alternate refs
for just _one_ fork (if you had some out-of-band way of
knowing which was the most interesting or important for the
current operation).
Similarly, no callbacks actually care about the symref value
of alternate refs, and as before, this patch ignores them
entirely. However, if we wanted to add them, for-each-ref's
"%(symref)" is going to be more flexible than upload-pack,
because the latter only handles the HEAD symref due to
historical constraints.
There is one potential downside, though: unlike upload-pack,
our for-each-ref command doesn't report the peeled value of
refs. The existing code calls the alternate_ref_fn callback
twice for tags: once for the tag, and once for the peeled
value with the refname set to "ref^{}".
For the callers in fetch-pack, this doesn't matter at all.
We immediately peel each tag down to a commit either way (so
there's a slight improvement, as do not bother passing the
redundant data over the pipe). For the caller in
receive-pack, it means we will not advertise the peeled
values of tags in our alternate. However, we also don't
advertise peeled values for our _own_ tags, so this is
actually making things more consistent.
It's unclear whether receive-pack advertising peeled values
is a win or not. On one hand, giving more information to the
other side may let it omit some objects from the push. On
the other hand, for tags which both sides have, they simply
bloat the advertisement. The upload-pack advertisement of
git.git is about 30% larger than the receive-pack
advertisement due to its peeled information.
This patch omits the peeled information from
for_each_alternate_ref entirely, and leaves it up to the
caller whether they want to dig up the information.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
for_each_alternate_ref: pass name/oid instead of ref struct
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.
The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
for_each_alternate_ref: use strbuf for path allocation
We have a string with ".../objects" pointing to the
alternate object store, and overwrite bits of it to look at
other paths in the (potential) git repository holding it.
This works because the only path we care about is "refs",
which is shorter than "objects".
Using a strbuf to hold the path lets us get rid of some
magic numbers, and makes it more obvious that the memory
operations are safe.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The real_pathdup() function will have removed extra slashes
for us already (on top of the normalize_path() done when we
created the alternate_object_database struct in the first
place).
Incidentally, this also fixes the case where the path is
just "/", which would read off the start of the array.
That doesn't seem possible to trigger in practice, though,
as link_alt_odb_entry() blindly eats trailing slashes,
including a bare "/".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
for_each_alternate_ref: handle failure from real_pathdup()
In older versions of git, if real_path() failed to resolve
the alternate object store path, we would die() with an
error. However, since 4ac9006f8 (real_path: have callers use
real_pathdup and strbuf_realpath, 2016-12-12) we use the
real_pathdup() function, which may return NULL. Since we
don't check the return value, we can segfault.
This is hard to trigger in practice, since we check that the
path is accessible before creating the alternate_object_database
struct. But it could be removed racily, or we could see a
transient filesystem error.
We could restore the original behavior by switching back to
xstrdup(real_path()). However, dying is probably not the
best option here. This whole function is best-effort
already; there might not even be a repository around the
shared objects at all. And if the alternate store has gone
away, there are no objects to show.
So let's just quietly return, as we would if we failed to
open "refs/", or if upload-pack failed to start, etc.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run "git log --graph --name-only", the pathnames are
not indented to go along with their matching commits (unlike
all of the other diff formats). We need to output the line
prefix for each item before writing it.
The tests cover both --name-status and --name-only. The
former actually gets this right already, because it builds
on the --raw format functions. It's only --name-only which
uses its own code (and this fix mirrors the code in
diff_flush_raw()).
Note that the tests don't follow our usual style of setting
up the "expect" output inside the test block. This matches
the surrounding style, but more importantly it is easier to
read: we don't have to worry about embedded single-quotes,
and the leading indentation is more obvious.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the match member of the first pathspec item directly to
read_directory() instead of using common_prefix() to duplicate it first,
thus avoiding memory duplication, strlen(3) and free(3).
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the document patch for f0298cf1c6 (revision walker: include a
detached HEAD in --all - 2009-01-16).
Even though that commit is about detached HEAD, as Jeff pointed out,
always adding HEAD in that case may have subtle differences with
--source or --exclude. So the document mentions nothing about the
detached-ness.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use "test_line_count" instead of "wc -l", use "git -C" instead of a
subshell, and use test_expect_code when calling difftool. Ease
debugging by capturing output into temporary files.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref-filter: resurrect "strip" as a synonym to "lstrip"
We forgot that "strip" was introduced at 0571979bd6 ("tag: do not
show ambiguous tag names as "tags/foo"", 2016-01-25) as part of Git
2.8 (and 2.7.1) when we started calling this "lstrip" to make it
easier to explain the new "rstrip" operation.
We shouldn't have renamed the existing one; "lstrip" should have
been a new synonym that means the same thing as "strip". Scripts
in the wild are surely using the original form already.
The `verbose` and `expire` options of the `git worktree prune`
subcommand have wrong descriptions in that they pretend to relate to
objects. But as the git-worktree(1) correctly states, these options have
nothing to do with objects but only with worktrees. Fix the description
accordingly.
Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
p5302: create repositories for index-pack results explicitly
Before 7176a314 (index-pack: complain when --stdin is used outside of a
repo) index-pack silently created a non-existing target directory; now
the command refuses to work unless it's used against a valid repository.
That causes p5302 to fail, which relies on the former behavior. Fix it
by setting up the destinations for its performance tests using git init.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fatal: BUG: setup_git_env called without repository
Defer repository setup so that the help option processing happens before
the repository is initialized.
Add tests to ensure that the basic usage works inside and outside of a
repository.
Signed-off-by: David Aguilar <davvid@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reset: add an example of how to split a commit into two
It is often useful to break a commit into multiple parts that are more
logical separations. This can be tricky to learn how to do without the
brute-force method if re-writing code or commit messages from scratch.
Add a section to the git-reset documentation which shows an example
process for how to use git add -p and git commit -c HEAD@{1} to
interactively break a commit apart and re-use the original commit
message as a starting point when making the new commit message.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Command completion only recognizes a subset of the available options for
the various git commands. The set of recognized options needs to balance
between having all useful options and to not clutter the terminal.
This commit adds all long-options that are mentioned in the man-page
synopsis of the respective git command. Possibly dangerous options are
not included in this set, to avoid accidental data loss. The added
options are:
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Helped-by: Johannes Sixt <j6t@kdbg.org> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: teach remote subcommands to complete options
Git-remote needs to complete remote names, its subcommands, and options
thereof. In addition to the existing subcommand and remote name
completion, do also complete the options
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git-replace needs to complete references and its own options. In
addition to the existing references completions, do also complete the
options --edit --graft --format= --list --delete.
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-remote needs to complete remote names and its own options. In
addition to the existing remote name completions, do also complete
the options --heads, --tags, --refs, --get-url, and --symref.
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Command completion for git-add did not recognize some long-options.
This commits adds completion for all long-options that are mentioned in
the man-page synopsis. In addition, if the user specified `--update` or
`-u`, path completion will only suggest modified tracked files.
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Managing recorded resolutions requires command-line usage of git-rerere.
Added subcommand completion for rerere and path completion for its
subcommand forget.
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: teach submodule subcommands to complete options
Each submodule subcommand has specific long-options. Therefore, teach
bash completion to support option completion based on the current
subcommand. All long-options that are mentioned in the man-page synopsis
are added.
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the previous changes in this series there are only a handful of
$(__gitdir) command substitutions left in the completion script, but
there is still a bit of room for improvements:
1. The command substitution involves the forking of a subshell,
which has considerable overhead on some platforms.
2. There are a few cases, where this command substitution is
executed more than once during a single completion, which means
multiple subshells and possibly multiple 'git rev-parse'
executions. __gitdir() is invoked twice while completing refs
for e.g. 'git log', 'git rebase', 'gitk', or while completing
remote refs for 'git fetch' or 'git push'.
Both of these points can be addressed by using the
__git_find_repo_path() helper function introduced in the previous
commit:
1. __git_find_repo_path() stores the path to the repository in a
variable instead of printing it, so the command substitution
around the function can be avoided. Or rather: the command
substitution should be avoided to make the new value of the
variable set inside the function visible to the callers.
(Yes, there is now a command substitution inside
__git_find_repo_path() around each 'git rev-parse', but that's
executed only if necessary, and only once per completion, see
point 2. below.)
2. $__git_repo_path, the variable holding the path to the
repository, is declared local in the toplevel completion
functions __git_main() and __gitk_main(). Thus, once set, the
path is visible in all completion functions, including all
subsequent calls to __git_find_repo_path(), meaning that they
wouldn't have to re-discover the path to the repository.
So call __git_find_repo_path() and use $__git_repo_path instead of the
$(__gitdir) command substitution to access paths in the .git
directory. Turn tests checking __gitdir()'s repository discovery into
tests of __git_find_repo_path() such that only the tested function
changes but the expected results don't, ensuring that repo discovery
keeps working as it did before.
As __gitdir() is not used anymore in the completion script, mark it as
deprecated and direct users' attention to __git_find_repo_path() and
$__git_repo_path. Yet keep four __gitdir() tests to ensure that it
handles success and failure of __git_find_repo_path() and that it
still handles its optional remote argument, because users' custom
completion scriptlets might depend on it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: extract repository discovery from __gitdir()
To prepare for caching the path to the repository in the following
commit, extract the repository discovering part of __gitdir() into the
__git_find_repo_path() helper function, which stores the found path in
the $__git_repo_path variable instead of printing it. Make __gitdir()
a wrapper around this new function. Declare $__git_repo_path local in
the toplevel completion functions __git_main() and __gitk_main() to
ensure that it never leaks into the environment and influences
subsequent completions (though this isn't necessary right now, as
__gitdir() is still only executed in subshells, but will matter for
the following commit).
Adjust tests checking __gitdir() or any other completion function
calling __gitdir() to perform those checks in a subshell to prevent
$__git_repo_path from leaking into the test environment. Otherwise
leave the tests unchanged to demonstrate that this change doesn't
alter __gitdir()'s behavior.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: don't guard git executions with __gitdir()
Three completion functions, namely __git_index_files(), __git_heads()
and __git_tags(), first run __gitdir() and check that the path it
outputs exists, i.e. that there is a git repository, and run a git
command only if there is one.
After the previous changes in this series there are no further uses of
__gitdir()'s output in these functions besides those checks. And
those checks are unnecessary, because we can just execute those git
commands outside of a repository and let them error out. We don't
perform such a check in other places either.
Remove this check and the __gitdir() call from these functions,
sparing the fork()+exec() overhead of the command substitution and the
potential 'git rev-parse' execution.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: consolidate silencing errors from git commands
Outputting error messages during completion is bad: they disrupt the
command line, can't be deleted, and the user is forced to Ctrl-C and
start over most of the time. We already silence stderr of many git
commands in our Bash completion script, but there are still some in
there that can spew error messages when something goes wrong.
We could add the missing stderr redirections to all the remaining
places, but instead let's leverage that git commands are now executed
through the previously introduced __git() wrapper function, and
redirect standard error to /dev/null only in that function. This way
we need only one redirection to take care of errors from almost all
git commands. Redirecting standard error of the __git() wrapper
function thus became redundant, remove them.
The exceptions, i.e. the repo-independent git executions and those in
the __gitdir() function that don't go through __git() already have
their standard error silenced.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several completion functions contain the following pattern to run git
commands respecting the path to the repository specified on the
command line:
git --git-dir="$(__gitdir)" <cmd> <options>
This imposes the overhead of fork()ing a subshell for the command
substitution and potentially fork()+exec()ing 'git rev-parse' inside
__gitdir().
Now, if neither '--gitdir=<path>' nor '-C <path>' options are
specified on the command line, then those git commands are perfectly
capable to discover the repository on their own. If either one or
both of those options are specified on the command line, then, again,
the git commands could discover the repository, if we pass them all of
those options from the command line.
This means we don't have to run __gitdir() at all for git commands and
can spare its fork()+exec() overhead.
Use Bash parameter expansions to check the $__git_dir variable and
$__git_C_args array and to assemble the appropriate '--git-dir=<path>'
and '-C <path>' options if either one or both are present on the
command line. These parameter expansions are, however, rather long,
so instead of changing all git executions and make already long lines
even longer, encapsulate running git with '--git-dir=<path> -C <path>'
options into the new __git() wrapper function. Furthermore, this
wrapper function will also enable us to silence error messages from
git commands uniformly in one place in a later commit.
There's one tricky case, though: in __git_refs() local refs are listed
with 'git for-each-ref', where "local" is not necessarily the
repository we are currently in, but it might mean a remote repository
in the filesystem (e.g. listing refs for 'git fetch /some/other/repo
<TAB>'). Use one-shot variable assignment to override $__git_dir with
the path of the repository where the refs should come from. Although
one-shot variable assignments in front of shell functions are to be
avoided in our scripts in general, in the Bash completion script we
can do that safely.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git -C <path>' option(s) on the command line should be taken into
account during completion, because
- like '--git-dir=<path>', it can lead us to a different repository,
- a few git commands executed in the completion script do care about
in which directory they are executed, and
- the command for which we are providing completion might care about
in which directory it will be executed.
However, unlike '--git-dir=<path>', the '-C <path>' option can be
specified multiple times and their effect is cumulative, so we can't
just store a single '<path>' in a variable. Nor can we simply
concatenate a path from '-C <path1> -C <path2> ...', because e.g. (in
an arguably pathological corner case) a relative path might be
followed by an absolute path.
Instead, store all '-C <path>' options word by word in the
$__git_C_args array in the main git completion function, and pass this
array, if present, to 'git rev-parse --absolute-git-dir' when
discovering the repository in __gitdir(), and let it take care of
multiple options, relative paths, absolute paths and everything.
Also pass all '-C <path> options via the $__git_C_args array to those
git executions which require a worktree and for which it matters from
which directory they are executed from. There are only three such
cases:
- 'git diff-index' and 'git ls-files' in __git_ls_files_helper()
used for git-aware filename completion, and
- the 'git ls-tree' used for completing the 'ref:path' notation.
The other git commands executed in the completion script don't need
these '-C <path>' options, because __gitdir() already took those
options into account. It would not hurt them, either, but let's not
induce unnecessary code churn.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The output of 'git rev-parse --git-dir' can be either a relative or an
absolute path, depending on whether the current working directory is
at the top of the worktree or the .git directory or not, or how the
path to the repository is specified via the '--git-dir=<path>' option
or the $GIT_DIR environment variable. And if that output is a
relative path, then it is relative to the directory where any 'git
-C <path>' options might have led us.
This doesn't matter at all for regular scripts, because the git
wrapper automatically takes care of changing directories according to
the '-C <path>' options, and the scripts can then simply follow any
path returned by 'git rev-parse --git-dir', even if it's a relative
path.
Our Bash completion script, however, is unique in that it must run
directly in the user's interactive shell environment. This means that
it's not executed through the git wrapper and would have to take care
of any '-C <path> options on its own, and it can't just change
directories as it pleases. Consequently, adding support for taking
any '-C <path>' options on the command line into account during
completion turned out to be considerably more difficult, error prone
and required more subshells and git processes when it had to cope with
a relative path to the .git directory.
Help this rather special use case and teach 'git rev-parse' a new
'--absolute-git-dir' option which always outputs a canonicalized
absolute path to the .git directory, regardless of whether the path is
discovered automatically or is specified via $GIT_DIR or 'git
--git-dir=<path>'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main completion function finds the name of the git command by
iterating through all the words on the command line in search for the
first non-option-looking word. As it is not aware of 'git -C's
mandatory path argument, if the '-C <path>' option is present, 'path'
will be the first such word and it will be mistaken for a git command.
This breaks completion in various ways:
- If 'path' happens to match one of the commands supported by the
completion script, then options of that command will be offered.
- If 'path' doesn't match a supported command and doesn't contain any
characters not allowed in Bash identifier names, then the
completion script does basically nothing and Bash in turn falls
back to filename completion for all subsequent words.
- Otherwise, if 'path' does contain such an unallowed character, then
it leads to a more or less ugly error message in the middle of the
command line. The standard '/' directory separator is such a
character, and it happens to trigger one of the uglier errors:
$ git -C some/path <TAB>sh.exe": declare: `_git_some/path': not a valid identifier
error: invalid key: alias.some/path
Fix this by skipping 'git -C's mandatory path argument while iterating
over the words on the command line. Extend the relevant test with
this case and, while at it, with cases that needed similar treatment
in the past ('--git-dir', '-c', '--work-tree' and '--namespace').
Additionally, silence the standard error of the 'declare' builtins
looking for the completion function associated with the git command
and of the 'git config' query for the aliased command. So if git ever
learns a new option with a mandatory argument in the future, then,
though the completion script will again misbehave, at least the
command line will not be utterly disrupted by those error messages.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: don't offer commands when 'git --opt' needs an argument
The main git options '--git-dir', '-c', '-C', '--worktree' and
'--namespace' require an argument, but attempting completion right
after them lists git commands.
Don't offer anything right after these options, thus let Bash fall
back to filename completion, because
- the three options '--git-dir', '-C' and '--worktree' do actually
require a path argument, and
- we don't complete the required argument of '-c' and '--namespace',
and in that case the "standard" behavior of our completion script
is to not offer anything, but fall back to filename completion.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: list short refs from a remote given as a URL
e832f5c09680 (completion: avoid ls-remote in certain scenarios,
2013-05-28) turned a 'git ls-remote <remote>' query into a 'git
for-each-ref refs/remotes/<remote>/' to improve responsiveness of
remote refs completion by avoiding potential network communication.
However, it inadvertently made impossible to complete short refs from
a remote given as a URL, e.g. 'git fetch git://server.com/repo.git
<TAB>', because there is, of course, no such thing as
'refs/remotes/git://server.com/repo.git'.
Since the previous commit we tell apart configured remotes, i.e. those
that can have a hierarchy under 'refs/remotes/', from others that
don't, including remotes given as URL, so we know when we can't use
the faster 'git for-each-ref'-based approach.
Resurrect the old, pre-e832f5c09680 'git ls-remote'-based code for the
latter case to support listing short refs from remotes given as a URL.
The code is slightly updated from the original to
- take into account the path to the repository given on the command
line (if any), and
- omit 'ORIG_HEAD' from the query, as 'git ls-remote' will never
list it anyway.
When the remote given to __git_refs() doesn't exist, then it will be
handled by this resurrected 'git ls-remote' query. This code path
doesn't list 'HEAD' unconditionally, which has the nice side effect of
fixing two more expected test failures.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: list refs from remote when remote's name matches a directory
If the remote given to __git_refs() happens to match both the name of
a configured remote and the name of a directory in the current working
directory, then that directory is assumed to be a git repository, and
listing refs from that directory will be attempted. This is wrong,
because in such a situation git commands (e.g. 'git fetch|pull|push
<remote>' whom these refs will eventually be passed to) give
precedence to the configured remote. Therefore, __git_refs() should
list refs from the configured remote as well.
Add the helper function __git_is_configured_remote() that checks
whether its argument matches the name of a configured remote. Use
this helper to decide how to handle the remote passed to __git_refs().
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: respect 'git --git-dir=<path>' when listing remote refs
In __git_refs() the git commands listing refs, both short and full,
from a given remote repository are run without giving them the path to
the git repository which might have been specified on the command line
via 'git --git-dir=<path>'. This is bad, those git commands should
access the 'refs/remotes/<remote>/' hierarchy or the remote and
credentials configuration in that specified repository.
Use the __gitdir() helper only to find the path to the .git directory
and pass the resulting path to the 'git ls-remote' and 'for-each-ref'
executions that list remote refs. While modifying that 'for-each-ref'
line, remove the superfluous disambiguating doubledash.
Don't use __gitdir() to check that the given remote is on the file
system: basically it performs only a single if statement for us at the
considerable cost of fork()ing a subshell for a command substitution.
We are better off to perform all the necessary checks of the remote in
__git_refs().
Though __git_refs() was the last remaining callsite that passed a
remote to __gitdir(), don't delete __gitdir()'s remote-handling part
yet, just in case some users' custom completion scriptlets depend on
it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: fix most spots not respecting 'git --git-dir=<path>'
The completion script already respects the path to the repository
specified on the command line most of the time, here we add the
necessary '--git-dir=$(__gitdir)' options to most of the places where
git was executed without it. The exceptions where said option is not
added are the git invocations:
- in __git_refs() which are non-trivial and will be the subject of
the following patch,
- getting the list of git commands, merge strategies and archive
formats, because these are independent from the repository and
thus don't need it, and
- the 'git rev-parse --git-dir' in __gitdir() itself.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: ensure that the repository path given on the command line exists
The __gitdir() helper function prints the path to the git repository
to its stdout or stays silent and returns with error when it can't
find a repository or when the repository given via $GIT_DIR doesn't
exist.
This is not the case, however, when the path in $__git_dir, i.e. the
path to the repository specified on the command line via 'git
--git-dir=<path>', doesn't exist: __gitdir() still outputs it as if it
were a real existing repository, making some completion functions
believe that they operate on an existing repository.
Check that the path in $__git_dir exists and return with error without
printing anything to stdout if it doesn't.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion tests: add tests for the __git_refs() helper function
Check how __git_refs() lists refs in different scenarios, i.e.
- short and full refs,
- from a local or from a remote repository,
- remote specified via path, name or URL,
- with or without a repository specified on the command line,
- non-existing remote,
- unique remote branches for 'git checkout's tracking DWIMery,
- not in a git repository, and
- interesting combinations of the above.
Seven of these tests expect failure, mostly demonstrating bugs related
to listing refs from a remote repository:
- ignoring the repository specified on the command line (2 tests),
- listing refs from the wrong place when the name of a configured
remote happens to match a directory,
- listing only 'HEAD' but no short refs from a remote given as URL,
- listing 'HEAD' even from non-existing remotes (2 tests), and
- listing 'HEAD' when not in a repository.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion tests: check __gitdir()'s output in the error cases
The __gitdir() helper function shouldn't output anything if not in a
git repository. The relevant tests only checked its error code, so
extend them to ensure that there's no output.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion tests: consolidate getting path of current working directory
Some tests of the __gitdir() helper function use the $TRASH_DIRECTORY
variable in direct path comparisons. In general this should be
avoided, because it might contain symbolic links. There happens to be
no issues with this here, however, because those tests use
$TRASH_DIRECTORY both for specifying the expected result and for
specifying input which in turn is just 'echo'ed verbatim.
Other __gitdir() tests ask for the path of the trash directory by
running $(pwd -P) in each test, sometimes even twice in a single test.
Run $(pwd) only once at the beginning of the test script to store the
path of the trash directory in a variable, and use that variable in
all __gitdir() tests.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion tests: make the $cur variable local to the test helper functions
The test helper functions test_gitcomp() and test_gitcomp_nl() leak
the $cur variable into the test environment. Since this variable has
a special role in the Bash completion script (it holds the word
currently being completed) it influences the behavior of most
completion functions and thus this leakage could interfere with
subsequent tests. Although there are no such issues in the current
tests, early versions of the new tests that will be added later in
this series suffered because of this.
It's better to play safe and declare $cur local in those test helper
functions. 'local' is bashism, of course, but the tests of the Bash
completion script are run under Bash anyway, and there are already
other variables declared local in this test script.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion tests: don't add test cruft to the test repository
While preparing commits, three tests added newly created files to the
index using 'git add .', which added not only the files in question
but leftover test cruft from previous tests like the files 'expected'
and 'actual' as well. Luckily, this had no effect on the tests'
correctness.
Add only the files we are actually interested in.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "core.logAllRefUpdates" that used to be boolean has been
enhanced to take 'always' as well, to record ref updates to refs
other than the ones that are expected to be updated (i.e. branches,
remote-tracking branches and notes).
* cw/log-updates-for-all-refs-really:
doc: add note about ignoring '--no-create-reflog'
update-ref: add test cases for bare repository
refs: add option core.logAllRefUpdates = always
config: add markup to core.logAllRefUpdates doc
"uchar [40]" to "struct object_id" conversion continues.
* rs/object-id:
checkout: convert post_checkout_hook() to struct object_id
use oidcpy() for copying hashes between instances of struct object_id
use oid_to_hex_r() for converting struct object_id hashes to hex strings
"make -C t failed" will now run only the tests that failed in the
previous run. This is usable only when prove is not use, and gives
a useless error message when run after "make clean", but otherwise
is serviceable.
* js/re-running-failed-tests:
t/Makefile: add a rule to re-run previously-failed tests
The user can specify a custom update method that is run when
"submodule update" updates an already checked out submodule. This
was ignored when checking the submodule out for the first time and
we instead always just checked out the commit that is bound to the
path in the superproject's index.
* sb/submodule-update-initial-runs-custom-script:
submodule update: run custom update script for initial populating as well
When a submodule "A", which has another submodule "B" nested within
it, is "absorbed" into the top-level superproject, the inner
submodule "B" used to be left in a strange state. The logic to
adjust the .git pointers in these submodules has been corrected.
* sb/submodule-recursive-absorb:
submodule absorbing: fix worktree/gitdir pointers recursively for non-moves
cache.h: expose the dying procedure for reading gitlinks
setup: add gentle version of resolve_git_dir
Some people feel the default set of colors used by "git log --graph"
rather limiting. A mechanism to customize the set of colors has
been introduced.
* nd/log-graph-configurable-colors:
document behavior of empty color name
color_parse_mem: allow empty color spec
log --graph: customize the graph lines with config log.graphColors
color.c: trim leading spaces in color_parse_mem()
color.c: fix color_parse_mem() with value_len == 0
* ep/commit-static-buf-cleanup:
builtin/commit.c: switch to strbuf, instead of snprintf()
builtin/commit.c: remove the PATH_MAX limitation via dynamic allocation
Asciidoctor, an alternative reimplementation of AsciiDoc, still
needs some changes to work with documents meant to be formatted
with AsciiDoc. "make USE_ASCIIDOCTOR=YesPlease" to use it out of
the box to document our pages is getting closer to reality.
* bc/use-asciidoctor-opt:
Documentation: implement linkgit macro for Asciidoctor
Makefile: add a knob to enable the use of Asciidoctor
Documentation: move dblatex arguments into variable
Documentation: add XSLT to fix DocBook for Texinfo
Documentation: sort sources for gitman.texi
Documentation: remove unneeded argument in cat-texi.perl
Documentation: modernize cat-texi.perl
Documentation: fix warning in cat-texi.perl
Names of the various hook scripts must be spelled exactly, but on
Windows, an .exe binary must be named with .exe suffix; notice
$GIT_DIR/hooks/<hookname>.exe as a valid <hookname> hook.
* js/mingw-hooks-with-exe-suffix:
mingw: allow hooks to be .exe files
After starting "git rebase -i", which first opens the user's editor
to edit the series of patches to apply, but before saving the
contents of that file, "git status" failed to show the current
state (i.e. you are in an interactive rebase session, but you have
applied no steps yet) correctly.
* js/status-pre-rebase-i:
status: be prepared for not-yet-started interactive rebase
Merge branch 'jk/execv-dashed-external' into maint
Typing ^C to pager, which usually does not kill it, killed Git and
took the pager down as a collateral damage in certain process-tree
structure. This has been fixed.
* jk/execv-dashed-external:
execv_dashed_external: wait for child on signal death
execv_dashed_external: stop exiting with negative code
execv_dashed_external: use child_process struct
Commit 55cccf4bb (color_parse_mem: allow empty color spec,
2017-02-01) clearly defined the behavior of an empty color
config variable. Let's document that, and give a hint about
why it might be useful.
It's important not to say that it makes the item uncolored,
because it doesn't. It just sets no attributes, which means
that any previous attributes continue to take effect.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commands git-branch and git-tag accept the '--create-reflog'
option, and create reflog even when core.logallrefupdates
configuration is explicitly set not to.
On the other hand, the negated form '--no-create-reflog' is accepted
as a valid option but has no effect (other than overriding an
earlier '--create-reflog' on the command line). This silent noop may
puzzle users. To communicate that this is a known limitation, add a
short note in the manuals for git-branch and git-tag.
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: add doc for git-push --recurse-submodules=only
Add documentation for the `--recurse-submodules=only` option of
git-push. The feature was added in commit 225e8bf (add option to
push only submodules).
Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the 'git_attr_set_direction()' up to be closer to the variables
that it modifies as well as a small formatting by renaming the variable
'new' to 'new_direction' so that it is more descriptive.
Update the comment about how 'direction' is used to read the state of
the world. It should be noted that callers of
'git_attr_set_direction()' should ensure that other threads are not
making calls into the attribute system until after the call to
'git_attr_set_direction()' completes. This function essentially acts as
reset button for the attribute system and should be handled with care.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Push the bare repository check into the 'read_attr()' function. This
avoids needing to have extra logic which creates an empty stack frame
when inside a bare repo as a similar bit of logic already exists in the
'read_attr()' function.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: store attribute stack in attr_check structure
The last big hurdle towards a thread-safe API for the attribute system
is the reliance on a global attribute stack that is modified during each
call into the attribute system.
This patch removes this global stack and instead a stack is stored
locally in each attr_check instance. This opens up the opportunity for
future optimizations to customize the attribute stack for the attributes
that a particular attr_check struct is interested in.
One caveat with pushing the attribute stack into the attr_check
structure is that the attribute system now needs to keep track of all
active attr_check instances. Due to the direction mechanism the stack
needs to be dropped when the direction is switched. In order to ensure
correctness when the direction is changed the attribute system needs to
iterate through all active attr_check instances and drop each of their
stacks.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: remove maybe-real, maybe-macro from git_attr
Whether or not a git attribute is real or a macro isn't a property of
the attribute but rather it depends on the attribute stack (which
.gitattribute files were read).
This patch removes the 'maybe_real' and 'maybe_macro' fields in a
git_attr and instead adds the 'macro' field to a attr_check_item. The
'macro' indicates (if non-NULL) that a particular attribute is a macro
for the given attribute stack. It's populated, through a quick scan of
the attribute stack, with the match_attr that corresponds to the macro's
definition. This way the attribute stack only needs to be scanned a
single time prior to attribute collection instead of each time a macro
needs to be expanded.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently there is a reliance on 'check_all_attr' which is a global
array of 'attr_check_item' items which is used to store the value of
each attribute during the collection process.
This patch eliminates this global and instead creates an array per
'attr_check' instance which is then used in the attribute collection
process. This brings the attribute system one step closer to being
thread-safe.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current implementation of the attribute dictionary uses a custom
hashtable. This modernizes the dictionary by converting it to the builtin
'hashmap' structure.
Also, in order to enable a threaded API in the future add an
accompanying mutex which must be acquired prior to accessing the
dictionary of interned attributes.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: change validity check for attribute names to use positive logic
Convert 'invalid_attr_name()' to 'attr_name_valid()' and use positive
logic for the return value. In addition create a helper function that
prints out an error message when an invalid attribute name is used.
We could later update the message to exactly spell out what the
rules for a good attribute name are, etc.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: pass struct attr_check to collect_some_attrs
The old callchain used to take an array of attr_check_item items.
Instead pass the 'attr_check' container object to 'collect_some_attrs()'
and access the fields in the data structure directly.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: convert git_check_attrs() callers to use the new API
The remaining callers are all simple "I have N attributes I am
interested in. I'll ask about them with various paths one by one".
After this step, no caller to git_check_attrs() remains. After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: convert git_all_attrs() to use "struct attr_check"
This updates the other two ways the attribute check is done via an
array of "struct attr_check_item" elements. These two niches
appear only in "git check-attr".
* The caller does not know offhand what attributes it wants to ask
about and cannot use attr_check_initl() to prepare the
attr_check structure.
* The caller may not know what attributes it wants to ask at all,
and instead wants to learn everything that the given path has.
Such a caller can call attr_check_alloc() to allocate an empty
attr_check, and then call attr_check_append() to add attribute names
one by one.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>