Initialisation of the dir_struct and path_exclude_check structs was
previously done within check_ignore(). This was acceptable since
check_ignore() was only called once per check-ignore invocation;
however the next commit will convert it into an inner loop which is
called once per line of STDIN when --stdin is given. Therefore moving
the initialisation code out into cmd_check_ignore() ensures that
initialisation is still only performed once per check-ignore
invocation, and consequently that the output is identical whether
pathspecs are provided as CLI arguments or via STDIN.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If `-n` or `--non-matching` are specified, non-matching pathnames will
also be output, in which case all fields in each output record except
for <pathname> will be empty. This can be useful when running
check-ignore as a background process, so that files can be
incrementally streamed to STDIN, and for each of these files, STDOUT
will indicate whether that file matched a pattern or not. (Without
this option, it would be impossible to tell whether the absence of
output for a given file meant that it didn't match any pattern, or
that the result simply hadn't been flushed to STDOUT yet.)
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expected contents of STDOUT for the final --stdin tests can be
derived from the expected contents of STDOUT for the same tests when
--verbose is given, in the same way that test_expect_success_multi
derives this for earlier tests.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually we do not pass an empty string to the function hash_name()
because we almost always ask for hash values for a path that is a
candidate to be added to the index. However, check-ignore (and most
likely check-attr, but I didn't check) apparently has a callchain
to ask the hash value for an empty path when it was given a "." from
the top-level directory to ask "Is the path . excluded by default?"
Make sure that hash_name() does not overrun the end of the given
pathname even when it is empty.
Remove a sweep-the-issue-under-the-rug conditional in check-ignore
that avoided to pass an empty string to the callchain while at it.
It is a valid question to ask for check-ignore if the top-level is
set to be ignored by default, even though the answer is most likely
no, if only because there is currently no way to specify such an
entry in the .gitignore file. But it is an unusual thing to ask and
it is not worth optimizing for it by special casing at the top level
of the call chain.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_expect_success_multi() helper function warrants some explanation,
since at first sight it may seem like generic test framework plumbing,
but is in fact specific to testing check-ignore, and allows more
thorough testing of the various output formats without significantly
increase the size of t0008.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clean.c, ls-files.c: respect encapsulation of exclude_list_groups
Consumers of the dir.c traversal API should avoid assuming knowledge
of the internal implementation of exclude_list_groups. Therefore
when adding items to an exclude list, it should be accessed via the
pointer returned from add_exclude_list(), rather than by referencing
a location within dir.exclude_list_groups[EXC_CMDL].
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Brace expansion is a shell feature that's not required by POSIX and not
supported by dash nor NetBSD's sh. Explicitly list all combinations
instead. Also avoid calling touch by creating the test files with a
redirection instead, as suggested by Junio.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
Extract the body of the for loop in treat_gitlinks() into a separate
check_path_for_gitlink() function so that it can be reused elsewhere.
This paves the way for a new check-ignore sub-command.
Also document treat_gitlinks().
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pathspec.c: rename newly public functions for clarity
Perform the following function renames to make it explicit that these
pathspec handling functions are for matching against the index, rather
than against a tree or the working directory.
dir.c: provide clear_directory() for reclaiming dir_struct memory
By the end of a directory traversal, a dir_struct instance will
typically contains pointers to various data structures on the heap.
clear_directory() provides a convenient way to reclaim that memory.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
For exclude patterns read in from files, the filename is stored in the
exclude list, and the originating line number is stored in the
individual exclude (counting starting at 1).
For exclude patterns provided on the command line, a string describing
the source of the patterns is stored in the exclude list, and the
sequence number assigned to each exclude pattern is negative, with
counting starting at -1. So for example the 2nd pattern provided via
--exclude would be numbered -2. This allows any future consumers of
that data to easily distinguish between exclude patterns from files
vs. from the CLI.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c: use a single struct exclude_list per source of excludes
Previously each exclude_list could potentially contain patterns
from multiple sources. For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).
We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source. This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.
Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c: rename free_excludes() to clear_exclude_list()
It is clearer to use a 'clear_' prefix for functions which empty
and deallocate the contents of a data structure without freeing
the structure itself, and a 'free_' prefix for functions which
also free the structure itself.
In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching_path() which return the last
exclude_list element which matched, or NULL if no match was found.
is_path_excluded() becomes a wrapper around this, and just returns 0
or 1 depending on whether any matching exclude_list element was found.
This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching() which returns the last exclude_list
element which matched, or NULL if no match was found. is_excluded()
becomes a wrapper around this, and just returns 0 or 1 depending on
whether any matching exclude_list element was found.
This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The excluded function uses a new helper function called
last_exclude_matching_from_list() to perform the inner loop over all of
the exclude patterns. The helper just tells us whether the path is
included, excluded, or undecided.
However, it may be useful to know _which_ pattern was triggered. So
let's pass out the entire exclude match, which contains the status
information we were already passing out.
Further patches can make use of this.
This is a modified forward port of a patch from 2009 by Jeff King:
http://article.gmane.org/gmane.comp.version-control.git/108815
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c: rename cryptic 'which' variable to more consistent name
'el' is only *slightly* less cryptic, but is already used as the
variable name for a struct exclude_list pointer in numerous other
places, so this reduces the number of cryptic variable names in use by
one :-)
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve documentation and comments regarding directory traversal API
traversal API has a few potentially confusing properties. These
comments clarify a few key aspects and will hopefully make it easier
to understand for other newcomers in the future.
Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
.gitattributes and .gitignore share the same pattern syntax but has
separate matching implementation. Over the years, ignore's
implementation accumulates more optimizations while attr's stays the
same.
This patch reuses the core matching functions that are also used by
excluded_from_list. excluded_from_list and path_matches can't be
merged due to differences in exclude and attr, for example:
* "!pattern" syntax is forbidden in .gitattributes. As an attribute
can be unset (i.e. set to a special value "false") or made back to
unspecified (i.e. not even set to "false"), "!pattern attr" is unclear
which one it means.
* we support attaching attributes to directories, but git-core
internally does not currently make use of attributes on
directories.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitignore: make pattern parsing code a separate function
This function can later be reused by attr.c. Also turn to_exclude
field into a flag.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "namelen" becomes zero at this stage, we have matched the fixed
part, but whether it actually matches the pattern still depends on the
pattern in "exclude". As demonstrated in t3001, path "three/a.3"
exists and it matches the "three/a.3" part in pattern "three/a.3[abc]",
but that does not mean a true match.
Don't be too optimistic and let fnmatch() do the job.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
exclude: stricten a length check in EXC_FLAG_ENDSWITH case
This block of code deals with the "basename" part only, which has the
length of "pathlen - (basename - pathname)". Stricten the length check
and remove "pathname" from the main expression to avoid confusion.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge commit 'f9f6e2c' into nd/attr-match-optim-more
* commit 'f9f6e2c':
exclude: do strcmp as much as possible before fnmatch
dir.c: get rid of the wildcard symbol set in no_wildcard()
Unindent excluded_from_list()
Merge branch 'jc/maint-checkout-fileglob-doc' into maint-1.7.11
* jc/maint-checkout-fileglob-doc:
gitcli: contrast wildcard given to shell and to git
gitcli: formatting fix
Document file-glob for "git checkout -- '*.c'"
Merge branch 'jc/apply-binary-p0' into maint-1.7.11
"git apply -p0" did not parse pathnames on "diff --git" line
correctly. This caused patches that had pathnames in no other
places to be mistakenly rejected (most notably, binary patch that
does not rename nor change mode). Textual patches, renames or mode
changes have preimage and postimage pathnames in different places in
a form that can be parsed unambiguously and did not suffer from this
problem.
* jc/apply-binary-p0:
apply: compute patch->def_name correctly under -p0
Merge branch 'jc/dotdot-is-parent-directory' into maint-1.7.11
"git log .." errored out saying it is both rev range and a path when
there is no disambiguating "--" is on the command line. Update the
command line parser to interpret ".." as a path in such a case.
* jc/dotdot-is-parent-directory:
specifying ranges: we did not mean to make ".." an empty set
Merge branch 'jc/maint-doc-checkout-b-always-takes-branch-name' into maint-1.7.11
The synopsis said "checkout [-B branch]" to make it clear the
branch name is a parameter to the option, but the heading for the
option description was "-B::", not "-B branch::", making the
documentation misleading.
* jc/maint-doc-checkout-b-always-takes-branch-name:
doc: "git checkout -b/-B/--orphan" always takes a branch name
Merge branch 'jk/maint-http-half-auth-push' into maint-1.7.11
Pushing to smart HTTP server with recent Git fails without having
the username in the URL to force authentication, if the server is
configured to allow GET anonymously, while requiring authentication
for POST.
* jk/maint-http-half-auth-push:
http: prompt for credentials on failed POST
http: factor out http error code handling
t: test http access to "half-auth" repositories
t: test basic smart-http authentication
t/lib-httpd: recognize */smart/* repos as smart-http
t/lib-httpd: only route auth/dumb to dumb repos
t5550: factor out http auth setup
t5550: put auth-required repo in auth/dumb
Merge branch 'jc/maint-protect-sh-from-ifs' into maint-1.7.11
When the user exports a non-default IFS without HT, scripts that
rely on being able to parse "ls-files -s | while read a b c..."
start to fail. Protect them from such a misconfiguration.
* jc/maint-protect-sh-from-ifs:
sh-setup: protect from exported IFS
Merge branch 'bc/receive-pack-stdout-protection' into maint-1.7.11
When "git push" triggered the automatic gc on the receiving end, a
message from "git prune" that said it was removing cruft leaked to
the standard output, breaking the communication protocol.
* bc/receive-pack-stdout-protection:
receive-pack: do not leak output from auto-gc to standard output
t/t5400: demonstrate breakage caused by informational message from prune
Merge branch 'jk/maint-null-in-trees' into maint-1.7.11
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.
* jk/maint-null-in-trees:
fsck: detect null sha1 in tree entries
do not write null sha1s to on-disk index
diff: do not use null sha1 as a sentinel value
Merge branch 'mm/die-with-dashdash-help' into maint-1.7.11
When the user gives an argument that can be taken as both a
revision name and a pathname without disambiguating with "--", we
used to give a help message "Use '--' to separate". The message
has been clarified to show where that '--' goes on the command
line.
* mm/die-with-dashdash-help:
setup: clarify error messages for file/revisions ambiguity
gitcli: contrast wildcard given to shell and to git
People who are not used to working with shell may intellectually
understand how the command line argument is massaged by the shell
but still have a hard time visualizing the difference between
letting the shell expand fileglobs and having Git see the fileglob
to use as a pathspec.
The paragraph to encourage use of "--" in scripts belongs to the
bullet point that describes the behaviour for a command line without
the explicit "--" disambiguation; it is not a supporting explanation
for the entire bulletted list, and it is wrong to make it a separate
paragraph outside the list.
Tag contents, once read, are forever cached in memory.
This makes gitk unable to notice when tag contents change.
Allow users to cause a reload of the tag contents by using
the "File->Reread references" action.
Reported-by: Tim McCormack <cortex@brainonfire.net> Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
When fed such a commit, --format='%ci' fails to parse it, and gives
back an empty string. Update the split_ident_line() to be a bit
more lenient when parsing, but make sure the caller that wants to
pick up sane value from its return value does its own validation.
Originally the "--quiet" option was parsed by the
diff-option parser into the internal QUICK option. This had
the effect of silencing diff output from the log (which was
not intended, but happened to work and people started to
use it). But it also had other odd side effects at the diff
level (for example, it would suppress the second commit in
"git show A B").
To fix this, commit 1c40c36 converted log to parse-options
and handled the "quiet" option separately, not passing it
on to the diff code. However, it simply ignored the option,
which was a regression for people using it as a synonym for
"-s". Commit 01771a8 then fixed that by interpreting the
option to add DIFF_FORMAT_NO_OUTPUT to the list of output
formats.
However, that commit did not fix it in all cases. It sets
the flag after setup_revisions is called. Naively, this
makes sense because you would expect the setup_revisions
parser to overwrite our output format flag if "-p" or
another output format flag is seen.
However, that is not how the NO_OUTPUT flag works. We
actually store it in the bit-field as just another format.
At the end of setup_revisions, we call diff_setup_done,
which post-processes the bitfield and clears any other
formats if we have set NO_OUTPUT. By setting the flag after
setup_revisions is done, diff_setup_done does not have a
chance to make this tweak, and we end up with other format
options still set.
As a result, the flag would have no effect in "git log -p
--quiet" or "git show --quiet". Fix it by setting the
format flag before the call to setup_revisions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the smart-http GET requests go through the http_get_*
functions, which will prompt for credentials and retry if we
see an HTTP 401.
POST requests, however, do not go through any central point.
Moreover, it is difficult to retry in the general case; we
cannot assume the request body fits in memory or is even
seekable, and we don't know how much of it was consumed
during the attempt.
Most of the time, this is not a big deal; for both fetching
and pushing, we make a GET request before doing any POSTs,
so typically we figure out the credentials during the first
request, then reuse them during the POST. However, some
servers may allow a client to get the list of refs from
receive-pack without authentication, and then require
authentication when the client actually tries to POST the
pack.
This is not ideal, as the client may do a non-trivial amount
of work to generate the pack (e.g., delta-compressing
objects). However, for a long time it has been the
recommended example configuration in git-http-backend(1) for
setting up a repository with anonymous fetch and
authenticated push. This setup has always been broken
without putting a username into the URL. Prior to commit 986bbc0, it did work with a username in the URL, because git
would prompt for credentials before making any requests at
all. However, post-986bbc0, it is totally broken. Since it
has been advertised in the manpage for some time, we should
make sure it works.
Unfortunately, it is not as easy as simply calling post_rpc
again when it fails, due to the input issue mentioned above.
However, we can still make this specific case work by
retrying in two specific instances:
1. If the request is large (bigger than LARGE_PACKET_MAX),
we will first send a probe request with a single flush
packet. Since this request is static, we can freely
retry it.
2. If the request is small and we are not using gzip, then
we have the whole thing in-core, and we can freely
retry.
That means we will not retry in some instances, including:
1. If we are using gzip. However, we only do so when
calling git-upload-pack, so it does not apply to
pushes.
2. If we have a large request, the probe succeeds, but
then the real POST wants authentication. This is an
extremely unlikely configuration and not worth worrying
about.
While it might be nice to cover those instances, doing so
would be significantly more complex for very little
real-world gain. In the long run, we will be much better off
when curl learns to internally handle authentication as a
callback, and we can cleanly handle all cases that way.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of our http requests go through the http_request()
interface, which does some nice post-processing on the
results. In particular, it handles prompting for missing
credentials as well as approving and rejecting valid or
invalid credentials. Unfortunately, it only handles GET
requests. Making it handle POSTs would be quite complex, so
let's pull result handling code into its own function so
that it can be reused from the POST code paths.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some sites set up http access to repositories such that
fetching is anonymous and unauthenticated, but pushing is
authenticated. While there are multiple ways to do this, the
technique advertised in the git-http-backend manpage is to
block access to locations matching "/git-receive-pack$".
Let's emulate that advice in our test setup, which makes it
clear that this advice does not actually work.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not currently test authentication over smart-http at
all. In theory, it should work exactly as it does for dumb
http (which we do test). It does indeed work for these
simple tests, but this patch lays the groundwork for more
complex tests in future patches.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/lib-httpd: recognize */smart/* repos as smart-http
We do not currently test authentication for smart-http repos
at all. Part of the infrastructure to do this is recognizing
that auth/smart is indeed a smart-http repo.
The current apache config recognizes only "^/smart/*" as
smart-http. Let's instead treat anything with /smart/ in the
URL as smart-http. This is obviously a stupid thing to do
for a real production site, but for our test suite we know
that our repositories will not have this magic string in the
name.
Note that we will route /foo/smart/bar.git directly to
git-http-backend/bar.git; in other words, everything before
the "/smart/" is irrelevant to finding the repo on disk (but
may impact apache config, for example by triggering auth
checks).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our test apache config points all of auth/ directly to the
on-disk repositories via an Alias directive. This works fine
because everything authenticated is currently in auth/dumb,
which is a subset. However, this would conflict with a
ScriptAlias for auth/smart (which will come in future
patches), so let's narrow the Alias.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t5550 script sets up a nice askpass helper for
simulating user input and checking what git prompted for.
Let's make it available to other http scripts by migrating
it to lib-httpd.
We can use this immediately in t5540 to make our tests more
robust (previously, we did not check at all that hitting the
password-protected repo actually involved a password).
Unfortunately, we end up failing the test because the
current code erroneously prompts twice (once for
git-remote-http, and then again when the former spawns
git-http-push).
More importantly, though, it will let us easily add
smart-http authentication tests in t5541 and t5551; we
currently do not test smart-http authentication at all.
As part of making it generic, let's always look for and
store auxiliary askpass files at the top-level trash
directory; this makes it compatible with t5540, which runs
some tests from sub-repositories. We can abstract away the
ugliness with a short helper function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most of our tests, we put repos to be accessed by dumb
protocols in /dumb, and repos to be accessed by smart
protocols in /smart. In our test apache setup, the whole
/auth hierarchy requires authentication. However, we don't
bother to split it by smart and dumb here because we are not
currently testing smart-http authentication at all.
That will change in future patches, so let's be explicit
that we are interested in testing dumb access here. This
also happens to match what t5540 does for the push tests.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply: compute patch->def_name correctly under -p0
Back when "git apply" was written, we made sure that the user can
skip more than the default number of path components (i.e. 1) by
giving "-p<n>", but the logic for doing so was built around the
notion of "we skip N slashes and stop". This obviously does not
work well when running under -p0 where we do not want to skip any,
but still want to skip SP/HT that separates the pathnames of
preimage and postimage and want to reject absolute pathnames.
Stop using "stop_at_slash()", and instead introduce a new helper
"skip_tree_prefix()" with similar logic but works correctly even for
the -p0 case.
This is an ancient bug, but has been masked for a long time because
most of the patches are text and have other clues to tell us the
name of the preimage and the postimage.
Merge branch 'jc/maint-abbrev-option-cli' into maint-1.7.11
We did not document that many commands take unique prefix
abbreviations of long options (e.g. "--option" may be the only flag
that the command accepts that begin with "--opt", in which case you
can give "--opt") anywhere easy to find for new people.
* jc/maint-abbrev-option-cli:
gitcli: describe abbreviation of long options
Merge branch 'jc/maint-rev-list-topo-doc' into maint-1.7.11
It was unclear what "--topo-order" was really about in the
documentation. It is not just about "children before parent", but
also about "don't mix lineages".
Merge branch 'hv/coding-guidelines' into maint-1.7.11
In earlier days, "imitate the style in the neibouring code" was
sufficient to keep the coherent style, but over time some parts of
the codebase have drifted enough to make it ineffective.
* hv/coding-guidelines:
Documentation/CodingGuidelines: spell out more shell guidelines
Our documentation used to assume having files in .git/refs/*
directories was the only to have branches and tags, but that is not
true for quite some time.
* jc/tag-doc:
Documentation: do not mention .git/refs/* directories