The submodule code has been taught to work better with separate
work trees created via "git worktree add".
* mk/submodule-gitdir-path:
path: implement common_dir handling in git_pathdup_submodule()
submodule refactor: use strbuf_git_path_submodule() in add_submodule_odb()
"git p4" learned to reencode the pathname it uses to communicate
with the p4 depot with a new option.
* ls/p4-path-encoding:
git-p4: use replacement character for non UTF-8 characters in paths
git-p4: improve path encoding verbose output
git-p4: add config git-p4.pathEncoding
Allow a later "!/abc/def" to override an earlier "/abc" that
appears in the same .gitignore file to make it easier to express
"everything in /abc directory is ignored, except for ...".
* nd/ignore-then-not-ignore:
dir.c: don't exclude whole dir prematurely if neg pattern may match
dir.c: make last_exclude_matching_from_list() run til the end
Allocation related functions and stdio are unsafe things to call
inside a signal handler, and indeed killing the pager can cause
glibc to deadlock waiting on allocation mutex as our signal handler
tries to free() some data structures in wait_for_pager(). Reduce
these unsafe calls.
* ti/glibc-stdio-mutex-from-signal-handler:
pager: don't use unsafe functions in signal handlers
Very small number of options take a parameter that is optional
(which is not a great UI element as they can only appear at the end
of the command line). Add notice to documentation of each and
every one of them.
* mm/keyid-docs:
Documentation: explain optional arguments better
Documentation/grep: fix documentation of -O
Documentation: use 'keyid' consistently, not 'key-id'
The Makefile always runs the library archiver with hardcoded "crs"
options, which was inconvenient for exotic platforms on which
people want to use programs with totally different set of command
line options.
* jw/make-arflags-customizable:
Makefile: allow $(ARFLAGS) specified from the command line
The infrastructure to rewrite "git submodule" in C is being built
incrementally. Let's polish these early parts well enough and make
them graduate to 'next' and 'master', so that the more involved
follow-up can start cooking on a solid ground.
* sb/submodule-helper:
submodule: rewrite `module_clone` shell function in C
submodule: rewrite `module_name` shell function in C
submodule: rewrite `module_list` shell function in C
The "ref-filter" code was taught about many parts of what "tag -l"
does and then "tag -l" is being reimplemented in terms of "ref-filter".
* kn/for-each-tag:
tag.c: implement '--merged' and '--no-merged' options
tag.c: implement '--format' option
tag.c: use 'ref-filter' APIs
tag.c: use 'ref-filter' data structures
ref-filter: add option to match literal pattern
ref-filter: add support to sort by version
ref-filter: add support for %(contents:lines=X)
ref-filter: add option to filter out tags, branches and remotes
ref-filter: implement an `align` atom
ref-filter: introduce match_atom_name()
ref-filter: introduce handler function for each atom
utf8: add function to align a string into given strbuf
ref-filter: introduce ref_formatting_state and ref_formatting_stack
ref-filter: move `struct atom_value` to ref-filter.c
strtoul_ui: reject negative values
Because "test_when_finished" in our test framework queues the
clean-up tasks to be done in a shell variable, it should not be
used inside a subshell. Add a mechanism to allow 'bash' to catch
such uses, and fix the ones that were found.
* jk/test-lint-forbid-when-finished-in-subshell:
test-lib-functions: detect test_when_finished in subshell
t7800: don't use test_config in a subshell
test-lib-functions: support "test_config -C <dir> ..."
t5801: don't use test_when_finished in a subshell
t7610: don't use test_config in a subshell
Update "git subtree" (in contrib/) so that it can take whitespaces
in the pathnames, not only in the in-tree pathname but the name of
the directory that the repository is in.
* as/subtree-with-spaces:
contrib/subtree: respect spaces in a repository path
t7900-subtree: test the "space in a subdirectory name" case
The ssh transport, just like any other transport over the network,
did not clear GIT_* environment variables, but it is possible to
use SendEnv and AcceptEnv to leak them to the remote invocation of
Git, which is not a good idea at all. Explicitly clear them just
like we do for the local transport.
* jk/connect-clear-env:
git_connect: clarify conn->use_shell flag
git_connect: clear GIT_* environment for ssh
"git log --date=local" used to only show the normal (default)
format in the local timezone. The command learned to take 'local'
as an instruction to use the local timezone with other formats,
e.g. "git show --date=rfc-local".
* jk/date-local:
t6300: add tests for "-local" date formats
t6300: make UTC and local dates different
date: make "local" orthogonal to date format
date: check for "local" before anything else
t6300: add test for "raw" date format
t6300: introduce test_date() helper
fast-import: switch crash-report date to iso8601
Documentation/rev-list: don't list date formats
Documentation/git-for-each-ref: don't list date formats
Documentation/config: don't list date formats
Documentation/blame-options: don't list date formats
Move the refs used during a "git bisect" session to per-worktree
hierarchy refs/worktree/* so that independent bisect sessions can
be done in different worktrees.
* dt/refs-bisection:
refs: make refs/bisect/* per-worktree
path: optimize common dir checking
refs: clean up common_list
Users who are too busy to type three extra keystrokes to ask for
"git stash show -p" can now set stash.showPatch configuration
varible to true to always see the actual patch, not just the list
of paths affected with feel for the extent of damage via diffstat.
Correct "git p4 --detect-labels" so that it does not fail to create
a tag that points at a commit that is also being imported.
* ld/p4-import-labels:
git-p4: fix P4 label import for unprocessed commits
git-p4: do not terminate creating tag for unknown commit
git-p4: failing test for ignoring invalid p4 labels
The use of 'good/bad' in "git bisect" made it confusing to use when
hunting for a state change that is not a regression (e.g. bugfix).
The command learned 'old/new' and then allows the end user to
say e.g. "bisect start --term-old=fast --term=new=slow" to find a
performance regression.
Michael's idea to make 'good/bad' more intelligent does have
certain attractiveness ($gname/272867), and makes some of the work
on this topic a moot point.
* ad/bisect-terms:
bisect: allow setting any user-specified in 'git bisect start'
bisect: add 'git bisect terms' to view the current terms
bisect: add the terms old/new
bisect: sanity check on terms
* jc/rerere: (21 commits)
rerere: un-nest merge() further
rerere: use "struct rerere_id" instead of "char *" for conflict ID
rerere: call conflict-ids IDs
rerere: further clarify do_rerere_one_path()
rerere: further de-dent do_plain_rerere()
rerere: refactor "replay" part of do_plain_rerere()
rerere: explain the remainder
rerere: explain "rerere forget" codepath
rerere: explain the primary codepath
rerere: explain MERGE_RR management helpers
rerere: fix benign off-by-one non-bug and clarify code
rerere: explain the rerere I/O abstraction
rerere: do not leak mmfile[] for a path with multiple stage #1 entries
rerere: stop looping unnecessarily
rerere: drop want_sp parameter from is_cmarker()
rerere: report autoupdated paths only after actually updating them
rerere: write out each record of MERGE_RR in one go
rerere: lift PATH_MAX limitation
rerere: plug conflict ID leaks
rerere: handle conflicts with multiple stage #1 entries
...
Some features from "git tag -l" and "git branch -l" have been made
available to "git for-each-ref" so that eventually the unified
implementation can be shared across all three, in a follow-up
series or two.
merge-file: enforce MAX_XDIFF_SIZE on incoming files
The previous commit enforces MAX_XDIFF_SIZE at the
interfaces to xdiff: xdi_diff (which calls xdl_diff) and
ll_xdl_merge (which calls xdl_merge).
But we have another direct call to xdl_merge in
merge-file.c. If it were written today, this probably would
just use the ll_merge machinery. But it predates that code,
and uses slightly different options to xdl_merge (e.g.,
ZEALOUS_ALNUM).
We could try to abstract out an xdi_merge to match the
existing xdi_diff, but even that is difficult. Rather than
simply report error, we try to treat large files as binary,
and that distinction would happen outside of xdi_merge.
The simplest fix is to just replicate the MAX_XDIFF_SIZE
check in merge-file.c.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The xdiff code is not prepared to handle extremely large
files. It uses "int" in many places, which can overflow if
we have a very large number of lines or even bytes in our
input files. This can cause us to produce incorrect diffs,
with no indication that the output is wrong. Or worse, we
may even underallocate a buffer whose size is the result of
an overflowing addition.
We're much better off to tell the user that we cannot diff
or merge such a large file. This patch covers both cases,
but in slightly different ways:
1. For merging, we notice the large file and cleanly fall
back to a binary merge (which is effectively "we cannot
merge this").
2. For diffing, we make the binary/text distinction much
earlier, and in many different places. For this case,
we'll use the xdi_diff as our choke point, and reject
any diff there before it hits the xdiff code.
This means in most cases we'll die() immediately after.
That's not ideal, but in practice we shouldn't
generally hit this code path unless the user is trying
to do something tricky. We already consider files
larger than core.bigfilethreshold to be binary, so this
code would only kick in when that is circumvented
(either by bumping that value, or by using a
.gitattribute to mark a file as diffable).
In other words, we can avoid being "nice" here, because
there is already nice code that tries to do the right
thing. We are adding the suspenders to the nice code's
belt, so notice when it has been worked around (both to
protect the user from malicious inputs, and because it
is better to die() than generate bogus output).
The maximum size was chosen after experimenting with feeding
large files to the xdiff code. It's just under a gigabyte,
which leaves room for two obvious cases:
- a diff3 merge conflict result on files of maximum size X
could be 3*X plus the size of the markers, which would
still be only about 3G, which fits in a 32-bit int.
- some of the diff code allocates arrays of one int per
record. Even if each file consists only of blank lines,
then a file smaller than 1G will have fewer than 1G
records, and therefore the int array will fit in 4G.
Since the limit is arbitrary anyway, I chose to go under a
gigabyte, to leave a safety margin (e.g., we would not want
to overflow by allocating "(records + 1) * sizeof(int)" or
similar.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call into xdiff to perform a diff, we generally lose
the return code completely. Typically by ignoring the return
of our xdi_diff wrapper, but sometimes we even propagate
that return value up and then ignore it later. This can
lead to us silently producing incorrect diffs (e.g., "git
log" might produce no output at all, not even a diff header,
for a content-level diff).
In practice this does not happen very often, because the
typical reason for xdiff to report failure is that it
malloc() failed (it uses straight malloc, and not our
xmalloc wrapper). But it could also happen when xdiff
triggers one our callbacks, which returns an error (e.g.,
outf() in builtin/rerere.c tries to report a write failure
in this way). And the next patch also plans to add more
failure modes.
Let's notice an error return from xdiff and react
appropriately. In most of the diff.c code, we can simply
die(), which matches the surrounding code (e.g., that is
what we do if we fail to load a file for diffing in the
first place). This is not that elegant, but we are probably
better off dying to let the user know there was a problem,
rather than simply generating bogus output.
We could also just die() directly in xdi_diff, but the
callers typically have a bit more context, and can provide a
better message (and if we do later decide to pass errors up,
we're one step closer to doing so).
There is one interesting case, which is in diff_grep(). Here
if we cannot generate the diff, there is nothing to match,
and we silently return "no hits". This is actually what the
existing code does already, but we make it a little more
explicit.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-u <exec> has never been supported, but it was mentioned since 0a2bb55 (git ls-remote: make usage string match manpage -
2008-11-11). Nobody has complained about it for seven years, it's
probably safe to say nobody cares. So let's remove "-u" in documents
instead of adding code to support it.
While at there, fix --upload-pack syntax too.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, libcurl will follow circular http redirects
forever. Let's put a cap on this so that somebody who can
trigger an automated fetch of an arbitrary repository (e.g.,
for CI) cannot convince git to loop infinitely.
The value chosen is 20, which is the same default that
Firefox uses.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, libcurl would follow redirection to any protocol
it was compiled for support with. This is desirable to allow
redirection from HTTP to HTTPS. However, it would even
successfully allow redirection from HTTP to SFTP, a protocol
that git does not otherwise support at all. Furthermore
git's new protocol-whitelisting could be bypassed by
following a redirect within the remote helper, as it was
only enforced at transport selection time.
This patch limits redirects within libcurl to HTTP, HTTPS,
FTP and FTPS. If there is a protocol-whitelist present, this
list is limited to those also allowed by the whitelist. As
redirection happens from within libcurl, it is impossible
for an HTTP redirect to a protocol implemented within
another remote helper.
When the curl version git was compiled with is too old to
support restrictions on protocol redirection, we warn the
user if GIT_ALLOW_PROTOCOL restrictions were requested. This
is a little inaccurate, as even without that variable in the
environment, we would still restrict SFTP, etc, and we do
not warn in that case. But anything else means we would
literally warn every time git accesses an http remote.
This commit includes a test, but it is not as robust as we
would hope. It redirects an http request to ftp, and checks
that curl complained about the protocol, which means that we
are relying on curl's specific error message to know what
happened. Ideally we would redirect to a working ftp server
and confirm that we can clone without protocol restrictions,
and not with them. But we do not have a portable way of
providing an ftp server, nor any other protocol that curl
supports (https is the closest, but we would have to deal
with certificates).
[jk: added test and version warning]
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Asciidoctor is stricter than AsciiDoc when deciding if underlining
is a section title or the start of preformatted text. Make the
length of the underlining match the text to ensure that it renders
correctly in all implementations.
Signed-off-by: John Keeping <john@keeping.me.uk>
[jc: squashed in git-bisect one noticed by Michael J Gruber] Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule: allow only certain protocols for submodule fetches
Some protocols (like git-remote-ext) can execute arbitrary
code found in the URL. The URLs that submodules use may come
from arbitrary sources (e.g., .gitmodules files in a remote
repository). Let's restrict submodules to fetching from a
known-good subset of protocols.
Note that we apply this restriction to all submodule
commands, whether the URL comes from .gitmodules or not.
This is more restrictive than we need to be; for example, in
the tests we run:
git submodule add ext::...
which should be trusted, as the URL comes directly from the
command line provided by the user. But doing it this way is
simpler, and makes it much less likely that we would miss a
case. And since such protocols should be an exception
(especially because nobody who clones from them will be able
to update the submodules!), it's not likely to inconvenience
anyone in practice.
Reported-by: Blake Burkhart <bburky@bburky.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
transport: add a protocol-whitelist environment variable
If we are cloning an untrusted remote repository into a
sandbox, we may also want to fetch remote submodules in
order to get the complete view as intended by the other
side. However, that opens us up to attacks where a malicious
user gets us to clone something they would not otherwise
have access to (this is not necessarily a problem by itself,
but we may then act on the cloned contents in a way that
exposes them to the attacker).
Ideally such a setup would sandbox git entirely away from
high-value items, but this is not always practical or easy
to set up (e.g., OS network controls may block multiple
protocols, and we would want to enable some but not others).
We can help this case by providing a way to restrict
particular protocols. We use a whitelist in the environment.
This is more annoying to set up than a blacklist, but
defaults to safety if the set of protocols git supports
grows). If no whitelist is specified, we continue to default
to allowing all protocols (this is an "unsafe" default, but
since the minority of users will want this sandboxing
effect, it is the only sensible one).
A note on the tests: ideally these would all be in a single
test file, but the git-daemon and httpd test infrastructure
is an all-or-nothing proposition rather than a test-by-test
prerequisite. By putting them all together, we would be
unable to test the file-local code on machines without
apache.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
notes: correct documentation of DWIMery for notes references
expand_notes_ref is used by --ref from git-notes(1) and --notes from the
git log to find the full refname of a notes reference. Previously the
documentation of these options was not clear about what sorts of
expansions would be performed. Fix the documentation to clearly and
accurately describe the behavior of the expansions.
Add a test for this expansion when using git notes get-ref in order to
prevent future patches from changing this behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4: handle "Translation of file content failed"
A P4 repository can get into a state where it contains a file with
type UTF-16 that does not contain a valid UTF-16 BOM. If git-p4
attempts to retrieve the file then the process crashes with a
"Translation of file content failed" error.
More info here: http://answers.perforce.com/articles/KB/3117
Fix this by detecting this error and retrieving the file as binary
instead. The result in Git is the same.
Known issue: This works only if git-p4 is executed in verbose mode.
In normal mode no exceptions are thrown and git-p4 just exits.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4: add test case for "Translation of file content failed" error
A P4 repository can get into a state where it contains a file with
type UTF-16 that does not contain a valid UTF-16 BOM. If git-p4
attempts to retrieve the file then the process crashes with a
"Translation of file content failed" error.
More info here: http://answers.perforce.com/articles/KB/3117
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name of some variables that are used very locally in this
function were overly long; they were making the lines harder to read
and the longer names didn't add much more information.
git-p4: use replacement character for non UTF-8 characters in paths
If non UTF-8 characters are detected in paths then replace them with
a placeholder instead of throwing a UnicodeDecodeError
exception. This restores the original (implicit) implementation that
was broken in 00a9403.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Reviewed-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c: don't exclude whole dir prematurely if neg pattern may match
If there is a pattern "!foo/bar", this patch makes it not exclude "foo"
right away. This gives us a chance to examine "foo" and re-include
"foo/bar".
In order for it to detect that the directory under examination should
not be excluded right away, in other words it is a parent directory of a
negative pattern, the "directory path" of the negative pattern must be
literal. Patterns like "!f?o/bar" can't stop "foo" from being excluded.
Basename matching (i.e. "no slashes in the pattern") or must-be-dir
matching (i.e. "trailing slash in the pattern") does not work well with
this. For example, if we descend in "foo" and are examining "foo/abc",
current code for "foo/" pattern will check if path "foo/abc", not "foo",
is a directory. The same problem with basename matching. These may need
big code reorg to make it work.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.c: make last_exclude_matching_from_list() run til the end
The next patch adds some post processing to the result value before it's
returned to the caller. Keep all branches reach the end of the function,
so we can do it all in one place.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: fix uninitialized var warning for $smtp_auth
On the latest version of git-send-email, I see this error just before
running SMTP auth (I didn't provide any --smtp-auth= parameter):
Use of uninitialized value $smtp_auth in pattern match (m//) at \
/home/briannorris/git/git/git-send-email.perl line 1139.
Signed-off-by: Brian Norris <computersforpeace@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the documentation of commands taking optional arguments in two
ways:
* Documents the behavior of '-O' (for grep) and '-S' (for commands
creating commits) when used without the optional argument.
* Document the syntax of these options.
For the second point, the behavior is documented in gitcli(7), but it is
easy for users to miss, and hard for the same user to understand why e.g.
"git status -u no" does not work.
Document this explicitly in the documentation of each short option having
an optional argument: they are the most error prone since there is no '='
sign between the option and its argument.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gc: save log from daemonized gc --auto and print it next time
While commit 9f673f9 (gc: config option for running --auto in
background - 2014-02-08) helps reduce some complaints about 'gc
--auto' hogging the terminal, it creates another set of problems.
The latest in this set is, as the result of daemonizing, stderr is
closed and all warnings are lost. This warning at the end of cmd_gc()
is particularly important because it tells the user how to avoid "gc
--auto" running repeatedly. Because stderr is closed, the user does
not know, naturally they complain about 'gc --auto' wasting CPU.
Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto
will not run and gc.log printed out until the user removes gc.log.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce three i18n improvements from the following commits:
* tag, update-ref: improve description of option "create-reflog"
* pull: don't mark values for option "rebase" for translation
* show-ref: place angle brackets around variables in usage string
Expanding `insteadOf` is a part of ls-remote --url and there is no way
to expand `pushInsteadOf` as well. Add a get-url subcommand to be able
to query both as well as a way to get all configured urls.
Signed-off-by: Ben Boeckel <mathstuf@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The experimental untracked-cache feature were buggy when paths with
a few levels of subdirectories are involved.
* dt/untracked-subdir:
untracked cache: fix entry invalidation
untracked-cache: fix subdirectory handling
t7063: use --force-untracked-cache to speed up a bit
untracked-cache: support sparse checkout
tag.c: implement '--merged' and '--no-merged' options
Use 'ref-filter' APIs to implement the '--merged' and '--no-merged'
options into 'tag.c'. The '--merged' option lets the user to only list
tags merged into the named commit. The '--no-merged' option lets the
user to only list tags not merged into the named commit. If no object
is provided it assumes HEAD as the object.
Add documentation and tests for the same.
Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>