Merge branch 'jk/filter-branch-use-of-sed-on-incomplete-line' into maint
"filter-branch" corrupted commit log message that ends with an
incomplete line on platforms with some "sed" implementations that
munge such a line. Work it around by avoiding to use "sed".
* jk/filter-branch-use-of-sed-on-incomplete-line:
filter-branch: avoid passing commit message through sed
Merge branch 'jk/stash-require-clean-index' into maint
"git stash pop/apply" forgot to make sure that not just the working
tree is clean but also the index is clean. The latter is important
as a stash application can conflict and the index will be used for
conflict resolution.
* jk/stash-require-clean-index:
stash: require a clean index to apply
t3903: avoid applying onto dirty index
t3903: stop hard-coding commit sha1s
Merge branch 'jk/git-no-more-argv0-path-munging' into maint
We have prepended $GIT_EXEC_PATH and the path "git" is installed in
(typically "/usr/bin") to $PATH when invoking subprograms and hooks
for almost eternity, but the original use case the latter tried to
support was semi-bogus (i.e. install git to /opt/foo/git and run it
without having /opt/foo on $PATH), and more importantly it has
become less and less relevant as Git grew more mainstream (i.e. the
users would _want_ to have it on their $PATH). Stop prepending the
path in which "git" is installed to users' $PATH, as that would
interfere the command search order people depend on (e.g. they may
not like versions of programs that are unrelated to Git in /usr/bin
and want to override them by having different ones in /usr/local/bin
and have the latter directory earlier in their $PATH).
* jk/git-no-more-argv0-path-munging:
stop putting argv[0] dirname at front of PATH
Teach the codepaths that read .gitignore and .gitattributes files
that these files encoded in UTF-8 may have UTF-8 BOM marker at the
beginning; this makes it in line with what we do for configuration
files already.
* cn/bom-in-gitignore:
attr: skip UTF8 BOM at the beginning of the input file
config: use utf8_bom[] from utf.[ch] in git_parse_source()
utf8-bom: introduce skip_utf8_bom() helper
add_excludes_from_file: clarify the bom skipping logic
dir: allow a BOM at the beginning of exclude files
Access to objects in repositories that borrow from another one on a
slow NFS server unnecessarily got more expensive due to recent code
becoming more cautious in a naive way not to lose objects to pruning.
* jk/prune-mtime:
sha1_file: only freshen packs once per run
sha1_file: freshen pack objects before loose
reachable: only mark local objects as recent
Merge branch 'jk/init-core-worktree-at-root' into maint
We avoid setting core.worktree when the repository location is the
".git" directory directly at the top level of the working tree, but
the code misdetected the case in which the working tree is at the
root level of the filesystem (which arguably is a silly thing to
do, but still valid).
* jk/init-core-worktree-at-root:
init: don't set core.worktree when initializing /.git
Merge branch 'jc/diff-no-index-d-f' into maint-2.3
The usual "git diff" when seeing a file turning into a directory
showed a patchset to remove the file and create all files in the
directory, but "git diff --no-index" simply refused to work. Also,
when asked to compare a file and a directory, imitate POSIX "diff"
and compare the file with the file with the same name in the
directory, instead of refusing to run.
* jc/diff-no-index-d-f:
diff-no-index: align D/F handling with that of normal Git
diff-no-index: DWIM "diff D F" into "diff D/F F"
When 01cec54e (daemon: deglobalize hostname information, 2015-03-07)
wrapped the global variables such as hostname inside a struct, it
forgot to convert one location that spelled "hostname" that needs to
be updated to "hi->hostname".
This was inside NO_IPV6 block, and was not caught by anybody.
completion: fix and update 'git log --decorate=' options
'git log --decorate=' understands the 'full', 'short' and 'no' options.
From these the completion script only offered 'short' and it offered
'long' instead of 'full'.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
filter-branch: avoid passing commit message through sed
On some systems (like OS X), if sed encounters input without
a trailing newline, it will silently add it. As a result,
"git filter-branch" on such systems may silently rewrite
commit messages that omit a trailing newline. Even though
this is not something we generate ourselves with "git
commit", it's better for filter-branch to preserve the
original data as closely as possible.
We're using sed here only to strip the header fields from
the commit object. We can accomplish the same thing with a
shell loop. Since shell "read" calls are slow (usually one
syscall per byte), we use "cat" once we've skipped past the
header. Depending on the size of your commit messages, this
is probably faster (you pay the cost to fork, but then read
the data in saner-sized chunks). This idea is shamelessly
stolen from Junio.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the branch to be rebased is already up to date, we
"git checkout" the branch, print an "up to date" message,
and end the rebase early. However, our checkout may print
"Switched to branch 'foo'" or "Already on 'foo'", even if
the user has asked for "--quiet".
We should avoid printing these messages at all, "--quiet" or
no. Since the rebase is a noop, this checkout can be seen as
optimizing out these other two checkout operations (that
happen in a real rebase):
1. Moving to the detached HEAD to start the rebase; we
always feed "-q" to checkout there, and instead rely on
our own custom message (which respects --quiet).
2. Finishing a rebase, where we move to the final branch.
Here we actually use update-ref rather than
git-checkout, and produce no messages.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'tb/connect-ipv6-parse-fix' into maint
An earlier update to the parser that disects a URL broke an
address, followed by a colon, followed by an empty string (instead
of the port number), e.g. ssh://example.com:/path/to/repo.
* tb/connect-ipv6-parse-fix:
connect.c: ignore extra colon after hostname
The "git push --signed" protocol extension did not limit what the
"nonce" that is a server-chosen string can contain or how long it
can be, which was unnecessarily lax. Limit both the length and the
alphabet to a reasonably small space that can still have enough
entropy.
* jc/push-cert:
push --signed: tighten what the receiving end can ask to sign
When the "git" wrapper is invoked, we prepend the baked-in
exec-path to our PATH, so that any sub-processes we exec
will all find the git-foo commands that match the wrapper
version.
If you invoke git with an absolute path, like:
/usr/bin/git foo
we also prepend "/usr/bin" to the PATH. This was added long
ago by by 231af83 (Teach the "git" command to handle some
commands internally, 2006-02-26), with the intent that
things would just work if you did something like:
cd /opt
tar xzf premade-git-package.tar.gz
alias git=/opt/git/bin/git
as we would then find all of the related external commands
in /opt/git/bin. I.e., it made git runtime-relocatable,
since at the time of 231af83, we installed all of the git
commands into $(bindir). But these days, that is not enough.
Since f28ac70 (Move all dashed-form commands to libexecdir,
2007-11-28), we do not put commands into $(bindir), and you
actually need to convert "/usr/bin" into "/usr/libexec". And
not just for finding binaries; we want to find $(sharedir),
etc, the same way. The RUNTIME_PREFIX build knob does this
the right way, by assuming a sane hierarchy rooted at
"$prefix" when we run "$prefix/bin/git", and inferring
"$prefix/libexec/git-core", etc.
So this feature (prepending the argv[0] dirname to the PATH)
is broken for providing a runtime prefix, and has been for
many years. Does it do anything for other cases?
For the "git" wrapper itself, as well as any commands
shipped by "git", the answer is no. Those are already in
git's exec-path, which is consulted first. For third-party
commands which you've dropped into the same directory, it
does include them. So if you do
cd /opt
tar xzf git-built-specifically-for-opt-git.tar.gz
cp third-party/git-foo /opt/git/bin/git-foo
alias git=/opt/git/bin/git
it does mean that we will find the third-party "git-foo",
even if you do not put /opt/git/bin into your $PATH. But
the flipside of this is that we will bump the precedence of
_other_ third-party tools that happen to be in the same
directory as git. For example, consider this setup:
1. Git is installed by the system in /usr/bin. There are
other system utilities in /usr/bin. E.g., a system
"vi".
2. The user installs tools they prefer in /usr/local/bin.
E.g., vim with a "vi" symlink. They set their PATH to
/usr/local/bin:/usr/bin to prefer their custom tools.
3. Running /usr/bin/git puts "/usr/bin" at the front of
their PATH. When git invokes the editor on behalf of
the user, they get the system vi, not their normal vim.
There are other variants of this, including overriding
system ruby and python (which is quite common using tools
like "rvm" and "virtualenv", which use relocatable
hierarchies and $PATH settings to get a consistent
environment).
Given that the main motivation for git placing the argv[0]
dirname into the PATH has been broken for years, that the
remaining cases are obscure and unlikely (and easily fixed
by the user just setting up their $PATH sanely), and that
the behavior is hurting real, reasonably common use cases,
it's not worth continuing to do so.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have staged contents in your index and run "stash
apply", we may hit a conflict and put new entries into the
index. Recovering to your original state is difficult at
that point, because tools like "git reset --keep" will blow
away anything staged. We can make this safer by refusing to
apply when there are staged changes.
It's possible we could provide better tooling here, as "git
stash apply" should be writing only conflicts to the index
(so we know that any stage-0 entries are potentially
precious). But it is the odd duck; most "mergy" commands
will update the index for cleanly merged entries, and it is
not worth updating our tooling to support this use case
which is unlikely to be of interest (besides which, we would
still need to block a dirty index for "stash apply --index",
since that case _would_ be ambiguous).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the tests in t3903 wants to make sure that applying a
stash that touches only "file" can still happen even if there
are working tree changes to "other-file". To do so, it adds
"other-file" to the index (since otherwise it is an
untracked file, voiding the purpose of the test).
But as we are about to refactor the dirty-index handling,
and as this test does not actually care about having a dirty
index (only a dirty working tree), let's bump the tracking
of "other-file" into the setup phase, so we can have _just_
a dirty working tree here.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When testing the diff output of "git stash list", we look
for the stash's subject of "WIP on master: $sha1", even
though it's not relevant to the diff output. This makes the
test brittle to refactoring, as any changes to earlier tests
may impact the commit sha1.
Since we don't care about the commit subject here, we can
simply ask stash not to print it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff-highlight" (in contrib/) used to show byte-by-byte
differences, which meant that multi-byte characters can be chopped
in the middle. It learned to pay attention to character boundaries
(assuming the UTF-8 payload).
* jk/colors:
diff-highlight: do not split multibyte characters
* jk/test-annoyances:
t5551: make EXPENSIVE test cheaper
t5541: move run_with_cmdline_limit to test-lib.sh
t: pass GIT_TRACE through Apache
t: redirect stderr GIT_TRACE to descriptor 4
t: translate SIGINT to an exit
An earlier update to the parser that disects an address broke an
address, followed by a colon, followed by an empty string (instead
of the port number).
* tb/connect-ipv6-parse-fix:
connect.c: ignore extra colon after hostname
* va/fix-git-p4-tests:
t9814: guarantee only one source exists in git-p4 copy tests
git-p4: fix copy detection test
t9814: fix broken shell syntax in git-p4 rename test
The "git push --signed" protocol extension did not limit what the
"nonce" that is a server-chosen string can contain or how long it
can be, which was unnecessarily lax. Limit both the length and the
alphabet to a reasonably small space that can still have enough
entropy.
* jc/push-cert:
push --signed: tighten what the receiving end can ask to sign
Since 33d4221 (write_sha1_file: freshen existing objects,
2014-10-15), we update the mtime of existing objects that we
would have written out (had they not existed). For the
common case in which many objects are packed, we may update
the mtime on a single packfile repeatedly. This can result
in a noticeable performance problem if calling utime() is
expensive (e.g., because your storage is on NFS).
We can fix this by keeping a per-pack flag that lets us
freshen only once per program invocation.
An alternative would be to keep the packed_git.mtime flag up
to date as we freshen, and freshen only once every N
seconds. In practice, it's not worth the complexity. We are
racing against prune expiration times here, which inherently
must be set to accomodate reasonable program running times
(because they really care about the time between an object
being written and it becoming referenced, and the latter is
typically the last step a program takes).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing out an object file, we first check whether it
already exists and if so optimize out the write. Prior to 33d4221, we did this by calling has_sha1_file(), which will
check for packed objects followed by loose. Since that
commit, we check loose objects first.
For the common case of a repository whose objects are mostly
packed, this means we will make a lot of extra access()
system calls checking for loose objects. We should follow
the same packed-then-loose order that all of our other
lookups use.
Reported-by: Stefan Saasen <ssaasen@atlassian.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pruning and repacking a repository that has an
alternate object store configured, we may traverse a large
number of objects in the alternate. This serves no purpose,
and may be expensive to do. A longer explanation is below.
Commits d3038d2 and abcb865 taught prune and pack-objects
(respectively) to treat "recent" objects as tips for
reachability, so that we keep whole chunks of history. They
built on the object traversal in 660c889 (sha1_file: add
for_each iterators for loose and packed objects,
2014-10-15), which covers both local and alternate objects.
In both cases, covering alternate objects is unnecessary, as
both commands can only drop objects from the local
repository. In the case of prune, we traverse only the local
object directory. And in the case of repacking, while we may
or may not include local objects in our pack, we will never
reach into the alternate with "repack -d". The "-l" option
is only a question of whether we are migrating objects from
the alternate into our repository, or leaving them
untouched.
It is possible that we may drop an object that is depended
upon by another object in the alternate. For example,
imagine two repositories, A and B, with A pointing to B as
an alternate. Now imagine a commit that is in B which
references a tree that is only in A. Traversing from recent
objects in B might prevent A from dropping that tree. But
this case isn't worth covering. Repo B should take
responsibility for its own objects. It would never have had
the commit in the first place if it did not also have the
tree, and assuming it is using the same "keep recent chunks
of history" scheme, then it would itself keep the tree, as
well.
So checking the alternate objects is not worth doing, and
come with a significant performance impact. In both cases,
we skip any recent objects that have already been marked
SEEN (i.e., that we know are already reachable for prune, or
included in the pack for a repack). So there is a slight
waste of time in opening the alternate packs at all, only to
notice that we have already considered each object. But much
worse, the alternate repository may have a large number of
objects that are not reachable from the local repository at
all, and we end up adding them to the traversal.
We can fix this by considering only local unseen objects.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old wording was somehow implying that <start> and <end> were not
regular expressions. Also, the common case is to use a plain function
name here so <funcname> makes sense (the fact that it is a regular
expression is documented in line-range-format.txt).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tag 'gitgui-0.20.0' of http://repo.or.cz/r/git-gui:
git-gui: set version 0.20
git-gui: sv.po: Update Swedish translation (547t0f0u)
git-gui i18n: Updated Bulgarian translation (547t,0f,0u)
git-gui: Makes chooser set 'gitdir' to the resolved path
git-gui: Fixes chooser not accepting gitfiles
git-gui: reinstate support for Tcl 8.4
git-gui: fix problem with gui.maxfilesdisplayed
git-gui: fix verbose loading when git path contains spaces.
git-gui/gitk: Do not depend on Cygwin's "kill" command on Windows
git-gui: add configurable tab size to the diff view
git-gui: Make git-gui lib dir configurable at runime
git-gui i18n: Updated Bulgarian translation (520t,0f,0u)
L10n: vi.po (543t): Init translation for Vietnamese
git-gui: align the new recursive checkbox with the radiobuttons.
git-gui: Add a 'recursive' checkbox in the clone menu.
When commit fe8e3b7 refactored type_from_string to allow
input that was not NUL-terminated, it switched to using
strncmp instead of strcmp. But this means we check only the
first "len" bytes of the strings, and ignore any remaining
bytes in the object_type_string. We should make sure that it
is also "len" bytes, or else we would accept "comm" as
"commit", and so forth.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: use utf8_bom[] from utf.[ch] in git_parse_source()
Because the function reads one character at the time, unfortunately
we cannot use the easier skip_utf8_bom() helper, but at least we do
not have to duplicate the constant string this way.
With the recent change to ignore the UTF8 BOM at the beginning of
.gitignore files, we now have two codepaths that do such a skipping
(the other one is for reading the configuration files).
Introduce utf8_bom[] constant string and skip_utf8_bom() helper
and teach .gitignore code how to use it.
add_excludes_from_file: clarify the bom skipping logic
Even though the previous step shifts where the "entry" begins, we
still iterate over the original buf[], which may begin with the
UTF-8 BOM we are supposed to be skipping. At the end of the first
line, the code grabs the contents of it starting at "entry", so
there is nothing wrong per-se, but the logic looks really confused.
Instead, move the buf pointer and shrink its size, to truly
pretend that UTF-8 BOM did not exist in the input.
dir: allow a BOM at the beginning of exclude files
Some text editors like Notepad or LibreOffice write an UTF-8 BOM in
order to indicate that the file is Unicode text rather than whatever the
current locale would indicate.
If someone uses such an editor to edit a gitignore file, we are left
with those three bytes at the beginning of the file. If we do not skip
them, we will attempt to match a filename with the BOM as prefix, which
won't match the files the user is expecting.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revert "merge: pass verbosity flag down to merge-recursive"
This reverts commit 2bf15a3330a26183adc8563dbeeacc11294b8a01, whose
intention was good, but the verbosity levels used in merge-recursive
turns out to be rather uneven. For example, a merge of two branches
with conflicting submodule updates used to report CONFLICT: output
with --quiet but no longer (which *is* desired), while the final
"Automatic merge failed; fix conflicts and then commit" message is
still shown even with --quiet (which *is* inconsistent).
Originally reported by Bryan Turner; it is too early to declare what
the concensus is, but it seems that we would need to level the
verbosity levels used in merge strategy backends before we can go
forward. In the meantime, we'd revert to the old behaviour until
that happens.
parse_date_basic(): let the system handle DST conversion
The function parses the input to compute the broken-down time in
"struct tm", and the GMT timezone offset. If the timezone offset
does not exist in the input, the broken-down time is turned into the
number of seconds since epoch both in the current timezone and in
GMT and the offset is computed as their difference.
However, we forgot to make sure tm.tm_isdst is set to -1 (i.e. let
the system figure out if DST is in effect in the current timezone
when turning the broken-down time to the number of seconds since
epoch); it is done so at the beginning of the function, but a call
to match_digit() in the function can lead to a call to gmtime_r() to
clobber the field.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Diagnosed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_date_basic(): return early when given a bogus timestamp
When the input does not have GMT timezone offset, the code computes
it by computing the local and GMT time for the given timestamp. But
there is no point doing so if the given timestamp is known to be a
bogus one.
"diff-highlight" (in contrib/) used to show byte-by-byte
differences, which meant that multi-byte characters can be chopped
in the middle. It learned to pay attention to character boundaries
(assuming the UTF-8 payload).
* jk/colors:
diff-highlight: do not split multibyte characters
Changed inaccurate count of "rough rules" from three to the more
generic 'a few'.
Signed-off-by: Julian Gindi <juliangindi@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>