Andrew's git - gitweb.git/log

revision: quit pruning diff more quickly when possibleJeff King Fri, 13 Oct 2017 15:27:45 +0000 (11:27 -0400)

revision: quit pruning diff more quickly when possible

When the revision traversal machinery is given a pathspec,
we must compute the parent-diff for each commit to determine
which ones are TREESAME. We set the QUICK diff flag to avoid
looking at more entries than we need; we really just care
whether there are any changes at all.

But there is one case where we want to know a bit more: if
--remove-empty is set, we care about finding cases where the
change consists only of added entries (in which case we may
prune the parent in try_to_simplify_commit()). To cover that
case, our file_add_remove() callback does not quit the diff
upon seeing an added entry; it keeps looking for other types
of entries.

But this means when --remove-empty is not set (and it is not
by default), we compute more of the diff than is necessary.
You can see this in a pathological case where a commit adds
a very large number of entries, and we limit based on a
broad pathspec. E.g.:

perl -e '
chomp(my $blob = `git hash-object -w --stdin </dev/null`);
for my $a (1..1000) {
for my $b (1..1000) {
print "100644 $blob\t$a/$b\n";
}
}
' | git update-index --index-info
git commit -qm add

git rev-list HEAD -- .

This case takes about 100ms now, but after this patch only
needs 6ms. That's not a huge improvement, but it's easy to
get and it protects us against even more pathological cases
(e.g., going from 1 million to 10 million files would take
ten times as long with the current code, but not increase at
all after this patch).

This is reported to minorly speed-up pathspec limiting in
real world repositories (like the 100-million-file Windows
repository), but probably won't make a noticeable difference
outside of pathological setups.

This patch actually covers the case without --remove-empty,
and the case where we see only deletions. See the in-code
comment for details.

Note that we have to add a new member to the diff_options
struct so that our callback can see the value of
revs->remove_empty_trees. This callback parameter could be
passed to the "add_remove" and "change" callbacks, but
there's not much point. They already receive the
diff_options struct, and doing it this way avoids having to
update the function signature of the other callbacks
(arguably the format_callback and output_prefix functions
could benefit from the same simplification).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

branch: split validate_new_branchname() into twoJunio C Hamano Fri, 13 Oct 2017 04:45:40 +0000 (13:45 +0900)

branch: split validate_new_branchname() into two

Checking if a proposed name is appropriate for a branch is strictly
a subset of checking if we want to allow creating or updating a
branch with such a name. The mysterious sounding 'attr_only'
parameter to validate_new_branchname() is used to switch the
function between these two roles.

Instead, split the function into two, and adjust the callers. A new
helper validate_branchname() only checks the name and reports if the
branch already exists.

This loses one NEEDSWORK from the branch API.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

branch: streamline "attr_only" handling in validate_new... Junio C Hamano Fri, 13 Oct 2017 03:57:02 +0000 (12:57 +0900)

branch: streamline "attr_only" handling in validate_new_branchname()

The function takes a parameter "attr_only" (which is a name that is
hard to reason about, which will be corrected soon) and skips some
checks when it is set. Reorganize the conditionals to make it more
obvious that some checks are never made when this parameter is set.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

pull: pass --signoff/--no-signoff to "git merge"W. Trevor King Thu, 12 Oct 2017 18:35:42 +0000 (11:35 -0700)

pull: pass --signoff/--no-signoff to "git merge"

merge can take --signoff, but without pull passing --signoff down, it
is inconvenient to use; allow 'pull' to take the option and pass it
through.

The order of options in merge-options.txt is mostly alphabetical by
long option since 7c85d274 (Documentation/merge-options.txt: order
options in alphabetical groups, 2009-10-22). The long-option bit
didn't make it into the commit message, but it's under the fold in
[1]. I've put --signoff between --log and --stat to preserve the
alphabetical order.

[1]: https://public-inbox.org/git/87iqe7zspn.fsf@jondo.cante.net/

Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_name: minimize OID comparisons during disambiguationDerrick Stolee Thu, 12 Oct 2017 12:02:20 +0000 (08:02 -0400)

sha1_name: minimize OID comparisons during disambiguation

Minimize OID comparisons during disambiguation of packfile OIDs.

Teach git to use binary search with the full OID to find the object's
position (or insertion position, if not present) in the pack-index.
The object before and immediately after (or the one at the insertion
position) give the maximum common prefix. No subsequent linear search
is required.

Take care of which two to inspect, in case the object id exists in the
packfile.

If the input to find_unique_abbrev_r() is a partial prefix, then the
OID used for the binary search is padded with zeroes so the object will
not exist in the repo (with high probability) and the same logic
applies.

This commit completes a series of three changes to OID abbreviation
code, and the overall change can be seen using standard commands for
large repos. Below we report performance statistics for perf test 4211.6
from p4211-line-log.sh using three copies of the Linux repo:

| Packs | Loose | HEAD~3 | HEAD | Rel% |
|-------|--------|----------|----------|-------|
| 1 | 0 | 41.27 s | 38.93 s | -4.8% |
| 24 | 0 | 98.04 s | 91.35 s | -5.7% |
| 23 | 323952 | 117.78 s | 112.18 s | -4.8% |

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_name: parse less while finding common prefixDerrick Stolee Thu, 12 Oct 2017 12:02:19 +0000 (08:02 -0400)

sha1_name: parse less while finding common prefix

Create get_hex_char_from_oid() to parse oids one hex character at a
time. This prevents unnecessary copying of hex characters in
extend_abbrev_len() when finding the length of a common prefix.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_name: unroll len loop in find_unique_abbrev_r()Derrick Stolee Sun, 8 Oct 2017 18:49:40 +0000 (14:49 -0400)

sha1_name: unroll len loop in find_unique_abbrev_r()

Unroll the while loop inside find_unique_abbrev_r to avoid iterating
through all loose objects and packfiles multiple times when the short
name is longer than the predicted length.

Instead, inspect each object that collides with the estimated
abbreviation to find the longest common prefix.

The focus of this change is to refactor the existing method in a way
that clearly does not change the current behavior. In some cases, the
new method is slower than the previous method. Later changes will
correct all performance loss.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

p4211-line-log.sh: add log --online --raw --parents... Derrick Stolee Sun, 8 Oct 2017 18:49:39 +0000 (14:49 -0400)

p4211-line-log.sh: add log --online --raw --parents perf test

Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time computing
diffs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/merge-options.txt: describe -S/--gpg... W. Trevor King Thu, 12 Oct 2017 09:02:17 +0000 (02:02 -0700)

Documentation/merge-options.txt: describe -S/--gpg-sign for 'pull'

Pull has supported these since ea230d8 (pull: add the --gpg-sign
option, 2014-02-10). Insert in long-option alphabetical order
following 7c85d274 (Documentation/merge-options.txt: order options
in alphabetical groups, 2009-10-22).

Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: sv.po: Update Swedish translation (3245t0f0u)Peter Krefting Wed, 11 Oct 2017 10:35:11 +0000 (11:35 +0100)

l10n: sv.po: Update Swedish translation (3245t0f0u)

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>

merge-ours: do not use cmd_*() as a subroutineJunio C Hamano Tue, 10 Oct 2017 03:04:48 +0000 (12:04 +0900)

merge-ours: do not use cmd_*() as a subroutine

The call to cmd_diff_index() "git merge-ours" makes has been working
by accident that the function did not call exit(3), and the caller
exited almost immediately after making a call, but it sets a bad
precedent for people to cut and paste.

For finding out if the index exactly matches the HEAD (or a given
tree-ish), there is index_differs_from() which is exactly written
for that purpose.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

describe: do not use cmd_*() as a subroutineJunio C Hamano Tue, 10 Oct 2017 03:42:58 +0000 (12:42 +0900)

describe: do not use cmd_*() as a subroutine

The cmd_foo() function is a moral equivalent of 'main' for a Git
subcommand 'git foo', and as such, it is allowed to do many things
that make it unsuitable to be called as a subroutine, including

- call exit(3) to terminate the process;

- allocate resource held and used throughout its lifetime, without
releasing it upon return/exit;

- rely on global variables being initialized at program startup,
and update them as needed, making another clean invocation of the
function impossible.

The call to cmd_diff_index() "git describe" makes has been working
by accident that the function did not call exit(3); it sets a bad
precedent for people to cut and paste.

We could invoke it via the run_command() interface, but the diff
family of commands have helper functions in diff-lib.c that are
meant to be usable as subroutines, and using the latter does not
make the resulting code all that longer. Use it.

Note that there is also an invocation of cmd_name_rev() at the end;
"git describe --contains" massages its command line arguments to be
suitable for "git name-rev" invocation and jumps to it, never to
regain control. This call is left as-is as an exception to the
rule. When we start to allow calling name-rev repeatedly as a
helper function, we would be able to remove this call as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

checkout doc: clarify command line args for "checkout... Junio C Hamano Wed, 11 Oct 2017 02:21:03 +0000 (11:21 +0900)

checkout doc: clarify command line args for "checkout paths" mode

There are "git checkout [-p][<tree-ish>][--][<paths>...]" in the
SYNOPSIS section, and "git checkout [-p][<tree-ish>][--]<paths>..."
as the header for the section that explains the "check out paths
from index/tree-ish" mode. It is unclear if we require at least one
path, or it is entirely optional.

Actually, both are wrong. Without the "-p(atch)" option, you must
have <pathspec> (otherwise, with a commit that is a <tree-ish>, you
would be checking out that commit to build a new history on top of
it). With it, it is already clear that you are checking out paths,
it is optional. In other words, you cannot omit both.

The source of the confusion is that -p(atch) is described as if it
is just another "optional" part and its description is lumped
together with the non patch mode, even though the actual end user
experience is vastly different.

Let's split the entry into two, and describe the regular mode and
the patch mode separately. This allows us to make it clear that the
regular mode MUST be given at least one pathspec, that the patch
mode can be invoked with either '-p' or '--patch' but one of these
must be given, and that the pathspec is entirely optional in the
patch mode.

Also, revamp the explanation of "checkout paths" by removing
extraneous description at the beginning, that says "checking out
paths is not checking out a branch". Explaining what it is for and
when the user wants to use it upfront is the most direct way to help
the readers.

Noticed-by: Robert P J Day
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Git 2.15-rc1 v2.15.0-rc1Junio C Hamano Wed, 11 Oct 2017 05:54:04 +0000 (14:54 +0900)

Git 2.15-rc1

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ls/filter-process-delayed'Junio C Hamano Wed, 11 Oct 2017 05:52:24 +0000 (14:52 +0900)

Merge branch 'ls/filter-process-delayed'

Bugfixes to an already graduated series.

* ls/filter-process-delayed:
write_entry: untangle symlink and regular-file cases
write_entry: avoid reading blobs in CE_RETRY case
write_entry: fix leak when retrying delayed filter
entry.c: check if file exists after checkout
entry.c: update cache entry only for existing files

Merge branch 'ds/avoid-overflow-in-midpoint-computation'Junio C Hamano Wed, 11 Oct 2017 05:52:24 +0000 (14:52 +0900)

Merge branch 'ds/avoid-overflow-in-midpoint-computation'

Code clean-up.

* ds/avoid-overflow-in-midpoint-computation:
cleanup: fix possible overflow errors in binary search

Merge branch 'tb/complete-describe'Junio C Hamano Wed, 11 Oct 2017 05:52:23 +0000 (14:52 +0900)

Merge branch 'tb/complete-describe'

Docfix.

* tb/complete-describe:
completion: add --broken and --dirty to describe

Merge branch 'sb/test-cmp-expect-actual'Junio C Hamano Wed, 11 Oct 2017 05:52:23 +0000 (14:52 +0900)

Merge branch 'sb/test-cmp-expect-actual'

Test tweak.

* sb/test-cmp-expect-actual:
tests: fix diff order arguments in test_cmp

Merge branch 'jk/refs-df-conflict'Junio C Hamano Wed, 11 Oct 2017 05:52:23 +0000 (14:52 +0900)

Merge branch 'jk/refs-df-conflict'

An ancient bug that made Git misbehave with creation/renaming of
refs has been fixed.

* jk/refs-df-conflict:
refs_resolve_ref_unsafe: handle d/f conflicts for writes
t3308: create a real ref directory/file conflict

Merge branch 'rs/rs-mailmap'Junio C Hamano Wed, 11 Oct 2017 05:52:23 +0000 (14:52 +0900)

Merge branch 'rs/rs-mailmap'

* rs/rs-mailmap:
.mailmap: normalize name for René Scharfe

Merge branch 'rs/fsck-null-return-from-lookup'Junio C Hamano Wed, 11 Oct 2017 05:52:23 +0000 (14:52 +0900)

Merge branch 'rs/fsck-null-return-from-lookup'

Improve behaviour of "git fsck" upon finding a missing object.

* rs/fsck-null-return-from-lookup:
fsck: handle NULL return of lookup_blob() and lookup_tree()

Merge branch 'jk/sha1-loose-object-info-fix'Junio C Hamano Wed, 11 Oct 2017 05:52:22 +0000 (14:52 +0900)

Merge branch 'jk/sha1-loose-object-info-fix'

Leakfix and futureproofing.

* jk/sha1-loose-object-info-fix:
sha1_loose_object_info: handle errors from unpack_sha1_rest

Merge branch 'hn/string-list-doc'Junio C Hamano Wed, 11 Oct 2017 05:52:22 +0000 (14:52 +0900)

Merge branch 'hn/string-list-doc'

Docfix.

* hn/string-list-doc:
api-argv-array.txt: remove broken link to string-list API

Merge branch 'tb/show-trailers-in-ref-filter'Junio C Hamano Wed, 11 Oct 2017 05:52:22 +0000 (14:52 +0900)

Merge branch 'tb/show-trailers-in-ref-filter'

"git for-each-ref --format=..." learned a new format element,
%(trailers), to show only the commit log trailer part of the log
message.

* tb/show-trailers-in-ref-filter:
ref-filter.c: parse trailers arguments with %(contents) atom
ref-filter.c: use trailer_opts to format trailers
t6300: refactor %(trailers) tests
doc: use "`<literal>`"-style quoting for literal strings
doc: 'trailers' is the preferred way to format trailers
t4205: unfold across multiple lines

Merge branch 'jt/oidmap'Junio C Hamano Wed, 11 Oct 2017 05:52:22 +0000 (14:52 +0900)

Merge branch 'jt/oidmap'

Introduce a new "oidmap" API and rewrite oidset to use it.

* jt/oidmap:
oidmap: map with OID as key

Merge branch 'jr/hash-migration-plan-doc'Junio C Hamano Wed, 11 Oct 2017 05:52:22 +0000 (14:52 +0900)

Merge branch 'jr/hash-migration-plan-doc'

Lay out plans for weaning us off of SHA-1.

* jr/hash-migration-plan-doc:
technical doc: add a design doc for hash function transition

Merge branch 'master' of https://github.com/vnwildman/gitJiang Xin Wed, 11 Oct 2017 00:08:10 +0000 (08:08 +0800)

Merge branch 'master' of https://github.com/vnwildman/git

* 'master' of https://github.com/vnwildman/git:
l10n: vi.po(3245t): Updated Vietnamese translation for v2.15.0

for-each-ref: let upstream/push optionally report the... Johannes Schindelin Thu, 5 Oct 2017 12:19:09 +0000 (14:19 +0200)

for-each-ref: let upstream/push optionally report the remote name

There are times when e.g. scripts want to know not only the name of the
upstream branch on the remote repository, but also the name of the
remote.

This patch offers the new suffix :remotename for the upstream and for
the push atoms, allowing to show exactly that. Example:

$ cat .git/config
...
[remote "origin"]
url = https://where.do.we.come/from
fetch = refs/heads/*:refs/remote/origin/*
[remote "hello-world"]
url = https://hello.world/git
fetch = refs/heads/*:refs/remote/origin/*
pushURL = hello.world:git
push = refs/heads/*:refs/heads/*
[branch "master"]
remote = origin
pushRemote = hello-world
...

$ git for-each-ref \
--format='%(upstream) %(upstream:remotename) %(push:remotename)' \
refs/heads/master
refs/remotes/origin/master origin hello-world

The implementation chooses *not* to DWIM the push remote if no explicit
push remote was configured; The reason is that it is possible to DWIM this
by using

%(if)%(push:remotename)%(then)
%(push:remotename)
%(else)
%(upstream:remotename)
%(end)

while it would be impossible to "un-DWIM" the information in case the
caller is really only interested in explicit push remotes.

While `:remote` would be shorter, it would also be a bit more ambiguous,
and it would also shut the door e.g. for `:remoteref` (which would
obviously refer to the corresponding ref in the remote repository).

Note: the dashless, non-CamelCased form `:remotename` follows the
example of the `:trackshort` example.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

run-command: add hint when a hook is ignoredDamien Marié Fri, 6 Oct 2017 08:07:55 +0000 (08:07 +0000)

run-command: add hint when a hook is ignored

When an hook is present but the file is not set as executable then git will
ignore the hook.
For now this is silent which can be confusing.

This commit adds this warning to improve the situation:

hint: The 'pre-commit' hook was ignored because it's not set as executable.
hint: You can disable this warning with `git config advice.ignoredHook false`

To allow the old use-case of enabling/disabling hooks via the executable flag a
new setting is introduced: advice.ignoredHook.

Signed-off-by: Damien Marié <damien@dam.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

write_entry: untangle symlink and regular-file casesJeff King Mon, 9 Oct 2017 17:50:05 +0000 (13:50 -0400)

write_entry: untangle symlink and regular-file cases

The write_entry() function switches on the mode of the entry
we're going to write out. The cases for S_IFLNK and S_IFREG
are lumped together. In earlier versions of the code, this
made some sense. They have a shared preamble (which reads
the blob content), a short type-specific body, and a shared
conclusion (which writes out the file contents; always for
S_IFREG and only sometimes for S_IFLNK).

But over time this has grown to make less sense. The preamble
now has conditional bits for each type, and the S_IFREG body
has grown a lot more complicated. It's hard to follow the
logic of which code is running for which mode.

Let's give each mode its own case arm. We will still share
the conclusion code, which means we now jump to it with a
goto. Ideally we'd pull that shared code into its own
function, but it touches so much internal state in the
write_entry() function that the end result is actually
harder to follow than the goto.

While we're here, we'll touch up a few bits of whitespace to
make the beginning and endings of the cases easier to read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

write_entry: avoid reading blobs in CE_RETRY caseJeff King Mon, 9 Oct 2017 17:48:52 +0000 (13:48 -0400)

write_entry: avoid reading blobs in CE_RETRY case

When retrying a delayed filter-process request, we don't
need to send the blob to the filter a second time. However,
we read it unconditionally into a buffer, only to later
throw away that buffer. We can make this more efficient by
skipping the read in the first place when it isn't
necessary.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

write_entry: fix leak when retrying delayed filterJeff King Mon, 9 Oct 2017 17:48:24 +0000 (13:48 -0400)

write_entry: fix leak when retrying delayed filter

When write_entry() retries a delayed filter request, we
don't need to send the blob content to the filter again, and
set the pointer to NULL. But doing so means we leak the
contents we read earlier from read_blob_entry(). Let's make
sure to free it before dropping the pointer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cleanup: fix possible overflow errors in binary searchDerrick Stolee Sun, 8 Oct 2017 18:29:37 +0000 (14:29 -0400)

cleanup: fix possible overflow errors in binary search

A common mistake when writing binary search is to allow possible
integer overflow by using the simple average:

mid = (min + max) / 2;

Instead, use the overflow-safe version:

mid = min + (max - min) / 2;

This translation is safe since the operation occurs inside a loop
conditioned on "min < max". The included changes were found using
the following git grep:

git grep '/ *2;' '*.c'

Making this cleanup will prevent future review friction when a new
binary search is contructed based on existing code.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'js/rebase-i-final'Junio C Hamano Mon, 9 Oct 2017 09:59:16 +0000 (18:59 +0900)

Merge branch 'js/rebase-i-final'

* js/rebase-i-final:
i18n: add a missing space in message

i18n: add a missing space in messageJean-Noel Avila Sun, 8 Oct 2017 12:18:39 +0000 (14:18 +0200)

i18n: add a missing space in message

The message spans over 2 lines but the C conconcatenation does not add
the needed space between the two lines.

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: vi.po(3245t): Updated Vietnamese translation... Tran Ngoc Quan Mon, 9 Oct 2017 08:13:05 +0000 (15:13 +0700)

l10n: vi.po(3245t): Updated Vietnamese translation for v2.15.0

Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>

l10n: es.po: Update translation v2.15.0 round 1Christopher Díaz Sun, 8 Oct 2017 16:30:11 +0000 (11:30 -0500)

l10n: es.po: Update translation v2.15.0 round 1

Signed-off-by: Christopher Díaz <christopher.diaz.riv@gmail.com>

Merge branch 'maint' of git://github.com/git-l10n/git-poJiang Xin Sun, 8 Oct 2017 07:21:22 +0000 (15:21 +0800)

Merge branch 'maint' of git://github.com/git-l10n/git-po

* 'maint' of git://github.com/git-l10n/git-po:
l10n: es.po: spanish added to TEAMS
l10n: es.po: initial Spanish version git 2.14.0

l10n: git.pot: v2.15.0 round 1 (68 new, 36 removed)Jiang Xin Sun, 8 Oct 2017 07:12:45 +0000 (15:12 +0800)

l10n: git.pot: v2.15.0 round 1 (68 new, 36 removed)

Generate po/git.pot from commit d35688db19 ("Prepare for -rc1",
2017-10-07) for git v2.15.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

fetch: add test to make sure we stay backwards compatibleHeiko Voigt Fri, 6 Oct 2017 22:30:47 +0000 (00:30 +0200)

fetch: add test to make sure we stay backwards compatible

The current implementation of submodules supports on-demand fetch if
there is no .gitmodules entry for a submodule. Let's add a test to
document this behavior.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule: port submodule subcommand 'status' from... Prathamesh Chavan Fri, 6 Oct 2017 13:24:15 +0000 (18:54 +0530)

submodule: port submodule subcommand 'status' from shell to C

This aims to make git-submodule 'status' a built-in. Hence, the function
cmd_status() is ported from shell to C. This is done by introducing
four functions: module_status(), submodule_status_cb(),
submodule_status() and print_status().

The function module_status() acts as the front-end of the subcommand.
It parses subcommand's options and then calls the function
module_list_compute() for computing the list of submodules. Then
this functions calls for_each_listed_submodule() looping through the
list obtained.

Then for_each_listed_submodule() calls submodule_status_cb() for each of
the submodule in its list. The function submodule_status_cb() calls
submodule_status() after passing appropriate arguments to the funciton.
Function submodule_status() is responsible for generating the status
each submodule it is called for, and then calls print_status().

Finally, the function print_status() handles the printing of submodule's
status.

Function set_name_rev() is also ported from git-submodule to the
submodule--helper builtin function compute_rev_name(), which now
generates the value of the revision name as required.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule--helper: introduce for_each_listed_submodule()Prathamesh Chavan Fri, 6 Oct 2017 13:24:14 +0000 (18:54 +0530)

submodule--helper: introduce for_each_listed_submodule()

Introduce function for_each_listed_submodule() and replace a loop
in module_init() with a call to it.

The new function will also be used in other parts of the
system in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Prepare for -rc1Junio C Hamano Sat, 7 Oct 2017 07:29:03 +0000 (16:29 +0900)

Prepare for -rc1

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'tb/ref-filter-empty-modifier'Junio C Hamano Sat, 7 Oct 2017 07:27:56 +0000 (16:27 +0900)

Merge branch 'tb/ref-filter-empty-modifier'

In the "--format=..." option of the "git for-each-ref" command (and
its friends, i.e. the listing mode of "git branch/tag"), "%(atom:)"
(e.g. "%(refname:)", "%(body:)" used to error out. Instead, treat
them as if the colon and an empty string that follows it were not
there.

* tb/ref-filter-empty-modifier:
ref-filter.c: pass empty-string as NULL to atom parsers

Merge branch 'ks/verify-filename-non-option-error-messa... Junio C Hamano Sat, 7 Oct 2017 07:27:55 +0000 (16:27 +0900)

Merge branch 'ks/verify-filename-non-option-error-message-tweak'

Error message tweak.

* ks/verify-filename-non-option-error-message-tweak:
setup: update error message to be more meaningful

Merge branch 'ks/branch-tweak-error-message-for-extra... Junio C Hamano Sat, 7 Oct 2017 07:27:55 +0000 (16:27 +0900)

Merge branch 'ks/branch-tweak-error-message-for-extra-args'

Error message tweak.

* ks/branch-tweak-error-message-for-extra-args:
branch: change the error messages to be more meaningful

Merge branch 'jk/ui-color-always-to-auto'Junio C Hamano Sat, 7 Oct 2017 07:27:55 +0000 (16:27 +0900)

Merge branch 'jk/ui-color-always-to-auto'

Fix regression of "git add -p" for users with "color.ui = always"
in their configuration, by merging the topic below and adjusting it
for the 'master' front.

* jk/ui-color-always-to-auto:
t7301: use test_terminal to check color
t4015: use --color with --color-moved
color: make "always" the same as "auto" in config
provide --color option for all ref-filter users
t3205: use --color instead of color.branch=always
t3203: drop "always" color test
t6006: drop "always" color config tests
t7502: use diff.noprefix for --verbose test
t7508: use test_terminal for color output
t3701: use test-terminal to collect color output
t4015: prefer --color to -c color.diff=always
test-terminal: set TERM=vt100

Merge branch 'ma/builtin-unleak'Junio C Hamano Sat, 7 Oct 2017 07:27:55 +0000 (16:27 +0900)

Merge branch 'ma/builtin-unleak'

Many variables that points at a region of memory that will live
throughout the life of the program have been marked with UNLEAK
marker to help the leak checkers concentrate on real leaks..

* ma/builtin-unleak:
builtin/: add UNLEAKs

Merge branch 'rb/compat-poll-fix'Junio C Hamano Sat, 7 Oct 2017 07:27:55 +0000 (16:27 +0900)

Merge branch 'rb/compat-poll-fix'

Backports a moral equivalent of 2015 fix to the poll emulation from
the upstream gnulib to fix occasional breakages on HPE NonStop.

* rb/compat-poll-fix:
poll.c: always set revents, even if to zero

Merge branch 'tg/memfixes'Junio C Hamano Sat, 7 Oct 2017 07:27:54 +0000 (16:27 +0900)

Merge branch 'tg/memfixes'

Fixes for a handful memory access issues identified by valgrind.

* tg/memfixes:
sub-process: use child_process.args instead of child_process.argv
http-push: fix construction of hex value from path
path.c: fix uninitialized memory access

Merge branch 'sb/branch-avoid-repeated-strbuf-release'Junio C Hamano Sat, 7 Oct 2017 07:27:54 +0000 (16:27 +0900)

Merge branch 'sb/branch-avoid-repeated-strbuf-release'

* sb/branch-avoid-repeated-strbuf-release:
branch: reset instead of release a strbuf

Merge branch 'rs/qsort-s'Junio C Hamano Sat, 7 Oct 2017 07:27:53 +0000 (16:27 +0900)

Merge branch 'rs/qsort-s'

* rs/qsort-s:
test-stringlist: avoid buffer underrun when sorting nothing

Merge branch 'jn/strbuf-doc-re-reuse'Junio C Hamano Sat, 7 Oct 2017 07:27:53 +0000 (16:27 +0900)

Merge branch 'jn/strbuf-doc-re-reuse'

* jn/strbuf-doc-re-reuse:
strbuf doc: reuse after strbuf_release is fine

Merge branch 'tb/delimit-pretty-trailers-args-with... Junio C Hamano Sat, 7 Oct 2017 07:27:52 +0000 (16:27 +0900)

Merge branch 'tb/delimit-pretty-trailers-args-with-comma'

The feature that allows --pretty='%(trailers)' to take modifiers
like "fold" and "only" used to separate these modifiers with a
comma, i.e. "%(trailers:fold:only)", but we changed our mind and
use a comma, i.e. "%(trailers:fold,only)". Fast track this change
before this new feature becomes part of any official release.

* tb/delimit-pretty-trailers-args-with-comma:
pretty.c: delimit "%(trailers)" arguments with ","

completion: add --broken and --dirty to describeThomas Braun Fri, 6 Oct 2017 18:02:47 +0000 (20:02 +0200)

completion: add --broken and --dirty to describe

When the flags for broken and dirty were implemented in
b0176ce6b5 (builtin/describe: introduce --broken flag, 2017-03-21)
and 9f67d2e827 (Teach "git describe" --dirty option, 2009-10-21)
the completion was not updated, although these flags are useful
completions. Add them.

Signed-off-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests: fix diff order arguments in test_cmpStefan Beller Fri, 6 Oct 2017 19:00:06 +0000 (12:00 -0700)

tests: fix diff order arguments in test_cmp

Fix the argument order for test_cmp. When given the expected
result first the diff shows the actual output with '+' and the
expectation with '-', which is the convention for our tests.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs_resolve_ref_unsafe: handle d/f conflicts for writesJeff King Fri, 6 Oct 2017 14:42:17 +0000 (10:42 -0400)

refs_resolve_ref_unsafe: handle d/f conflicts for writes

If our call to refs_read_raw_ref() fails, we check errno to
see if the ref is simply missing, or if we encountered a
more serious error. If it's just missing, then in "write"
mode (i.e., when RESOLVE_REFS_READING is not set), this is
perfectly fine.

However, checking for ENOENT isn't sufficient to catch all
missing-ref cases. In the filesystem backend, we may also
see EISDIR when we try to resolve "a" and "a/b" exists.
Likewise, we may see ENOTDIR if we try to resolve "a/b" and
"a" exists. In both of those cases, we know that our
resolved ref doesn't exist, but we return an error (rather
than reporting the refname and returning a null sha1).

This has been broken for a long time, but nobody really
noticed because the next step after resolving without the
READING flag is usually to lock the ref and write it. But in
both of those cases, the write will fail with the same
errno due to the directory/file conflict.

There are two cases where we can notice this, though:

1. If we try to write "a" and there's a leftover directory
already at "a", even though there is no ref "a/b". The
actual write is smart enough to move the empty "a" out
of the way.

This is reasonably rare, if only because the writing
code has to do an independent resolution before trying
its write (because the actual update_ref() code handles
this case fine). The notes-merge code does this, and
before the fix in the prior commit t3308 erroneously
expected this case to fail.

2. When resolving symbolic refs, we typically do not use
the READING flag because we want to resolve even
symrefs that point to unborn refs. Even if those unborn
refs could not actually be written because of d/f
conflicts with existing refs.

You can see this by asking "git symbolic-ref" to report
the target of a symref pointing past a d/f conflict.

We can fix the problem by recognizing the other "missing"
errnos and treating them like ENOENT. This should be safe to
do even for callers who are then going to actually write the
ref, because the actual writing process will fail if the d/f
conflict is a real one (and t1404 checks these cases).

Arguably this should be the responsibility of the
files-backend to normalize all "missing ref" errors into
ENOENT (since something like EISDIR may not be meaningful at
all to a database backend). However other callers of
refs_read_raw_ref() may actually care about the distinction;
putting this into resolve_ref() is the minimal fix for now.

The new tests in t1401 use git-symbolic-ref, which is the
most direct way to check the resolution by itself.
Interestingly we actually had a test that setup this case
already, but we only used it to verify that the funny state
could be overwritten, not that it could be resolved.

We also add a new test in t3200, as "branch -m" was the
original motivation for looking into this. What happens is
this:

0. HEAD is pointing to branch "a"

1. The user asks to rename "a" to "a/b".

2. We create "a/b" and delete "a".

3. We then try to update any worktree HEADs that point to
the renamed ref (including the main repo HEAD). To do
that, we have to resolve each HEAD. But now our HEAD is
pointing at "a", and we get EISDIR due to the loose
"a/b". As a result, we think there is no HEAD, and we
do not update it. It now points to the bogus "a".

Interestingly this case used to work, but only accidentally.
Before 31824d180d (branch: fix branch renaming not updating
HEADs correctly, 2017-08-24), we'd update any HEAD which we
couldn't resolve. That was wrong, but it papered over the
fact that we were incorrectly failing to resolve HEAD.

So while the bug demonstrated by the git-symbolic-ref is
quite old, the regression to "branch -m" is recent.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3308: create a real ref directory/file conflictJeff King Fri, 6 Oct 2017 14:38:30 +0000 (10:38 -0400)

t3308: create a real ref directory/file conflict

A test in t3308 wants to make sure that we don't
accidentally merge into "refs/notes/dir" when it exists as a
directory, so it does:

mkdir .git/refs/notes/dir
git -c core.notesRef=refs/notes/dir merge ...

and expects the second command to fail. But that
understimates the refs code, which is smart enough to remove
useless directories in the refs hierarchy. The test
succeeded only because of a bug which prevented resolving
refs/notes/dir for writing, even though an actual ref update
would succeed.

In preparation for fixing that bug, let's switch to creating
a real ref in refs/notes/dir, which is a more realistic
situation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read_cache: roll back lock in `update_index_if_able()`Martin Ågren Fri, 6 Oct 2017 20:12:14 +0000 (22:12 +0200)

read_cache: roll back lock in `update_index_if_able()`

`update_index_if_able()` used to always commit the lock or roll it back.
Commit 03b866477 (read-cache: new API write_locked_index instead of
write_index/write_cache, 2014-06-13) stopped rolling it back in case a
write was not even attempted. This change in behavior is not motivated
in the commit message and appears to be accidental: the `else`-path was
removed, although that changed the behavior in case the `if` shortcuts.

Reintroduce the rollback and document this behavior. While at it, move
the documentation on this function from the function definition to the
function declaration in cache.h.

If `write_locked_index(..., COMMIT_LOCK)` fails, it will roll back the
lock for us (see the previous commit).

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: leave lock in right state in `write_locked_... Martin Ågren Fri, 6 Oct 2017 20:12:13 +0000 (22:12 +0200)

read-cache: leave lock in right state in `write_locked_index()`

If the original version of `write_locked_index()` returned with an
error, it didn't roll back the lockfile unless the error occured at the
very end, during closing/committing. See commit 03b866477 (read-cache:
new API write_locked_index instead of write_index/write_cache,
2014-06-13).

In commit 9f41c7a6b (read-cache: close index.lock in do_write_index,
2017-04-26), we learned to close the lock slightly earlier in the
callstack. That was mostly a side-effect of lockfiles being implemented
using temporary files, but didn't cause any real harm.

Recently, commit 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05) introduced a subtle bug. If the temporary file is deleted
(i.e., the lockfile is rolled back), the tempfile-pointer in the `struct
lock_file` will be left dangling. Thus, an attempt to reuse the
lockfile, or even just to roll it back, will induce undefined behavior
-- most likely a crash.

Besides not crashing, we clearly want to make things consistent. The
guarantees which the lockfile-machinery itself provides is A) if we ask
to commit and it fails, roll back, and B) if we ask to close and it
fails, do _not_ roll back. Let's do the same for consistency.

Do not delete the temporary file in `do_write_index()`. One of its
callers, `write_locked_index()` will thereby avoid rolling back the
lock. The other caller, `write_shared_index()`, will delete its
temporary file anyway. Both of these callers will avoid undefined
behavior (crashing).

Teach `write_locked_index(..., COMMIT_LOCK)` to roll back the lock
before returning. If we have already succeeded and committed, it will be
a noop. Simplify the existing callers where we now have a superfluous
call to `rollback_lockfile()`. That should keep future readers from
wondering why the callers are inconsistent.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: drop explicit `CLOSE_LOCK`-flagMartin Ågren Fri, 6 Oct 2017 20:12:12 +0000 (22:12 +0200)

read-cache: drop explicit `CLOSE_LOCK`-flag

`write_locked_index()` takes two flags: `COMMIT_LOCK` and `CLOSE_LOCK`.
At most one is allowed. But it is also possible to use no flag, i.e.,
`0`. But when `write_locked_index()` calls `do_write_index()`, the
temporary file, a.k.a. the lockfile, will be closed. So passing `0` is
effectively the same as `CLOSE_LOCK`, which seems like a bug.

We might feel tempted to restructure the code in order to close the file
later, or conditionally. It also feels a bit unfortunate that we simply
"happen" to close the lock by way of an implementation detail of
lockfiles. But note that we need to close the temporary file before
`stat`-ing it, at least on Windows. See 9f41c7a6b (read-cache: close
index.lock in do_write_index, 2017-04-26).

Drop `CLOSE_LOCK` and make it explicit that `write_locked_index()`
always closes the lock. Whether it is also committed is governed by the
remaining flag, `COMMIT_LOCK`.

This means we neither have nor suggest that we have a mode to write the
index and leave the file open. Whatever extra contents we might
eventually want to write, we should probably write it from within
`write_locked_index()` itself anyway.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

api-argv-array.txt: remove broken link to string-list APITodd Zullinger Fri, 6 Oct 2017 03:14:56 +0000 (23:14 -0400)

api-argv-array.txt: remove broken link to string-list API

In 4f665f2cf3 (string-list.h: move documentation from Documentation/api/
into header, 2017-09-26) the string-list API documentation was moved to
string-list.h. The argv-array API documentation may follow a similar
course in the future. Until then, prevent the broken link from making
it to the end-user documentation.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

entry.c: check if file exists after checkoutLars Schneider Thu, 5 Oct 2017 10:44:07 +0000 (12:44 +0200)

entry.c: check if file exists after checkout

If we are checking out a file and somebody else racily deletes our file,
then we would write garbage to the cache entry. Fix that by checking
the result of the lstat() call on that file. Print an error to the user
if the file does not exist.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bisect--helper: `is_expected_rev` & `check_expected_rev... Pranit Bauva Fri, 29 Sep 2017 06:49:40 +0000 (06:49 +0000)

bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C

Reimplement `is_expected_rev` & `check_expected_revs` shell function in
C and add a `--check-expected-revs` subcommand to `git bisect--helper` to
call it from git-bisect.sh .

Using `--check-expected-revs` subcommand is a temporary measure to port
shell functions to C so as to use the existing test suite. As more
functions are ported, this subcommand would be retired but its
implementation will be called by some other method.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t6030: explicitly test for bisection cleanupPranit Bauva Fri, 29 Sep 2017 06:49:39 +0000 (06:49 +0000)

t6030: explicitly test for bisection cleanup

Add test to explicitly check that 'git bisect reset' is working as
expected. This is already covered implicitly by the test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bisect--helper: `bisect_clean_state` shell function... Pranit Bauva Fri, 29 Sep 2017 06:49:39 +0000 (06:49 +0000)

bisect--helper: `bisect_clean_state` shell function in C

Reimplement `bisect_clean_state` shell function in C and add a
`bisect-clean-state` subcommand to `git bisect--helper` to call it from
git-bisect.sh .

Using `--bisect-clean-state` subcommand is a measure to port shell
function to C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired but its implementation will
be called by bisect_reset() and bisect_start().

Also introduce a function `mark_for_removal` to store the refs which
need to be removed while iterating through the refs.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bisect--helper: `write_terms` shell function in CPranit Bauva Fri, 29 Sep 2017 06:49:39 +0000 (06:49 +0000)

bisect--helper: `write_terms` shell function in C

Reimplement the `write_terms` shell function in C and add a `write-terms`
subcommand to `git bisect--helper` to call it from git-bisect.sh . Also
remove the subcommand `--check-term-format` as it can now be called from
inside the function write_terms() C implementation.

Also `|| exit` is added when calling write-terms subcommand from
git-bisect.sh so as to exit whenever there is an error.

Using `--write-terms` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired and its implementation will
be called by some other method.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bisect--helper: rewrite `check_term_format` shell funct... Pranit Bauva Fri, 29 Sep 2017 06:49:39 +0000 (06:49 +0000)

bisect--helper: rewrite `check_term_format` shell function in C

Reimplement the `check_term_format` shell function in C and add
a `--check-term-format` subcommand to `git bisect--helper` to call it
from git-bisect.sh

Using `--check-term-format` subcommand is a temporary measure to port
shell function to C so as to use the existing test suite. As more
functions are ported, this subcommand will be retired and its
implementation will be called by some other method/subcommand. For
eg. In conversion of write_terms() of git-bisect.sh, the subcommand will
be removed and instead check_term_format() will be called in its C
implementation while a new subcommand will be introduced for write_terms().

Helped-by: Johannes Schindelein <Johannes.Schindelein@gmx.de>
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bisect--helper: use OPT_CMDMODE instead of OPT_BOOLPranit Bauva Fri, 29 Sep 2017 06:49:39 +0000 (06:49 +0000)

bisect--helper: use OPT_CMDMODE instead of OPT_BOOL

`--next-all` is meant to be used as a subcommand to support multiple
"operation mode" though the current implementation does not contain any
other subcommand along side with `--next-all` but further commits will
include some more subcommands.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_loose_object_info: handle errors from unpack_sha1_restJeff King Thu, 5 Oct 2017 05:59:52 +0000 (01:59 -0400)

sha1_loose_object_info: handle errors from unpack_sha1_rest

When a caller of sha1_object_info_extended() sets the
"contentp" field in object_info, we call unpack_sha1_rest()
but do not check whether it signaled an error.

This causes two problems:

1. We pass back NULL to the caller via the contentp field,
but the function returns "0" for success. A caller
might reasonably expect after a successful return that
it can access contentp without a NULL check and
segfault.

As it happens, this is impossible to trigger in the
current code. There is exactly one caller which uses
contentp, read_object(). And the only thing it does
after a successful call is to return the content
pointer to its caller, using NULL as a sentinel for
errors. So in effect it converts the success code from
sha1_object_info_extended() back into an error!

But this is still worth addressing avoid problems for
future users of "contentp".

2. Callers of unpack_sha1_rest() are expected to close the
zlib stream themselves on error. Which means that we're
leaking the stream.

The problem in (1) comes from from c84a1f3ed4 (sha1_file:
refactor read_object, 2017-06-21), which added the contentp
field. Before that, we called unpack_sha1_rest() via
unpack_sha1_file(), which directly used the NULL to signal
an error.

But note that the leak in (2) is actually older than that.
The original unpack_sha1_file() directly returned the result
of unpack_sha1_rest() to its caller, when it should have
been closing the zlib stream itself on error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

.mailmap: normalize name for René ScharfeRené Scharfe Thu, 5 Oct 2017 19:41:31 +0000 (21:41 +0200)

.mailmap: normalize name for René Scharfe

Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reported-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsck: handle NULL return of lookup_blob() and lookup_tree()René Scharfe Thu, 5 Oct 2017 19:41:26 +0000 (21:41 +0200)

fsck: handle NULL return of lookup_blob() and lookup_tree()

lookup_blob() and lookup_tree() can return NULL if they find an object
of an unexpected type. Accessing the object member is undefined in that
case. Cast the result to a struct object pointer instead; we can do
that because object is the first member of all object types. This trick
is already used in other places in the code.

An error message is already shown by object_as_type(), which is called
by the lookup functions. The walk callback functions are expected to
handle NULL object pointers passed to them, but put_object_name() needs
a valid object, so avoid calling it without one.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cache.h: document `write_locked_index()`Martin Ågren Thu, 5 Oct 2017 20:32:11 +0000 (22:32 +0200)

cache.h: document `write_locked_index()`

The next patches will tweak the behavior of this function. Document it
in order to establish a basis for those patches.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: remove `newfd` from `struct apply_state`Martin Ågren Thu, 5 Oct 2017 20:32:10 +0000 (22:32 +0200)

apply: remove `newfd` from `struct apply_state`

Similar to a previous patch, we do not need to use `newfd` to signal
that we have a lockfile to clean up. We can just unconditionally call
`rollback_lock_file`. If we do not hold the lock, it will be a no-op.

Where we check `newfd` to decide whether we need to take the lock, we
can instead use `is_lock_file_locked()`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: move lockfile into `apply_state`Martin Ågren Thu, 5 Oct 2017 20:32:09 +0000 (22:32 +0200)

apply: move lockfile into `apply_state`

We have two users of `struct apply_state` and the related functionality
in apply.c. Each user sets up its `apply_state` by handing over a
pointer to its static `lock_file`. (Before 076aa2cbd (tempfile:
auto-allocate tempfiles on heap, 2017-09-05), we could never free
lockfiles, so making them static was a reasonable approach.)

Other than that, they never directly access their `lock_file`s, which
are instead handled by the functionality in apply.c.

To make life easier for the caller and to make it less tempting for a
future caller to mess with the lock, make apply.c fully responsible for
setting up the `lock_file`. As mentioned above, it is now safe to free a
`lock_file`, so we can make the `struct apply_state` contain an actual
`struct lock_file` instead of a pointer to one.

The user in builtin/apply.c is rather simple. For builtin/am.c, we might
worry that the lock state is actually meant to be inherited across
calls. But the lock is only taken as `apply_all_patches()` executes, and
code inspection shows that it will always be released.

Alternatively, we can observe that the lock itself is never queried
directly. When we decide whether we should lock, we check a related
variable `newfd`. That variable is not inherited, so from the point of
view of apply.c, the state machine really is reset with each call to
`init_apply_state()`. (It would be a bug if `newfd` and the lock status
were not in sync. The duplication of information in `newfd` and the lock
will be addressed in the next patch.)

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cache-tree: simplify locking logicMartin Ågren Thu, 5 Oct 2017 20:32:08 +0000 (22:32 +0200)

cache-tree: simplify locking logic

After we have taken the lock using `LOCK_DIE_ON_ERROR`, we know that
`newfd` is non-negative. So when we check for exactly that property
before calling `write_locked_index()`, the outcome is guaranteed.

If we write and commit successfully, we set `newfd = -1`, so that we can
later avoid calling `rollback_lock_file` on an already-committed lock.
But we might just as well unconditionally call `rollback_lock_file()` --
it will be a no-op if we have already committed.

All in all, we use `newfd` as a bool and the only benefit we get from it
is that we can avoid calling a no-op. Remove `newfd` so that we have one
variable less to reason about.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

checkout-index: simplify locking logicMartin Ågren Thu, 5 Oct 2017 20:32:07 +0000 (22:32 +0200)

checkout-index: simplify locking logic

`newfd` starts out negative. If we then take the lock, `newfd` will
become non-negative. We later check for exactly that property before
calling `write_locked_index()`. That is, we are simply using `newfd` as
a boolean to keep track of whether we took the lock or not. (We always
use `newfd` and `lock_file` together, so they really are mirroring each
other.)

Drop `newfd` and check with `is_lock_file_locked()` instead. While at
it, move the `static struct lock_file` into `cmd_checkout_index()` and
make it non-static. It is only used in this function, and after
076aa2cbd (tempfile: auto-allocate tempfiles on heap, 2017-09-05), we
can have lockfiles on the stack.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tempfile: fix documentation on `delete_tempfile()`Martin Ågren Thu, 5 Oct 2017 20:32:06 +0000 (22:32 +0200)

tempfile: fix documentation on `delete_tempfile()`

The function has always been documented as returning 0 or -1. It is in
fact `void`. Correct that. As part of the rearrangements we lose the
mention that `delete_tempfile()` might set `errno`. Because there is
no return value, the user can't really know whether it did anyway.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

lockfile: fix documentation on `close_lock_file_gently()`Martin Ågren Thu, 5 Oct 2017 20:32:05 +0000 (22:32 +0200)

lockfile: fix documentation on `close_lock_file_gently()`

Commit 83a3069a3 (lockfile: do not rollback lock on failed close,
2017-09-05) forgot to update the documentation by the function definition
to reflect that the lock is not rolled back in case closing fails.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

treewide: prefer lockfiles on the stackMartin Ågren Thu, 5 Oct 2017 20:32:04 +0000 (22:32 +0200)

treewide: prefer lockfiles on the stack

There is no longer any need to allocate and leak a `struct lock_file`.
The previous patch addressed an instance where we needed a minor tweak
alongside the trivial changes.

Deal with the remaining instances where we allocate and leak a struct
within a single function. Change them to have the `struct lock_file` on
the stack instead.

These instances were identified by running `git grep "^\s*struct
lock_file\s*\*"`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_file: do not leak `lock_file`Martin Ågren Thu, 5 Oct 2017 20:32:03 +0000 (22:32 +0200)

sha1_file: do not leak `lock_file`

There is no longer any need to allocate and leak a `struct lock_file`.
Initialize it on the stack instead.

Before this patch, we set `lock = NULL` to signal that we have already
rolled back, and that we should not do any more work. We need to take
another approach now that we cannot assign NULL. We could, e.g., use
`is_lock_file_locked()`. But we already have another variable that we
could use instead, `found`. Its scope is only too small.

Bump `found` to the scope of the whole function and rearrange the "roll
back or write?"-checks to a straightforward if-else on `found`. This
also future-proves the code by making it obvious that we intend to take
exactly one of these paths.

Improved-by: Jeff King <peff@peff.net>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

entry.c: update cache entry only for existing filesLars Schneider Thu, 5 Oct 2017 10:44:06 +0000 (12:44 +0200)

entry.c: update cache entry only for existing files

In 2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) we taught the filter process protocol to delay responses.

That means an external filter might answer in the first write_entry()
call on a file that requires filtering "I got your request, but I
can't answer right now. Ask again later!". As Git got no answer, we do
not write anything to the filesystem. Consequently, the lstat() call in
the finish block of the function writes garbage to the cache entry.
The garbage is eventually overwritten when the filter answers with
the final file content in a subsequent write_entry() call.

Fix the brief time window of garbage in the cache entry by adding a
special finish block that does nothing for delayed responses. The cache
entry is written properly in a subsequent write_entry() call where
the filter responds with the final file content.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Git 2.15-rc0 v2.15.0-rc0Junio C Hamano Thu, 5 Oct 2017 04:49:07 +0000 (13:49 +0900)

Git 2.15-rc0

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ar/request-pull-phrasofix'Junio C Hamano Thu, 5 Oct 2017 04:48:21 +0000 (13:48 +0900)

Merge branch 'ar/request-pull-phrasofix'

Spell the name of our system as "Git" in the output from
request-pull script.

* ar/request-pull-phrasofix:
request-pull: capitalise "Git" to make it a proper noun

Merge branch 'rs/run-command-use-alloc-array'Junio C Hamano Thu, 5 Oct 2017 04:48:20 +0000 (13:48 +0900)

Merge branch 'rs/run-command-use-alloc-array'

Code clean-up.

* rs/run-command-use-alloc-array:
run-command: use ALLOC_ARRAY

Merge branch 'sb/git-clang-format'Junio C Hamano Thu, 5 Oct 2017 04:48:20 +0000 (13:48 +0900)

Merge branch 'sb/git-clang-format'

Add comment to clarify that the style file is meant to be used with
clang-5 and the rules are still work in progress.

* sb/git-clang-format:
clang-format: add a comment about the meaning/status of the

Merge branch 'rs/use-free-and-null'Junio C Hamano Thu, 5 Oct 2017 04:48:20 +0000 (13:48 +0900)

Merge branch 'rs/use-free-and-null'

Code clean-up.

* rs/use-free-and-null:
repository: use FREE_AND_NULL

Merge branch 'rs/tag-null-pointer-arith-fix'Junio C Hamano Thu, 5 Oct 2017 04:48:19 +0000 (13:48 +0900)

Merge branch 'rs/tag-null-pointer-arith-fix'

Code clean-up.

* rs/tag-null-pointer-arith-fix:
tag: avoid NULL pointer arithmetic

Merge branch 'rs/cocci-de-paren-call-params'Junio C Hamano Thu, 5 Oct 2017 04:48:19 +0000 (13:48 +0900)

Merge branch 'rs/cocci-de-paren-call-params'

Code clean-up.

* rs/cocci-de-paren-call-params:
coccinelle: remove parentheses that become unnecessary

Merge branch 'rs/cleanup-strbuf-users'Junio C Hamano Thu, 5 Oct 2017 04:48:19 +0000 (13:48 +0900)

Merge branch 'rs/cleanup-strbuf-users'

Code clean-up.

* rs/cleanup-strbuf-users:
graph: use strbuf_addchars() to add spaces
use strbuf_addstr() for adding strings to strbufs
path: use strbuf_add_real_path()

Merge branch 'rs/resolve-ref-optional-result'Junio C Hamano Thu, 5 Oct 2017 04:48:19 +0000 (13:48 +0900)

Merge branch 'rs/resolve-ref-optional-result'

Code clean-up.

* rs/resolve-ref-optional-result:
refs: pass NULL to resolve_refdup() if hash is not needed
refs: pass NULL to refs_resolve_refdup() if hash is not needed

Merge branch 'er/fast-import-dump-refs-on-checkpoint'Junio C Hamano Thu, 5 Oct 2017 04:48:19 +0000 (13:48 +0900)

Merge branch 'er/fast-import-dump-refs-on-checkpoint'

The checkpoint command "git fast-import" did not flush updates to
refs and marks unless at least one object was created since the
last checkpoint, which has been corrected, as these things can
happen without any new object getting created.

* er/fast-import-dump-refs-on-checkpoint:
fast-import: checkpoint: dump branches/tags/marks even if object_count==0

ref-filter.c: pass empty-string as NULL to atom parsersTaylor Blau Mon, 2 Oct 2017 16:10:34 +0000 (09:10 -0700)

ref-filter.c: pass empty-string as NULL to atom parsers

Peff points out that different atom parsers handle the empty
"sub-argument" list differently. An example of this is the format
"%(refname:)".

Since callers often use `string_list_split` (which splits the empty
string with any delimiter as a 1-ary string_list containing the empty
string), this makes handling empty sub-argument strings non-ergonomic.

Let's fix this by declaring that atom parser implementations must
not care about distinguishing between the empty string "%(refname:)"
and no sub-arguments "%(refname)". Current code aborts, either with
"unrecognised arg" (e.g. "refname:") or "does not take args"
(e.g. "body:") as an error message.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsmonitor: preserve utf8 filenames in fsmonitor-watchma... Ben Peart Wed, 4 Oct 2017 12:33:39 +0000 (08:33 -0400)

fsmonitor: preserve utf8 filenames in fsmonitor-watchman log

Update the test fsmonitor-watchman integration script to properly
preserve utf8 filenames when outputting the .git/watchman-output.out log
file.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsmonitor: read entirety of watchman outputAlex Vandiver Wed, 4 Oct 2017 06:27:46 +0000 (23:27 -0700)

fsmonitor: read entirety of watchman output

In Perl, setting $/ sets the string that is used as the "record
separator," which sets the boundary that the `<>` construct reads to.
Setting `local $/ = 0666;` evaluates the octal, getting 438, and
stringifies it. Thus, the later read from `<CHLD_OUT>` stops as soon
as it encounters the string "438" in the watchman output, yielding
invalid JSON; repositories containing filenames with SHA1 hashes are
able to trip this easily.

Set `$/` to undefined, thus slurping all output from watchman. Also
close STDIN which is provided to watchman, to better guarantee that we
cannot deadlock with watchman while both attempting to read.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

strbuf doc: reuse after strbuf_release is fineJonathan Nieder Wed, 4 Oct 2017 02:39:54 +0000 (19:39 -0700)

strbuf doc: reuse after strbuf_release is fine

strbuf_release leaves the strbuf in a valid, initialized state, so
there is no need to call strbuf_init after it.

Moreover, this is not likely to change in the future: strbuf_release
leaving the strbuf in a valid state has been easy to maintain and has
been very helpful for Git's robustness and simplicity (e.g.,
preventing use-after-free vulnerabilities).

Document the semantics so the next generation of Git developers can
become familiar with them without reading the implementation. It is
still not advisable to call strbuf_release too often because it is
wasteful, so add a note pointing to strbuf_reset for that.

The same semantics apply to strbuf_detach. Add a similar note to its
docstring to make that clear.

Improved-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

branch: reset instead of release a strbufStefan Beller Tue, 3 Oct 2017 22:17:40 +0000 (15:17 -0700)

branch: reset instead of release a strbuf

Our documentation advises to not re-use a strbuf, after strbuf_release
has been called on it. Use the proper reset instead.

Currently 'strbuf_release' releases and re-initializes the strbuf, so it
is safe, but slow. 'strbuf_reset' only resets the internal length variable,
such that this could also be accounted for as a micro-optimization.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sub-process: use child_process.args instead of child_pr... Johannes Sixt Tue, 3 Oct 2017 20:24:57 +0000 (22:24 +0200)

sub-process: use child_process.args instead of child_process.argv

Currently the argv is only allocated on the stack, and then assigned to
process->argv. When the start_subprocess function goes out of scope,
the local argv variable is eliminated from the stack, but the pointer is
still kept around in process->argv.

Much later when we try to access the same process->argv in
finish_command, this leads us to access a memory location that no longer
contains what we want. As argv0 is only used for printing errors, this
is not easily noticed in normal git operations. However when running
t0021-conversion.sh through valgrind, valgrind rightfully complains:

==21024== Invalid read of size 8
==21024== at 0x2ACF64: finish_command (run-command.c:869)
==21024== by 0x2D6B18: subprocess_exit_handler (sub-process.c:72)
==21024== by 0x2AB41E: cleanup_children (run-command.c:45)
==21024== by 0x2AB526: cleanup_children_on_exit (run-command.c:81)
==21024== by 0x54AD487: __run_exit_handlers (in /usr/lib/libc-2.26.so)
==21024== by 0x54AD4D9: exit (in /usr/lib/libc-2.26.so)
==21024== by 0x11A9EF: handle_builtin (git.c:550)
==21024== by 0x11ABCC: run_argv (git.c:602)
==21024== by 0x11AD8E: cmd_main (git.c:679)
==21024== by 0x1BF125: main (common-main.c:43)
==21024== Address 0x1ffeffec00 is on thread 1's stack
==21024== 1504 bytes below stack pointer
==21024==

These days, the child_process structure has its own args array, and
the standard way to set up its argv[] is to use that one, instead of
assigning to process->argv to point at an array that is outside.
Use that facility automatically fixes this issue.

Reported-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

http-push: fix construction of hex value from pathThomas Gummerer Tue, 3 Oct 2017 19:57:12 +0000 (20:57 +0100)

http-push: fix construction of hex value from path

The get_oid_hex_from_objpath takes care of creating a oid from a
pathname. It does this by memcpy'ing the first two bytes of the path to
the "hex" string, then skipping the '/', and then copying the rest of the
path to the "hex" string. Currently it fails to increase the pointer to
the hex string, so the second memcpy invocation just mashes over what
was copied in the first one, and leaves the last two bytes in the string
uninitialized.

This breaks valgrind in t5540, although the test passes without
valgrind:

==5490== Use of uninitialised value of size 8
==5490== at 0x13C6B5: hexval (cache.h:1238)
==5490== by 0x13C6DB: hex2chr (cache.h:1247)
==5490== by 0x13C734: get_sha1_hex (hex.c:42)
==5490== by 0x13C78E: get_oid_hex (hex.c:53)
==5490== by 0x118BDA: get_oid_hex_from_objpath (http-push.c:1023)
==5490== by 0x118C92: process_ls_object (http-push.c:1038)
==5490== by 0x118E5B: handle_remote_ls_ctx (http-push.c:1077)
==5490== by 0x118227: xml_end_tag (http-push.c:815)
==5490== by 0x50C1448: ??? (in /usr/lib/libexpat.so.1.6.6)
==5490== by 0x50C221B: ??? (in /usr/lib/libexpat.so.1.6.6)
==5490== by 0x50BFBF2: ??? (in /usr/lib/libexpat.so.1.6.6)
==5490== by 0x50C0B24: ??? (in /usr/lib/libexpat.so.1.6.6)
==5490== Uninitialised value was created by a stack allocation
==5490== at 0x118B63: get_oid_hex_from_objpath (http-push.c:1012)
==5490==

Fix this by correctly incrementing the pointer to the "hex" variable, so
the first two bytes are left untouched by the memcpy call, and the last
two bytes are correctly initialized.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

path.c: fix uninitialized memory accessJeff King Tue, 3 Oct 2017 23:30:40 +0000 (19:30 -0400)

path.c: fix uninitialized memory access

In cleanup_path we're passing in a char array, run a memcmp on it, and
run through it without ever checking if something is in the array in the
first place. This can lead us to access uninitialized memory, for
example in t5541-http-push-smart.sh test 7, when run under valgrind:

==4423== Conditional jump or move depends on uninitialised value(s)
==4423== at 0x242FA9: cleanup_path (path.c:35)
==4423== by 0x242FA9: mkpath (path.c:456)
==4423== by 0x256CC7: refname_match (refs.c:364)
==4423== by 0x26C181: count_refspec_match (remote.c:1015)
==4423== by 0x26C181: match_explicit_lhs (remote.c:1126)
==4423== by 0x26C181: check_push_refs (remote.c:1409)
==4423== by 0x2ABB4D: transport_push (transport.c:870)
==4423== by 0x186703: push_with_options (push.c:332)
==4423== by 0x18746D: do_push (push.c:409)
==4423== by 0x18746D: cmd_push (push.c:566)
==4423== by 0x1183E0: run_builtin (git.c:352)
==4423== by 0x11973E: handle_builtin (git.c:539)
==4423== by 0x11973E: run_argv (git.c:593)
==4423== by 0x11973E: main (git.c:698)
==4423== Uninitialised value was created by a heap allocation
==4423== at 0x4C2CD8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4423== by 0x4C2F195: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4423== by 0x2C196B: xrealloc (wrapper.c:137)
==4423== by 0x29A30B: strbuf_grow (strbuf.c:66)
==4423== by 0x29A30B: strbuf_vaddf (strbuf.c:277)
==4423== by 0x242F9F: mkpath (path.c:454)
==4423== by 0x256CC7: refname_match (refs.c:364)
==4423== by 0x26C181: count_refspec_match (remote.c:1015)
==4423== by 0x26C181: match_explicit_lhs (remote.c:1126)
==4423== by 0x26C181: check_push_refs (remote.c:1409)
==4423== by 0x2ABB4D: transport_push (transport.c:870)
==4423== by 0x186703: push_with_options (push.c:332)
==4423== by 0x18746D: do_push (push.c:409)
==4423== by 0x18746D: cmd_push (push.c:566)
==4423== by 0x1183E0: run_builtin (git.c:352)
==4423== by 0x11973E: handle_builtin (git.c:539)
==4423== by 0x11973E: run_argv (git.c:593)
==4423== by 0x11973E: main (git.c:698)
==4423==

Avoid this by using skip_prefix(), which knows not to go beyond the
end of the string.

Reported-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>