gitweb.git
Merge branch 'jk/test-lib-drop-pid-from-results'Junio C Hamano Mon, 12 Sep 2016 22:34:33 +0000 (15:34 -0700)

Merge branch 'jk/test-lib-drop-pid-from-results'

The test framework left the number of tests and success/failure
count in the t/test-results directory, keyed by the name of the
test script plus the process ID. The latter however turned out not
to serve any useful purpose. The process ID part of the filename
has been removed.

* jk/test-lib-drop-pid-from-results:
test-lib: drop PID from test-results/*.count

Merge branch 'jc/am-read-author-file'Junio C Hamano Mon, 12 Sep 2016 22:34:32 +0000 (15:34 -0700)

Merge branch 'jc/am-read-author-file'

Extract a small helper out of the function that reads the authors
script file "git am" internally uses.

* jc/am-read-author-file:
am: refactor read_author_script()

Merge branch 'jk/diff-submodule-diff-inline'Junio C Hamano Mon, 12 Sep 2016 22:34:31 +0000 (15:34 -0700)

Merge branch 'jk/diff-submodule-diff-inline'

The "git diff --submodule={short,log}" mechanism has been enhanced
to allow "--submodule=diff" to show the patch between the submodule
commits bound to the superproject.

* jk/diff-submodule-diff-inline:
diff: teach diff to display submodule difference with an inline diff
submodule: refactor show_submodule_summary with helper function
submodule: convert show_submodule_summary to use struct object_id *
allow do_submodule_path to work even if submodule isn't checked out
diff: prepare for additional submodule formats
graph: add support for --line-prefix on all graph-aware output
diff.c: remove output_prefix_length field
cache: add empty_tree_oid object and helper function

Sync with maintJunio C Hamano Fri, 9 Sep 2016 05:00:53 +0000 (22:00 -0700)

Sync with maint

* maint:
Prepare for 2.9.4

Start the 2.11 cycleJunio C Hamano Fri, 9 Sep 2016 05:00:35 +0000 (22:00 -0700)

Start the 2.11 cycle

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'bh/diff-highlight-graph'Junio C Hamano Fri, 9 Sep 2016 04:49:52 +0000 (21:49 -0700)

Merge branch 'bh/diff-highlight-graph'

"diff-highlight" script (in contrib/) learned to work better with
"git log -p --graph" output.

* bh/diff-highlight-graph:
diff-highlight: avoid highlighting combined diffs
diff-highlight: add multi-byte tests
diff-highlight: ignore test cruft
diff-highlight: add support for --graph output
diff-highlight: add failing test for handling --graph output
diff-highlight: add some tests

Merge branch 'hv/doc-commit-reference-style'Junio C Hamano Fri, 9 Sep 2016 04:49:51 +0000 (21:49 -0700)

Merge branch 'hv/doc-commit-reference-style'

A small doc update.

* hv/doc-commit-reference-style:
SubmittingPatches: use gitk's "Copy commit summary" format

Merge branch 'sb/submodule-clone-rr'Junio C Hamano Fri, 9 Sep 2016 04:49:50 +0000 (21:49 -0700)

Merge branch 'sb/submodule-clone-rr'

"git clone --resurse-submodules --reference $path $URL" is a way to
reduce network transfer cost by borrowing objects in an existing
$path repository when cloning the superproject from $URL; it
learned to also peek into $path for presense of corresponding
repositories of submodules and borrow objects from there when able.

* sb/submodule-clone-rr:
clone: recursive and reference option triggers submodule alternates
clone: implement optional references
clone: clarify option_reference as required
clone: factor out checking for an alternate path
submodule--helper update-clone: allow multiple references
submodule--helper module-clone: allow multiple references
t7408: merge short tests, factor out testing method
t7408: modernize style

Merge branch 'jh/status-v2-porcelain'Junio C Hamano Fri, 9 Sep 2016 04:49:50 +0000 (21:49 -0700)

Merge branch 'jh/status-v2-porcelain'

Enhance "git status --porcelain" output by collecting more data on
the state of the index and the working tree files, which may
further be used to teach git-prompt (in contrib/) to make fewer
calls to git.

* jh/status-v2-porcelain:
status: unit tests for --porcelain=v2
test-lib-functions.sh: add lf_to_nul helper
git-status.txt: describe --porcelain=v2 format
status: print branch info with --porcelain=v2 --branch
status: print per-file porcelain v2 status data
status: collect per-file data for --porcelain=v2
status: support --porcelain[=<version>]
status: cleanup API to wt_status_print
status: rename long-format print routines

Merge branch 'po/range-doc'Junio C Hamano Fri, 9 Sep 2016 04:49:49 +0000 (21:49 -0700)

Merge branch 'po/range-doc'

Clarify various ways to specify the "revision ranges" in the
documentation.

* po/range-doc:
doc: revisions: sort examples and fix alignment of the unchanged
doc: revisions: show revision expansion in examples
doc: revisions - clarify reachability examples
doc: revisions - define `reachable`
doc: gitrevisions - clarify 'latter case' is revision walk
doc: gitrevisions - use 'reachable' in page description
doc: revisions: single vs multi-parent notation comparison
doc: revisions: extra clarification of <rev>^! notation effects
doc: revisions: give headings for the two and three dot notations
doc: show the actual left, right, and boundary marks
doc: revisions - name the left and right sides
doc: use 'symmetric difference' consistently

Merge branch 'rt/help-unknown'Junio C Hamano Fri, 9 Sep 2016 04:49:48 +0000 (21:49 -0700)

Merge branch 'rt/help-unknown'

"git nosuchcommand --help" said "No manual entry for gitnosuchcommand",
which was not intuitive, given that "git nosuchcommand" said "git:
'nosuchcommand' is not a git command".

* rt/help-unknown:
help: make option --help open man pages only for Git commands
help: introduce option --exclude-guides

Merge branch 'cc/receive-pack-limit'Junio C Hamano Fri, 9 Sep 2016 04:49:47 +0000 (21:49 -0700)

Merge branch 'cc/receive-pack-limit'

An incoming "git push" that attempts to push too many bytes can now
be rejected by setting a new configuration variable at the receiving
end.

* cc/receive-pack-limit:
receive-pack: allow a maximum input size to be specified
unpack-objects: add --max-input-size=<size> option
index-pack: add --max-input-size=<size> option

Merge branch 'jk/format-patch-number-singleton-patch... Junio C Hamano Fri, 9 Sep 2016 04:49:47 +0000 (21:49 -0700)

Merge branch 'jk/format-patch-number-singleton-patch-with-cover'

"git format-patch --cover-letter HEAD^" to format a single patch
with a separate cover letter now numbers the output as [PATCH 0/1]
and [PATCH 1/1] by default.

* jk/format-patch-number-singleton-patch-with-cover:
format-patch: show 0/1 and 1/1 for singleton patch with cover letter

Merge branch 'jk/delta-base-cache'Junio C Hamano Fri, 9 Sep 2016 04:49:46 +0000 (21:49 -0700)

Merge branch 'jk/delta-base-cache'

The delta-base-cache mechanism has been a key to the performance in
a repository with a tightly packed packfile, but it did not scale
well even with a larger value of core.deltaBaseCacheLimit.

* jk/delta-base-cache:
t/perf: add basic perf tests for delta base cache
delta_base_cache: use hashmap.h
delta_base_cache: drop special treatment of blobs
delta_base_cache: use list.h for LRU
release_delta_base_cache: reuse existing detach function
clear_delta_base_cache_entry: use a more descriptive name
cache_or_unpack_entry: drop keep_cache parameter

Start maintenance track for 2.10.x seriesJunio C Hamano Fri, 9 Sep 2016 04:39:38 +0000 (21:39 -0700)

Start maintenance track for 2.10.x series

Prepare for 2.9.4Junio C Hamano Fri, 9 Sep 2016 04:37:59 +0000 (21:37 -0700)

Prepare for 2.9.4

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'hv/doc-commit-reference-style' into maintJunio C Hamano Fri, 9 Sep 2016 04:36:03 +0000 (21:36 -0700)

Merge branch 'hv/doc-commit-reference-style' into maint

A small doc update.

* hv/doc-commit-reference-style:
SubmittingPatches: use gitk's "Copy commit summary" format
SubmittingPatches: document how to reference previous commits

Merge branch 'sg/reflog-past-root' into maintJunio C Hamano Fri, 9 Sep 2016 04:36:02 +0000 (21:36 -0700)

Merge branch 'sg/reflog-past-root' into maint

A small test clean-up for a topic introduced in v2.9.1 and later.

* sg/reflog-past-root:
t1410: remove superfluous 'git reflog' from the 'walk past root' test

Merge branch 'rs/mailinfo-lib' into maintJunio C Hamano Fri, 9 Sep 2016 04:36:01 +0000 (21:36 -0700)

Merge branch 'rs/mailinfo-lib' into maint

Small code clean-up.

* rs/mailinfo-lib:
mailinfo: recycle strbuf in check_header()

Merge branch 'jk/tighten-alloc' into maintJunio C Hamano Fri, 9 Sep 2016 04:36:00 +0000 (21:36 -0700)

Merge branch 'jk/tighten-alloc' into maint

Small code and comment clean-up.

* jk/tighten-alloc:
receive-pack: use FLEX_ALLOC_MEM in queue_command()
correct FLEXPTR_* example in comment

Merge branch 'rs/use-strbuf-add-unique-abbrev' into... Junio C Hamano Fri, 9 Sep 2016 04:35:59 +0000 (21:35 -0700)

Merge branch 'rs/use-strbuf-add-unique-abbrev' into maint

A small code clean-up.

* rs/use-strbuf-add-unique-abbrev:
use strbuf_add_unique_abbrev() for adding short hashes

Merge branch 'rs/merge-recursive-string-list-init'... Junio C Hamano Fri, 9 Sep 2016 04:35:59 +0000 (21:35 -0700)

Merge branch 'rs/merge-recursive-string-list-init' into maint

A small code clean-up.

* rs/merge-recursive-string-list-init:
merge-recursive: use STRING_LIST_INIT_NODUP

Merge branch 'rs/merge-add-strategies-simplification... Junio C Hamano Fri, 9 Sep 2016 04:35:58 +0000 (21:35 -0700)

Merge branch 'rs/merge-add-strategies-simplification' into maint

A small code clean-up.

* rs/merge-add-strategies-simplification:
merge: use string_list_split() in add_strategies()

Merge branch 'ls/packet-line-protocol-doc-fix' into... Junio C Hamano Fri, 9 Sep 2016 04:35:57 +0000 (21:35 -0700)

Merge branch 'ls/packet-line-protocol-doc-fix' into maint

Correct an age-old calco (is that a typo-like word for calc)
in the documentation.

* ls/packet-line-protocol-doc-fix:
pack-protocol: fix maximum pkt-line size

Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile... Junio C Hamano Fri, 9 Sep 2016 04:35:56 +0000 (21:35 -0700)

Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile' into maint

The tempfile (hence its user lockfile) API lets the caller to open
a file descriptor to a temporary file, write into it and then
finalize it by first closing the filehandle and then either
removing or renaming the temporary file. When the process spawns a
subprocess after obtaining the file descriptor, and if the
subprocess has not exited when the attempt to remove or rename is
made, the last step fails on Windows, because the subprocess has
the file descriptor still open. Open tempfile with O_CLOEXEC flag
to avoid this (on Windows, this is mapped to O_NOINHERIT).

* bw/mingw-avoid-inheriting-fd-to-lockfile:
mingw: ensure temporary file handles are not inherited by child processes
t6026-merge-attr: child processes must not inherit index.lock handles

Merge branch 'dg/document-git-c-in-git-config-doc'... Junio C Hamano Fri, 9 Sep 2016 04:35:56 +0000 (21:35 -0700)

Merge branch 'dg/document-git-c-in-git-config-doc' into maint

The "git -c var[=val] cmd" facility to append a configuration
variable definition at the end of the search order was described in
git(1) manual page, but not in git-config(1), which was more likely
place for people to look for when they ask "can I make a one-shot
override, and if so how?"

* dg/document-git-c-in-git-config-doc:
doc: mention `git -c` in git-config(1)

Merge branch 'js/no-html-bypass-on-windows' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:55 +0000 (21:35 -0700)

Merge branch 'js/no-html-bypass-on-windows' into maint

On Windows, help.browser configuration variable used to be ignored,
which has been corrected.

* js/no-html-bypass-on-windows:
Revert "display HTML in default browser using Windows' shell API"

Merge branch 'jk/difftool-command-not-found' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:54 +0000 (21:35 -0700)

Merge branch 'jk/difftool-command-not-found' into maint

"git difftool" by default ignores the error exit from the backend
commands it spawns, because often they signal that they found
differences by exiting with a non-zero status code just like "diff"
does; the exit status codes 126 and above however are special in
that they are used to signal that the command is not executable,
does not exist, or killed by a signal. "git difftool" has been
taught to notice these exit status codes.

* jk/difftool-command-not-found:
difftool: always honor fatal error exit codes

Merge branch 'sb/checkout-explit-detach-no-advice'... Junio C Hamano Fri, 9 Sep 2016 04:35:54 +0000 (21:35 -0700)

Merge branch 'sb/checkout-explit-detach-no-advice' into maint

"git checkout --detach <branch>" used to give the same advice
message as that is issued when "git checkout <tag>" (or anything
that is not a branch name) is given, but asking with "--detach" is
an explicit enough sign that the user knows what is going on. The
advice message has been squelched in this case.

* sb/checkout-explit-detach-no-advice:
checkout: do not mention detach advice for explicit --detach option

Merge branch 'rs/pull-signed-tag' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:54 +0000 (21:35 -0700)

Merge branch 'rs/pull-signed-tag' into maint

When "git merge-recursive" works on history with many criss-cross
merges in "verbose" mode, the names the command assigns to the
virtual merge bases could have overwritten each other by unintended
reuse of the same piece of memory.

* rs/pull-signed-tag:
commit: use FLEX_ARRAY in struct merge_remote_desc
merge-recursive: fix verbose output for multiple base trees
commit: factor out set_merge_remote_desc()
commit: use xstrdup() in get_merge_parent()

Merge branch 'js/test-lint-pathname' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:54 +0000 (21:35 -0700)

Merge branch 'js/test-lint-pathname' into maint

The "t/" hierarchy is prone to get an unusual pathname; "make test"
has been taught to make sure they do not contain paths that cannot
be checked out on Windows (and the mechanism can be reusable to
catch pathnames that are not portable to other platforms as need
arises).

* js/test-lint-pathname:
t/Makefile: ensure that paths are valid on platforms we care

Merge branch 'js/mv-dir-to-new-directory' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:54 +0000 (21:35 -0700)

Merge branch 'js/mv-dir-to-new-directory' into maint

"git mv dir non-existing-dir/" did not work in some environments
the same way as existing mainstream platforms. The code now moves
"dir" to "non-existing-dir", without relying on rename("A", "B/")
that strips the trailing slash of '/'.

* js/mv-dir-to-new-directory:
git mv: do not keep slash in `git mv dir non-existing-dir/`

Merge branch 'js/import-tars-hardlinks' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:54 +0000 (21:35 -0700)

Merge branch 'js/import-tars-hardlinks' into maint

"import-tars" fast-import script (in contrib/) used to ignore a
hardlink target and replaced it with an empty file, which has been
corrected to record the same blob as the other file the hardlink is
shared with.

* js/import-tars-hardlinks:
import-tars: support hard links

Merge branch 'ms/document-pack-window-memory-is-per... Junio C Hamano Fri, 9 Sep 2016 04:35:53 +0000 (21:35 -0700)

Merge branch 'ms/document-pack-window-memory-is-per-thread' into maint

* ms/document-pack-window-memory-is-per-thread:
document git-repack interaction of pack.threads and pack.windowMemory

Merge branch 'jk/push-force-with-lease-creation' into... Junio C Hamano Fri, 9 Sep 2016 04:35:53 +0000 (21:35 -0700)

Merge branch 'jk/push-force-with-lease-creation' into maint

"git push --force-with-lease" already had enough logic to allow
ensuring that such a push results in creation of a ref (i.e. the
receiving end did not have another push from sideways that would be
discarded by our force-pushing), but didn't expose this possibility
to the users. It does so now.

* jk/push-force-with-lease-creation:
t5533: make it pass on case-sensitive filesystems
push: allow pushing new branches with --force-with-lease
push: add shorthand for --force-with-lease branch creation
Documentation/git-push: fix placeholder formatting

Merge branch 'jk/reflog-date' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:52 +0000 (21:35 -0700)

Merge branch 'jk/reflog-date' into maint

The reflog output format is documented better, and a new format
--date=unix to report the seconds-since-epoch (without timezone)
has been added.

* jk/reflog-date:
date: clarify --date=raw description
date: add "unix" format
date: document and test "raw-local" mode
doc/pretty-formats: explain shortening of %gd
doc/pretty-formats: describe index/time formats for %gd
doc/rev-list-options: explain "-g" output formats
doc/rev-list-options: clarify "commit@{Nth}" for "-g" option

Merge branch 'jc/renormalize-merge-kill-safer-crlf... Junio C Hamano Fri, 9 Sep 2016 04:35:51 +0000 (21:35 -0700)

Merge branch 'jc/renormalize-merge-kill-safer-crlf' into maint

"git merge" with renormalization did not work well with
merge-recursive, due to "safer crlf" conversion kicking in when it
shouldn't.

* jc/renormalize-merge-kill-safer-crlf:
merge: avoid "safer crlf" during recording of merge results
convert: unify the "auto" handling of CRLF

Merge branch 'jk/common-main' into maintJunio C Hamano Fri, 9 Sep 2016 04:35:50 +0000 (21:35 -0700)

Merge branch 'jk/common-main' into maint

There are certain house-keeping tasks that need to be performed at
the very beginning of any Git program, and programs that are not
built-in commands had to do them exactly the same way as "git"
potty does. It was easy to make mistakes in one-off standalone
programs (like test helpers). A common "main()" function that
calls cmd_main() of individual program has been introduced to
make it harder to make mistakes.

* jk/common-main:
mingw: declare main()'s argv as const
common-main: call git_setup_gettext()
common-main: call restore_sigpipe_to_default()
common-main: call sanitize_stdfds()
common-main: call git_extract_argv0_path()
add an extra level of indirection to main()

Git 2.10 v2.10.0Junio C Hamano Fri, 2 Sep 2016 16:05:47 +0000 (09:05 -0700)

Git 2.10

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge tag 'l10n-2.10.0-rnd2.2' of git://github.com... Junio C Hamano Fri, 2 Sep 2016 15:48:14 +0000 (08:48 -0700)

Merge tag 'l10n-2.10.0-rnd2.2' of git://github.com/git-l10n/git-po

l10n-2.10.0-rnd2.2

* tag 'l10n-2.10.0-rnd2.2' of git://github.com/git-l10n/git-po:
l10n: Updated Vietnamese translation for v2.10.0-rc2 (2757t)

Merge branch 'master' of https://github.com/vnwildman/gitJiang Xin Fri, 2 Sep 2016 13:29:48 +0000 (21:29 +0800)

Merge branch 'master' of https://github.com/vnwildman/git

* 'master' of https://github.com/vnwildman/git:
l10n: Updated Vietnamese translation for v2.10.0-rc2 (2757t)

diff: teach diff to display submodule difference with... Jacob Keller Wed, 31 Aug 2016 23:27:25 +0000 (16:27 -0700)

diff: teach diff to display submodule difference with an inline diff

Teach git-diff and friends a new format for displaying the difference
of a submodule. The new format is an inline diff of the contents of the
submodule between the commit range of the update. This allows the user
to see the actual code change caused by a submodule update.

Add tests for the new format and option.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule: refactor show_submodule_summary with helper... Jacob Keller Wed, 31 Aug 2016 23:27:24 +0000 (16:27 -0700)

submodule: refactor show_submodule_summary with helper function

A future patch is going to add a new submodule diff format which
displays an inline diff of the submodule changes. To make this easier,
and to ensure that both submodule diff formats use the same initial
header, factor out show_submodule_header() function which will print the
current submodule header line, and then leave the show_submodule_summary
function to lookup and print the submodule log format.

This does create one format change in that "(revision walker failed)"
will now be displayed on its own line rather than as part of the message
because we no longer perform this step directly in the header display
flow. However, this is a rare case as most causes of the failure will be
due to a missing commit which we already check for and avoid previously.
flow. However, this is a rare case and shouldn't impact much.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule: convert show_submodule_summary to use struct... Jacob Keller Wed, 31 Aug 2016 23:27:23 +0000 (16:27 -0700)

submodule: convert show_submodule_summary to use struct object_id *

Since we're going to be changing this function in a future patch, lets
go ahead and convert this to use object_id now.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

allow do_submodule_path to work even if submodule isn... Jacob Keller Wed, 31 Aug 2016 23:27:22 +0000 (16:27 -0700)

allow do_submodule_path to work even if submodule isn't checked out

Currently, do_submodule_path will attempt locating the .git directory by
using read_gitfile on <path>/.git. If this fails it just assumes the
<path>/.git is actually a git directory.

This is good because it allows for handling submodules which were cloned
in a regular manner first before being added to the superproject.

Unfortunately this fails if the <path> is not actually checked out any
longer, such as by removing the directory.

Fix this by checking if the directory we found is actually a gitdir. In
the case it is not, attempt to lookup the submodule configuration and
find the name of where it is stored in the .git/modules/ directory of
the superproject.

If we can't locate the submodule configuration, this might occur because
for example a submodule gitlink was added but the corresponding
.gitmodules file was not properly updated. A die() here would not be
pleasant to the users of submodule diff formats, so instead, modify
do_submodule_path() to return an error code:

- git_pathdup_submodule() returns NULL when we fail to find a path.
- strbuf_git_path_submodule() propagates the error code to the caller.

Modify the callers of these functions to check the error code and fail
properly. This ensures we don't attempt to use a bad path that doesn't
match the corresponding submodule.

Because this change fixes add_submodule_odb() to work even if the
submodule is not checked out, update the wording of the submodule log
diff format to correctly display that the submodule is "not initialized"
instead of "not checked out"

Add tests to ensure this change works as expected.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: prepare for additional submodule formatsJacob Keller Wed, 31 Aug 2016 23:27:21 +0000 (16:27 -0700)

diff: prepare for additional submodule formats

A future patch will add a new format for displaying the difference of
a submodule. Make it easier by changing how we store the current
selected format. Replace the DIFF_OPT flag with an enumeration, as each
format will be mutually exclusive.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

graph: add support for --line-prefix on all graph-aware... Jacob Keller Wed, 31 Aug 2016 23:27:20 +0000 (16:27 -0700)

graph: add support for --line-prefix on all graph-aware output

Add an extension to git-diff and git-log (and any other graph-aware
displayable output) such that "--line-prefix=<string>" will print the
additional line-prefix on every line of output.

To make this work, we have to fix a few bugs in the graph API that force
graph_show_commit_msg to be used only when you have a valid graph.
Additionally, we extend the default_diff_output_prefix handler to work
even when no graph is enabled.

This is somewhat of a hack on top of the graph API, but I think it
should be acceptable here.

This will be used by a future extension of submodule display which
displays the submodule diff as the actual diff between the pre and post
commit in the submodule project.

Add some tests for both git-log and git-diff to ensure that the prefix
is honored correctly.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff.c: remove output_prefix_length fieldJunio C Hamano Wed, 31 Aug 2016 23:27:19 +0000 (16:27 -0700)

diff.c: remove output_prefix_length field

"diff/log --stat" has a logic that determines the display columns
available for the diffstat part of the output and apportions it for
pathnames and diffstat graph automatically.

5e71a84a (Add output_prefix_length to diff_options, 2012-04-16)
added the output_prefix_length field to diff_options structure to
allow this logic to subtract the display columns used for the
history graph part from the total "terminal width"; this matters
when the "git log --graph -p" option is in use.

The field must be set to the number of display columns needed to
show the output from the output_prefix() callback, which is error
prone. As there is only one user of the field, and the user has the
actual value of the prefix string, let's get rid of the field and
have the user count the display width itself.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

cache: add empty_tree_oid object and helper functionJacob Keller Wed, 31 Aug 2016 23:27:18 +0000 (16:27 -0700)

cache: add empty_tree_oid object and helper function

Similar to is_null_oid(), and is_empty_blob_sha1() add an
empty_tree_oid along with helper function is_empty_tree_oid(). For
completeness, also add an "is_empty_tree_sha1()",
"is_empty_blob_sha1()", "is_empty_tree_oid()" and "is_empty_blob_oid()"
helpers.

To ensure we only get one singleton, implement EMPTY_BLOB_SHA1_BIN as
simply getting the hash of empty_blob_oid structure.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

A few more fixes before the final 2.10Junio C Hamano Wed, 31 Aug 2016 17:21:05 +0000 (10:21 -0700)

A few more fixes before the final 2.10

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge tag 'l10n-2.10.0-rnd2' of git://github.com/git... Junio C Hamano Wed, 31 Aug 2016 17:04:14 +0000 (10:04 -0700)

Merge tag 'l10n-2.10.0-rnd2' of git://github.com/git-l10n/git-po

l10n-2.10.0-rnd2

* tag 'l10n-2.10.0-rnd2' of git://github.com/git-l10n/git-po:
l10n: zh_CN: for git v2.10.0 l10n round 2
l10n: ca.po: update translation
l10n: fr.po v2.10.0-rc2
l10n: sv.po: Update Swedish translation (2757t0f0u)
l10n: git.pot: v2.10.0 round 2 (12 new, 44 removed)
l10n: Updated Vietnamese translation for v2.10.0 (2789t)
l10n: pt_PT: update Portuguese translation
l10n: pt_PT: merge git.pot
l10n: ko.po: Update Korean translation
l10n: git.pot: v2.10.0 round 1 (248 new, 56 removed)

Merge branch 'ls/packet-line-protocol-doc-fix'Junio C Hamano Wed, 31 Aug 2016 17:03:51 +0000 (10:03 -0700)

Merge branch 'ls/packet-line-protocol-doc-fix'

Correct an age-old calco (is that a typo-like word for calc)
in the documentation.

* ls/packet-line-protocol-doc-fix:
pack-protocol: fix maximum pkt-line size

Merge branch 'mh/blame-worktree'Junio C Hamano Wed, 31 Aug 2016 17:03:50 +0000 (10:03 -0700)

Merge branch 'mh/blame-worktree'

* mh/blame-worktree:
blame: fix segfault on untracked files

Merge branch 'kw/patch-ids-optim'Junio C Hamano Wed, 31 Aug 2016 17:03:49 +0000 (10:03 -0700)

Merge branch 'kw/patch-ids-optim'

* kw/patch-ids-optim:
p3400: make test script executable

diff-highlight: avoid highlighting combined diffsJeff King Wed, 31 Aug 2016 05:05:38 +0000 (01:05 -0400)

diff-highlight: avoid highlighting combined diffs

The algorithm in diff-highlight only understands how to look
at two sides of a diff; it cannot correctly handle combined
diffs with multiple preimages. Often highlighting does not
trigger at all for these diffs because the line counts do
not match up. E.g., if we see:

- ours
-theirs
++resolved

we would not bother highlighting; it naively looks like a
single line went away, and then a separate hunk added
another single line.

But of course there are exceptions. E.g., if the other side
deleted the line, we might see:

- ours
++resolved

which looks like we dropped " ours" and added "+resolved".
This is only a small highlighting glitch (we highlight the
space and the "+" along with the content), but it's also the
tip of the iceberg. Even if we learned to find the true
content here (by noticing we are in a 3-way combined diff
and marking _two_ characters from the front of the line as
uninteresting), there are other more complicated cases where
we really do need to handle a 3-way hunk.

Let's just punt for now; we can recognize combined diffs by
the presence of extra "@" symbols in the hunk header, and
treat them as non-diff content.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff-highlight: add multi-byte testsJeff King Wed, 31 Aug 2016 05:03:10 +0000 (01:03 -0400)

diff-highlight: add multi-byte tests

Now that we have a test suite for diff highlight, we can
show off the improvements from 8d00662 (diff-highlight: do
not split multibyte characters, 2015-04-03).

While we're at it, we can also add another case that
_doesn't_ work: combining code points are treated as their
own unit, which means that we may stick colors between them
and the character they are modifying (with the result that
the color is not shown in an xterm, though it's possible
that other terminals err the other way, and show the color
but not the accent). There's no fix here, but let's
document it as a failure.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff-highlight: ignore test cruftJeff King Wed, 31 Aug 2016 05:02:53 +0000 (01:02 -0400)

diff-highlight: ignore test cruft

These are the same as in the normal t/.gitignore, with the
exception of ".prove", as our Makefile does not support it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

help: make option --help open man pages only for Git... Ralf Thielow Fri, 26 Aug 2016 17:58:36 +0000 (19:58 +0200)

help: make option --help open man pages only for Git commands

If option --help is passed to a Git command, we try to open
the man page of that command. However, we do it for both commands
and concepts. Make sure it is an actual command.

This makes "git <concept> --help" not working anymore, while
"git help <concept>" still works.

Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

help: introduce option --exclude-guidesRalf Thielow Fri, 26 Aug 2016 17:58:35 +0000 (19:58 +0200)

help: introduce option --exclude-guides

Introduce option --exclude-guides to the help command. With this option
being passed, "git help" will open man pages only for actual commands.

Since we know it is a command, we can use function help_unknown_command
to give the user advice on typos.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

am: refactor read_author_script()Junio C Hamano Tue, 30 Aug 2016 19:36:42 +0000 (12:36 -0700)

am: refactor read_author_script()

By splitting the part that reads from a file and the part that
parses the variable definitions from the contents, make the latter
can be more reusable in the future.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-lib: drop PID from test-results/*.countJeff King Tue, 30 Aug 2016 08:43:57 +0000 (04:43 -0400)

test-lib: drop PID from test-results/*.count

Each test run generates a "count" file in t/test-results
that stores the number of successful, failed, etc tests.
If you run "t1234-foo.sh", that file is named as
"t/test-results/t1234-foo-$$.count"

The addition of the PID there is serving no purpose, and
makes analysis of the count files harder.

The presence of the PID dates back to 2d84e9f (Modify
test-lib.sh to output stats to t/test-results/*,
2008-06-08), but no reasoning is given there. Looking at the
current code, we can see that other files we write to
test-results (like *.exit and *.out) do _not_ have the PID
included. So the presence of the PID does not meaningfully
allow one to store the results from multiple runs anyway.

Moreover, anybody wishing to read the *.count files to
aggregate results has to deal with the presence of multiple
files for a given test (and figure out which one is the most
recent based on their timestamps!). The only consumer of
these files is the aggregate.sh script, which arguably gets
this wrong. If a test is run multiple times, its counts will
appear multiple times in the total (I say arguably only
because the desired semantics aren't documented anywhere,
but I have trouble seeing how this behavior could be
useful).

So let's just drop the PID, which fixes aggregate.sh, and
will make new features based around the count files easier
to write.

Note that since the count-file may already exist (when
re-running a test), we also switch the "cat" from appending
to truncating. The use of append here was pointless in the
first place, as we expected to always write to a unique file.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-protocol: fix maximum pkt-line sizeLars Schneider Mon, 29 Aug 2016 17:55:09 +0000 (19:55 +0200)

pack-protocol: fix maximum pkt-line size

According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
therefore the pkt-line data component must not exceed 65516 bytes.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: zh_CN: for git v2.10.0 l10n round 2Jiang Xin Sun, 28 Aug 2016 02:18:12 +0000 (10:18 +0800)

l10n: zh_CN: for git v2.10.0 l10n round 2

Update 215 translations (2757t0f0u) for git v2.10.0-rc2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

p3400: make test script executableRené Scharfe Sun, 28 Aug 2016 12:39:27 +0000 (14:39 +0200)

p3400: make test script executable

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff-highlight: add support for --graph outputBrian Henderson Mon, 29 Aug 2016 17:33:47 +0000 (10:33 -0700)

diff-highlight: add support for --graph output

Signed-off-by: Brian Henderson <henderson.bj@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff-highlight: add failing test for handling --graph... Brian Henderson Mon, 29 Aug 2016 17:33:46 +0000 (10:33 -0700)

diff-highlight: add failing test for handling --graph output

Signed-off-by: Brian Henderson <henderson.bj@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff-highlight: add some testsBrian Henderson Mon, 29 Aug 2016 17:33:45 +0000 (10:33 -0700)

diff-highlight: add some tests

Signed-off-by: Brian Henderson <henderson.bj@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blame: fix segfault on untracked filesThomas Gummerer Sat, 27 Aug 2016 20:01:50 +0000 (21:01 +0100)

blame: fix segfault on untracked files

Since 3b75ee9 ("blame: allow to blame paths freshly added to the index",
2016-07-16) git blame also looks at the index to determine if there is a
file that was freshly added to the index.

cache_name_pos returns -pos - 1 in case there is no match is found, or
if the name matches, but the entry has a stage other than 0. As git
blame should work for unmerged files, it uses strcmp to determine
whether the name of the returned position matches, in which case the
file exists, but is merely unmerged, or if the file actually doesn't
exist in the index.

If the repository is empty, or if the file would lexicographically be
sorted as the last file in the repository, -cache_name_pos - 1 is
outside of the length of the active_cache array, causing git blame to
segfault. Guard against that, and die() normally to restore the old
behaviour.

Reported-by: Simon Ruderich <simon@ruderich.org>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: ca.po: update translationAlex Henrie Sun, 28 Aug 2016 16:32:56 +0000 (10:32 -0600)

l10n: ca.po: update translation

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>

l10n: fr.po v2.10.0-rc2Jean-Noel Avila Sat, 20 Aug 2016 14:20:17 +0000 (16:20 +0200)

l10n: fr.po v2.10.0-rc2

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>

l10n: Updated Vietnamese translation for v2.10.0-rc2... Tran Ngoc Quan Sun, 28 Aug 2016 00:23:30 +0000 (07:23 +0700)

l10n: Updated Vietnamese translation for v2.10.0-rc2 (2757t)

Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>

l10n: sv.po: Update Swedish translation (2757t0f0u)Peter Krefting Fri, 26 Aug 2016 13:27:24 +0000 (14:27 +0100)

l10n: sv.po: Update Swedish translation (2757t0f0u)

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>

Merge branch 'master' of https://github.com/vnwildman/gitJiang Xin Sat, 27 Aug 2016 15:36:16 +0000 (23:36 +0800)

Merge branch 'master' of https://github.com/vnwildman/git

* 'master' of https://github.com/vnwildman/git:
l10n: Updated Vietnamese translation for v2.10.0 (2789t)

l10n: git.pot: v2.10.0 round 2 (12 new, 44 removed)Jiang Xin Sat, 27 Aug 2016 15:23:26 +0000 (23:23 +0800)

l10n: git.pot: v2.10.0 round 2 (12 new, 44 removed)

Generate po/git.pot from v2.10.0-rc2 for git v2.10.0 l10n round 2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

Merge branch 'master' of git://github.com/git-l10n... Jiang Xin Sat, 27 Aug 2016 15:14:27 +0000 (23:14 +0800)

Merge branch 'master' of git://github.com/git-l10n/git-po

* 'master' of git://github.com/git-l10n/git-po:
l10n: pt_PT: update Portuguese translation
l10n: pt_PT: merge git.pot
l10n: ko.po: Update Korean translation
l10n: git.pot: v2.10.0 round 1 (248 new, 56 removed)

l10n: Updated Vietnamese translation for v2.10.0 (2789t)Tran Ngoc Quan Sat, 27 Aug 2016 02:15:28 +0000 (09:15 +0700)

l10n: Updated Vietnamese translation for v2.10.0 (2789t)

Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>

SubmittingPatches: use gitk's "Copy commit summary... Beat Bolli Fri, 26 Aug 2016 16:59:01 +0000 (18:59 +0200)

SubmittingPatches: use gitk's "Copy commit summary" format

Update the suggestion in 175d38ca ("SubmittingPatches: document how
to reference previous commits", 2016-07-28) on the format to refer
to a commit to match what gitk has been giving since last year with
its "Copy commit summary" command; also mention this as one of the
ways to obtain a commit reference in this format.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Git 2.10-rc2 v2.10.0-rc2Junio C Hamano Fri, 26 Aug 2016 20:59:20 +0000 (13:59 -0700)

Git 2.10-rc2

Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitattributes: Document the unified "auto" handlingTorsten Bögershausen Fri, 26 Aug 2016 20:18:48 +0000 (22:18 +0200)

gitattributes: Document the unified "auto" handling

Update the documentation about text=auto:
text=auto now follows the core.autocrlf handling when files are not
normalized in the repository.

For a cross platform project recommend the usage of attributes for
line-ending conversions.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'js/no-html-bypass-on-windows' into rt... Junio C Hamano Fri, 26 Aug 2016 18:29:07 +0000 (11:29 -0700)

Merge branch 'js/no-html-bypass-on-windows' into rt/help-unknown

* js/no-html-bypass-on-windows:
Revert "display HTML in default browser using Windows' shell API"

Prepare for 2.10.0-rc2Junio C Hamano Thu, 25 Aug 2016 20:56:51 +0000 (13:56 -0700)

Prepare for 2.10.0-rc2

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ja/i18n'Junio C Hamano Thu, 25 Aug 2016 20:55:07 +0000 (13:55 -0700)

Merge branch 'ja/i18n'

The recent i18n patch we added during this cycle did a bit too much
refactoring of the messages to avoid word-legos; the repetition has
been reduced to help translators.

* ja/i18n:
i18n: simplify numeric error reporting
i18n: fix git rebase interactive commit messages
i18n: fix typos for translation

Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile'Junio C Hamano Thu, 25 Aug 2016 20:55:07 +0000 (13:55 -0700)

Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile'

The tempfile (hence its user lockfile) API lets the caller to open
a file descriptor to a temporary file, write into it and then
finalize it by first closing the filehandle and then either
removing or renaming the temporary file. When the process spawns a
subprocess after obtaining the file descriptor, and if the
subprocess has not exited when the attempt to remove or rename is
made, the last step fails on Windows, because the subprocess has
the file descriptor still open. Open tempfile with O_CLOEXEC flag
to avoid this (on Windows, this is mapped to O_NOINHERIT).

* bw/mingw-avoid-inheriting-fd-to-lockfile:
mingw: ensure temporary file handles are not inherited by child processes
t6026-merge-attr: child processes must not inherit index.lock handles

Merge branch 'dg/document-git-c-in-git-config-doc'Junio C Hamano Thu, 25 Aug 2016 20:55:07 +0000 (13:55 -0700)

Merge branch 'dg/document-git-c-in-git-config-doc'

The "git -c var[=val] cmd" facility to append a configuration
variable definition at the end of the search order was described in
git(1) manual page, but not in git-config(1), which was more likely
place for people to look for when they ask "can I make a one-shot
override, and if so how?"

* dg/document-git-c-in-git-config-doc:
doc: mention `git -c` in git-config(1)

Merge branch 'js/no-html-bypass-on-windows'Junio C Hamano Thu, 25 Aug 2016 20:55:06 +0000 (13:55 -0700)

Merge branch 'js/no-html-bypass-on-windows'

On Windows, help.browser configuration variable used to be ignored,
which has been corrected.

* js/no-html-bypass-on-windows:
Revert "display HTML in default browser using Windows' shell API"

Merge branch 'hv/doc-commit-reference-style'Junio C Hamano Thu, 25 Aug 2016 20:55:05 +0000 (13:55 -0700)

Merge branch 'hv/doc-commit-reference-style'

A small doc update.

* hv/doc-commit-reference-style:
SubmittingPatches: document how to reference previous commits

git ls-files: text=auto eol=lf is supported in Git... Torsten Bögershausen Thu, 25 Aug 2016 15:52:57 +0000 (17:52 +0200)

git ls-files: text=auto eol=lf is supported in Git 2.10

The man page for `git ls-files --eol` mentions the combination
of text attributes "text=auto eol=lf" or "text=auto eol=crlf" as not
supported yet, but may be in the future.

Now they are supported.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: pt_PT: update Portuguese translationVasco Almeida Mon, 22 Aug 2016 16:29:35 +0000 (16:29 +0000)

l10n: pt_PT: update Portuguese translation

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>

l10n: pt_PT: merge git.potVasco Almeida Tue, 16 Aug 2016 12:06:44 +0000 (12:06 +0000)

l10n: pt_PT: merge git.pot

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>

receive-pack: allow a maximum input size to be specifiedJeff King Wed, 24 Aug 2016 18:41:57 +0000 (20:41 +0200)

receive-pack: allow a maximum input size to be specified

Receive-pack feeds its input to either index-pack or
unpack-objects, which will happily accept as many bytes as
a sender is willing to provide. Let's allow an arbitrary
cutoff point where we will stop writing bytes to disk.

Cleaning up what has already been written to disk is a
related problem that is not addressed by this patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

unpack-objects: add --max-input-size=<size> optionChristian Couder Wed, 24 Aug 2016 18:41:56 +0000 (20:41 +0200)

unpack-objects: add --max-input-size=<size> option

When receiving a pack-file, it can be useful to abort the
`git unpack-objects`, if the pack-file is too big.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

index-pack: add --max-input-size=<size> optionJeff King Wed, 24 Aug 2016 18:41:55 +0000 (20:41 +0200)

index-pack: add --max-input-size=<size> option

When receiving a pack-file, it can be useful to abort the
`git index-pack`, if the pack-file is too big.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

i18n: simplify numeric error reportingJean-Noel Avila Sun, 21 Aug 2016 14:50:39 +0000 (16:50 +0200)

i18n: simplify numeric error reporting

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

i18n: fix git rebase interactive commit messagesJean-Noel Avila Sun, 21 Aug 2016 14:50:38 +0000 (16:50 +0200)

i18n: fix git rebase interactive commit messages

For proper i18n, the logic cannot embed english specific processing.

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

i18n: fix typos for translationJean-Noel Avila Sun, 21 Aug 2016 14:50:37 +0000 (16:50 +0200)

i18n: fix typos for translation

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

format-patch: show 0/1 and 1/1 for singleton patch... Jacob Keller Tue, 23 Aug 2016 22:45:50 +0000 (15:45 -0700)

format-patch: show 0/1 and 1/1 for singleton patch with cover letter

Change the default behavior of git-format-patch to generate numbered
sequence of 0/1 and 1/1 when generating both a cover-letter and a single
patch. This standardizes the cover letter to have 0/N which helps
distinguish the cover letter from the patch itself. Since the behavior
is easily changed via configuration as well as the use of -n and -N this
should be acceptable default behavior.

Add tests for the new default behavior.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/perf: add basic perf tests for delta base cacheJeff King Mon, 22 Aug 2016 22:01:10 +0000 (18:01 -0400)

t/perf: add basic perf tests for delta base cache

This just shows off the improvements done by the last few
patches, and gives us a baseline for noticing regressions in
the future. Here are the results with linux.git as the perf
"large repo":

Test origin HEAD
-------------------------------------------------------------------
0003.1: log --raw 43.41(40.36+2.69) 33.86(30.96+2.41) -22.0%
0003.2: log -S 313.61(309.74+3.78) 298.75(295.58+3.00) -4.7%

(for a large repo, the "log -S" improvements are greater if
you bump the delta base cache limit, but I think it makes
sense to test the "stock" behavior, since that is what most
people will see).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

delta_base_cache: use hashmap.hJeff King Mon, 22 Aug 2016 22:00:07 +0000 (18:00 -0400)

delta_base_cache: use hashmap.h

The fundamental data structure of the delta base cache is a
hash table mapping pairs of "(packfile, offset)" into
structs containing the actual object data. The hash table
implementation dates back to e5e0161 (Implement a simple
delta_base cache, 2007-03-17), and uses a fixed-size table.
The current size is a hard-coded 256 entries.

Because we need to be able to remove objects from the hash
table, entry lookup does not do any kind of probing to
handle collisions. Colliding items simply replace whatever
is in their slot. As a result, we have fewer usable slots
than even the 256 we allocate. At half full, each new item
has a 50% chance of displacing another one. Or another way
to think about it: every item has a 1/256 chance of being
ejected due to hash collision, without regard to our LRU
strategy.

So it would be interesting to see the effect of increasing
the cache size on the runtime for some common operations. As
with the previous patch, we'll measure "git log --raw" for
tree-only operations, and "git log -Sfoo --raw" for
operations that touch trees and blobs. All times are
wall-clock best-of-3, done against fully packed repos with
--depth=50, and the default core.deltaBaseCacheLimit of
96MB.

Here are timings for various values of MAX_DELTA_CACHE
against git.git (the asterisk marks the minimum time for
each operation):

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.227s 0m12.821s
512 0m02.143s 0m10.602s
1024 0m02.127s 0m08.642s
2048 0m02.148s 0m07.123s
4096 0m02.194s 0m06.448s*
8192 0m02.239s 0m06.504s
16384 0m02.144s* 0m06.502s
32768 0m02.202s 0m06.622s
65536 0m02.230s 0m06.677s

The log-raw case isn't changed much at all here (probably
because our trees just aren't that big in the first place,
or possibly because we have so _few_ trees in git.git that
the 256-entry cache is enough). But once we start putting
blobs in the cache, too, we see a big improvement (almost
50%). The curve levels off around 4096, which means that we
can hold about that many entries before hitting the 96MB
memory limit (or possibly that the workload is small enough
that there is simply no more work to be optimized out by
caching more).

(As a side note, I initially timed my existing git.git pack,
which was a base of --aggressive combined with some pulls on
top. So it had quite a few deeper delta chains. The
256-cache case was more like 15s, and it still dropped to
~6.5s in the same way).

Here are the timings for linux.git:

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m41.661s 5m12.410s
512 0m39.547s 5m07.920s
1024 0m37.054s 4m54.666s
2048 0m35.871s 4m41.194s*
4096 0m34.646s 4m51.648s
8192 0m33.881s 4m55.342s
16384 0m35.190s 5m00.122s
32768 0m35.060s 4m58.851s
65536 0m33.311s* 4m51.420s

As we grow we see a nice 20% speedup in the tree traversal,
and more modest 10% in the log-S. This is probably an
indication that we are bound less by the number of entries,
and more by the memory limit (more on that below). What is
interesting is that the numbers bounce around a bit;
increasing the number of entries isn't always a strict
improvement.

Partially this is due to noise in the measurement. But it
may also be an indication that our LRU ejection scheme is
not optimal. The smaller cache sizes introduce some
randomness into the ejection (due to collisions), which may
sometimes work in our favor (and sometimes not!).

So what is the optimal setting of MAX_DELTA_CACHE? The
"bouncing" in the linux.git log-S numbers notwithstanding,
it mostly seems like bigger is better. And even if we were
to try to find a "sweet spot", these are just two
repositories, that are not necessarily representative. The
shape of history, the size of trees and blobs, the memory
limit configuration, etc, all will affect the outcome.

Rather than trying to find the "right" number, another
strategy is to just switch to a hash table that can actually
store collisions: namely our hashmap.h implementation.

Here are numbers for that compared to the "best" we saw from
adjusting MAX_DELTA_CACHE:

| log-raw | log-S
| best hashmap | best hashmap
| --------- --------- | --------- ---------
git | 0m02.144s 0m02.144s | 0m06.448s 0m06.688s
linux | 0m33.311s 0m33.092s | 4m41.194s 4m57.172s

We can see the results are similar in most cases, which is
what we'd expect. We're not ejecting due to collisions at
all, so this is purely representing the LRU. So really, we'd
expect this to model most closely the larger values of the
static MAX_DELTA_CACHE limit. And that does seem to be
what's happening, including the "bounce" in the linux log-S
case.

So while the value for that case _isn't_ as good as the
optimal one measured above (which was 2048 entries), given
the bouncing I'm hesitant to suggest that 2048 is any kind
of optimum (not even for linux.git, let alone as a general
rule). The generic hashmap has the appeal that it drops the
number of tweakable numbers by one, which means we can focus
on tuning other elements, like the LRU strategy or the
core.deltaBaseCacheLimit setting.

And indeed, if we bump the cache limit to 1G (which is
probably silly for general use, but maybe something people
with big workstations would want to do), the linux.git log-S
time drops to 3m32s. That's something you really _can't_ do
easily with the static hash table, because the number of
entries needs to grow in proportion to the memory limit (so
2048 is almost certainly not going to be the right value
there).

This patch takes that direction, and drops the static hash
table entirely in favor of using the hashmap.h API.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

delta_base_cache: drop special treatment of blobsJeff King Mon, 22 Aug 2016 21:59:56 +0000 (17:59 -0400)

delta_base_cache: drop special treatment of blobs

When the delta base cache runs out of allowed memory, it has
to drop entries. It does so by walking an LRU list, dropping
objects until we are under the memory limit. But we actually
walk the list twice: once to drop blobs, and then again to
drop other objects (which are generally trees). This comes
from 18bdec1 (Limit the size of the new delta_base_cache,
2007-03-19).

This performs poorly as the number of entries grows, because
any time dropping blobs does not satisfy the limit, we have
to walk the _entire_ list, trees included, looking for blobs
to drop, before starting to drop any trees.

It's not generally a problem now, as the cache is limited to
only 256 entries. But as we could benefit from increasing
that in a future patch, it's worth looking at how it
performs as the cache size grows. And the answer is "not
well".

The table below shows times for various operations with
different values of MAX_DELTA_CACHE (which is not a run-time
knob; I recompiled with -DMAX_DELTA_CACHE=$n for each).

I chose "git log --raw" ("log-raw" in the table) because it
will access all of the trees, but no blobs at all (so in a
sense it is a worst case for this problem, because we will
always walk over the entire list of trees once before
realizing there are no blobs to drop). This is also
representative of other tree-only operations like "rev-list
--objects" and "git log -- <path>".

I also timed "git log -Sfoo --raw" ("log-S" in the table).
It similarly accesses all of the trees, but also the blobs
for each commit. It's representative of "git log -p", though
it emphasizes the cost of blob access more, as "-S" is
cheaper than computing an actual blob diff.

All timings are best-of-3 wall-clock times (though they all
were CPU bound, so the user CPU times are similar). The
repositories were fully packed with --depth=50, and the
default core.deltaBaseCacheLimit of 96M was in effect. The
current value of MAX_DELTA_CACHE is 256, so I started there
and worked up by factors of 2.

First, here are values for git.git (the asterisk signals the
fastest run for each operation):

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.212s 0m12.634s
512 0m02.136s* 0m10.614s
1024 0m02.156s 0m08.614s
2048 0m02.208s 0m07.062s
4096 0m02.190s 0m06.484s*
8192 0m02.176s 0m07.635s
16384 0m02.913s 0m19.845s
32768 0m03.617s 1m05.507s
65536 0m04.031s 1m18.488s

You can see that for the tree-only log-raw case, we don't
actually benefit that much as the cache grows (all the
differences up through 8192 are basically just noise; this
is probably because we don't actually have that many
distinct trees in git.git). But for log-S, we get a definite
speed improvement as the cache grows, but the improvements
are lost as cache size grows and the linear LRU management
starts to dominate.

Here's the same thing run against linux.git:

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ----------
256 0m40.987s 5m13.216s
512 0m37.949s 5m03.243s
1024 0m35.977s 4m50.580s
2048 0m33.855s 4m39.818s
4096 0m32.913s 4m47.299s*
8192 0m32.176s* 5m14.650s
16384 0m32.185s 6m31.625s
32768 0m38.056s 9m31.136s
65536 1m30.518s 17m38.549s

The pattern is similar, though the effect in log-raw is more
pronounced here. The times dip down in the middle, and then
go back up as we keep growing.

So we know there's a problem. What's the solution?

The obvious one is to improve the data structure to avoid
walking over tree entries during the looking-for-blobs
traversal. We can do this by keeping _two_ LRU lists: one
for blobs, and one for other objects. We drop items from the
blob LRU first, and then from the tree LRU (if necessary).

Here's git.git using that strategy:

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ----------
256 0m02.264s 0m12.830s
512 0m02.201s 0m10.771s
1024 0m02.181s 0m08.593s
2048 0m02.205s 0m07.116s
4096 0m02.158s 0m06.537s*
8192 0m02.213s 0m07.246s
16384 0m02.155s* 0m10.975s
32768 0m02.159s 0m16.047s
65536 0m02.181s 0m16.992s

The upswing on log-raw is gone completely. But log-S still
has it (albeit much better than without this strategy).
Let's see what linux.git shows:

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m42.519s 5m14.654s
512 0m39.106s 5m04.708s
1024 0m36.802s 4m51.454s
2048 0m34.685s 4m39.378s*
4096 0m33.663s 4m44.047s
8192 0m33.157s 4m50.644s
16384 0m33.090s* 4m49.648s
32768 0m33.458s 4m53.371s
65536 0m33.563s 5m04.580s

The results are similar. The tree-only case again performs
well (not surprising; we're literally just dropping the one
useless walk, and not otherwise changing the cache eviction
strategy at all). But the log-S case again does a bit worse
as the cache grows (though possibly that's within the noise,
which is much larger for this case).

Perhaps this is an indication that the "remove blobs first"
strategy is not actually optimal. The intent of it is to
avoid blowing out the tree cache when we see large blobs,
but it also means we'll throw away useful, recent blobs in
favor of older trees.

Let's run the same numbers without caring about object type
at all (i.e., one LRU list, and always evicting whatever is
at the head, regardless of type).

Here's git.git:

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.227s 0m12.821s
512 0m02.143s 0m10.602s
1024 0m02.127s 0m08.642s
2048 0m02.148s 0m07.123s
4096 0m02.194s 0m06.448s*
8192 0m02.239s 0m06.504s
16384 0m02.144s* 0m06.502s
32768 0m02.202s 0m06.622s
65536 0m02.230s 0m06.677s

Much smoother; there's no dramatic upswing as we increase
the cache size (some remains, though it's small enough that
it's mostly run-to-run noise. E.g., in the log-raw case,
note how 8192 is 50-100ms higher than its neighbors). Note
also that we stop getting any real benefit for log-S after
about 4096 entries; that number will depend on the size of
the repository, the size of the blob entries, and the memory
limit of the cache.

Let's see what linux.git shows for the same strategy:

MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m41.661s 5m12.410s
512 0m39.547s 5m07.920s
1024 0m37.054s 4m54.666s
2048 0m35.871s 4m41.194s*
4096 0m34.646s 4m51.648s
8192 0m33.881s 4m55.342s
16384 0m35.190s 5m00.122s
32768 0m35.060s 4m58.851s
65536 0m33.311s* 4m51.420s

It's similarly good. As with the "separate blob LRU"
strategy, there's a lot of noise on the log-S run here. But
it's certainly not any worse, is possibly a bit better, and
the improvement over "separate blob LRU" on the git.git case
is dramatic.

So it seems like a clear winner, and that's what this patch
implements.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

delta_base_cache: use list.h for LRUJeff King Mon, 22 Aug 2016 21:59:42 +0000 (17:59 -0400)

delta_base_cache: use list.h for LRU

We keep an LRU list of entries for when we need to drop
something from an over-full cache. The list is implemented
as a circular doubly-linked list, which is exactly what
list.h provides. We can save a few lines by using the list.h
macros and functions. More importantly, this makes the code
easier to follow, as the reader sees explicit concepts like
"list_add_tail()" instead of pointer manipulation.

As a bonus, the list_entry() macro lets us place the lru
pointers anywhere inside the delta_base_cache_entry struct
(as opposed to just casting the pointer, which requires it
at the front of the struct). This will be useful in later
patches when we need to place other items at the front of
the struct (e.g., our hashmap implementation requires this).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>