gitweb.git
commit: allow parse_commit* to handle any repoStefan Beller Wed, 14 Nov 2018 00:12:50 +0000 (16:12 -0800)

commit: allow parse_commit* to handle any repo

Just like the previous commit, parse_commit and friends are used a lot
and are found in new patches, so we cannot change their signature easily.

Re-introduce these function prefixed with 'repo_' that take a repository
argument and keep the original as a shallow macro.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: parse_object to honor its repository argumentStefan Beller Wed, 14 Nov 2018 00:12:49 +0000 (16:12 -0800)

object: parse_object to honor its repository argument

In 8e4b0b6047 (object.c: allow parse_object to handle
arbitrary repositories, 2018-06-28), we forgot to pass the
repository down to the read_object_file.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-store: prepare has_{sha1, object}_file to handle... Stefan Beller Wed, 14 Nov 2018 00:12:48 +0000 (16:12 -0800)

object-store: prepare has_{sha1, object}_file to handle any repo

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-store: prepare read_object_file to deal with... Stefan Beller Wed, 14 Nov 2018 00:12:47 +0000 (16:12 -0800)

object-store: prepare read_object_file to deal with any repo

As read_object_file is a widely used function (which is also regularly used
in new code in flight between master..pu), changing its signature is painful
is hard, as other series in flight rely on the original signature. It would
burden the maintainer if we'd just change the signature.

Introduce repo_read_object_file which takes the repository argument, and
hide the original read_object_file as a macro behind
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to
e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21)

Add a coccinelle patch to convert existing callers, but do not apply
the resulting patch to keep the diff of this patch small.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-store: allow read_object_file_extended to read... Stefan Beller Wed, 14 Nov 2018 00:12:46 +0000 (16:12 -0800)

object-store: allow read_object_file_extended to read from any repo

read_object_file_extended is not widely used, so migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

packfile: allow has_packed_and_bad to handle arbitrary... Stefan Beller Tue, 16 Oct 2018 23:35:33 +0000 (16:35 -0700)

packfile: allow has_packed_and_bad to handle arbitrary repositories

has_packed_and_bad is not widely used, so just migrate it all at once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_file: allow read_object to read objects in arbitra... Stefan Beller Tue, 16 Oct 2018 23:35:32 +0000 (16:35 -0700)

sha1_file: allow read_object to read objects in arbitrary repositories

Allow read_object (a file local functon in sha1_file) to
handle arbitrary repositories by passing the repository down
to oid_object_info_extended.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Fifth batch for 2.20Junio C Hamano Fri, 19 Oct 2018 04:52:51 +0000 (13:52 +0900)

Fifth batch for 2.20

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jt/cache-tree-allow-missing-object-in... Junio C Hamano Fri, 19 Oct 2018 04:34:08 +0000 (13:34 +0900)

Merge branch 'jt/cache-tree-allow-missing-object-in-partial-clone'

In a partial clone that will lazily be hydrated from the
originating repository, we generally want to avoid "does this
object exist (locally)?" on objects that we deliberately omitted
when we created the clone. The cache-tree codepath (which is used
to write a tree object out of the index) however insisted that the
object exists, even for paths that are outside of the partial
checkout area. The code has been updated to avoid such a check.

* jt/cache-tree-allow-missing-object-in-partial-clone:
cache-tree: skip some blob checks in partial clone

Merge branch 'tb/filter-alternate-refs'Junio C Hamano Fri, 19 Oct 2018 04:34:08 +0000 (13:34 +0900)

Merge branch 'tb/filter-alternate-refs'

When pushing into a repository that borrows its objects from an
alternate object store, "git receive-pack" that responds to the
push request on the other side lists the tips of refs in the
alternate to reduce the amount of objects transferred. This
sometimes is detrimental when the number of refs in the alternate
is absurdly large, in which case the bandwidth saved in potentially
fewer objects transferred is wasted in excessively large ref
advertisement. The alternate refs that are advertised are now
configurable with a pair of configuration variables.

* tb/filter-alternate-refs:
transport.c: introduce core.alternateRefsPrefixes
transport.c: introduce core.alternateRefsCommand
transport.c: extract 'fill_alternate_refs_command'
transport: drop refnames from for_each_alternate_ref

Merge branch 'jt/avoid-ls-refs'Junio C Hamano Fri, 19 Oct 2018 04:34:07 +0000 (13:34 +0900)

Merge branch 'jt/avoid-ls-refs'

Over some transports, fetching objects with an exact commit object
name can be done without first seeing the ref advertisements. The
code has been optimized to exploit this.

* jt/avoid-ls-refs:
fetch: do not list refs if fetching only hashes
transport: list refs before fetch if necessary
transport: do not list refs if possible
transport: allow skipping of ref listing

Merge branch 'ds/commit-graph-leakfix'Junio C Hamano Fri, 19 Oct 2018 04:34:07 +0000 (13:34 +0900)

Merge branch 'ds/commit-graph-leakfix'

Code clean-up.

* ds/commit-graph-leakfix:
commit-graph: reduce initial oid allocation
builtin/commit-graph.c: UNLEAK variables
commit-graph: clean up leaked memory during write

Merge branch 'jt/non-blob-lazy-fetch'Junio C Hamano Fri, 19 Oct 2018 04:34:07 +0000 (13:34 +0900)

Merge branch 'jt/non-blob-lazy-fetch'

A partial clone that is configured to lazily fetch missing objects
will on-demand issue a "git fetch" request to the originating
repository to fill not-yet-obtained objects. The request has been
optimized for requesting a tree object (and not the leaf blob
objects contained in it) by telling the originating repository that
no blobs are needed.

* jt/non-blob-lazy-fetch:
fetch-pack: exclude blobs when lazy-fetching trees
fetch-pack: avoid object flags if no_dependents

Merge branch 'pw/diff-color-moved-ws-fix'Junio C Hamano Fri, 19 Oct 2018 04:34:06 +0000 (13:34 +0900)

Merge branch 'pw/diff-color-moved-ws-fix'

Various fixes to "diff --color-moved-ws".

* pw/diff-color-moved-ws-fix:
diff --color-moved: fix a memory leak
diff --color-moved-ws: fix another memory leak
diff --color-moved-ws: fix a memory leak
diff --color-moved-ws: fix out of bounds string access
diff --color-moved-ws: fix double free crash

Merge branch 'rs/oidset-on-khash'Junio C Hamano Fri, 19 Oct 2018 04:34:06 +0000 (13:34 +0900)

Merge branch 'rs/oidset-on-khash'

The oidset API was built on top of the oidmap API which in turn is
on the hashmap API. Replace the implementation to build on top of
the khash API and gain performance.

* rs/oidset-on-khash:
oidset: uninline oidset_init()
oidset: use khash
khash: factor out kh_release_*
fetch-pack: load tip_oids eagerly iff needed
fetch-pack: factor out is_unmatched_ref()

Merge branch 'rs/grep-no-recursive'Junio C Hamano Fri, 19 Oct 2018 04:34:05 +0000 (13:34 +0900)

Merge branch 'rs/grep-no-recursive'

Unlike "grep", "git grep" by default recurses to the whole tree.
The command learned "git grep --recursive" option, so that "git
grep --no-recursive" can serve as a synonym to setting the
max-depth to 0.

* rs/grep-no-recursive:
grep: add -r/--[no-]recursive

Merge branch 'nd/help-commands-verbose-by-default'Junio C Hamano Fri, 19 Oct 2018 04:34:05 +0000 (13:34 +0900)

Merge branch 'nd/help-commands-verbose-by-default'

"git help -a" and "git help -av" give different pieces of
information, and generally the "verbose" version is more friendly
to the new users. "git help -a" by default now uses the more
verbose output (with "--no-verbose", you can go back to the
original). Also "git help -av" now lists aliases and external
commands, which it did not used to.

* nd/help-commands-verbose-by-default:
help -a: improve and make --verbose default

Merge branch 'jc/how-to-document-api'Junio C Hamano Fri, 19 Oct 2018 04:34:05 +0000 (13:34 +0900)

Merge branch 'jc/how-to-document-api'

Doc update.

* jc/how-to-document-api:
CodingGuidelines: document the API in *.h files

Merge branch 'sm/show-superproject-while-conflicted'Junio C Hamano Fri, 19 Oct 2018 04:34:04 +0000 (13:34 +0900)

Merge branch 'sm/show-superproject-while-conflicted'

A corner-case bugfix.

* sm/show-superproject-while-conflicted:
rev-parse: --show-superproject-working-tree should work during a merge

Merge branch 'jt/fetch-tips-in-partial-clone'Junio C Hamano Fri, 19 Oct 2018 04:34:04 +0000 (13:34 +0900)

Merge branch 'jt/fetch-tips-in-partial-clone'

"git fetch $repo $object" in a partial clone did not correctly
fetch the asked-for object that is referenced by an object in
promisor packfile, which has been fixed.

* jt/fetch-tips-in-partial-clone:
fetch: in partial clone, check presence of targets
connected: document connectivity in partial clones

Merge branch 'nd/status-refresh-progress'Junio C Hamano Fri, 19 Oct 2018 04:34:03 +0000 (13:34 +0900)

Merge branch 'nd/status-refresh-progress'

"git status" learns to show progress bar when refreshing the index
takes a long time.

* nd/status-refresh-progress:
status: show progress bar if refreshing the index takes too long

Merge branch 'bp/read-cache-parallel'Junio C Hamano Fri, 19 Oct 2018 04:34:03 +0000 (13:34 +0900)

Merge branch 'bp/read-cache-parallel'

A new extension to the index file has been introduced, which allows
the file to be read in parallel.

* bp/read-cache-parallel:
read-cache: load cache entries on worker threads
ieot: add Index Entry Offset Table (IEOT) extension
read-cache: load cache extensions on a worker thread
config: add new index.threads config setting
eoie: add End of Index Entry (EOIE) extension
read-cache: clean up casting and byte decoding
read-cache.c: optimize reading index format v4

Merge branch 'bp/rename-test-env-var'Junio C Hamano Fri, 19 Oct 2018 04:34:03 +0000 (13:34 +0900)

Merge branch 'bp/rename-test-env-var'

Some environment variables that control the runtime options of Git
used during tests are getting renamed for consistency.

* bp/rename-test-env-var:
t0000: do not get self-test disrupted by environment warnings
preload-index: update GIT_FORCE_PRELOAD_TEST support
read-cache: update TEST_GIT_INDEX_VERSION support
fsmonitor: update GIT_TEST_FSMONITOR support
preload-index: use git_env_bool() not getenv() for customization
t/README: correct spelling of "uncommon"

Merge branch 'ss/wt-status-committable'Junio C Hamano Fri, 19 Oct 2018 04:34:02 +0000 (13:34 +0900)

Merge branch 'ss/wt-status-committable'

Code clean-up in the internal machinery used by "git status" and
"git commit --dry-run".

* ss/wt-status-committable:
roll wt_status_state into wt_status and populate in the collect phase
wt-status.c: set the committable flag in the collect phase
t7501: add test of "commit --dry-run --short"
wt-status: rename commitable to committable
wt-status.c: move has_unmerged earlier in the file

Merge branch 'nd/the-index'Junio C Hamano Fri, 19 Oct 2018 04:34:02 +0000 (13:34 +0900)

Merge branch 'nd/the-index'

Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".

* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...

Merge branch 'nd/complete-fetch-multiple-args'Junio C Hamano Fri, 19 Oct 2018 04:34:01 +0000 (13:34 +0900)

Merge branch 'nd/complete-fetch-multiple-args'

Teach bash completion that "git fetch --multiple" only takes remote
names as arguments and no refspecs.

* nd/complete-fetch-multiple-args:
completion: support "git fetch --multiple"

Fourth batch for 2.20Junio C Hamano Tue, 16 Oct 2018 07:21:17 +0000 (16:21 +0900)

Fourth batch for 2.20

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'sf/complete-stash-list'Junio C Hamano Tue, 16 Oct 2018 07:16:09 +0000 (16:16 +0900)

Merge branch 'sf/complete-stash-list'

The completion script (in contrib/) learned to complete a handful of
options "git stash list" command takes.

* sf/complete-stash-list:
git-completion.bash: add completion for stash list

Merge branch 'mw/doc-typofixes'Junio C Hamano Tue, 16 Oct 2018 07:16:09 +0000 (16:16 +0900)

Merge branch 'mw/doc-typofixes'

Typofixes.

* mw/doc-typofixes:
docs: typo: s/isimilar/similar/
docs: graph: remove unnecessary `graph_update()' call
docs: typo: s/go/to/

Merge branch 'js/mingw-wants-vista-or-above'Junio C Hamano Tue, 16 Oct 2018 07:16:08 +0000 (16:16 +0900)

Merge branch 'js/mingw-wants-vista-or-above'

The minimum version of Windows supported by Windows port fo Git is
now set to Vista.

* js/mingw-wants-vista-or-above:
mingw: bump the minimum Windows version to Vista
mingw: set _WIN32_WINNT explicitly for Git for Windows
compat/poll: prepare for targeting Windows Vista

Merge branch 'rs/sequencer-oidset-insert-avoids-dups'Junio C Hamano Tue, 16 Oct 2018 07:16:08 +0000 (16:16 +0900)

Merge branch 'rs/sequencer-oidset-insert-avoids-dups'

Code clean-up.

* rs/sequencer-oidset-insert-avoids-dups:
sequencer: use return value of oidset_insert()

Merge branch 'jk/oideq-hasheq-cleanup'Junio C Hamano Tue, 16 Oct 2018 07:16:07 +0000 (16:16 +0900)

Merge branch 'jk/oideq-hasheq-cleanup'

Code clean-up.

* jk/oideq-hasheq-cleanup:
more oideq/hasheq conversions

Merge branch 'ma/mailing-list-address-in-git-help'Junio C Hamano Tue, 16 Oct 2018 07:16:07 +0000 (16:16 +0900)

Merge branch 'ma/mailing-list-address-in-git-help'

Doc update.

* ma/mailing-list-address-in-git-help:
git doc: direct bug reporters to mailing list archive

Merge branch 'nd/packobjectshook-doc-fix'Junio C Hamano Tue, 16 Oct 2018 07:16:07 +0000 (16:16 +0900)

Merge branch 'nd/packobjectshook-doc-fix'

Doc update.

* nd/packobjectshook-doc-fix:
config.txt: correct the note about uploadpack.packObjectsHook

Merge branch 'rt/rebase-typofix'Junio C Hamano Tue, 16 Oct 2018 07:16:06 +0000 (16:16 +0900)

Merge branch 'rt/rebase-typofix'

Typofix.

* rt/rebase-typofix:
git-rebase.sh: fix typos in error messages

Merge branch 'ma/t1400-undebug-test'Junio C Hamano Tue, 16 Oct 2018 07:16:06 +0000 (16:16 +0900)

Merge branch 'ma/t1400-undebug-test'

Test fix.

* ma/t1400-undebug-test:
t1400: drop debug `echo` to actually execute `test`

Merge branch 'ma/commit-graph-docs'Junio C Hamano Tue, 16 Oct 2018 07:16:05 +0000 (16:16 +0900)

Merge branch 'ma/commit-graph-docs'

Doc update.

* ma/commit-graph-docs:
Doc: refer to the "commit-graph file" with dash
git-commit-graph.txt: refer to "*commit*-graph file"
git-commit-graph.txt: typeset more in monospace
git-commit-graph.txt: fix bullet lists

Merge branch 'dz/credential-doc-url-matching-rules'Junio C Hamano Tue, 16 Oct 2018 07:16:05 +0000 (16:16 +0900)

Merge branch 'dz/credential-doc-url-matching-rules'

Doc update.

* dz/credential-doc-url-matching-rules:
doc: clarify gitcredentials path component matching

Merge branch 'en/status-multiple-renames-to-the-same... Junio C Hamano Tue, 16 Oct 2018 07:16:05 +0000 (16:16 +0900)

Merge branch 'en/status-multiple-renames-to-the-same-target-fix'

The code in "git status" sometimes hit an assertion failure. This
was caused by a structure that was reused without cleaning the data
used for the first run, which has been corrected.

* en/status-multiple-renames-to-the-same-target-fix:
commit: fix erroneous BUG, 'multiple renames on the same target? how?'

Merge branch 'ds/reachable-final-cleanup'Junio C Hamano Tue, 16 Oct 2018 07:16:04 +0000 (16:16 +0900)

Merge branch 'ds/reachable-final-cleanup'

Code already in 'master' is further cleaned-up by this patch.

* ds/reachable-final-cleanup:
commit-reach: cleanups in can_all_from_reach...

Merge branch 'jk/check-everything-connected-is-long... Junio C Hamano Tue, 16 Oct 2018 07:16:04 +0000 (16:16 +0900)

Merge branch 'jk/check-everything-connected-is-long-gone'

Comment fix.

* jk/check-everything-connected-is-long-gone:
receive-pack: update comment with check_everything_connected

Merge branch 'jn/gc-auto'Junio C Hamano Tue, 16 Oct 2018 07:16:02 +0000 (16:16 +0900)

Merge branch 'jn/gc-auto'

"gc --auto" ended up calling exit(-1) upon error, which has been
corrected to use exit(1). Also the error reporting behaviour when
daemonized has been updated to exit with zero status when stopping
due to a previously discovered error (which implies there is no
point running gc to improve the situation); we used to exit with
failure in such a case.

* jn/gc-auto:
gc: do not return error for prior errors in daemonized mode

Merge branch 'jn/gc-auto-prep'Junio C Hamano Tue, 16 Oct 2018 07:16:02 +0000 (16:16 +0900)

Merge branch 'jn/gc-auto-prep'

Code clean-up.

* jn/gc-auto-prep:
gc: exit with status 128 on failure
gc: improve handling of errors reading gc.log

Merge branch 'md/test-cleanup'Junio C Hamano Tue, 16 Oct 2018 07:16:01 +0000 (16:16 +0900)

Merge branch 'md/test-cleanup'

Various test scripts have been updated for style and also correct
handling of exit status of various commands.

* md/test-cleanup:
tests: order arguments to git-rev-list properly
t9109: don't swallow Git errors upstream of pipes
tests: don't swallow Git errors upstream of pipes
t/*: fix ordering of expected/observed arguments
tests: standardize pipe placement
Documentation: add shell guidelines
t/README: reformat Do, Don't, Keep in mind lists

Merge branch 'fe/doc-updates'Junio C Hamano Tue, 16 Oct 2018 07:16:01 +0000 (16:16 +0900)

Merge branch 'fe/doc-updates'

Doc updates.

* fe/doc-updates:
git-describe.1: clarify that "human readable" is also git-readable
git-column.1: clarify initial description, provide examples
git-archimport.1: specify what kind of Arch we're talking about

Merge branch 'jn/mailmap-update'Junio C Hamano Tue, 16 Oct 2018 07:16:01 +0000 (16:16 +0900)

Merge branch 'jn/mailmap-update'

The mailmap file update.

* jn/mailmap-update:
mailmap: consistently normalize brian m. carlson's name

Merge branch 'tg/t5551-with-curl-7.61.1'Junio C Hamano Tue, 16 Oct 2018 07:16:00 +0000 (16:16 +0900)

Merge branch 'tg/t5551-with-curl-7.61.1'

Test update.

* tg/t5551-with-curl-7.61.1:
t5551: compare sorted cookies files
t5551: move setup code inside test_expect blocks

Merge branch 'en/merge-cleanup'Junio C Hamano Tue, 16 Oct 2018 07:16:00 +0000 (16:16 +0900)

Merge branch 'en/merge-cleanup'

Code clean-up.

* en/merge-cleanup:
merge-recursive: rename merge_file_1() and merge_content()
merge-recursive: remove final remaining caller of merge_file_one()
merge-recursive: avoid wrapper function when unnecessary and wasteful
merge-recursive: set paths correctly when three-way merging content

Merge branch 'rj/header-check'Junio C Hamano Tue, 16 Oct 2018 07:16:00 +0000 (16:16 +0900)

Merge branch 'rj/header-check'

Header files clean-up.

* rj/header-check:
delta-islands.h: add missing forward declarations (hdr-check)
midx.h: add missing forward declarations (hdr-check)
refs/refs-internal.h: add missing declarations (hdr-check)
refs/packed-backend.h: add missing declaration (hdr-check)
refs/ref-cache.h: add missing declarations (hdr-check)
ewah/ewok_rlw.h: add missing include (hdr-check)
json-writer.h: add missing include (hdr-check)
Makefile: add a hdr-check target

Merge branch 'ma/config-doc-update'Junio C Hamano Tue, 16 Oct 2018 07:16:00 +0000 (16:16 +0900)

Merge branch 'ma/config-doc-update'

Doc update.

* ma/config-doc-update:
git-config.txt: fix 'see: above' note
Doc: use `--type=bool` instead of `--bool`

Merge branch 'jk/delta-islands-with-bitmap-reuse-delta... Junio C Hamano Tue, 16 Oct 2018 07:16:00 +0000 (16:16 +0900)

Merge branch 'jk/delta-islands-with-bitmap-reuse-delta-fix'

Fix interactions between two recent topics.

* jk/delta-islands-with-bitmap-reuse-delta-fix:
pack-objects: handle island check for "external" delta base

Merge branch 'tq/refs-internal-comment-fix'Junio C Hamano Tue, 16 Oct 2018 07:15:59 +0000 (16:15 +0900)

Merge branch 'tq/refs-internal-comment-fix'

Fix for typo in a sample code in comment.

* tq/refs-internal-comment-fix:
refs: docstring typo

Merge branch 'ts/alias-of-alias'Junio C Hamano Tue, 16 Oct 2018 07:15:59 +0000 (16:15 +0900)

Merge branch 'ts/alias-of-alias'

An alias that expands to another alias has so far been forbidden,
but now it is allowed to create such an alias.

* ts/alias-of-alias:
t0014: introduce an alias testing suite
alias: show the call history when an alias is looping
alias: add support for aliases of an alias

Merge branch 'ds/commit-graph-with-grafts'Junio C Hamano Tue, 16 Oct 2018 07:15:59 +0000 (16:15 +0900)

Merge branch 'ds/commit-graph-with-grafts'

The recently introduced commit-graph auxiliary data is incompatible
with mechanisms such as replace & grafts that "breaks" immutable
nature of the object reference relationship. Disable optimizations
based on its use (and updating existing commit-graph) when these
incompatible features are in use in the repository.

* ds/commit-graph-with-grafts:
commit-graph: close_commit_graph before shallow walk
commit-graph: not compatible with uninitialized repo
commit-graph: not compatible with grafts
commit-graph: not compatible with replace objects
test-repository: properly init repo
commit-graph: update design document
refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
refs.c: migrate internal ref iteration to pass thru repository argument

Merge branch 'ab/commit-graph-progress'Junio C Hamano Tue, 16 Oct 2018 07:15:58 +0000 (16:15 +0900)

Merge branch 'ab/commit-graph-progress'

Generation of (experimental) commit-graph files have so far been
fairly silent, even though it takes noticeable amount of time in a
meaningfully large repository. The users will now see progress
output.

* ab/commit-graph-progress:
gc: fix regression in 7b0f229222 impacting --quiet
commit-graph verify: add progress output
commit-graph write: add progress output

read-cache: load cache entries on worker threadsBen Peart Wed, 10 Oct 2018 15:59:38 +0000 (11:59 -0400)

read-cache: load cache entries on worker threads

This patch helps address the CPU cost of loading the index by utilizing
the Index Entry Offset Table (IEOT) to divide loading and conversion of
the cache entries across multiple threads in parallel.

I used p0002-read-cache.sh to generate some performance data:

Test w/100,000 files reduced the time by 32.24%
Test w/1,000,000 files reduced the time by -4.77%

Note that on the 1,000,000 files case, multi-threading the cache entry parsing
does not yield a performance win. This is because the cost to parse the
index extensions in this repo, far outweigh the cost of loading the cache
entries.

The high cost of parsing the index extensions is driven by the cache tree
and the untracked cache extensions. As this is currently the longest pole,
any reduction in this time will reduce the overall index load times so is
worth further investigation in another patch series.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ieot: add Index Entry Offset Table (IEOT) extensionBen Peart Wed, 10 Oct 2018 15:59:37 +0000 (11:59 -0400)

ieot: add Index Entry Offset Table (IEOT) extension

This patch enables addressing the CPU cost of loading the index by adding
additional data to the index that will allow us to efficiently multi-
thread the loading and conversion of cache entries.

It accomplishes this by adding an (optional) index extension that is a
table of offsets to blocks of cache entries in the index file. To make
this work for V4 indexes, when writing the cache entries, it periodically
"resets" the prefix-compression by encoding the current entry as if the
path name for the previous entry is completely different and saves the
offset of that entry in the IEOT. Basically, with V4 indexes, it
generates offsets into blocks of prefix-compressed entries.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: load cache extensions on a worker threadBen Peart Wed, 10 Oct 2018 15:59:36 +0000 (11:59 -0400)

read-cache: load cache extensions on a worker thread

This patch helps address the CPU cost of loading the index by loading
the cache extensions on a worker thread in parallel with loading the cache
entries.

In some cases, loading the extensions takes longer than loading the
cache entries so this patch utilizes the new EOIE to start the thread to
load the extensions before loading all the cache entries in parallel.

This is possible because the current extensions don't access the cache
entries in the index_state structure so are OK that they don't all exist
yet.

The CACHE_EXT_TREE, CACHE_EXT_RESOLVE_UNDO, and CACHE_EXT_UNTRACKED
extensions don't even get a pointer to the index so don't have access to the
cache entries.

CACHE_EXT_LINK only uses the index_state to initialize the split index.
CACHE_EXT_FSMONITOR only uses the index_state to save the fsmonitor last
update and dirty flags.

I used p0002-read-cache.sh to generate some performance data:

Test w/100,000 files reduced the time by 0.53%
Test w/1,000,000 files reduced the time by 27.78%

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: add new index.threads config settingBen Peart Wed, 10 Oct 2018 15:59:35 +0000 (11:59 -0400)

config: add new index.threads config setting

Add support for a new index.threads config setting which will be used to
control the threading code in do_read_index(). A value of 0 will tell the
index code to automatically determine the correct number of threads to use.
A value of 1 will make the code single threaded. A value greater than 1
will set the maximum number of threads to use.

For testing purposes, this setting can be overwritten by setting the
GIT_TEST_INDEX_THREADS=<n> environment variable to a value greater than 0.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

eoie: add End of Index Entry (EOIE) extensionBen Peart Wed, 10 Oct 2018 15:59:34 +0000 (11:59 -0400)

eoie: add End of Index Entry (EOIE) extension

The End of Index Entry (EOIE) is used to locate the end of the variable
length index entries and the beginning of the extensions. Code can take
advantage of this to quickly locate the index extensions without having
to parse through all of the index entries.

The EOIE extension is always written out to the index file including to
the shared index when using the split index feature. Because it is always
written out, the SHA checksums in t/t1700-split-index.sh were updated
to reflect its inclusion.

It is written as an optional extension to ensure compatibility with other
git implementations that do not yet support it. It is always written out
to ensure it is available as often as possible to speed up index operations.

Because it must be able to be loaded before the variable length cache
entries and other index extensions, this extension must be written last.
The signature for this extension is { 'E', 'O', 'I', 'E' }.

The extension consists of:

- 32-bit offset to the end of the index entries

- 160-bit SHA-1 over the extension types and their sizes (but not
their contents). E.g. if we have "TREE" extension that is N-bytes
long, "REUC" extension that is M-bytes long, followed by "EOIE",
then the hash would be:

SHA-1("TREE" + <binary representation of N> +
"REUC" + <binary representation of M>)

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: clean up casting and byte decodingBen Peart Wed, 10 Oct 2018 15:59:33 +0000 (11:59 -0400)

read-cache: clean up casting and byte decoding

This patch does a clean up pass to minimize the casting required to work
with the memory mapped index (mmap).

It also makes the decoding of network byte order more consistent by using
get_be32() where possible.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Third batch for 2.20Junio C Hamano Wed, 10 Oct 2018 03:38:03 +0000 (12:38 +0900)

Third batch for 2.20

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ab/fsck-skiplist'Junio C Hamano Wed, 10 Oct 2018 03:37:16 +0000 (12:37 +0900)

Merge branch 'ab/fsck-skiplist'

Update fsck.skipList implementation and documentation.

* ab/fsck-skiplist:
fsck: support comments & empty lines in skipList
fsck: use oidset instead of oid_array for skipList
fsck: use strbuf_getline() to read skiplist file
fsck: add a performance test for skipList
fsck: add a performance test
fsck: document that skipList input must be unabbreviated
fsck: document and test commented & empty line skipList input
fsck: document and test sorted skipList input
fsck tests: add a test for no skipList input
fsck tests: setup of bogus commit object

Merge branch 'ds/multi-pack-verify'Junio C Hamano Wed, 10 Oct 2018 03:37:16 +0000 (12:37 +0900)

Merge branch 'ds/multi-pack-verify'

"git multi-pack-index" learned to detect corruption in the .midx
file it uses, and this feature has been integrated into "git fsck".

* ds/multi-pack-verify:
fsck: verify multi-pack-index
multi-pack-index: report progress during 'verify'
multi-pack-index: verify object offsets
multi-pack-index: fix 32-bit vs 64-bit size check
multi-pack-index: verify oid lookup order
multi-pack-index: verify oid fanout order
multi-pack-index: verify missing pack
multi-pack-index: verify packname order
multi-pack-index: verify corrupt chunk lookup table
multi-pack-index: verify bad header
multi-pack-index: add 'verify' verb

Merge branch 'bc/hash-independent-tests'Junio C Hamano Wed, 10 Oct 2018 03:37:16 +0000 (12:37 +0900)

Merge branch 'bc/hash-independent-tests'

Various tests have been updated to make it easier to swap the
hash function used for object identification.

* bc/hash-independent-tests:
t5318: use test_oid for HASH_LEN
t1407: make hash size independent
t1406: make hash-size independent
t1405: make hash size independent
t1400: switch hard-coded object ID to variable
t1006: make hash size independent
t0064: make hash size independent
t0002: abstract away SHA-1 specific constants
t0000: update tests for SHA-256
t0000: use hash translation table
t: add test functions to translate hash-related values

Merge branch 'nd/test-tool'Junio C Hamano Wed, 10 Oct 2018 03:37:16 +0000 (12:37 +0900)

Merge branch 'nd/test-tool'

Test helper binaries clean-up.

* nd/test-tool:
Makefile: add a hint about TEST_BUILTINS_OBJS
t/helper: merge test-dump-fsmonitor into test-tool
t/helper: merge test-parse-options into test-tool
t/helper: merge test-pkt-line into test-tool
t/helper: merge test-dump-untracked-cache into test-tool
t/helper: keep test-tool command list sorted

Merge branch 'nd/config-split'Junio C Hamano Wed, 10 Oct 2018 03:37:15 +0000 (12:37 +0900)

Merge branch 'nd/config-split'

Split Documentation/config.txt for easier maintenance.

* nd/config-split:
config.txt: move submodule part out to a separate file
config.txt: move sequence.editor out of "core" part
config.txt: move sendemail part out to a separate file
config.txt: move receive part out to a separate file
config.txt: move push part out to a separate file
config.txt: move pull part out to a separate file
config.txt: move gui part out to a separate file
config.txt: move gitcvs part out to a separate file
config.txt: move format part out to a separate file
config.txt: move fetch part out to a separate file
config.txt: follow camelCase naming

cache-tree: skip some blob checks in partial cloneJonathan Tan Tue, 9 Oct 2018 18:40:37 +0000 (11:40 -0700)

cache-tree: skip some blob checks in partial clone

In a partial clone, whenever a sparse checkout occurs, the existence of
all blobs in the index is verified, whether they are included or
excluded by the .git/info/sparse-checkout specification. This
significantly degrades performance because a lazy fetch occurs whenever
the existence of a missing blob is checked.

This is because cache_tree_update() checks the existence of all objects
in the index, whether or not CE_SKIP_WORKTREE is set on them. Teach
cache_tree_update() to skip checking CE_SKIP_WORKTREE objects when the
repository is a partial clone. This improves performance for sparse
checkout and also other operations that use cache_tree_update().

Instead of completely removing the check, an argument could be made that
the check should instead be replaced by a check that the blob is
promised, but for performance reasons, I decided not to do this.
If the user needs to verify the repository, it can be done using fsck
(which will notify if a tree points to a missing and non-promised blob,
whether the blob is included or excluded by the sparse-checkout
specification).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Declare that the next one will be named 2.20Junio C Hamano Wed, 10 Oct 2018 00:20:03 +0000 (09:20 +0900)

Declare that the next one will be named 2.20

Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport.c: introduce core.alternateRefsPrefixesTaylor Blau Mon, 8 Oct 2018 18:09:30 +0000 (11:09 -0700)

transport.c: introduce core.alternateRefsPrefixes

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

$ git config core.alternateRefsCommand ' \
f() { git -C "$1" for-each-ref \
refs/tags --format="%(objectname)" }; f "$@"'

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

$ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport.c: introduce core.alternateRefsCommandTaylor Blau Mon, 8 Oct 2018 18:09:28 +0000 (11:09 -0700)

transport.c: introduce core.alternateRefsCommand

When in a repository containing one or more alternates, Git would
sometimes like to list references from those alternates. For example,
'git receive-pack' lists the "tips" pointed to by references in those
alternates as special ".have" references.

Listing ".have" references is designed to make pushing changes from
upstream to a fork a lightweight operation, by advertising to the pusher
that the fork already has the objects (via its alternate). Thus, the
client can avoid sending them.

However, when the alternate (upstream, in the previous example) has a
pathologically large number of references, the initial advertisement is
too expensive. In fact, it can dominate any such optimization where the
pusher avoids sending certain objects.

Introduce "core.alternateRefsCommand" in order to provide a facility to
limit or filter alternate references. This can be used, for example, to
filter out references the alternate does not wish to send (for space
concerns, or otherwise) during the initial advertisement.

Let the repository that has alternates configure this command to avoid
trusting the alternate to provide us a safe command to run in the shell.
To find the alternate, pass its absolute path as the first argument.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport.c: extract 'fill_alternate_refs_command'Taylor Blau Mon, 8 Oct 2018 18:09:26 +0000 (11:09 -0700)

transport.c: extract 'fill_alternate_refs_command'

To list alternate references, 'read_alternate_refs' creates a child
process running 'git for-each-ref' in the alternate's Git directory.

Prepare to run other commands besides 'git for-each-ref' by introducing
and moving the relevant code from 'read_alternate_refs' to
'fill_alternate_refs_command'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport: drop refnames from for_each_alternate_refJeff King Mon, 8 Oct 2018 18:09:23 +0000 (11:09 -0700)

transport: drop refnames from for_each_alternate_ref

None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

docs: typo: s/isimilar/similar/Michael Witten Sat, 6 Oct 2018 04:20:22 +0000 (04:20 +0000)

docs: typo: s/isimilar/similar/

Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

docs: graph: remove unnecessary `graph_update()' callMichael Witten Sat, 6 Oct 2018 04:20:16 +0000 (04:20 +0000)

docs: graph: remove unnecessary `graph_update()' call

The sample code calls `get_revision()' followed by `graph_update()',
but the documentation and source code indicate that `get_revision()'
already calls `graph_update()' for you.

Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

docs: typo: s/go/to/Michael Witten Sat, 6 Oct 2018 04:20:09 +0000 (04:20 +0000)

docs: typo: s/go/to/

Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-completion.bash: add completion for stash listSteven Fernandez Thu, 27 Sep 2018 19:59:00 +0000 (20:59 +0100)

git-completion.bash: add completion for stash list

Since stash list accepts git-log options, add the following useful
options that make sense in the context of the `git stash list` command:

--name-status --oneline --patch-with-stat

Signed-off-by: Steven Fernandez <steve@lonetwin.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: do not list refs if fetching only hashesJonathan Tan Thu, 27 Sep 2018 19:24:07 +0000 (12:24 -0700)

fetch: do not list refs if fetching only hashes

If only hash literals are given on a "git fetch" command-line, tag
following is not requested, and the fetch is done using protocol v2, a
list of refs is not required from the remote. Therefore, optimize by
invoking transport_get_remote_refs() only if we need the refs.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport: list refs before fetch if necessaryJonathan Tan Thu, 27 Sep 2018 19:24:06 +0000 (12:24 -0700)

transport: list refs before fetch if necessary

The built-in bundle transport and the transport helper interface do not
work when transport_fetch_refs() is called immediately after transport
creation. This will be needed in a subsequent patch, so fix this.

Evidence: fetch_refs_from_bundle() relies on data->header being
initialized in get_refs_from_bundle(), and fetch() in transport-helper.c
relies on either data->fetch or data->import being set by get_helper(),
but neither transport_helper_init() nor fetch() calls get_helper().

Up until the introduction of the partial clone feature, this has not
been a problem, because transport_fetch_refs() is always called after
transport_get_remote_refs(). With the introduction of the partial clone
feature, which involves calling transport_fetch_refs() (to fetch objects
by their OIDs) without transport_get_remote_refs(), this is still not a
problem, but only coincidentally - we do not support partially cloning a
bundle, and as for cloning using a transport-helper-using protocol, it
so happens that before transport_fetch_refs() is called, fetch_refs() in
fetch-object.c calls transport_set_option(), which means that the
aforementioned get_helper() is invoked through set_helper_option() in
transport-helper.c.

This could be fixed by fixing the transports themselves, but it doesn't
seem like a good idea to me to open up previously untested code paths;
also, there may be transport helpers in the wild that assume that "list"
is always called before "fetch". Instead, fix this by having
transport_fetch_refs() call transport_get_remote_refs() to ensure that
the latter is always called at least once, unless the transport
explicitly states that it supports fetching without listing refs.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport: do not list refs if possibleJonathan Tan Thu, 27 Sep 2018 19:24:05 +0000 (12:24 -0700)

transport: do not list refs if possible

When all refs to be fetched are exact OIDs, it is possible to perform a
fetch without requiring the remote to list refs if protocol v2 is used.
Teach Git to do this.

This currently has an effect only for lazy fetches done from partial
clones. The change necessary to likewise optimize "git fetch <remote>
<sha-1>" will be done in a subsequent patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

transport: allow skipping of ref listingJonathan Tan Thu, 27 Sep 2018 19:24:04 +0000 (12:24 -0700)

transport: allow skipping of ref listing

The get_refs_via_connect() function both performs the handshake
(including determining the protocol version) and obtaining the list of
remote refs. However, the fetch protocol v2 supports fetching objects
without the listing of refs, so make it possible for the user to skip
the listing by creating a new handshake() function. This will be used in
a subsequent commit.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests: order arguments to git-rev-list properlyMatthew DeVore Fri, 5 Oct 2018 21:54:07 +0000 (14:54 -0700)

tests: order arguments to git-rev-list properly

It is a common mistake to put positional arguments before flags when
invoking git-rev-list. Order the positional arguments last.

This patch skips git-rev-list invocations which include the --not flag,
since the ordering of flags and positional arguments affects the
behavior. This patch also skips invocations of git-rev-list that occur
in command substitution in which the exit code is discarded, since
fixing those properly will require a more involved cleanup.

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t9109: don't swallow Git errors upstream of pipesMatthew DeVore Fri, 5 Oct 2018 21:54:06 +0000 (14:54 -0700)

t9109: don't swallow Git errors upstream of pipes

'git ... | foo' will mask any errors or crashes in git, so split up such
pipes in this file.

One testcase uses several separate pipe sequences in a row which are
awkward to split up. Wrap the split-up pipe in a function so the
awkwardness is not repeated. Also change that testcase's surrounding
quotes from double to single to avoid premature string interpolation.

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests: don't swallow Git errors upstream of pipesMatthew DeVore Fri, 5 Oct 2018 21:54:05 +0000 (14:54 -0700)

tests: don't swallow Git errors upstream of pipes

Some pipes in tests lose the exit code of git processes, which can mask
unexpected behavior like crashes. Split these pipes up so that git
commands are only at the end of pipes rather than the beginning or
middle.

The violations fixed in this patch were found in the process of fixing
pipe placement in a prior patch.

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/*: fix ordering of expected/observed argumentsMatthew DeVore Fri, 5 Oct 2018 21:54:04 +0000 (14:54 -0700)

t/*: fix ordering of expected/observed arguments

Fix various places where the ordering was obviously wrong, meaning it
was easy to find with grep.

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tests: standardize pipe placementMatthew DeVore Fri, 5 Oct 2018 21:54:03 +0000 (14:54 -0700)

tests: standardize pipe placement

Instead of using a line-continuation and pipe on the second line, take
advantage of the shell's implicit line continuation after a pipe
character. So for example, instead of

some long line \
| next line

use

some long line |
next line

And add a blank line before and after the pipe where it aids readability
(it usually does).

This better matches the coding style documented in
Documentation/CodingGuidelines and used in shell scripts elsewhere in
the tree.

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation: add shell guidelinesMatthew DeVore Fri, 5 Oct 2018 21:54:02 +0000 (14:54 -0700)

Documentation: add shell guidelines

Add the following guideline to Documentation/CodingGuidelines:

Break overlong lines after "&&", "||", and "|", not before
them; that way the command can continue to subsequent lines
without backslash at the end.

And the following to t/README (since it is specific to writing tests):

Pipes and $(git ...) should be avoided when they swallow exit
codes of Git processes

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/README: reformat Do, Don't, Keep in mind listsMatthew DeVore Fri, 5 Oct 2018 21:54:01 +0000 (14:54 -0700)

t/README: reformat Do, Don't, Keep in mind lists

The list of Don'ts for test writing has grown large such that it is hard
to see at a glance which section an item is in. In other words, if I
ignore a little bit of surrounding context, the "don'ts" look like
"do's."

To make the list more readable, prefix "Don't" in front of every first
sentence in the items.

Also, the "Keep in mind" list is out of place and awkward, because it
was a very short "list" beneath two very long ones, and it seemed easy
to miss under the list of "don'ts," and it only had one item. So move
this item to the list of "do's" and phrase as "Remember..."

Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: reduce initial oid allocationDerrick Stolee Wed, 3 Oct 2018 17:12:19 +0000 (10:12 -0700)

commit-graph: reduce initial oid allocation

While writing a commit-graph file, we store the full list of
commits in a flat list. We use this list for sorting and ensuring
we are closed under reachability.

The initial allocation assumed that (at most) one in four objects
is a commit. This is a dramatic over-count for many repos,
especially large ones. Since we grow the repo dynamically, reduce
this count by a factor of eight. We still set it to a minimum of
1024 before allocating.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/commit-graph.c: UNLEAK variablesMartin Ågren Wed, 3 Oct 2018 17:12:17 +0000 (10:12 -0700)

builtin/commit-graph.c: UNLEAK variables

`graph_verify()`, `graph_read()` and `graph_write()` do the hard work of
`cmd_commit_graph()`. As soon as these return, so does
`cmd_commit_graph()`.

`strbuf_getline()` may allocate memory in the strbuf, yet return EOF.
We need to release the strbuf or UNLEAK it. Go for the latter since we
are close to returning from `graph_write()`.

`graph_write()` also fails to free the strings in the string list. They
have been added to the list with `strdup_strings` set to 0. We could
flip `strdup_strings` before clearing the list, which is our usual hack
in situations like this. But since we are about to exit, let's just
UNLEAK the whole string list instead.

UNLEAK `graph` in `graph_verify`. While at it, and for consistency,
UNLEAK in `graph_read()` as well, and remove an unnecessary UNLEAK just
before dying.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: clean up leaked memory during writeDerrick Stolee Wed, 3 Oct 2018 17:12:15 +0000 (10:12 -0700)

commit-graph: clean up leaked memory during write

The write_commit_graph() method in commit-graph.c leaks some lits
and strings during execution. In addition, a list of strings is
leaked in write_commit_graph_reachable(). Clean these up so our
memory checking is cleaner.

Further, if we use a list of pack-files to find the commits, we
can leak the packed_git structs after scanning them for commits.

Running the following commands demonstrates the leak before and
the fix after:

* valgrind --leak-check=full ./git commit-graph write --reachable
* valgrind --leak-check=full ./git commit-graph write --stdin-packs

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --color-moved: fix a memory leakPhillip Wood Thu, 4 Oct 2018 10:07:45 +0000 (11:07 +0100)

diff --color-moved: fix a memory leak

Free the hashmap items as well as the hashmap itself. This was found
with asan.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --color-moved-ws: fix another memory leakPhillip Wood Thu, 4 Oct 2018 10:07:44 +0000 (11:07 +0100)

diff --color-moved-ws: fix another memory leak

This is obvious in retrospect, it was found with asan.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --color-moved-ws: fix a memory leakPhillip Wood Thu, 4 Oct 2018 10:07:43 +0000 (11:07 +0100)

diff --color-moved-ws: fix a memory leak

Don't duplicate the indentation string if we're not going to use it.
This was found with asan.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --color-moved-ws: fix out of bounds string accessPhillip Wood Thu, 4 Oct 2018 10:07:42 +0000 (11:07 +0100)

diff --color-moved-ws: fix out of bounds string access

When adjusting the start of the string to take account of the change
in indentation the code was not checking that the string being
adjusted was in fact longer than the indentation change. This was
detected by asan.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --color-moved-ws: fix double free crashPhillip Wood Thu, 4 Oct 2018 10:07:41 +0000 (11:07 +0100)

diff --color-moved-ws: fix double free crash

Running

git diff --color-moved-ws=allow-indentation-change v2.18.0 v2.19.0

results in a crash due to a double free. This happens when two
potential moved blocks start with consecutive lines. As
pmb_advance_or_null_multi_match() advances it copies the ws_delta from
the last matching line to the next. When the first of our consecutive
lines is advanced its ws_delta well be copied to the second,
overwriting the ws_delta of the block containing the second line. Then
when the second line is advanced it will copy the new ws_delta to the
line below it and so on. Eventually one of these blocks will stop
matching and the ws_delta will be freed. From then on the other block
is in a use-after-free state and when it stops matching it will try to
free the ws_delta that has already been freed by the other block.

The solution is to store the ws_delta in the array of potential moved
blocks rather than with the lines. This means that it no longer needs
to be copied around and one block cannot overwrite the ws_delta of
another. Additionally it saves some malloc/free calls as we don't keep
allocating and freeing ws_deltas.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

oidset: uninline oidset_init()René Scharfe Thu, 4 Oct 2018 15:14:37 +0000 (17:14 +0200)

oidset: uninline oidset_init()

There is no need to inline oidset_init(), as it's typically only called
twice in the lifetime of an oidset (once at the beginning and at the end
by oidset_clear()) and kh_resize_* is quite big, so move its definition
to oidset.c. Document it while we're at it.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

oidset: use khashRené Scharfe Thu, 4 Oct 2018 15:13:06 +0000 (17:13 +0200)

oidset: use khash

Reimplement oidset using khash.h in order to reduce its memory footprint
and make it faster.

Performance of a command that mainly checks for duplicate objects using
an oidset, with master and Clang 6.0.1:

$ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'"

$ /usr/bin/time $cmd >/dev/null
0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k
0inputs+0outputs (0major+11204minor)pagefaults 0swaps

$ hyperfine "$cmd"
Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'

Time (mean ± σ): 250.0 ms ± 6.0 ms [User: 225.9 ms, System: 23.6 ms]

Range (min … max): 242.0 ms … 261.1 ms

And with this patch:

$ /usr/bin/time $cmd >/dev/null
0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k
0inputs+0outputs (0major+8318minor)pagefaults 0swaps

$ hyperfine "$cmd"
Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'

Time (mean ± σ): 151.9 ms ± 4.9 ms [User: 130.5 ms, System: 21.2 ms]

Range (min … max): 148.2 ms … 170.4 ms

Initial-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

khash: factor out kh_release_*René Scharfe Thu, 4 Oct 2018 15:10:54 +0000 (17:10 +0200)

khash: factor out kh_release_*

Add a function for releasing the khash-internal allocations, but not the
khash structure itself. It can be used with on-stack khash structs.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: load tip_oids eagerly iff neededRené Scharfe Thu, 4 Oct 2018 15:09:39 +0000 (17:09 +0200)

fetch-pack: load tip_oids eagerly iff needed

tip_oids_contain() lazily loads refs into an oidset at its first call.
It abuses the internal (sub)member .map.tablesize of that oidset to
check if it has done that already.

Determine if the oidset needs to be populated upfront and then do that
instead. This duplicates a loop, but simplifies the existing one by
separating concerns between the two.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>