mingw: verify that paths are not mistaken for remote nicknames
This added test case simply verifies that users will not be bothered
with bogus complaints à la
warning: unable to access '.git/remotes/D:\repo': Invalid argument
when fetching from a Windows path (in this case, D:\repo).
[j6t: mark the new test as test_expect_failure]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fopen() returns NULL, it could be because the given path does not
exist, but it could also be some other errors and the caller has to
check. Add a wrapper so we don't have to repeat the same error check
everywhere.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many places, Git warns about an inaccessible file after a fopen()
failed. To discern these cases from other cases where we want to warn
about inaccessible files, introduce a new helper specifically to test
whether fopen() failed because the current user lacks the permission to
open file in question.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
This variable is added [1] with the assumption that on a sane system,
fopen(<dir>, "r") should return NULL. Linux and FreeBSD do not meet this
expectation while at least Windows and AIX do. Let's make sure they
behave the same way.
I only tested one version on Linux (4.7.0 with glibc 2.22) and
FreeBSD (11.0) but since GNU/kFreeBSD is fbsd kernel with gnu userspace,
I'm pretty sure it shares the same problem.
[1] cba22528fa (Add compat/fopen.c which returns NULL on attempt to open
directory - 2008-02-08)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- provides error details
- explains error on reading, or writing, or whatever operation
- has l10n support
- prints file name in the error
Some of these are missing in the places that are replaced with xfopen(),
which is a clear win. In some other places, it's just less code (not as
clearly a win as the previous case but still is).
The only slight regresssion is in remote-testsvn, where we don't report
the file class (marks files) in the error messages anymore. But since
this is a _test_ svn remote transport, I'm not too concerned.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If git is built with the FREAD_READS_DIRECTORIES build variable set, this
would cause sparse to issue a 'not declared, should it be static?' warning
on Linux. This is a result of the method employed by 'compat/fopen.c' to
suppress the (possible) redefinition of the (system) fopen macro, which
also removes the extern declaration of the git_fopen function.
In order to suppress the warning, introduce a new macro to suppress the
definition (or possibly the re-definition) of the fopen symbol as a macro
override. This new macro (SUPPRESS_FOPEN_REDEFINITION) is only defined in
'compat/fopen.c', just prior to the inclusion of the 'git-compat-util.h'
header file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Completion for "git checkout <branch>" that auto-creates the branch
out of a remote tracking branch can now be disabled, as this
completion often gets in the way when completing to checkout an
existing local branch that happens to share the same prefix with
bunch of remote tracking branches.
rebase -i: reread the todo list if `exec` touched it
In the scripted version of the interactive rebase, there was no internal
representation of the todo list; it was re-read before every command.
That allowed the hack that an `exec` command could append (or even
completely rewrite) the todo list.
This hack was broken by the partial conversion of the interactive rebase
to C, and this patch reinstates it.
We also add a small test to verify that this fix does not regress in the
future.
Signed-off-by: Stephen Hicks <sdh@google.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Linux32 build was not build with our strict compiler settings (e.g.
warnings as errors). Fix this by passing the DEVELOPER environment
variable to the docker container.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the $STATUS variable contains a "%" character then printf will
interpret that as invalid format string. Fix this by formatting $STATUS
as string.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internals of the refs API around the cached refs has been
streamlined.
* mh/separate-ref-cache:
do_for_each_entry_in_dir(): delete function
files_pack_refs(): use reference iteration
commit_packed_refs(): use reference iteration
cache_ref_iterator_begin(): make function smarter
get_loose_ref_cache(): new function
get_loose_ref_dir(): function renamed from get_loose_refs()
do_for_each_entry_in_dir(): eliminate `offset` argument
refs: handle "refs/bisect/" in `loose_fill_ref_dir()`
ref-cache: use a callback function to fill the cache
refs: record the ref_store in ref_cache, not ref_dir
ref-cache: introduce a new type, ref_cache
refs: split `ref_cache` code into separate files
ref-cache: rename `remove_entry()` to `remove_entry_from_dir()`
ref-cache: rename `find_ref()` to `find_ref_entry()`
ref-cache: rename `add_ref()` to `add_ref_entry()`
refs_verify_refname_available(): use function in more places
refs_verify_refname_available(): implement once for all backends
refs_ref_iterator_begin(): new function
refs_read_raw_ref(): new function
get_ref_dir(): don't call read_loose_refs() for "refs/bisect"
Allow to lock a worktree immediately after it's created. This helps
prevent a race between "git worktree add; git worktree lock" and
"git worktree prune".
While handy, "git_path()" is a dangerous function to use as a
callsite that uses it safely one day can be broken by changes
to other code that calls it. Reduction of its use continues.
* jk/war-on-git-path:
am: drop "dir" parameter from am_state_init
replace strbuf_addstr(git_path()) with git_path_buf()
replace xstrdup(git_path(...)) with git_pathdup(...)
use git_path_* helper functions
branch: add edit_description() helper
bisect: add git_path_bisect_terms helper
"git checkout" that handles a lot of paths has been optimized by
reducing the number of unnecessary checks of paths in the
has_dir_name() function.
* jh/add-index-entry-optim:
read-cache: speed up has_dir_name (part 2)
read-cache: speed up has_dir_name (part 1)
read-cache: speed up add_index_entry during checkout
p0006-read-tree-checkout: perf test to time read-tree
read-cache: add strcmp_offset function
The recently introduced conditional inclusion of configuration did
not work well when early-config mechanism was involved.
* nd/conditional-config-in-early-config:
config: correct file reading order in read_early_config()
config: handle conditional include when $GIT_DIR is not set up
config: prepare to pass more info in git_config_with_options()
Having a git command on the upstream side of a pipe in a test
script will hide the exit status from the command, which may cause
us to fail to notice a breakage; rewrite tests in a script to avoid
this issue.
* pc/t2027-git-to-pipe-cleanup:
t2027: avoid using pipes
* gb/rebase-signoff:
rebase: pass --[no-]signoff option to git am
builtin/am: fold am_signoff() into am_append_signoff()
builtin/am: honor --signoff also when --rebasing
prio_queue_reverse: don't swap elements with themselves
Our array-reverse algorithm does the usual "walk from both
ends, swapping elements". We can quit when the two indices
are equal, since:
1. Swapping an element with itself is a noop.
2. If i and j are equal, then in the next iteration i is
guaranteed to be bigge than j, and we will exit the
loop.
So exiting the loop on equality is slightly more efficient.
And more importantly, the new SWAP() macro does not expect
to handle noop swaps; it will call memcpy() with the same src
and dst pointers in this case. It's unclear whether that
causes a problem on any platforms by violating the
"overlapping memory" constraint of memcpy, but it does cause
valgrind to complain.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gethostname(2) may not NUL terminate the buffer if hostname does
not fit; unfortunately there is no easy way to see if our buffer
was too small, but at least this will make sure we will not end up
using garbage past the end of the buffer.
* dt/xgethostname-nul-termination:
xgethostname: handle long hostnames
use HOST_NAME_MAX to size buffers for gethostname(2)
* rs/misc-cppcheck-fixes:
server-info: avoid calling fclose(3) twice in update_info_file()
files_for_each_reflog_ent_reverse(): close stream and free strbuf on error
am: close stream on error, but not stdin
"git fetch-pack" was not prepared to accept ERR packet that the
upload-pack can send with a human-readable error message. It
showed the packet contents with ERR prefix, so there was no data
loss, but it was redundant to say "ERR" in an error message.
* jt/fetch-pack-error-reporting:
fetch-pack: show clearer error message upon ERR
* jk/quarantine-received-objects:
refs: reject ref updates while GIT_QUARANTINE_PATH is set
receive-pack: document user-visible quarantine effects
receive-pack: drop tmp_objdir_env from run_update_hook
The index file has a trailing SHA-1 checksum to detect file
corruption, and historically we checked it every time the index
file is used. Omit the validation during normal use, and instead
verify only in "git fsck".
In a 2- and 3-way merge of trees, more than one source trees often
end up sharing an identical subtree; optimize by not reading the
same tree multiple times in such a case.
* jh/unpack-trees-micro-optim:
unpack-trees: avoid duplicate ODB lookups during checkout
The string-list API used a custom reallocation strategy that was
very inefficient, instead of using the usual ALLOC_GROW() macro,
which has been fixed.
* jh/string-list-micro-optim:
string-list: use ALLOC_GROW macro when reallocing string_list
$GIT_DIR may in some cases be normalized with all symlinks resolved
while "gitdir" path expansion in the pattern does not receive the
same treatment, leading to incorrect mismatch. This has been fixed.
* nd/conditional-config-include:
config: resolve symlinks in conditional include's patterns
path.c: and an option to call real_path() in expand_user_path()
Allow the http.postbuffer configuration variable to be set to a
size that can be expressed in size_t, which can be larger than
ulong on some platforms.
* dt/http-postbuffer-can-be-large:
http.postbuffer: allow full range of ssize_t values
t/perf: correctly align non-ASCII descriptions in output
Change the test descriptions from being treated as binary blobs by
perl to being treated as UTF-8. This ensures that e.g. a test
description like "æ" is counted as 1 character, not 2.
I have WIP performance tests for non-ASCII grep patterns on another
topic that are affected by this.
Now instead of:
$ ./run p0000-perf-lib-sanity.sh
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
We emit:
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
Fixes code originally added in 342e9ef2d9 ("Introduce a performance
testing framework", 2012-02-17).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we complete branch names for "git checkout", we also
complete remote branch names that could trigger the DWIM
behavior. Depending on your workflow and project, this can
be either convenient or annoying.
For instance, my clone of gitster.git contains 74 local
"jk/*" branches, but origin contains another 147. When I
want to checkout a local branch but can't quite remember the
name, tab completion shows me 251 entries. And worse, for a
topic that has been picked up for pu, the upstream branch
name is likely to be similar to mine, leading to a high
probability that I pick the wrong one and accidentally
create a new branch.
This patch adds a way for the user to tell the completion
code not to include DWIM suggestions for checkout. This can
already be done by typing:
git checkout --no-guess jk/<TAB>
but that's rather cumbersome. The downside, of course, is
that you no longer get completion support when you _do_ want
to invoke the DWIM behavior. But depending on your workflow,
that may not be a big loss (for instance, in git.git I am
much more likely to want to detach, so I'd type "git
checkout origin/jk/<TAB>" anyway).
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: expand "push --delete <remote> <ref>" for refs on that <remote>
Change the completion of "push --delete <remote> <ref>" to complete
refs on that <remote>, not all refs.
Before this cloning git.git and doing "git push --delete origin
p<TAB>" will complete nothing, since a fresh clone of git.git will
have no "pu" branch, whereas origin/p<TAB> will uselessly complete
origin/pu, but fully qualified references aren't accepted by
"--delete".
Now p<TAB> will complete as "pu". The completion of giving --delete
later, e.g. "git push origin --delete p<TAB>" remains unchanged, this
is a bug, but is a general existing limitation of the bash completion,
and not how git-push is documented, so I'm not fixing that case, but
adding a failing TODO test for it.
The testing code was supplied by SZEDER Gábor in
<20170421122832.24617-1-szeder.dev@gmail.com> with minor setup
modifications on my part.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com> Test-code-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original NIST press release linked here is no longer
available. But it was just a one-page summary of a larger
planning report; we can link to the report and point people
to the executive summary, which contains the same
information.
Ideally we'd cite it with a DOI, but I couldn't dig one up
for this particular document. I found many URLs pointing to
this report, but they all end up redirecting to this one
(and it looks somewhat official).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-archimport has an option to register archives at
mirrors.sourcecontrol.net. The sourcecontrol.net domain
still exists, but that hostname no longer exists.
That means this feature is presumably broken. I'll leave the
examination and modification of that to people who might
actually use archimport. But in the meantime, let's wrap the
reference in the documentation in backticks, which will
avoid turning it into a broken link (and thus polluting
linkchecker results).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The slides for the Linux-mentoring presentation are no
longer available. Let's point to the wayback version of the
page, which works.
Note that the referenced diagram is also available on page
15 of [1]. We could link to that instead, but it's not clear
from the URL scheme ("uploads") whether it's going to stick
around forever.
Many sites these days unconditionally redirect http requests
to their https equivalents. Let's make our links https in
the first place to save the client a redirect.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see an error from split_cmdline(), we exit the
function without freeing the copy of the command string we
made.
This was sort-of introduced by 22e5ae5c8 (connect.c: handle
errors from split_cmdline, 2017-04-10). The leak existed
before that, but before that commit fixed the bug, we could
never trigger this else clause in the first place.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only caller of this function passes in a static buffer
returned from git_path(). This looks dangerous at first
glance, but turns out to be OK because the first thing we do
is xstrdup() the result.
Let's turn this into a git_pathdup(). That's slightly more
efficient (no extra copy), and makes it easier to audit for
dangerous git_path() invocations.
Since there's only a single caller, let's just set this
default path inside the init function. That makes the memory
ownership clear.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
replace strbuf_addstr(git_path()) with git_path_buf()
Writing directly into the strbuf avoids a useless copy of
the data, and dropping calls to git_path() makes it easier
to audit for dangerous calls.
Note that git_path() does an implicit strbuf_reset(), but in
each of these cases we were either already doing that reset,
or writing into a fresh strbuf anyway.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
replace xstrdup(git_path(...)) with git_pathdup(...)
It's more efficient to use git_pathdup(), as it skips an
extra copy of the path. And by removing some calls to
git_path(), it makes it easier to audit for dangerous uses.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Long ago we added functions like git_path_merge_msg() to
replace the more dangerous git_path("MERGE_MSG"). Over time
some new calls to the latter have crept it. Let's convert
them to use the safer form.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than have a variable with a short name that is fed to
git_path(), let's add a helper function that returns the
full path. This avoids the dangerous git_path() function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids using the dangerous git_path(). Right now
there's only one call site (because the writing half is
still part of the shell script), but it may come in handy in
the future as more of bisect is written in C. It also
matches how we access the other BISECT_* files.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
read-cache: avoid using git_path() in freshen_shared_index()
When performing an interactive rebase in split-index mode,
the commit message that one should rework when squashing commits
can contain some garbage instead of the usual concatenation of
both of the commit messages.
The code uses git_path() to compute the shared index filename, and
passes it to check_and_freshen_file() as its argument; there is no
guarantee that the rotating pathname buffer passed as argument will
stay valid during the life of this call. Make our own copy before
calling the function and pass the copy as its argument to avoid this
risky pattern.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As explained in the document. This option has an advantage over the
command sequence "git worktree add && git worktree lock": there will be
no gap that somebody can accidentally "prune" the new worktree (or soon,
explicitly "worktree remove" it).
"worktree add" does keep a lock on while it's preparing the worktree.
If --lock is specified, this lock remains after the worktree is created.
Suggested-by: David Taylor <David.Taylor@dell.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jh/memihash-opt:
p0004: make perf test executable
t3008: skip lazy-init test on a single-core box
test-online-cpus: helper to return cpu count
name-hash: fix buffer overrun
"git p4" used "name-rev HEAD" when it wants to learn what branch is
checked out; it should use "symbolic-ref HEAD".
* ld/p4-current-branch-fix:
git-p4: don't use name-rev to get current branch
git-p4: add read_pipe_text() internal function
git-p4: add failing test for name-rev rather than symbolic-ref
Clean up fallouts from recent tightening of the set-up sequence,
where Git barfs when repository information is accessed without
first ensuring that it was started in a repository.
* jk/no-looking-at-dotgit-outside-repo:
test-read-cache: setup git dir
has_sha1_file: don't bother if we are not in a repository
The "submodule" specific field in the ref_store structure is
replaced with a more generic "gitdir" that can later be used also
when dealing with ref_store that represents the set of refs visible
from the other worktrees.
* nd/files-backend-git-dir: (28 commits)
refs.h: add a note about sorting order of for_each_ref_*
t1406: new tests for submodule ref store
t1405: some basic tests on main ref store
t/helper: add test-ref-store to test ref-store functions
refs: delete pack_refs() in favor of refs_pack_refs()
files-backend: avoid ref api targeting main ref store
refs: new transaction related ref-store api
refs: add new ref-store api
refs: rename get_ref_store() to get_submodule_ref_store() and make it public
files-backend: replace submodule_allowed check in files_downcast()
refs: move submodule code out of files-backend.c
path.c: move some code out of strbuf_git_path_submodule()
refs.c: make get_main_ref_store() public and use it
refs.c: kill register_ref_store(), add register_submodule_ref_store()
refs.c: flatten get_ref_store() a bit
refs: rename lookup_ref_store() to lookup_submodule_ref_store()
refs.c: introduce get_main_ref_store()
files-backend: remove the use of git_path()
files-backend: add and use files_ref_path()
files-backend: add and use files_reflog_path()
...
If a patch e-mail had its first paragraph after an in-body header
indented (even after a blank line after the in-body header line),
the indented line was mistook as a continuation of the in-body
header. This has been fixed.
"git push --recurse-submodules --push-option=<string>" learned to
propagate the push option recursively down to pushes in submodules.
* bw/push-options-recursively-to-submodules:
push: propagate remote and refspec with --recurse-submodules
submodule--helper: add push-check subcommand
remote: expose parse_push_refspec function
push: propagate push-options with --recurse-submodules
push: unmark a local variable as static
Conversion from unsigned char [40] to struct object_id continues.
* bc/object-id:
Documentation: update and rename api-sha1-array.txt
Rename sha1_array to oid_array
Convert sha1_array_for_each_unique and for_each_abbrev to object_id
Convert sha1_array_lookup to take struct object_id
Convert remaining callers of sha1_array_lookup to object_id
Make sha1_array_append take a struct object_id *
sha1-array: convert internal storage for struct sha1_array to object_id
builtin/pull: convert to struct object_id
submodule: convert check_for_new_submodule_commits to object_id
sha1_name: convert disambiguate_hint_fn to take object_id
sha1_name: convert struct disambiguate_state to object_id
test-sha1-array: convert most code to struct object_id
parse-options-cb: convert sha1_array_append caller to struct object_id
fsck: convert init_skiplist to struct object_id
builtin/receive-pack: convert portions to struct object_id
builtin/pull: convert portions to struct object_id
builtin/diff: convert to struct object_id
Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
Define new hash-size constants for allocating memory
The output from "git status --short" has been extended to show
various kinds of dirtyness in submodules differently; instead of to
"M" for modified, 'm' and '?' can be shown to signal changes only
to the working tree of the submodule but not the commit that is
checked out.
* sb/submodule-short-status:
submodule.c: correctly handle nested submodules in is_submodule_modified
short status: improve reporting for submodule changes
submodule.c: stricter checking for submodules in is_submodule_modified
submodule.c: port is_submodule_modified to use porcelain 2
submodule.c: convert is_submodule_modified to use strbuf_getwholeline
submodule.c: factor out early loop termination in is_submodule_modified
submodule.c: use argv_array in is_submodule_modified
Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.
has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn. This can cause multiple binary searches
for each path.
During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.
This commit is part 2 of 2. This commit handles the
additional possible short-cuts as we look at each
sub-directory prefix.
The net-net gains for add_index_entry_with_check() and
both had_dir_name() commits are best seen for very
large repos.
Here are results for an INFLATED version of linux.git
with 1M files.
$ GIT_PERF_REPO=/mnt/test/linux_inflated.git/ ./run upstream/base HEAD ./p0006-read-tree-checkout.sh
Test upstream/base HEAD
0006.2: read-tree br_base br_ballast (1043893) 3.79(3.63+0.15) 2.68(2.52+0.15) -29.3%
0006.3: switch between br_base br_ballast (1043893) 7.55(6.58+0.44) 6.03(4.60+0.43) -20.1%
0006.4: switch between br_ballast br_ballast_plus_1 (1043893) 10.84(9.26+0.59) 8.44(7.06+0.65) -22.1%
0006.5: switch between aliases (1043893) 10.93(9.39+0.58) 10.24(7.04+0.63) -6.3%
Here are results for a synthetic repo with 4.2M files.
$ GIT_PERF_REPO=~/work/gfw/t/perf/repos/gen-many-files-10.4.3.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (4194305) 29.96(19.26+10.50) 23.76(13.42+10.12) -20.7%
0006.3: switch between br_base br_ballast (4194305) 56.95(36.08+16.83) 45.54(25.94+15.68) -20.0%
0006.4: switch between br_ballast br_ballast_plus_1 (4194305) 90.94(51.50+31.52) 78.22(39.39+30.70) -14.0%
0006.5: switch between aliases (4194305) 93.72(51.63+34.09) 77.94(39.00+30.88) -16.8%
Results for medium repos (like linux.git) are mixed and have
more variance (probably do to disk IO unrelated to this test.
$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (57994) 0.25(0.21+0.03) 0.20(0.17+0.02) -20.0%
0006.3: switch between br_base br_ballast (57994) 10.67(6.06+2.92) 10.51(5.94+2.91) -1.5%
0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.59(0.47+0.16) 0.52(0.40+0.13) -11.9%
0006.5: switch between aliases (57994) 0.59(0.44+0.17) 0.51(0.38+0.14) -13.6%
$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (57994) 0.24(0.21+0.02) 0.21(0.18+0.02) -12.5%
0006.3: switch between br_base br_ballast (57994) 10.42(5.98+2.91) 10.66(5.86+3.09) +2.3%
0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.59(0.49+0.13) 0.53(0.37+0.16) -10.2%
0006.5: switch between aliases (57994) 0.59(0.43+0.17) 0.50(0.37+0.14) -15.3%
Results for smaller repos (like git.git) are not significant.
$ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (3043) 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0%
0006.3: switch between br_base br_ballast (3043) 0.31(0.17+0.11) 0.29(0.19+0.08) -6.5%
0006.4: switch between br_ballast br_ballast_plus_1 (3043) 0.03(0.02+0.00) 0.03(0.02+0.00) +0.0%
0006.5: switch between aliases (3043) 0.03(0.02+0.00) 0.03(0.02+0.00) +0.0%
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.
has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn. This can cause multiple binary searches
for each path.
During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.
This commit is part 1 of 2. This commit handles the top
of has_dir_name() and the trivial optimization.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
read-cache: speed up add_index_entry during checkout
Teach add_index_entry_with_check() to see if the path
of the new item is greater than the last path in the
index array before attempting to search for it.
During checkout, merge_working_tree() populates the new
index in sorted order, so this change will save a binary
lookups per file. This preserves the original behavior
but simply checks the last element before starting the
search.
This helps performance on very large repositories.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>