gitweb: Git config keys are case insensitive, make config search too
"git config -z -l" that gitweb uses in git_parse_project_config() to
populate %config hash returns section and key names of config
variables in lowercase (they are case insensitive). When checking
%config in git_get_project_config() we have to take it into account.
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new process's error output may be redirected elsewhere, but if
the exec fails, output should still go to the parent's stderr. This
has already been done for the die_routine. Do the same for
error_routine.
Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: clarify the invalidated tree entry format
When the entry_count is -1, the tree is invalidated and therefore has
not associated hash (or object name). Explicitly state that the next
entry starts after the newline.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the --quiet flag "git submodule update" and "git submodule add"
didn't behave as the documentation stated. They printed progress output
from the clone, even though they should only print error messages.
Fix that by passing the -q flag to git clone in module_clone() when the
GIT_QUIET variable is set. Two tests in t7400 have been modified to test
that behavior.
Reported-by: Daniel Holtmann-Rice <flyingtabmow@gmail.com> Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Introduce common system-wide settings for convenience
Because of backward compatibility we cannot change gitweb to always
use /etc/gitweb.conf (i.e. even if gitweb_config.perl exists). For
common system-wide settings we therefore need separate configuration
file: /etc/gitweb-common.conf.
Long description:
gitweb currently obtains configuration from the following sources:
If per-instance configuration file exists, then system-wide
configuration is _not used at all_. This is quite untypical and
suprising behavior.
Moreover it is different from way git itself treats /etc/git.conf. It
reads in stuff from /etc/git.conf and then local repos can change or
override things as needed. In fact this is quite beneficial, because
it gives site admins a simple and easy way to give an automatic hint
to a repo about things the admin would like.
On the other hand changing current behavior may lead to the situation,
where something in /etc/gitweb.conf may interfere with unintended
interaction in the local repository. One solution would be to
_require_ to do explicit include; with read_config_file() it is now
easy, as described in gitweb/README (description introduced in this
commit).
But as J.H. noticed we cannot ask people to modify their per-instance
gitweb config file to include system-wide settings, nor we can require
them to do this.
Therefore, as proposed by Junio, for gitweb to have centralized config
elements while retaining backwards compatibility, introduce separate
common system-wide configuration file, by default /etc/gitweb-common.conf
Noticed-by: Drew Northup <drew.northup@maine.edu> Helped-by: John 'Warthog9' Hawley <warthog9@kernel.org> Inspired-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tests: print failed test numbers at the end of the test run
On modern multi-core processors "make test" is often run in multiple jobs.
If one of them fails the test run does stop, but the concurrently running
tests finish their run. It is rather easy to find out which test failed by
doing a "ls -d t/trash*". But that only works when you don't use the "-i"
option to "make test" because you want to get an overview of all failing
tests. In that case all thrash directories are deleted end and the
information which tests failed is lost.
If one or more tests failed, print a list of them before the test summary:
failed test(s): t1000 t6500
fixed 0
success 7638
failed 3
broken 49
total 7723
This makes it possible to just run the test suite with -i and collect all
failed test scripts at the end for further examination.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test to check that git ls-tree sets non-zero exit code on error.
Expected to fail at this commit, fixed by subsequent commit.
Additional tests of adhoc or uncategorised nature should be added to this
file.
Improved-by: Jens Lehmann <Jens.Lehmann@web.de> Improved-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/tag-contains-ab:
Revert clock-skew based attempt to optimize tag --contains traversal
git skew: a tool to find how big a clock skew exists in the history
default core.clockskew variable to one day
limit "contains" traversals based on commit timestamp
tag: speed up --contains calculation
Kirill Smelkov noticed that post-1.7.6 "git checkout"
started leaking tons of memory. The streaming_write_entry
function properly calls close_istream(), but that function
did not actually free() the allocated git_istream struct.
The git_istream struct is totally opaque to calling code,
and must be heap-allocated by open_istream. Therefore it's
not appropriate for callers to have to free it.
This patch makes close_istream() into "close and de-allocate
all associated resources". We could add a new "free_istream"
call, but there's not much point in letting callers inspect
the istream after close. And this patch's semantics make us
match fopen/fclose, which is well-known and understood.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/gitweb-search:
gitweb: Make git_search_* subroutines render whole pages
gitweb: Clean up code in git_search_* subroutines
gitweb: Split body of git_search into subroutines
gitweb: Check permissions first in git_search
* jl/submodule-add-relurl-wo-upstream:
submodule add: clean up duplicated code
submodule add: allow relative repository path even when no url is set
submodule add: test failure when url is not configured in superproject
* maint:
doc/fast-import: clarify notemodify command
Documentation: minor grammatical fix in rev-list-options.txt
Documentation: git-filter-branch honors replacement refs
remote-curl: Add a format check to parsing of info/refs
git-config: Remove extra whitespaces
The "notemodify" fast-import command was introduced in commit a8dd2e7
(fast-import: Add support for importing commit notes, 2009-10-09)
The commit log has slightly different description than the added
documentation. The latter is somewhat confusing. "notemodify" is a
subcommand of "commit" command used to add a note for some commit.
Does this note annotate the commit produced by the "commit" command
or a commit given by it's committish parameter? Which notes tree
does it write notes to?
The exact meaning could be deduced with old description and some
notes machinery knowledge. But let's make it more obvious. This
command is used in a context like "commit refs/notes/test" to
add or rewrite an annotation for a committish parameter. So the
advised way to add notes in a fast-import stream is:
1) import some commits (optional)
2) prepare a "commit" to the notes tree:
2.1) choose notes ref, committer, log message, etc.
2.2) create annotations with "notemodify", where each can refer to
a commit being annotated via a branch name, import mark reference,
sha1 and other expressions specified in the Documentation.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reset command creates its reflog entry from argv.
However, it does so after having run parse_options, which
means the only thing left in argv is any non-option
arguments. Thus you would end up with confusing reflog
entries like:
However, we must also consider that some scripts may set
GIT_REFLOG_ACTION before calling reset, and we need to show
their reflog action (with our text appended). For example:
rebase -i (squash): updating HEAD
On top of that, we also set the ORIG_HEAD reflog action
(even though it doesn't generally exist). In that case, the
reset argument is somewhat meaningless, as it has nothing to
do with what's in ORIG_HEAD.
This patch changes the reset reflog code to show:
$GIT_REFLOG_ACTION: updating {HEAD,ORIG_HEAD}
as before, but only if GIT_REFLOG_ACTION is set. Otherwise,
show:
reset: moving to $rev
for HEAD, and:
reset: updating ORIG_HEAD
for ORIG_HEAD (this is still somewhat superfluous, since we
are in the ORIG_HEAD reflog, obviously, but at least we now
mention which command was used to update it).
While we're at it, we can clean up the code a bit:
- Use strbufs to make the message.
- Use the "rev" parameter instead of showing all options.
This makes more sense, since it is the only thing
impacting the writing of the ref.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-curl: Add a format check to parsing of info/refs
When parsing info/refs, no checks were applied that the file was in
the requried format. Since the file is read from a remote webserver,
this isn't guarenteed to be true. Add a check that the file at least
only contains lines that consist of 40 characters followed by a tab
and then the ref name.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a $(uname_S) case for Minix with the correct options.
Minix's linker needs all libraries specified explicitly.
Add NEEDS_SSL_WITH_CURL to add -lssl when using -lcurl.
Add NEEDS_IDN_WITH_CURL to add -lidn when using -lcurl.
When NEEDS_SSL_WITH_CURL is defined and NEEDS_CRYPTO_WITH_SSL
is defined, add -lcrypt to CURL_LIBCURL.
Change OPENSSL_LINK to OPENSSL_LIBSSL in the
NEEDS_CRYPTO_WITH_SSL conditional in the libopenssl
section. Libraries go in OPENSSL_LIBSSL, OPENSSL_LINK
is for linker flags.
Signed-off-by: Thomas Cort <tcort@minix3.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test4012.png test vector file that was originally used for t4012 to
check operations on binary files was later reused in other tests, making
it no longer consistent to name it after a specific test. Rename it to more
generic "test-binary-1.png".
While at it, rename test9200b to "test-binary-2.png" (even though it is
only used by t9200).
Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
transport-helper: implement marks location as capability
Now that the gitdir location is exported as an environment variable
this can be implemented elegantly without requiring any explicit
flushes nor an ad-hoc exchange of values.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
transport-helper: Use capname for refspec capability too
Previously the refspec capability could not be listed as
required or their parsing would break.
Most likely the reason the second hunk wasn't caught is because the
series that added 'refspec' as capability, and the one that added
required capabilities were done in parallel.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently the helper must somehow guess how many import statements to
read before it starts outputting its fast-export stream. This is
because the remote helper infrastructure runs fast-import only once,
so the helper is forced to output one stream for all import commands
it will receive. The only reason this worked in the past was because
only one ref was imported at a time.
Change the semantics of the import statement such that it matches
that of the push statement. That is, the import statement is followed
by a series of import statements that are terminated by a '\n'.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
transport-helper: use the new done feature where possible
In other words, use fast-export --use-done-feature to add a 'done'
command at the end of streams passed to remote helpers' "import"
commands, and teach the remote helpers implementing "export" to use
the 'done' command in turn when producing their streams.
The trailing \n in the protocol signals the helper that the
connection is about to close, allowing it to do whatever cleanup
neccesary.
Previously, the connection would already be closed by the
time the trailing \n was to be written. Now that the remote-helper
protocol uses the new done command in its fast-import streams, this
is no longer the case and we can safely write the trailing \n.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
transport-helper: check status code of finish_command
Previously the status code of all helpers were ignored, allowing
errors that occur to go unnoticed if the error text output by the
helper is not noticed.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fast-export is being used to generate a fast-import stream that
will be used afterwards it is desirable to indicate the end of the
stream with the new 'done' command.
Add a flag that causes fast-export to end with 'done'.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a 'done' command that causes fast-import to stop reading from the
stream and exit.
If the new --done command line flag was passed on the command line
(or a "feature done" declaration included at the start of the stream),
make the 'done' command mandatory. So "git fast-import --done"'s
input format will be prefix-free, making errors easier to detect when
they show up as early termination at some convenient time of the
upstream of a pipe writing to fast-import.
Another possible application of the 'done' command would to be allow a
fast-import stream that is only a small part of a larger encapsulating
stream to be easily parsed, leaving the file offset after the "done\n"
so the other application can pick up from there. This patch does not
teach fast-import to do that --- fast-import still uses buffered input
(stdio).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fast-export did not complete successfully the error handling code
itself would error out.
This was broken in commit 23b093ee0 (Brandon Casey, Wed Jun 9 2010,
Remove python 2.5'isms). Revert that commit an introduce our own copy
of check_call in util.py instead.
Tested by changing 'if retcode' to 'if not retcode' temporarily.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-remote-testgit: only push for non-local repositories
Trying to push for local repositories will fail since there is no
local checkout in .git/info/... to push from as that is only used for
non-local repositories (local repositories are pushed to directly).
This went unnoticed because the transport helper infrastructure does
not check the return value of the helper.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This went unnoticed because the transport helper infrastructore did
not check the return value of the helper, nor did the helper print
anything before exiting.
While at it also make sure that the stream doesn't end unexpectedly.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-helpers: export GIT_DIR variable to helpers
The gitdir capability is recognized by git and can be used to tell
the helper where the .git directory is. But it is not mentioned in
the documentation and considered worse than if gitdir was passed
via GIT_DIR environment variable.
Remove support for the gitdir capability and export GIT_DIR instead.
Teach testgit to use env instead of the now-removed gitdir command.
[sr: fixed up documentation]
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
transport-helper: don't feed bogus refs to export push
When we want to push to a remote helper that has the
"export" capability, we collect all of the refs we want to
push and then feed them to fast-export.
However, the list of refs is actually a list of remote refs,
not local refs. The mapped local refs are included via the
peer_ref pointer. So when we add an argument to our
fast-export command line, we must be sure to use the local
peer_ref name (and if there is no local name, it is because
we are not actually sending that ref, or we may not even
have the ref at all).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Upon receiving an "import" command, the testgit remote
helper would ignore the ref asked for by git and generate a
fast-export stream based on HEAD. Instead, we should
actually give git the ref it asked for.
This requires adding a new parameter to the export_repo
method in the remote-helpers python library, which may be
used by code outside of git.git. We use a default parameter
so that callers without the new parameter will get the same
behavior as before.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5800: document some non-functional parts of remote helpers
These are all things one might expect to work in a helper
that is capable of handling multiple branches (which our
testgit helper in theory should be able to do, as it is
backed by git). All of these bugs are specific to the
import/export codepaths, so they don't affect helpers like
git-remote-curl that use fetch/push commands.
The first and fourth tests are about fetching and pushing
new refs, and demonstrate bugs in the git_remote_helpers
library (so they would be most likely to impact helpers for
other VCSs which import/export git).
The second test is about importing multiple refs; it
demonstrates a bug in git-remote-testgit, which is mostly
for exercising the test code. Therefore it probably doesn't
affect anyone in practice.
The third test demonstrates a bug in git's side of the
helper code when the upstream has added refs that we do not
have locally. This could impact git users who use remote
helpers to access foreign VCSs.
All of those bugs have fixes later in this series.
The fifth test is the most complex, and does not have a fix
in this series. It tests pushing a ref via the export
mechanism to a new name on the remote side (i.e.,
"git push $remote old:new").
The problem is that we push all of the work of generating
the export stream onto fast-export, but we have no way of
communicating to fast-export that this name mapping is
happening. So we tell fast-export to generate a stream with
the commits for "old", but we can't tell it to label them
all as "new".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are a little hard to read, and I'm about to add more
just like them. Plus the failure output is nicer if we use
test_cmp than a comparison with "test".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/index-pack:
verify-pack: use index-pack --verify
index-pack: show histogram when emulating "verify-pack -v"
index-pack: start learning to emulate "verify-pack -v"
index-pack: a miniscule refactor
index-pack --verify: read anomalous offsets from v2 idx file
write_idx_file: need_large_offset() helper function
index-pack: --verify
write_idx_file: introduce a struct to hold idx customization options
index-pack: group the delta-base array entries also by type
* jn/mime-type-with-params:
gitweb: Serve */*+xml 'blob_plain' as text/plain with $prevent_xss
gitweb: Serve text/* 'blob_plain' as text/plain with $prevent_xss
* jk/clone-cmdline-config:
clone: accept config options on the command line
config: make git_config_parse_parameter a public function
remote: use new OPT_STRING_LIST
parse-options: add OPT_STRING_LIST helper
* jk/maint-config-param:
config: use strbuf_split_str instead of a temporary strbuf
strbuf: allow strbuf_split to work on non-strbufs
config: avoid segfault when parsing command-line config
config: die on error in command-line config
fix "git -c" parsing of values with equals signs
strbuf_split: add a max parameter
* jc/zlib-wrap:
zlib: allow feeding more than 4GB in one go
zlib: zlib can only process 4GB at a time
zlib: wrap deflateBound() too
zlib: wrap deflate side of the API
zlib: wrap inflateInit2 used to accept only for gzip format
zlib: wrap remaining calls to direct inflate/inflateEnd
zlib wrapper: refactor error message formatter
* ak/gcc46-profile-feedback:
Add explanation of the profile feedback build to the README
Add profile feedback build to git
Add option to disable NORETURN
IPv6 hosts are often unreachable on the primarily IPv4 Internet and
therefore we shouldn't print an error if there are still other hosts we
can try to connect() to. This helps "git fetch --quiet" stay quiet.
Signed-off-by: Dave Zarzycki <zarzycki@apple.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description for 'git rebase --abort' currently says:
Restore the original branch and abort the rebase operation.
The "restore" can be misinterpreted to imply that the original branch
was somehow in a broken state during the rebase operation. It is also
not completely clear what "the original branch" is --- is it the
branch that was checked out before the rebase operation was called or
is the the branch that is being rebased (it is the latter)? Although
both issues are made clear in the DESCRIPTION section, let us also
make the entry in the OPTIONS secion more clear.
Also remove the term "rebasing process" from the usage text, since the
user already knows that the text is about "git rebase".
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff-lib: refactor run_diff_index() and do_diff_cache()
The latter is meant to be an API for internal callers that want to inspect
the resulting diff-queue, while the former is an implementation of "git
diff-index" command. Extract the common logic into a single helper
function and make them thin wrappers around it.
Since 34110cd (Make 'unpack_trees()' have a separate source and
destination index, 2008-03-06), we can run unpack_trees() without munging
the index at all, but do_diff_cache() tried ever so carefully to work
around the old behaviour of the function.
We can just tell unpack_trees() not to touch the original index and there
is no need to clean-up whatever the previous round has done.
reset [<commit>] paths...: do not mishandle unmerged paths
Because "diff --cached HEAD" showed an incorrect blob object name on the
LHS of the diff, we ended up updating the index entry with bogus value,
not what we read from the tree.
* bc/submodule-foreach-stdin-fix-1.7.4:
git-submodule.sh: preserve stdin for the command spawned by foreach
t/t7407: demonstrate that the command called by 'submodule foreach' loses stdin
* nk/ref-doc:
glossary: clarify description of HEAD
glossary: update description of head and ref
glossary: update description of "tag"
git.txt: de-emphasize the implementation detail of a ref
check-ref-format doc: de-emphasize the implementation detail of a ref
git-remote.txt: avoid sounding as if loose refs are the only ones in the world
git-remote.txt: fix wrong remote refspec
* rj/config-cygwin:
config.c: Make git_config() work correctly when called recursively
t1301-*.sh: Fix the 'forced modes' test on cygwin
help.c: Fix detection of custom merge strategy on cygwin
* fg/submodule-keep-updating:
git-submodule.sh: clarify the "should we die now" logic
submodule update: continue when a checkout fails
git-sh-setup: add die_with_status
* an/shallow-doc:
Document the underlying protocol used by shallow repositories and --depth commands.
Fix documentation of fetch-pack that implies that the client can disconnect after sending wants.
10c4c88 (Allow add_path() to add non-existent directories to the path,
2008-07-21) introduced get_pwd_cwd() function in order to favor $PWD when
getenv("PWD") and getcwd() refer to the same directory but are different
strings (e.g. the former gives a nicer looking name via a symbolic link to
an uglier looking automounted path). The function tried to determine if
two directories are the same by running stat(2) on both and comparing
ino/dev fields.
Unfortunately, stat() does not fill any ino or dev fields in msysgit. But
there is a telltale: both ino and dev are 0 when they are not filled
correctly, so let's be extra cautious.
This happens to fix a bug in "get-receive-pack working_directory/" when
the GIT_DIR would not be set correctly due to absolute_path(".")
returning the wrong value.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref namespaces: Support remote repositories via upload-pack and receive-pack
Change upload-pack and receive-pack to use the namespace-prefixed refs
when working with the repository, and use the unprefixed refs when
talking to the client, maintaining the masquerade. This allows
clone, pull, fetch, and push to work with a suitably configured
GIT_NAMESPACE.
receive-pack advertises refs outside the current namespace as .have refs
(as it currently does for refs in alternates), so that the client can
use them to minimize data transfer but will otherwise ignore them.
With appropriate configuration, this also allows http-backend to expose
namespaces as multiple repositories with different paths. This only
requires setting GIT_NAMESPACE, which http-backend passes through to
upload-pack and receive-pack.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Jamey Sharp <jamey@minilop.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This optimizes the "recency order" (see pack-heuristics.txt in
Documentation/technical/ directory) used to order objects within a
packfile in three ways:
- Commits at the tip of tags are written together, in the hope that
revision traversal done in incremental fetch (which starts by
putting them in a revision queue marked as UNINTERESTING) will see a
better locality of these objects;
- In the original recency order, trees and blobs are intermixed. Write
trees together before blobs, in the hope that this will improve
locality when running pathspec-limited revision traversal, i.e.
"git log paths...";
- When writing blob objects out, write the whole family of blobs that use
the same delta base object together, by starting from the root of the
delta chain, and writing its immediate children in a width-first
manner, in the hope that this will again improve locality when reading
blobs that belong to the same path, which are likely to be deltified
against each other.
I tried various workloads in the Linux kernel repositories (HEAD at v3.0-rc6-71-g4dd1b49) packed with v1.7.6 and with this patch, counting how
large seeks are needed between adjacent accesses to objects in the pack,
and the result looks promising. The history has 2072052 objects, weighing
some 490MiB.
95% of the pack accesses look at data that is no further than 260kB
from the previous location we accessed. The patch does not change the
order of commit objects very much, and the result is very similar.
* Pathspec-limited log.
$ git log drivers/net >/dev/null
The path is touched by 26551 commits and merges (among 254656 total).
With the current pack heuristics, more than 30% of accesses have to
seek further than 300MB; the updated pack heuristics ensures that less
than 0.1% of accesses have to seek further than 135MB. This is largely
due to the fact that the updated heuristics does not mix blobs and
trees together.
The long-tail part of the result looks worse with the patch, but
the change helps majority of the access. 99.04% of the accesses
need less than 2MiB of seeking, compared to 77.09% with the current
packing heuristics.
As the index-pack command already walks objects in the delta chain
order, writing the blobs out in the delta chain order seems to
drastically improve the locality of access.
Note that a half-a-gigabyte packfile comfortably fits in the buffer cache,
and you would unlikely to see much performance difference on a modern and
reasonably beefy machine with enough memory and local disks. Benchmarking
with cold cache (or over NFS) would be interesting.
When executing an external shell script like `git foo` with a bad
shebang, e.g. "#!/usr/bin/not/existing", execvp returns 127 (ENOENT).
Since help_unknown_cmd proposes the use of all external commands similar
to the name of the "unknown" command, it suggests the just failed command
again. Stop it and give some advice to the user.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Michael Schubert <mschub@elegosoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a workload other than "git log" (without pathspec nor any option that
causes us to inspect trees and blobs), the recency pack order is said to
cause the access jump around quite a bit. Add a hook to allow us observe
how bad it is.
"git config core.logpackaccess /var/tmp/pal.txt" will give you the log
in the specified file.
* commit 'v1.7.6': (3211 commits)
Git 1.7.6
completion: replace core.abbrevguard to core.abbrev
Git 1.7.6-rc3
Documentation: git diff --check respects core.whitespace
gitweb: 'pickaxe' and 'grep' features requires 'search' to be enabled
t7810: avoid unportable use of "echo"
plug a few coverity-spotted leaks
builtin/gc.c: add missing newline in message
tests: link shell libraries into valgrind directory
t/Makefile: pass test opts to valgrind target properly
sh-i18n--envsubst.c: do not #include getopt.h
Fix typo: existant->existent
Git 1.7.6-rc2
gitweb: do not misparse nonnumeric content tag files that contain a digit
Git 1.7.6-rc1
fetch: do not leak a refspec
t3703: skip more tests using colons in file names on Windows
gitweb: Fix usability of $prevent_xss
gitweb: Move "Requirements" up in gitweb/INSTALL
gitweb: Describe CSSMIN and JSMIN in gitweb/INSTALL
...
* commit 'v1.7.0': (4188 commits)
Git 1.7.0
Fix typo in 1.6.6.2 release notes
Re-fix check-ref-format documentation mark-up
archive documentation: attributes are taken from the tree by default
Documentation: minor fixes to RelNotes-1.7.0
bash: support 'git am's new '--continue' option
filter-branch: Fix error message for --prune-empty --commit-filter
am: switch --resolved to --continue
Update draft release notes to 1.7.0 one more time
Git 1.6.6.2
t8003: check exit code of command and error message separately
check-ref-format documentation: fix enumeration mark-up
Documentation: quote braces in {upstream} notation
t3902: Protect against OS X normalization
blame: prevent a segv when -L given start > EOF
git-push: document all the status flags used in the output
Fix parsing of imap.preformattedHTML and imap.sslverify
git-add documentation: Fix shell quoting example
Revert "pack-objects: fix pack generation when using pack_size_limit"
archive: simplify archive format guessing
...
* commit 'v1.6.0': (2063 commits)
GIT 1.6.0
git-p4: chdir now properly sets PWD environment variable in msysGit
Improve error output of git-rebase
t9300: replace '!' with test_must_fail
Git.pm: Make File::Spec and File::Temp requirement lazy
Documentation: document the pager.* configuration setting
git-stash: improve synopsis in help and manual page
Makefile: building git in cygwin 1.7.0
git-am: ignore --binary option
bash-completion: Add non-command git help files to bash-completion
Fix t3700 on filesystems which do not support question marks in names
Utilise our new p4_read_pipe and p4_write_pipe wrappers
Add p4 read_pipe and write_pipe wrappers
bash completion: Add '--merge' long option for 'git log'
bash completion: Add completion for 'git mergetool'
git format-patch documentation: clarify what --cover-letter does
bash completion: 'git apply' should use 'fix' not 'strip'
t5304-prune: adjust file mtime based on system time rather than file mtime
test-parse-options: use appropriate cast in length_callback
Fix escaping of glob special characters in pathspecs
...
As resolve_ref() returns a static buffer that is local to the function,
the caller needs to be sure that it will not have any other calls to the
function before it uses the returned value, or store it away with a
strdup(). The code used old.path to record which branch it used to be on,
so that it can say between which branches the switch took place in the
reflog, but sometimes it failed to do so.
The SYNOPSIS sections of most commands that span several lines already
use [verse] to retain line breaks. Most commands that don't span
several lines seem not to use [verse]. In the HTML output, [verse]
does not only preserve line breaks, but also makes the section
indented, which causes a slight inconsistency between commands that
use [verse] and those that don't. Use [verse] in all SYNOPSIS sections
for consistency.
Also remove the blank lines from git-fetch.txt and git-rebase.txt to
align with the other man pages. In the case of git-rebase.txt, which
already uses [verse], the blank line makes the [verse] not apply to
the last line, so removing the blank line also makes the formatting
within the document more consistent.
While at it, add single quotes to 'git cvsimport' for consistency with
other commands.
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>