* maint:
fix index-pack with packs >4GB containing deltas on 32-bit machines
git-hash-object should honor config variables
gitweb: correct month in date display for atom feeds
strchrnul() was introduced in glibc in April 1999 and included in
glibc-2.1. Checking for that version means the majority of all git
users would get to use the optimized version in glibc. Of the
remaining few some might get to use a slightly slower version
than necessary but probably not slower than what we have today.
Unfortunately, __GLIBC_PREREQ() macro was not available in glibc 2.1.1
which was short lived but already supported strchrnul(). Odd minority
users of that library needs to live with our compatibility inline version.
Rediffed-against-next-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fix index-pack with packs >4GB containing deltas on 32-bit machines
This probably hasn't been properly tested before. Here's a script to
create a 8GB repo with the necessary characteristics (copy the
test-genrandom executable from the Git build tree to /tmp first):
-----
#!/bin/bash
git init
git config core.compression 0
# create big objects with no deltas
for i in $(seq -w 1 2 63)
do
echo $i
/tmp/test-genrandom $i 268435456 > file_$i
git add file_$i
rm file_$i
echo "file_$i -delta" >> .gitattributes
done
# create "deltifiable" objects in between big objects
for i in $(seq -w 2 2 64)
do
echo "$i $i $i" >> grow
cp grow file_$i
git add file_$i
rm file_$i
done
rm grow
# create a pack with them
git commit -q -m "commit of big objects interlaced with small deltas"
git repack -a -d
-----
Then clone this repo over the Git protocol.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As Jeff King remarked, format strings with duplicate placeholders can
be slow to expand, because each instance is calculated anew.
This patch makes use of the fact that format_commit_message() and its
helper functions only ever add stuff to the end of the strbuf. For
certain expensive placeholders, store the offset and length of their
expansion with the strbuf at the first occurrence. Later they
expansion result can simply be copied from there -- no malloc() or
strdup() required.
These certain placeholders are the abbreviated commit, tree and
parent hashes, as the search for a unique abbreviated hash is quite
costly. Here are the times for next (best of three runs):
$ time git log --pretty=format:%h >/dev/null
real 0m0.611s
user 0m0.404s
sys 0m0.204s
$ time git log --pretty=format:%h%h%h%h >/dev/null
real 0m1.206s
user 0m0.744s
sys 0m0.452s
And here those with this patch (and the previous two); the speedup
of the single placeholder case is just noise:
$ time git log --pretty=format:%h >/dev/null
real 0m0.608s
user 0m0.416s
sys 0m0.192s
$ time git log --pretty=format:%h%h%h%h >/dev/null
real 0m0.639s
user 0m0.488s
sys 0m0.140s
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As Jeff King pointed out, some placeholder expansions are related to
each other: the steps to calculate one go most of the way towards
calculating the other, too.
This patch makes format_commit_message() parse the commit message
only once, remembering the position of each item. This speeds up
handling of format strings containing multiple placeholders from the
set %s, %a*, %c*, %e, %b.
Here are the timings for the git version in next. The first one is
to estimate the overhead of the caching, the second one is taken
from http://svn.tue.mpg.de/tentakel/trunk/tentakel/Makefile as an
example of a format string found in the wild. The times are the
fastest of three consecutive runs in each case:
$ time git log --pretty=format:%e >/dev/null
real 0m0.381s
user 0m0.340s
sys 0m0.024s
$ time git log --pretty=format:"* %cd %cn%n%n%s%n%b" >/dev/null
real 0m0.623s
user 0m0.556s
sys 0m0.052s
And here the times with this patch:
$ time git log --pretty=format:%e >/dev/null
real 0m0.385s
user 0m0.332s
sys 0m0.040s
$ time git log --pretty=format:"* %cd %cn%n%n%s%n%b" >/dev/null
real 0m0.563s
user 0m0.504s
sys 0m0.048s
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-lib.sh: move error line after error() declaration
This patch removes a spurious "command not found" error
and actually makes the "Test script did not set test_description."
string follow the command line option "--no-color".
Signed-off-by: Michele Ballabio <barra_cuda@katamail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we were listing objects too then the objects were buffered in an
array only reachable from a stack allocated structure. When this
function returns that array would be leaked as nobody would have
a reference to it anymore.
Historically this hasn't been a problem as the primary user of
traverse_commit_list() (the noble git-rev-list) would terminate
as soon as the function was finished, thus allowing the operating
system to cleanup memory. However we have been leaking this data
in git-pack-objects ever since that program learned how to run the
revision listing internally, rather than relying on reading object
names from git-rev-list.
To better facilitate reuse of traverse_commit_list during other
builtin tools (such as git-fetch) we shouldn't leak temporary memory
like this and instead we need to clean up properly after ourselves.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-add: make the entry stat-clean after re-adding the same contents
Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe
(add_file_to_index: skip rehashing if the cached stat already
matches), add_file_to_index() were taught not to re-add the path
if it already matches the index.
The change meant well, but was not executed quite right. It
used ie_modified() to see if the file on the work tree is really
different from the index, and skipped adding the contents if the
function says "not modified".
This was wrong. There are three possible comparison results
between the index and the file in the work tree:
- with lstat(2) we _know_ they are different. E.g. if the
length or the owner in the cached stat information is
different from the length we just obtained from lstat(2), we
can tell the file is modified without looking at the actual
contents.
- with lstat(2) we _know_ they are the same. The same length,
the same owner, the same everything (but this has a twist, as
described below).
- we cannot tell from lstat(2) information alone and need to go
to the filesystem to actually compare.
The last case arises from what we call 'racy git' situation,
that can be caused with this sequence:
$ echo hello >file
$ git add file
$ echo aeiou >file ;# the same length
If the second "echo" is done within the same filesystem
timestamp granularity as the first "echo", then the timestamp
recorded by "git add" and the timestamp we get from lstat(2)
will be the same, and we can mistakenly say the file is not
modified. The path is called 'racily clean'. We need to
reliably detect racily clean paths are in fact modified.
To solve this problem, when we write out the index, we mark the
index entry that has the same timestamp as the index file itself
(that is the time from the point of view of the filesystem) to
tell any later code that does the lstat(2) comparison not to
trust the cached stat info, and ie_modified() then actually goes
to the filesystem to compare the contents for such a path.
That's all good, but it should not be used for this "git add"
optimization, as the goal of "git add" is to actually update the
path in the index and make it stat-clean. With the false
optimization, we did _not_ cause any data loss (after all, what
we failed to do was only to update the cached stat information),
but it made the following sequence leave the file stat dirty:
$ echo hello >file
$ git add file
$ echo hello >file ;# the same contents
$ git add file
The solution is not to use ie_modified() which goes to the
filesystem to see if it is really clean, but instead use
ie_match_stat() with "assume racily clean paths are dirty"
option, to force re-adding of such a path.
There was another problem with "git add -u". The codepath
shares the same issue when adding the paths that are found to be
modified, but in addition, it asked "git diff-files" machinery
run_diff_files() function (which is "git diff-files") to list
the paths that are modified. But "git diff-files" machinery
uses the same ie_modified() call so that it does not report
racily clean _and_ actually clean paths as modified, which is
not what we want.
The patch allows the callers of run_diff_files() to pass the
same "assume racily clean paths are dirty" option, and makes
"git-add -u" codepath to use that option, to discover and re-add
racily clean _and_ actually clean paths.
We could further optimize on top of this patch to differentiate
the case where the path really needs re-adding (i.e. the content
of the racily clean entry was indeed different) and the case
where only the cached stat information needs to be refreshed
(i.e. the racily clean entry was actually clean), but I do not
think it is worth it.
This makes it possible to mark commands that are deprecated in the
command list of the primary manual page git(7), and uses it to
mark "git lost-found" and "git tar-tree" as deprecated.
The interactive version of rebase does all the operations on a detached
HEAD, so that after a successful rebase, <branch>@{1} is the pre-rebase
state. The reflogs of "HEAD" still show all the actions in detail.
This teaches the non-interactive version to do the same.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the --pretty=format placeholders expansions are expensive to
calculate. This is made worse by the current code's use of
interpolate(), which requires _all_ placeholders are to be prepared
up front.
One way to speed this up is to check which placeholders are present
in the format string and to prepare only the expansions that are
needed. That still leaves the allocation overhead of interpolate().
Another way is to use a callback based approach together with the
strbuf library to keep allocations to a minimum and avoid string
copies. That's what this patch does. It introduces a new strbuf
function, strbuf_expand().
The function takes a format string, list of placeholder strings,
a user supplied function 'fn', and an opaque pointer 'context'
to tell 'fn' what thingy to operate on.
The function 'fn' is expected to accept a strbuf, a parsed
placeholder string and the 'context' pointer, and append the
interpolated value for the 'context' thingy, according to the
format specified by the placeholder.
Thanks to Pierre Habouzit for his suggestion to use strchrnul() and
the code surrounding its callsite. And thanks to Junio for most of
this commit message. :)
Here my measurements of most of Paul Mackerras' test cases that
highlighted the performance problem (best of three runs):
(master)
$ time git log --pretty=oneline >/dev/null
real 0m0.390s
user 0m0.340s
sys 0m0.040s
(master)
$ time git log --pretty=raw >/dev/null
real 0m0.434s
user 0m0.408s
sys 0m0.016s
(master)
$ time git log --pretty="format:%H {%P} %ct" >/dev/null
As suggested by Pierre Habouzit, add strchrnul(). It's a useful GNU
extension and can simplify string parser code. There are several
places in git that can be converted to strchrnul(); as a trivial
example, this patch introduces its usage to builtin-fetch--tool.c.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-cvsimport: fix handling of user name when it is not set in CVSROOT
The cvs programs do not default to "anonymous" as the user name, but use the
currently logged in user. This patch more closely matches the cvs behavior.
Signed-off-by: Gordon Hopper <g.hopper@computer.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Start preparing for 1.5.3.6
git-send-email: Change the prompt for the subject of the initial message.
SubmittingPatches: improve the 'Patch:' section of the checklist
instaweb: Minor cleanups and fixes for potential problems
stop t1400 hiding errors in tests
Makefile: add missing dependency on wt-status.h
refresh_index_quietly(): express "optional" nature of index writing better
Fix sed string regex escaping in module_name.
Avoid a few unportable, needlessly nested "...`...".
git-mailsplit: with maildirs not only process cur/, but also new/
SubmittingPatches: improve the 'Patch:' section of the checklist
There were 2 items "send patch to..." but having different set of
addresses to send patch to. Merge them together and move the resulting
item to the end of checklist.
Signed-off-by: Sergei Organov <osv@javad.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refresh_index_quietly(): express "optional" nature of index writing better
The point of the part of the code this patch touches is that if
we modified the active_cache, we try to write it out and make it
the index file for later users to use by calling
"commit_locked_index", but we do not really care about the
failure from this sequence because it is done purely as an
optimization.
The original code called three functions primarily for their
side effects but as condition of an if statement, which is
admittedly a bad style.
Incidentally, it squelches an "empty if body" warning from gcc.
When escaping a string to be used as a sed regex, it is important
to only escape active characters. Escaping other characters is
undefined according to POSIX, and in practice leads to issues with
extensions such as GNU sed's \+.
Signed-off-by: Ralf Wildenhues <Ralf.Wildenhues@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the same spirit of prettifying Git's output display for mere mortals,
here's a simple extension to the progress API allowing for a final
message to be provided when terminating a progress line, and use it for
the display of the number of objects needed to complete a thin pack,
saving yet one more line of screen display.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
hooks--update: decline deleting tags or branches by default, add config options
Decline deleting tags or branches through git push <remote> :<ref> by
default, support config options hooks.allowdeletetag, hooks.allowdeletebranch
to override this per repository.
Before this patch the update hook interpreted deleting a tag, no matter if
annotated or not, through git push <remote> :<tag> as unannotated tag, and
declined it by default, but with an unappropriate error message:
$ git push origin :atag
deleting 'refs/tags/atag'
*** The un-annotated tag, atag, is not allowed in this repository
*** Use 'git tag [ -a | -s ]' for tags you want to propagate.
ng refs/tags/atag hook declined
error: hooks/update exited with error code 1
error: hook declined to update refs/tags/atag
error: failed to push to 'monolith:/git/qm/test-repo'
Signed-off-by: Gerrit Pape <pape@smarden.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
hooks--update: fix test for properly set up project description file
The update hook template intends to abort if the project description file
hasn't been adjusted or is empty. This patch fixes the check for 'being
adjusted'.
Signed-off-by: Gerrit Pape <pape@smarden.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-mailsplit: with maildirs not only process cur/, but also new/
When saving patches to a maildir with e.g. mutt, the files are put into
the new/ subdirectory of the maildir, not cur/. This makes git-am state
"Nothing to do.". This patch lets git-mailsplit additional check new/
after reading cur/.
This was reported by Joey Hess through
http://bugs.debian.org/447396
Signed-off-by: Gerrit Pape <pape@smarden.org> Acked-by: Jeff King <peff@peff.net> Acked-by: Alex Riesen <raa.lkml@gmail.com> Acked-by: Fernando J. Pereda <ferdy@gentoo.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'automagic parseopt' support corrupted non option parameters
that had IFS characters in them. The worst case is when it had
a non option parameter like this:
* mh/work-tree:
Make git-blame fail when working tree is needed and we're not in one
Don't always require working tree for git-rm
Use setup_work_tree() in builtin-ls-files.c
Refactor working tree setup
Avoid "test -o" as it is only XSI not POSIX, and not portable.
Avoid exit(3) in test programs in favor of return, to accommodate
for newer Autoconf not providing a declaration for exit.
Improve accuracy of check for presence of deflateBound.
ZLIB_VERNUM isn't defined in some zlib versions, so this patch does a proper
linking test in autoconf to see whether deflateBound exists in zlib. Also,
setting NO_DEFLATE_BOUND will also work for folk not using autoconf.
Signed-off-by: David Symonds <dsymonds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if our code is quite a good documentation for our coding style,
some people seem to prefer a document describing it.
The part about the shell scripts is clearly just copied from one of
Junio's helpful mails, and some parts were added from comments by
Junio, Andreas Ericsson and Robin Rosenberg.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The minimum delay of 1/2 sec between successive throughput updates might
not have been elapsed when display_throughput() is called for the last
time, potentially making the display of total transferred bytes not
right when progress is said to be done.
Let's force an update of the throughput display as well when the
progress is complete. As a side effect, the total transferred will
always be displayed even if the actual transfer rate doesn't have time
to kickin.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
--text follows this line--
These commands currently lack OPTIONS_SPEC; allow people to
easily list with "git grep 'OPTIONS_SPEC=$'" what they can help
improving.
Update git-sh-setup(1) to allow transparent use of git-rev-parse --parseopt
If you set OPTIONS_SPEC, git-sh-setups uses git-rev-parse --parseopt
automatically.
It also diverts usage to re-exec $0 with the -h option as parse-options.c
will catch that.
If you need git-rev-parse --parseopt to keep the `--` the user may have
passed to your command, set OPTIONS_KEEPDASHDASH to a non empty value
in your script.
Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows to do git rm --cached -r directory, instead of
git ls-files -z directory | git update-index --remove -z --stdin.
This can be particularly useful for git-filter-branch users.
Signed-off-by: Mike Hommey <mh@glandium.org> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Mike Hommey <mh@glandium.org> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a setup_work_tree() that can be used from any command requiring
a working tree conditionally.
Signed-off-by: Mike Hommey <mh@glandium.org> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests check whether git-tag properly sends a comment into the
editor, and whether it reuses previous annotation when overwriting
an existing tag.
Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
upload-pack: Use finish_{command,async}() instead of waitpid().
upload-pack spawns two processes, rev-list and pack-objects, and carefully
monitors their status so that it can report failure to the remote end.
This change removes the complicated procedures on the grounds of the
following observations:
- If everything is OK, rev-list closes its output pipe end, upon which
pack-objects (which reads from the pipe) sees EOF and terminates itself,
closing its output (and error) pipes. upload-pack reads from both until
it sees EOF in both. It collects the exit codes of the child processes
(which indicate success) and terminates successfully.
- If rev-list sees an error, it closes its output and terminates with
failure. pack-objects sees EOF in its input and terminates successfully.
Again upload-pack reads its inputs until EOF. When it now collects
the exit codes of its child processes, it notices the failure of rev-list
and signals failure to the remote end.
- If pack-objects sees an error, it terminates with failure. Since this
breaks the pipe to rev-list, rev-list is killed with SIGPIPE.
upload-pack reads its input until EOF, then collects the exit codes of
the child processes, notices their failures, and signals failure to the
remote end.
- If upload-pack itself dies unexpectedly, pack-objects is killed with
SIGPIPE, and subsequently also rev-list.
The upshot of this is that precise monitoring of child processes is not
required because both terminate if either one of them dies unexpectedly.
This allows us to use finish_command() and finish_async() instead of
an explicit waitpid(2) call.
The change is smaller than it looks because most of it only reduces the
indentation of a large part of the inner loop.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Remove a couple of duplicated include
grep with unmerged index
git-daemon: fix remote port number in log entry
git-svn: t9114: verify merge commit message in test
git-svn: fix dcommit clobbering when committing a series of diffs
git-commit.sh: Fix usage checks regarding paths given when they do not make sense
The checks that looked for paths given to git-commit in addition to
--all or --interactive expected only 3 values, while the case statement
actually provides 4, so the check was never triggered.
We called flush_grep() every time we saw an unmerged entry in
the index. If we happen to find an unmerged entry before we saw
more than two paths, we incorrectly declared that the user had
too many non-paths options in front.