gitweb.git
repack: simplify handling of --write-bitmap-indexJeff King Tue, 10 Jun 2014 20:19:38 +0000 (16:19 -0400)

repack: simplify handling of --write-bitmap-index

We previously needed to pass --no-write-bitmap-index
explicitly to pack-objects to override its reading of
pack.writebitmaps from the config. Now that it no longer
does so, we can assume that bitmaps are off by default, and
only turn them on when necessary. This also lets us avoid a
confusing tri-state flag for write_bitmaps.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: stop respecting pack.writebitmapsJeff King Tue, 10 Jun 2014 20:19:13 +0000 (16:19 -0400)

pack-objects: stop respecting pack.writebitmaps

The handling of the pack.writebitmaps config option
originally happened in pack-objects, which is quite
low-level. It would make more sense for drivers of
pack-objects to read the config, and then manipulate
pack-objects with command-line options.

Recently, repack learned to do so, making the low-level read
of pack.writebitmaps redundant here. Other callers, like
upload-pack, would not generally want to write bitmaps
anyway.

This could be considered a regression for somebody who is
driving pack-objects themselves outside of repack and
expects the config option to be used. However, such users
seem rather unlikely given how new the bitmap code is (and
the fact that they would basically be reimplementing repack
in the first place).

Note that we do not do anything with pack.writeBitmapHashCache
here. That option is not about "do we write bimaps", but
rather "when we are writing bitmaps, how do we do it?". You
would want that to kick in anytime you decide to write them,
similar to how pack.indexVersion is used.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: s/write_bitmap/&s/ in codeJeff King Tue, 10 Jun 2014 20:10:07 +0000 (16:10 -0400)

repack: s/write_bitmap/&s/ in code

The config name is "writeBitmaps", so the internal variable
missing the plural is unnecessarily confusing to write.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: respect pack.writebitmapsJeff King Tue, 10 Jun 2014 20:09:23 +0000 (16:09 -0400)

repack: respect pack.writebitmaps

The config option to turn on bitmaps is read all the way
down in the plumbing of pack-objects. This makes it hard for
other options in the porcelain of repack to make decisions
based on the bitmap setting. For example,
repack.packKeptObjects tries to kick in by default only when
bitmaps are turned on. But it can't do so reliably because
it doesn't yet know whether we are using bitmaps.

This patch teaches repack to respect pack.writebitmaps. It
means we pass a redundant command-line flag to pack-objects,
but that's OK; it shouldn't affect the outcome.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: do not accidentally pack kept objects by defaultJeff King Tue, 10 Jun 2014 20:08:38 +0000 (16:08 -0400)

repack: do not accidentally pack kept objects by default

Commit ee34a2b (repack: add `repack.packKeptObjects` config
var, 2014-03-03) added a flag which could duplicate kept
objects, but did not mean to turn it on by default. Instead,
the option is tied by default to the decision to write
bitmaps, like:

if (pack_kept_objects < 0)
pack_kept_objects = write_bitmap;

after which we expect pack_kept_objects to be a boolean 0 or
1. However, that assignment neglects that write_bitmap is
_also_ a tri-state with "-1" as the default, and with
neither option given, we accidentally turn the option on.

This patch is the minimal fix to restore the desired
behavior for the default state. Further patches will fix the
more complicated cases.

Note the update to t7700. It failed to turn on bitmaps,
meaning we were actually confirming the wrong behavior!

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: add `repack.packKeptObjects` config varJeff King Mon, 3 Mar 2014 20:04:20 +0000 (15:04 -0500)

repack: add `repack.packKeptObjects` config var

The git-repack command always passes `--honor-pack-keep`
to pack-objects. This has traditionally been a good thing,
as we do not want to duplicate those objects in a new pack,
and we are not going to delete the old pack.

However, when bitmaps are in use, it is important for a full
repack to include all reachable objects, even if they may be
duplicated in a .keep pack. Otherwise, we cannot generate
the bitmaps, as the on-disk format requires the set of
objects in the pack to be fully closed.

Even if the repository does not generally have .keep files,
a simultaneous push could cause a race condition in which a
.keep file exists at the moment of a repack. The repack may
try to include those objects in one of two situations:

1. The pushed .keep pack contains objects that were
already in the repository (e.g., blobs due to a revert of
an old commit).

2. Receive-pack updates the refs, making the objects
reachable, but before it removes the .keep file, the
repack runs.

In either case, we may prefer to duplicate some objects in
the new, full pack, and let the next repack (after the .keep
file is cleaned up) take care of removing them.

This patch introduces both a command-line and config option
to disable the `--honor-pack-keep` option. By default, it
is triggered when pack.writeBitmaps (or `--write-bitmap-index`
is turned on), but specifying it explicitly can override the
behavior (e.g., in cases where you prefer .keep files to
bitmaps, but only when they are present).

Note that this option just disables the pack-objects
behavior. We still leave packs with a .keep in place, as we
do not necessarily know that we have duplicated all of their
objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ewah: unconditionally ntohll ewah dataJeff King Wed, 12 Feb 2014 16:48:28 +0000 (11:48 -0500)

ewah: unconditionally ntohll ewah data

Commit a201c20 tried to optimize out a loop like:

for (i = 0; i < len; i++)
data[i] = ntohll(data[i]);

in the big-endian case, because we know that ntohll is a
noop, and we do not need to pay the cost of the loop at all.
However, it mistakenly assumed that __BYTE_ORDER was always
defined, whereas it may not be on systems which do not
define it by default, and where we did not need to define it
to set up the ntohll macro. This includes OS X and Windows.

We could muck with the ordering in compat/bswap.h to make
sure it is defined unconditionally, but it is simpler to
still to just execute the loop unconditionally. That avoids
the application code knowing anything about these magic
macros, and lets it depend only on having ntohll defined.

And since the resulting loop looks like (on a big-endian
system):

for (i = 0; i < len; i++)
data[i] = data[i];

any decent compiler can probably optimize it out.

Original report and analysis by Brian Gernhardt.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ewah: support platforms that require aligned readsVicent Marti Thu, 23 Jan 2014 21:27:52 +0000 (16:27 -0500)

ewah: support platforms that require aligned reads

The caller may hand us an unaligned buffer (e.g., because it
is an mmap of a file with many ewah bitmaps). On some
platforms (like SPARC) this can cause a bus error. We can
fix it with a combination of get_be32 and moving the data
into an aligned buffer (which we would do anyway, but we can
move it before fixing the endianness).

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: use get_be32 instead of hand-rolled ntoh_lJeff King Thu, 23 Jan 2014 21:26:42 +0000 (16:26 -0500)

read-cache: use get_be32 instead of hand-rolled ntoh_l

Commit d60c49c (read-cache.c: allow unaligned mapping of the
index file, 2012-04-03) introduced helpers to access
unaligned data. However, we already have get_be32, which has
a few advantages:

1. It's already written, so we avoid duplication.

2. It's probably faster, since it does the endian
conversion and the alignment fix at the same time.

3. The get_be32 code is well-tested, having been in
block-sha1 for a long time. By contrast, our custom
helpers were probably almost never used, since the user
needed to manually define a macro to enable them.

We have to add a get_be16 implementation to the existing
get_be32, but that is very simple to do.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

block-sha1: factor out get_be and put_be wrappersJeff King Thu, 23 Jan 2014 21:23:09 +0000 (16:23 -0500)

block-sha1: factor out get_be and put_be wrappers

The BLK_SHA1 code has optimized wrappers for doing endian
conversions on memory that may not be aligned. Let's pull
them out so that we can use them elsewhere, especially the
time-tested list of platforms that prefer each strategy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

do not discard revindex when re-preparing packfilesJeff King Wed, 15 Jan 2014 11:17:48 +0000 (06:17 -0500)

do not discard revindex when re-preparing packfiles

When an object lookup fails, we re-read the objects/pack
directory to pick up any new packfiles that may have been
created since our last read. We also discard any pack
revindex structs we've allocated.

The discarding is a problem for the pack-bitmap code, which keeps
a pointer to the revindex for the bitmapped pack. After the
discard, the pointer is invalid, and we may read free()d
memory.

Other revindex users do not keep a bare pointer to the
revindex; instead, they always access it through
revindex_for_pack(), which lazily builds the revindex. So
one solution is to teach the pack-bitmap code a similar
trick. It would be slightly less efficient, but probably not
all that noticeable.

However, it turns out this discarding is not actually
necessary. When we call reprepare_packed_git, we do not
throw away our old pack list. We keep the existing entries,
and only add in new ones. So there is no safety problem; we
will still have the pack struct that matches each revindex.
The packfile itself may go away, of course, but we are
already prepared to handle that, and it may happen outside
of reprepare_packed_git anyway.

Throwing away the revindex may save some RAM if the pack
never gets reused (about 12 bytes per object). But it also
wastes some CPU time (to regenerate the index) if the pack
does get reused. It's hard to say which is more valuable,
but in either case, it happens very rarely (only when we
race with a simultaneous repack). Just leaving the revindex
in place is simple and safe both for current and future
code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: implement optional name_hash cacheVicent Marti Sat, 21 Dec 2013 14:00:45 +0000 (09:00 -0500)

pack-bitmap: implement optional name_hash cache

When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test origin/master HEAD^ HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7%
5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5%
5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2%
5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/perf: add tests for pack bitmapsJeff King Sat, 21 Dec 2013 14:00:42 +0000 (09:00 -0500)

t/perf: add tests for pack bitmaps

This adds a few basic perf tests for the pack bitmap code to
show off its improvements. The tests are:

1. How long does it take to do a repack (it gets slower
with bitmaps, since we have to do extra work)?

2. How long does it take to do a clone (it gets faster
with bitmaps)?

3. How does a small fetch perform when we've just
repacked?

4. How does a clone perform when we haven't repacked since
a week of pushes?

Here are results against linux.git:

Test origin/master this tree
-----------------------------------------------------------------------
5310.2: repack to disk 33.64(32.64+2.04) 67.67(66.75+1.84) +101.2%
5310.3: simulated clone 30.49(29.47+2.05) 1.20(1.10+0.10) -96.1%
5310.4: simulated fetch 3.49(6.79+0.06) 5.57(22.35+0.07) +59.6%
5310.6: partial bitmap 36.70(43.87+1.81) 8.18(21.92+0.73) -77.7%

You can see that we do take longer to repack, but we do way
better for further clones. A small fetch performs a bit
worse, as we spend way more time on delta compression (note
the heavy user CPU time, as we have 8 threads) due to the
lack of name hashes for the bitmapped objects.

The final test shows how the bitmaps degrade over time
between packs. There's still a significant speedup over the
non-bitmap case, but we don't do quite as well (we have to
spend time accessing the "new" objects the old fashioned
way, including delta compression).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: add basic bitmap functionality testsJeff King Sat, 21 Dec 2013 14:00:38 +0000 (09:00 -0500)

t: add basic bitmap functionality tests

Now that we can read and write bitmaps, we can exercise them
with some basic functionality tests. These tests aren't
particularly useful for seeing the benefit, as the test
repo is too small for it to make a difference. However, we
can at least check that using bitmaps does not break anything.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

count-objects: recognize .bitmap in garbage-checkingNguyễn Thái Ngọc Duy Sat, 21 Dec 2013 14:00:34 +0000 (09:00 -0500)

count-objects: recognize .bitmap in garbage-checking

Count-objects will report any "garbage" files in the packs
directory, including files whose extensions it does not
know (case 1), and files whose matching ".pack" file is
missing (case 2). Without having learned about ".bitmap"
files, the current code reports all such files as garbage
(case 1), even if their pack exists. Instead, they should be
treated as case 2.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: consider bitmaps when performing repacksVicent Marti Sat, 21 Dec 2013 14:00:31 +0000 (09:00 -0500)

repack: consider bitmaps when performing repacks

Since `pack-objects` will write a `.bitmap` file next to the `.pack` and
`.idx` files, this commit teaches `git-repack` to consider the new
bitmap indexes (if they exist) when performing repack operations.

This implies moving old bitmap indexes out of the way if we are
repacking a repository that already has them, and moving the newly
generated bitmap indexes into the `objects/pack` directory, next to
their corresponding packfiles.

Since `git repack` is now capable of handling these `.bitmap` files,
a normal `git gc` run on a repository that has `pack.writebitmaps` set
to true in its config file will generate bitmap indexes as part of the
garbage collection process.

Alternatively, `git repack` can be called with the `-b` switch to
explicitly generate bitmap indexes if you are experimenting
and don't want them on all the time.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: handle optional files created by pack-objectsJeff King Sat, 21 Dec 2013 14:00:27 +0000 (09:00 -0500)

repack: handle optional files created by pack-objects

We ask pack-objects to pack to a set of temporary files, and
then rename them into place. Some files that pack-objects
creates may be optional (like a .bitmap file), in which case
we would not want to call rename(). We already call stat()
and make the chmod optional if the file cannot be accessed.
We could simply skip the rename step in this case, but that
would be a minor regression in noticing problems with
non-optional files (like the .pack and .idx files).

Instead, we can now annotate extensions as optional, and
skip them if they don't exist (and otherwise rely on
rename() to barf).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: turn exts array into array-of-structJeff King Sat, 21 Dec 2013 14:00:23 +0000 (09:00 -0500)

repack: turn exts array into array-of-struct

This is slightly more verbose, but will let us annotate the
extensions with further options in future commits.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: stop using magic number for ARRAY_SIZE(exts)Jeff King Sat, 21 Dec 2013 14:00:19 +0000 (09:00 -0500)

repack: stop using magic number for ARRAY_SIZE(exts)

We have a static array of extensions, but hardcode the size
of the array in our loops. Let's pull out this magic number,
which will make it easier to change.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: implement bitmap writingVicent Marti Sat, 21 Dec 2013 14:00:16 +0000 (09:00 -0500)

pack-objects: implement bitmap writing

This commit extends more the functionality of `pack-objects` by allowing
it to write out a `.bitmap` index next to any written packs, together
with the `.idx` index that currently gets written.

If bitmap writing is enabled for a given repository (either by calling
`pack-objects` with the `--write-bitmap-index` flag or by having
`pack.writebitmaps` set to `true` in the config) and pack-objects is
writing a packfile that would normally be indexed (i.e. not piping to
stdout), we will attempt to write the corresponding bitmap index for the
packfile.

Bitmap index writing happens after the packfile and its index has been
successfully written to disk (`finish_tmp_packfile`). The process is
performed in several steps:

1. `bitmap_writer_set_checksum`: this call stores the partial
checksum for the packfile being written; the checksum will be
written in the resulting bitmap index to verify its integrity

2. `bitmap_writer_build_type_index`: this call uses the array of
`struct object_entry` that has just been sorted when writing out
the actual packfile index to disk to generate 4 type-index bitmaps
(one for each object type).

These bitmaps have their nth bit set if the given object is of
the bitmap's type. E.g. the nth bit of the Commits bitmap will be
1 if the nth object in the packfile index is a commit.

This is a very cheap operation because the bitmap writing code has
access to the metadata stored in the `struct object_entry` array,
and hence the real type for each object in the packfile.

3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap
index for one of the packfiles we're trying to repack, this call
will efficiently rebuild the existing bitmaps so they can be
reused on the new index. All the existing bitmaps will be stored
in a `reuse` hash table, and the commit selection phase will
prioritize these when selecting, as they can be written directly
to the new index without having to perform a revision walk to
fill the bitmap. This can greatly speed up the repack of a
repository that already has bitmaps.

4. `bitmap_writer_select_commits`: if bitmap writing is enabled for
a given `pack-objects` run, the sequence of commits generated
during the Counting Objects phase will be stored in an array.

We then use that array to build up the list of selected commits.
Writing a bitmap in the index for each object in the repository
would be cost-prohibitive, so we use a simple heuristic to pick
the commits that will be indexed with bitmaps.

The current heuristics are a simplified version of JGit's
original implementation. We select a higher density of commits
depending on their age: the 100 most recent commits are always
selected, after that we pick 1 commit of each 100, and the gap
increases as the commits grow older. On top of that, we make sure
that every single branch that has not been merged (all the tips
that would be required from a clone) gets their own bitmap, and
when selecting commits between a gap, we tend to prioritize the
commit with the most parents.

Do note that there is no right/wrong way to perform commit
selection; different selection algorithms will result in
different commits being selected, but there's no such thing as
"missing a commit". The bitmap walker algorithm implemented in
`prepare_bitmap_walk` is able to adapt to missing bitmaps by
performing manual walks that complete the bitmap: the ideal
selection algorithm, however, would select the commits that are
more likely to be used as roots for a walk in the future (e.g.
the tips of each branch, and so on) to ensure a bitmap for them
is always available.

5. `bitmap_writer_build`: this is the computationally expensive part
of bitmap generation. Based on the list of commits that were
selected in the previous step, we perform several incremental
walks to generate the bitmap for each commit.

The walks begin from the oldest commit, and are built up
incrementally for each branch. E.g. consider this dag where A, B,
C, D, E, F are the selected commits, and a, b, c, e are a chunk
of simplified history that will not receive bitmaps.

A---a---B--b--C--c--D
\
E--e--F

We start by building the bitmap for A, using A as the root for a
revision walk and marking all the objects that are reachable
until the walk is over. Once this bitmap is stored, we reuse the
bitmap walker to perform the walk for B, assuming that once we
reach A again, the walk will be terminated because A has already
been SEEN on the previous walk.

This process is repeated for C, and D, but when we try to
generate the bitmaps for E, we can reuse neither the current walk
nor the bitmap we have generated so far.

What we do now is resetting both the walk and clearing the
bitmap, and performing the walk from scratch using E as the
origin. This new walk, however, does not need to be completed.
Once we hit B, we can lookup the bitmap we have already stored
for that commit and OR it with the existing bitmap we've composed
so far, allowing us to limit the walk early.

After all the bitmaps have been generated, another iteration
through the list of commits is performed to find the best XOR
offsets for compression before writing them to disk. Because of
the incremental nature of these bitmaps, XORing one of them with
its predecesor results in a minimal "bitmap delta" most of the
time. We can write this delta to the on-disk bitmap index, and
then re-compose the original bitmaps by XORing them again when
loaded.

This is a phase very similar to pack-object's `find_delta` (using
bitmaps instead of objects, of course), except the heuristics
have been greatly simplified: we only check the 10 bitmaps before
any given one to find best compressing one. This gives good
results in practice, because there is locality in the ordering of
the objects (and therefore bitmaps) in the packfile.

6. `bitmap_writer_finish`: the last step in the process is
serializing to disk all the bitmap data that has been generated
in the two previous steps.

The bitmap is written to a tmp file and then moved atomically to
its final destination, using the same process as
`pack-write.c:write_idx_file`.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

rev-list: add bitmap mode to speed up object listsVicent Marti Sat, 21 Dec 2013 14:00:12 +0000 (09:00 -0500)

rev-list: add bitmap mode to speed up object lists

The bitmap reachability index used to speed up the counting objects
phase during `pack-objects` can also be used to optimize a normal
rev-list if the only thing required are the SHA1s of the objects during
the list (i.e., not the path names at which trees and blobs were found).

Calling `git rev-list --objects --use-bitmap-index [committish]` will
perform an object iteration based on a bitmap result instead of actually
walking the object graph.

These are some example timings for `torvalds/linux` (warm cache,
best-of-five):

$ time git rev-list --objects master > /dev/null

real 0m34.191s
user 0m33.904s
sys 0m0.268s

$ time git rev-list --objects --use-bitmap-index master > /dev/null

real 0m1.041s
user 0m0.976s
sys 0m0.064s

Likewise, using `git rev-list --count --use-bitmap-index` will speed up
the counting operation by building the resulting bitmap and performing a
fast popcount (number of bits set on the bitmap) on the result.

Here are some sample timings of different ways to count commits in
`torvalds/linux`:

$ time git rev-list master | wc -l
399882

real 0m6.524s
user 0m6.060s
sys 0m3.284s

$ time git rev-list --count master
399882

real 0m4.318s
user 0m4.236s
sys 0m0.076s

$ time git rev-list --use-bitmap-index --count master
399882

real 0m0.217s
user 0m0.176s
sys 0m0.040s

This also respects negative refs, so you can use it to count
a slice of history:

$ time git rev-list --count v3.0..master
144843

real 0m1.971s
user 0m1.932s
sys 0m0.036s

$ time git rev-list --use-bitmap-index --count v3.0..master
real 0m0.280s
user 0m0.220s
sys 0m0.056s

Though note that the closer the endpoints, the less it helps. In the
traversal case, we have fewer commits to cross, so we take less time.
But the bitmap time is dominated by generating the pack revindex, which
is constant with respect to the refs given.

Note that you cannot yet get a fast --left-right count of a symmetric
difference (e.g., "--count --left-right master...topic"). The slow part
of that walk actually happens during the merge-base determination when
we parse "master...topic". Even though a count does not actually need to
know the real merge base (it only needs to take the symmetric difference
of the bitmaps), the revision code would require some refactoring to
handle this case.

Additionally, a `--test-bitmap` flag has been added that will perform
the same rev-list manually (i.e. using a normal revwalk) and using
bitmaps, and verify that the results are the same. This can be used to
exercise the bitmap code, and also to verify that the contents of the
.bitmap file are sane.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: use bitmaps when packing objectsVicent Marti Sat, 21 Dec 2013 14:00:09 +0000 (09:00 -0500)

pack-objects: use bitmaps when packing objects

In this patch, we use the bitmap API to perform the `Counting Objects`
phase in pack-objects, rather than a traditional walk through the object
graph. For a reasonably-packed large repo, the time to fetch and clone
is often dominated by the full-object revision walk during the Counting
Objects phase. Using bitmaps can reduce the CPU time required on the
server (and therefore start sending the actual pack data with less
delay).

For bitmaps to be used, the following must be true:

1. We must be packing to stdout (as a normal `pack-objects` from
`upload-pack` would do).

2. There must be a .bitmap index containing at least one of the
"have" objects that the client is asking for.

3. Bitmaps must be enabled (they are enabled by default, but can be
disabled by setting `pack.usebitmaps` to false, or by using
`--no-use-bitmap-index` on the command-line).

If any of these is not true, we fall back to doing a normal walk of the
object graph.

Here are some sample timings from a full pack of `torvalds/linux` (i.e.
something very similar to what would be generated for a clone of the
repository) that show the speedup produced by various
methods:

[existing graph traversal]
$ time git pack-objects --all --stdout --no-use-bitmap-index \
</dev/null >/dev/null
Counting objects: 3237103, done.
Compressing objects: 100% (508752/508752), done.
Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

real 0m44.111s
user 0m42.396s
sys 0m3.544s

[bitmaps only, without partial pack reuse; note that
pack reuse is automatic, so timing this required a
patch to disable it]
$ time git pack-objects --all --stdout </dev/null >/dev/null
Counting objects: 3237103, done.
Compressing objects: 100% (508752/508752), done.
Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

real 0m5.413s
user 0m5.604s
sys 0m1.804s

[bitmaps with pack reuse (what you get with this patch)]
$ time git pack-objects --all --stdout </dev/null >/dev/null
Reusing existing pack: 3237103, done.
Total 3237103 (delta 0), reused 0 (delta 0)

real 0m1.636s
user 0m1.460s
sys 0m0.172s

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: split add_object_entryJeff King Sat, 21 Dec 2013 14:00:06 +0000 (09:00 -0500)

pack-objects: split add_object_entry

This function actually does three things:

1. Check whether we've already added the object to our
packing list.

2. Check whether the object meets our criteria for adding.

3. Actually add the object to our packing list.

It's a little hard to see these three phases, because they
happen linearly in the rather long function. Instead, this
patch breaks them up into three separate helper functions.

The result is a little easier to follow, though it
unfortunately suffers from some optimization
interdependencies between the stages (e.g., during step 3 we
use the packing list index from step 1 and the packfile
information from step 2).

More importantly, though, the various parts can be
composed differently, as they will be in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: add support for bitmap indexesVicent Marti Sat, 21 Dec 2013 14:00:01 +0000 (09:00 -0500)

pack-bitmap: add support for bitmap indexes

A bitmap index is a `.bitmap` file that can be found inside
`$GIT_DIR/objects/pack/`, next to its corresponding packfile, and
contains precalculated reachability information for selected commits.
The full specification of the format for these bitmap indexes can be found
in `Documentation/technical/bitmap-format.txt`.

For a given commit SHA1, if it happens to be available in the bitmap
index, its bitmap will represent every single object that is reachable
from the commit itself. The nth bit in the bitmap is the nth object in
the packfile; if it's set to 1, the object is reachable.

By using the bitmaps available in the index, this commit implements
several new functions:

- `prepare_bitmap_git`
- `prepare_bitmap_walk`
- `traverse_bitmap_commit_list`
- `reuse_partial_packfile_from_bitmap`

The `prepare_bitmap_walk` function tries to build a bitmap of all the
objects that can be reached from the commit roots of a given `rev_info`
struct by using the following algorithm:

- If all the interesting commits for a revision walk are available in
the index, the resulting reachability bitmap is the bitwise OR of all
the individual bitmaps.

- When the full set of WANTs is not available in the index, we perform a
partial revision walk using the commits that don't have bitmaps as
roots, and limiting the revision walk as soon as we reach a commit that
has a corresponding bitmap. The earlier OR'ed bitmap with all the
indexed commits can now be completed as this walk progresses, so the end
result is the full reachability list.

- For revision walks with a HAVEs set (a set of commits that are deemed
uninteresting), first we perform the same method as for the WANTs, but
using our HAVEs as roots, in order to obtain a full reachability bitmap
of all the uninteresting commits. This bitmap then can be used to:

a) limit the subsequent walk when building the WANTs bitmap
b) finding the final set of interesting commits by performing an
AND-NOT of the WANTs and the HAVEs.

If `prepare_bitmap_walk` runs successfully, the resulting bitmap is
stored and the equivalent of a `traverse_commit_list` call can be
performed by using `traverse_bitmap_commit_list`; the bitmap version
of this call yields the objects straight from the packfile index
(without having to look them up or parse them) and hence is several
orders of magnitude faster.

As an extra optimization, when `prepare_bitmap_walk` succeeds, the
`reuse_partial_packfile_from_bitmap` call can be attempted: it will find
the amount of objects at the beginning of the on-disk packfile that can
be reused as-is, and return an offset into the packfile. The source
packfile can then be loaded and the bytes up to `offset` can be written
directly to the result without having to consider the entires inside the
packfile individually.

If the `prepare_bitmap_walk` call fails (e.g. because no bitmap files
are available), the `rev_info` struct is left untouched, and can be used
to perform a manual rev-walk using `traverse_commit_list`.

Hence, this new set of functions are a generic API that allows to
perform the equivalent of

git rev-list --objects [roots...] [^uninteresting...]

for any set of commits, even if they don't have specific bitmaps
generated for them.

In further patches, we'll use this bitmap traversal optimization to
speed up the `pack-objects` and `rev-list` commands.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

documentation: add documentation for the bitmap formatVicent Marti Thu, 14 Nov 2013 12:44:02 +0000 (07:44 -0500)

documentation: add documentation for the bitmap format

This is the technical documentation for the JGit-compatible Bitmap v1
on-disk format.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ewah: compressed bitmap implementationVicent Marti Thu, 14 Nov 2013 12:43:51 +0000 (07:43 -0500)

ewah: compressed bitmap implementation

EWAH is a word-aligned compressed variant of a bitset (i.e. a data
structure that acts as a 0-indexed boolean array for many entries).

It uses a 64-bit run-length encoding (RLE) compression scheme,
trading some compression for better processing speed.

The goal of this word-aligned implementation is not to achieve
the best compression, but rather to improve query processing time.
As it stands right now, this EWAH implementation will always be more
efficient storage-wise than its uncompressed alternative.

EWAH arrays will be used as the on-disk format to store reachability
bitmaps for all objects in a repository while keeping reasonable sizes,
in the same way that JGit does.

This EWAH implementation is a mostly straightforward port of the
original `javaewah` library that JGit currently uses. The library is
self-contained and has been embedded whole (4 files) inside the `ewah`
folder to ease redistribution.

The library is re-licensed under the GPLv2 with the permission of Daniel
Lemire, the original author. The source code for the C version can
be found on GitHub:

https://github.com/vmg/libewok

The original Java implementation can also be found on GitHub:

https://github.com/lemire/javaewah

[jc: stripped debug-only code per Peff's $gmane/239768]

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat: add endianness helpersVicent Marti Thu, 14 Nov 2013 12:43:36 +0000 (07:43 -0500)

compat: add endianness helpers

The POSIX standard doesn't currently define a `ntohll`/`htonll`
function pair to perform network-to-host and host-to-network
swaps of 64-bit data. These 64-bit swaps are necessary for the on-disk
storage of EWAH bitmaps if they are not in native byte order.

Many thanks to Ramsay Jones <ramsay@ramsay1.demon.co.uk> and
Torsten Bögershausen <tboegi@web.de> for cygwin/mingw/msvc
portability fixes.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1_file: export `git_open_noatime`Vicent Marti Thu, 24 Oct 2013 18:01:47 +0000 (14:01 -0400)

sha1_file: export `git_open_noatime`

The `git_open_noatime` helper can be of general interest for other
consumers of git's different on-disk formats.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: allow setting custom limiter functionVicent Marti Thu, 24 Oct 2013 18:01:41 +0000 (14:01 -0400)

revision: allow setting custom limiter function

This commit enables users of `struct rev_info` to peform custom limiting
during a revision walk (i.e. `get_revision`).

If the field `include_check` has been set to a callback, this callback
will be issued once for each commit before it is added to the "pending"
list of the revwalk. If the include check returns 0, the commit will be
marked as added but won't be pushed to the pending list, effectively
limiting the walk.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: factor out name_hashVicent Marti Thu, 24 Oct 2013 18:01:29 +0000 (14:01 -0400)

pack-objects: factor out name_hash

As the pack-objects system grows beyond the single
pack-objects.c file, more parts (like the soon-to-exist
bitmap code) will need to compute hashes for matching
deltas. Factor out name_hash to make it available to other
files.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: refactor the packing listVicent Marti Thu, 24 Oct 2013 18:01:06 +0000 (14:01 -0400)

pack-objects: refactor the packing list

The hash table that stores the packing list for a given `pack-objects`
run was tightly coupled to the pack-objects code.

In this commit, we refactor the hash table and the underlying storage
array into a `packing_data` struct. The functionality for accessing and
adding entries to the packing list is hence accessible from other parts
of Git besides the `pack-objects` builtin.

This refactoring is a requirement for further patches in this series
that will require accessing the commit packing list from outside of
`pack-objects`.

The hash table implementation has been minimally altered: we now
use table sizes which are always a power of two, to ensure a uniform
index distribution in the array.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revindex: export new APIsVicent Marti Thu, 24 Oct 2013 18:00:36 +0000 (14:00 -0400)

revindex: export new APIs

Allow users to efficiently lookup consecutive entries that are expected
to be found on the same revindex by exporting `find_revindex_position`:
this function takes a pointer to revindex itself, instead of looking up
the proper revindex for a given packfile on each call.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sha1write: make buffer const-correctJeff King Thu, 24 Oct 2013 17:59:49 +0000 (13:59 -0400)

sha1write: make buffer const-correct

We are passed a "void *" and write it out without ever
touching it; let's indicate that by using "const".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Update draft release notes to 1.8.5Junio C Hamano Wed, 23 Oct 2013 20:37:27 +0000 (13:37 -0700)

Update draft release notes to 1.8.5

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with 'maint'Junio C Hamano Wed, 23 Oct 2013 20:36:57 +0000 (13:36 -0700)

Sync with 'maint'

Almost 1.8.4.2 ;-)Junio C Hamano Wed, 23 Oct 2013 20:34:39 +0000 (13:34 -0700)

Almost 1.8.4.2 ;-)

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jc/ls-files-killed-optim' into maintJunio C Hamano Wed, 23 Oct 2013 20:33:08 +0000 (13:33 -0700)

Merge branch 'jc/ls-files-killed-optim' into maint

"git ls-files -k" needs to crawl only the part of the working tree
that may overlap the paths in the index to find killed files, but
shared code with the logic to find all the untracked files, which
made it unnecessarily inefficient.

* jc/ls-files-killed-optim:
dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage
t3010: update to demonstrate "ls-files -k" optimization pitfalls
ls-files -k: a directory only can be killed if the index has a non-directory
dir.c: use the cache_* macro to access the current index

Merge branch 'jh/checkout-auto-tracking' into maintJunio C Hamano Wed, 23 Oct 2013 20:32:50 +0000 (13:32 -0700)

Merge branch 'jh/checkout-auto-tracking' into maint

"git branch --track" had a minor regression in v1.8.3.2 and later
that made it impossible to base your local work on anything but a
local branch of the upstream repository you are tracking from.

* jh/checkout-auto-tracking:
t3200: fix failure on case-insensitive filesystems
branch.c: Relax unnecessary requirement on upstream's remote ref name
t3200: Add test demonstrating minor regression in 41c21f2
Refer to branch.<name>.remote/merge when documenting --track
t3200: Minor fix when preparing for tracking failure
t2024: Fix &&-chaining and a couple of typos

Merge branch 'nd/fetch-into-shallow' into maintJunio C Hamano Wed, 23 Oct 2013 20:32:17 +0000 (13:32 -0700)

Merge branch 'nd/fetch-into-shallow' into maint

When there is no sufficient overlap between old and new history
during a "git fetch" into a shallow repository, objects that the
sending side knows the receiving end has were unnecessarily sent.

* nd/fetch-into-shallow:
Add testcase for needless objects during a shallow fetch
list-objects: mark more commits as edges in mark_edges_uninteresting
list-objects: reduce one argument in mark_edges_uninteresting
upload-pack: delegate rev walking in shallow fetch to pack-objects
shallow: add setup_temporary_shallow()
shallow: only add shallow graft points to new shallow file
move setup_alternate_shallow and write_shallow_commits to shallow.c

Merge branch 'bc/gnome-keyring'Junio C Hamano Wed, 23 Oct 2013 20:21:50 +0000 (13:21 -0700)

Merge branch 'bc/gnome-keyring'

Cleanups and tweaks for credential handling to work with ancient versions
of the gnome-keyring library that are still in use.

* bc/gnome-keyring:
contrib/git-credential-gnome-keyring.c: support really ancient gnome-keyring
contrib/git-credential-gnome-keyring.c: support ancient gnome-keyring
contrib/git-credential-gnome-keyring.c: report failure to store password
contrib/git-credential-gnome-keyring.c: use glib messaging functions
contrib/git-credential-gnome-keyring.c: use glib memory allocation functions
contrib/git-credential-gnome-keyring.c: use secure memory for reading passwords
contrib/git-credential-gnome-keyring.c: use secure memory functions for passwds
contrib/git-credential-gnome-keyring.c: use gnome helpers in keyring_object()
contrib/git-credential-gnome-keyring.c: set Gnome application name
contrib/git-credential-gnome-keyring.c: ensure buffer is non-empty before accessing
contrib/git-credential-gnome-keyring.c: strlen() returns size_t, not ssize_t
contrib/git-credential-gnome-keyring.c: exit non-zero when called incorrectly
contrib/git-credential-gnome-keyring.c: add static where applicable
contrib/git-credential-gnome-keyring.c: *style* use "if ()" not "if()" etc.
contrib/git-credential-gnome-keyring.c: remove unused die() function
contrib/git-credential-gnome-keyring.c: remove unnecessary pre-declarations

Merge branch 'po/dot-url'Junio C Hamano Wed, 23 Oct 2013 20:21:48 +0000 (13:21 -0700)

Merge branch 'po/dot-url'

Explain how '.' can be used to refer to the "current repository"
in the documentation.

* po/dot-url:
doc/cli: make "dot repository" an independent bullet point
config doc: update dot-repository notes
doc: command line interface (cli) dot-repository dwimmery

Merge branch 'jc/prompt-upstream'Junio C Hamano Wed, 23 Oct 2013 20:21:45 +0000 (13:21 -0700)

Merge branch 'jc/prompt-upstream'

An enhancement to the GIT_PS1_SHOWUPSTREAM facility.

* jc/prompt-upstream:
git-prompt.sh: optionally show upstream branch name

Merge branch 'hu/cherry-pick-previous-branch'Junio C Hamano Wed, 23 Oct 2013 20:21:35 +0000 (13:21 -0700)

Merge branch 'hu/cherry-pick-previous-branch'

"git cherry-pick" without further options would segfault.

Could use a follow-up to handle '-' after argv[1] better.

* hu/cherry-pick-previous-branch:
cherry-pick: handle "-" after parsing options

Merge branch 'mg/more-textconv'Junio C Hamano Wed, 23 Oct 2013 20:21:30 +0000 (13:21 -0700)

Merge branch 'mg/more-textconv'

Make "git grep" and "git show" pay attention to --textconv when
dealing with blob objects.

* mg/more-textconv:
grep: honor --textconv for the case rev:path
grep: allow to use textconv filters
t7008: demonstrate behavior of grep with textconv
cat-file: do not die on --textconv without textconv filters
show: honor --textconv for blobs
diff_opt: track whether flags have been set explicitly
t4030: demonstrate behavior of show with textconv

Merge branch 'jc/pack-objects'Junio C Hamano Wed, 23 Oct 2013 20:21:26 +0000 (13:21 -0700)

Merge branch 'jc/pack-objects'

* jc/pack-objects:
pack-objects: shrink struct object_entry

Update draft release notes to 1.8.5Junio C Hamano Fri, 18 Oct 2013 20:53:05 +0000 (13:53 -0700)

Update draft release notes to 1.8.5

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'maint'Junio C Hamano Fri, 18 Oct 2013 20:53:48 +0000 (13:53 -0700)

Merge branch 'maint'

* maint:
git-merge: document the -S option

Merge branch 'jc/reflog-doc'Junio C Hamano Fri, 18 Oct 2013 20:50:12 +0000 (13:50 -0700)

Merge branch 'jc/reflog-doc'

Document rules to use GIT_REFLOG_ACTION variable in the scripted
Porcelain. git-rebase--interactive locally violates them, but it
is a leaf user that does not call out to or dot-source other
scripts, so it does not urgently need to be fixed.

* jc/reflog-doc:
setup_reflog_action: document the rules for using GIT_REFLOG_ACTION

Merge branch 'sb/repack-in-c'Junio C Hamano Fri, 18 Oct 2013 20:49:56 +0000 (13:49 -0700)

Merge branch 'sb/repack-in-c'

Rewrite "git repack" in C.

* sb/repack-in-c:
repack: improve warnings about failure of renaming and removing files
repack: retain the return value of pack-objects
repack: rewrite the shell script in C

Merge branch 'jk/clone-progress-to-stderr'Junio C Hamano Fri, 18 Oct 2013 20:49:51 +0000 (13:49 -0700)

Merge branch 'jk/clone-progress-to-stderr'

Some progress and diagnostic messages from "git clone" were
incorrectly sent to the standard output stream, not to the standard
error stream.

* jk/clone-progress-to-stderr:
clone: always set transport options
clone: treat "checking connectivity" like other progress
clone: send diagnostic messages to stderr

Merge git://github.com/git-l10n/git-poJunio C Hamano Fri, 18 Oct 2013 20:49:00 +0000 (13:49 -0700)

Merge git://github.com/git-l10n/git-po

* git://github.com/git-l10n/git-po:
l10n: fr.po: 2135/2135 messages translated

git-merge: document the -S optionNicolas Vigier Mon, 14 Oct 2013 23:41:05 +0000 (01:41 +0200)

git-merge: document the -S option

The option to gpg sign a merge commit is available but was not
documented. Use wording from the git-commit(1) manpage.

Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

l10n: fr.po: 2135/2135 messages translatedJean-Noel Avila Wed, 21 Aug 2013 19:49:43 +0000 (21:49 +0200)

l10n: fr.po: 2135/2135 messages translated

Signed-off-by: Sebastien Helleu <flashcode@flashtux.org>
Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>

Update draft release notes to 1.8.5Junio C Hamano Thu, 17 Oct 2013 22:57:12 +0000 (15:57 -0700)

Update draft release notes to 1.8.5

Merge branch 'jk/format-patch-from'Junio C Hamano Thu, 17 Oct 2013 22:55:18 +0000 (15:55 -0700)

Merge branch 'jk/format-patch-from'

"format-patch --from=<whom>" forgot to omit unnecessary in-body
from line, i.e. when <whom> is the same as the real author.

* jk/format-patch-from:
format-patch: print in-body "From" only when needed

Merge branch 'es/name-hash-no-trailing-slash-in-dirs'Junio C Hamano Thu, 17 Oct 2013 22:55:15 +0000 (15:55 -0700)

Merge branch 'es/name-hash-no-trailing-slash-in-dirs'

Clean up the internal of the name-hash mechanism used to work
around case insensitivity on some filesystems to cleanly fix a
long-standing API glitch where the caller of cache_name_exists()
that ask about a directory with a counted string was required to
have '/' at one location past the end of the string.

* es/name-hash-no-trailing-slash-in-dirs:
dir: revert work-around for retired dangerous behavior
name-hash: stop storing trailing '/' on paths in index_state.dir_hash
employ new explicit "exists in index?" API
name-hash: refactor polymorphic index_name_exists()

Merge branch 'jk/trailing-slash-in-pathspec'Junio C Hamano Thu, 17 Oct 2013 22:55:13 +0000 (15:55 -0700)

Merge branch 'jk/trailing-slash-in-pathspec'

Code refactoring.

* jk/trailing-slash-in-pathspec:
reset: handle submodule with trailing slash
rm: re-use parse_pathspec's trailing-slash removal

Merge branch 'lc/filter-branch-too-many-refs'Junio C Hamano Thu, 17 Oct 2013 22:55:12 +0000 (15:55 -0700)

Merge branch 'lc/filter-branch-too-many-refs'

"git filter-branch" in a repository with many refs blew limit of
command line length.

* lc/filter-branch-too-many-refs:
Allow git-filter-branch to process large repositories with lots of branches.

Merge branch 'jc/checkout-detach-doc'Junio C Hamano Thu, 17 Oct 2013 22:55:08 +0000 (15:55 -0700)

Merge branch 'jc/checkout-detach-doc'

"git checkout [--detach] <commit>" was listed poorly in the
synopsis section of its documentation.

* jc/checkout-detach-doc:
checkout: update synopsys and documentation on detaching HEAD

Sync with maintJunio C Hamano Thu, 17 Oct 2013 22:54:28 +0000 (15:54 -0700)

Sync with maint

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Start preparing for 1.8.4.2Junio C Hamano Thu, 17 Oct 2013 22:50:45 +0000 (15:50 -0700)

Start preparing for 1.8.4.2

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jk/upload-pack-keepalive' into maintJunio C Hamano Thu, 17 Oct 2013 22:46:01 +0000 (15:46 -0700)

Merge branch 'jk/upload-pack-keepalive' into maint

* jk/upload-pack-keepalive:
upload-pack: bump keepalive default to 5 seconds
upload-pack: send keepalive packets during pack computation

Merge branch 'bc/http-backend-allow-405' into maintJunio C Hamano Thu, 17 Oct 2013 22:46:00 +0000 (15:46 -0700)

Merge branch 'bc/http-backend-allow-405' into maint

* bc/http-backend-allow-405:
http-backend: provide Allow header for 405

Merge branch 'jc/cvsserver-perm-bit-fix' into maintJunio C Hamano Thu, 17 Oct 2013 22:45:57 +0000 (15:45 -0700)

Merge branch 'jc/cvsserver-perm-bit-fix' into maint

* jc/cvsserver-perm-bit-fix:
cvsserver: pick up the right mode bits

Merge branch 'js/add-i-mingw' into maintJunio C Hamano Thu, 17 Oct 2013 22:45:56 +0000 (15:45 -0700)

Merge branch 'js/add-i-mingw' into maint

* js/add-i-mingw:
add--interactive: fix external command invocation on Windows

Merge branch 'nd/git-dir-pointing-at-gitfile' into... Junio C Hamano Thu, 17 Oct 2013 22:45:55 +0000 (15:45 -0700)

Merge branch 'nd/git-dir-pointing-at-gitfile' into maint

* nd/git-dir-pointing-at-gitfile:
Make setup_git_env() resolve .git file when $GIT_DIR is not specified

Merge branch 'jk/has-sha1-file-retry-packed' into maintJunio C Hamano Thu, 17 Oct 2013 22:45:54 +0000 (15:45 -0700)

Merge branch 'jk/has-sha1-file-retry-packed' into maint

* jk/has-sha1-file-retry-packed:
has_sha1_file: re-check pack directory before giving up

Merge branch 'ap/commit-author-mailmap' into maintJunio C Hamano Thu, 17 Oct 2013 22:45:51 +0000 (15:45 -0700)

Merge branch 'ap/commit-author-mailmap' into maint

* ap/commit-author-mailmap:
commit: search author pattern against mailmap

Merge branch 'es/rebase-i-no-abbrev' into maintJunio C Hamano Thu, 17 Oct 2013 22:45:50 +0000 (15:45 -0700)

Merge branch 'es/rebase-i-no-abbrev' into maint

* es/rebase-i-no-abbrev:
rebase -i: fix short SHA-1 collision
t3404: rebase -i: demonstrate short SHA-1 collision
t3404: make tests more self-contained

Conflicts:
t/t3404-rebase-interactive.sh

Merge branch 'rt/rebase-p-no-merge-summary' into maintJunio C Hamano Thu, 17 Oct 2013 22:45:45 +0000 (15:45 -0700)

Merge branch 'rt/rebase-p-no-merge-summary' into maint

* rt/rebase-p-no-merge-summary:
rebase --preserve-merges: ignore "merge.log" config

Merge branch 'es/rebase-i-respect-core-commentchar... Junio C Hamano Thu, 17 Oct 2013 22:45:24 +0000 (15:45 -0700)

Merge branch 'es/rebase-i-respect-core-commentchar' into maint

* es/rebase-i-respect-core-commentchar:
rebase -i: fix cases ignoring core.commentchar

t4254: modernize testsSZEDER Gábor Wed, 16 Oct 2013 12:27:16 +0000 (14:27 +0200)

t4254: modernize tests

- Don't start tests with 'test $? = 0' to catch preparation done
outside the test_expect_success block.

- Move writing the bogus patch and the expected output into the
appropriate test_expect_success blocks.

- Use the test_must_fail helper instead of manually checking for
non-zero exit code.

- Use the debug-friendly test_path_is_file helper instead of 'test -f'.

- No space after '>'.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Update draft release notes to 1.8.5Junio C Hamano Wed, 16 Oct 2013 19:27:45 +0000 (12:27 -0700)

Update draft release notes to 1.8.5

List notable topics that graduated during Jonathan's interim
maintainership.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge git://git.bogomips.org/git-svnJunio C Hamano Wed, 16 Oct 2013 17:45:58 +0000 (10:45 -0700)

Merge git://git.bogomips.org/git-svn

* git://git.bogomips.org/git-svn:
git-svn: Warn about changing default for --prefix in Git v2.0
Documentation/git-svn: Promote the use of --prefix in docs + examples
git-svn.txt: elaborate on rev_map files
git-svn.txt: replace .git with $GIT_DIR
git-svn.txt: reword description of gc command
git-svn.txt: fix AsciiDoc formatting error
git-svn: fix signed commit parsing

contrib/git-credential-gnome-keyring.c: support really... Brandon Casey Mon, 23 Sep 2013 18:49:17 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: support really ancient gnome-keyring

The gnome-keyring lib (0.4) distributed with RHEL 4.X is really ancient
and does not provide most of the synchronous functions that even ancient
releases do. Thankfully, we're only using one function that is missing.
Let's emulate gnome_keyring_item_delete_sync() by calling the asynchronous
function and then triggering the event loop processing until our
callback is called.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: support ancient... Brandon Casey Mon, 23 Sep 2013 18:49:16 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: support ancient gnome-keyring

The gnome-keyring lib distributed with RHEL 5.X is ancient and does
not provide a few of the functions/defines that more recent versions
do, but mostly the API is the same. Let's provide the missing bits
via macro definitions and function implementation.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: report failure... Brandon Casey Mon, 23 Sep 2013 18:49:15 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: report failure to store password

Produce an error message when we fail to store a password to the keyring.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: use glib messag... Brandon Casey Mon, 23 Sep 2013 18:49:14 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: use glib messaging functions

Rather than roll our own, let's use the messaging functions provided
by glib.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: use glib memory... Brandon Casey Mon, 23 Sep 2013 18:49:13 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: use glib memory allocation functions

Rather than roll our own, let's use the memory allocation/free routines
provided by glib.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: use secure... Brandon Casey Mon, 23 Sep 2013 18:49:12 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: use secure memory for reading passwords

gnome-keyring provides functions to allocate non-pageable memory (if
possible). Let's use them to allocate memory that may be used to hold
secure data read from the keyring.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: use secure... Brandon Casey Mon, 23 Sep 2013 18:49:11 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: use secure memory functions for passwds

gnome-keyring provides functions for allocating non-pageable memory (if
possible) intended to be used for storing passwords. Let's use them.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: use gnome helpe... Brandon Casey Mon, 23 Sep 2013 18:49:10 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: use gnome helpers in keyring_object()

Rather than carefully allocating memory for sprintf() to write into,
let's make use of the glib helper function g_strdup_printf(), which
makes things a lot easier and less error-prone.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: set Gnome appli... Brandon Casey Mon, 23 Sep 2013 18:49:09 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: set Gnome application name

Since this is a Gnome application, let's set the application name to
something reasonable. This will be displayed in Gnome dialog boxes
e.g. the one that prompts for the user's keyring password.

We add an include statement for glib.h and add the glib-2.0 cflags and
libs to the compilation arguments, but both of these are really noops
since glib is already a dependency of gnome-keyring.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: ensure buffer... Brandon Casey Mon, 23 Sep 2013 18:49:08 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: ensure buffer is non-empty before accessing

Ensure buffer length is non-zero before attempting to access the last
element.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: strlen() return... Brandon Casey Mon, 23 Sep 2013 18:49:07 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: strlen() returns size_t, not ssize_t

Also, initialization is not necessary since it is assigned before it is
used.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: exit non-zero... Brandon Casey Mon, 23 Sep 2013 18:49:06 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: exit non-zero when called incorrectly

If the correct arguments were not specified, this program should exit
non-zero. Let's do so.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: add static... Brandon Casey Mon, 23 Sep 2013 18:49:05 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: add static where applicable

Mark global variable and functions as static.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

contrib/git-credential-gnome-keyring.c: *style* use... Brandon Casey Mon, 23 Sep 2013 18:49:04 +0000 (11:49 -0700)

contrib/git-credential-gnome-keyring.c: *style* use "if ()" not "if()" etc.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'maint'Junio C Hamano Tue, 15 Oct 2013 23:15:00 +0000 (16:15 -0700)

Merge branch 'maint'

* maint:
git-prune-packed.txt: fix reference to GIT_OBJECT_DIRECTORY
clone --branch: refuse to clone if upstream repo is empty

git-prune-packed.txt: fix reference to GIT_OBJECT_DIRECTORYSteffen Prohaska Mon, 23 Sep 2013 19:19:19 +0000 (21:19 +0200)

git-prune-packed.txt: fix reference to GIT_OBJECT_DIRECTORY

git-prune-packed operates on GIT_OBJECT_DIRECTORY, not
GIT_OBJECT_DIR.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git.txt: fix asciidoc syntax of --*-pathspecsSteffen Prohaska Mon, 23 Sep 2013 18:54:35 +0000 (20:54 +0200)

git.txt: fix asciidoc syntax of --*-pathspecs

Labeled lists require a double colon.

[jc] I eyeballed the output from

git grep '[^:]:$' Documentation/\*.txt

and the patch fixes all breakages of this kind.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

doc/cli: make "dot repository" an independent bullet... Philip Oakley Tue, 15 Oct 2013 21:57:42 +0000 (14:57 -0700)

doc/cli: make "dot repository" an independent bullet point

The way to spell the current repository with a '.' dot is
independent from how the pathspec allows globs expanded by Git.

Make them two separate bullet items in the enumeration.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mergetool--lib: Fix typo in the merge/difftool helpStefan Saasen Fri, 4 Oct 2013 14:34:53 +0000 (07:34 -0700)

mergetool--lib: Fix typo in the merge/difftool help

The help text for the `tool` flag should mention:

--tool=<tool>

instead of:

--tool-<tool>

Signed-off-by: Stefan Saasen <ssaasen@atlassian.com>
Reviewed-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

sparse: suppress some "using sizeof on a function"... Ramsay Jones Sun, 6 Oct 2013 20:52:21 +0000 (21:52 +0100)

sparse: suppress some "using sizeof on a function" warnings

Sparse issues an "using sizeof on a function" warning for each
call to curl_easy_setopt() which sets an option that takes a
function pointer parameter. (currently 12 such warnings over 4
files.)

The warnings relate to the use of the "typecheck-gcc.h" header
file which adds a layer of type-checking macros to the curl
function invocations (for gcc >= 4.3 and !__cplusplus). As part
of the type-checking layer, 'sizeof' is applied to the function
parameter of curl_easy_setopt(). Note that, in the context of
sizeof, the function to function pointer conversion is not
performed and that sizeof(f) != sizeof(&f).

A simple solution, therefore, would be to replace the function
name in each such call to curl_easy_setopt() with an explicit
function pointer expression (i.e. replace f with &f).

However, the "typecheck-gcc.h" header file is only conditionally
included, in addition to the gcc and C++ checks mentioned above,
depending on the CURL_DISABLE_TYPECHECK preprocessor variable.

In order to suppress the warnings, we use target-specific variable
assignments to add -DCURL_DISABLE_TYPECHECK to SPARSE_FLAGS for
each file affected (http-push.c, http.c, http-walker.c and
remote-curl.c).

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

format-patch doc: Thunderbird wraps lines unless mailne... Ramsay Jones Sun, 6 Oct 2013 20:51:31 +0000 (21:51 +0100)

format-patch doc: Thunderbird wraps lines unless mailnews.wraplength=0

The Thunderbird section of the 'MUA-specific hints' contains three
different approaches to setting up the mail client to leave patch
emails unmolested. The second approach (configuration) has a step
missing when configuring the composition window not to wrap. In
particular, the "mailnews.wraplength" configuration variable needs
to be set to zero. Update the documentation to add the missing
setting.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

Merge branch 'rj/highlight-test-hang'Jonathan Nieder Mon, 14 Oct 2013 23:19:31 +0000 (16:19 -0700)

Merge branch 'rj/highlight-test-hang'

* rj/highlight-test-hang:
gitweb test: fix highlight test hang on Linux Mint

gitweb test: fix highlight test hang on Linux MintRamsay Jones Sun, 6 Oct 2013 20:50:46 +0000 (21:50 +0100)

gitweb test: fix highlight test hang on Linux Mint

Linux Mint has an implementation of the highlight command (unrelated
to the one from http://www.andre-simon.de) that works as a simple
filter. The script uses 'sed' to add terminal colour escape codes
around text matching a regular expression. When t9500-*.sh attempts
to run "highlight --version", the script simply hangs waiting for
input. (See https://bugs.launchpad.net/linuxmint/+bug/815005).

The tool required by gitweb can be installed from the 'highlight'
package. Unfortunately, given the default $PATH, this leads to the
tool having lower precedence than the script.

In order to avoid hanging the test, add '</dev/null' to the command
line of the highlight invocation. Also, since the 'highlight' tool
requred by gitweb produces '--version' output (and the script does
not), saving the command output allows a simple check for the wrong
'highlight'.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

wrapper.c: only define gitmkstemps if neededRamsay Jones Sun, 6 Oct 2013 20:50:00 +0000 (21:50 +0100)

wrapper.c: only define gitmkstemps if needed

When the NO_MKSTEMPS build variable is not set, the gitmkstemps
function is dead code. Use a preprocessor conditional to only include
the definition when needed.

Noticed by sparse. ("'gitmkstemps' was not declared. Should it be
static?")

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

refs.c: spell NULL pointer as NULLRamsay Jones Sun, 6 Oct 2013 20:49:18 +0000 (21:49 +0100)

refs.c: spell NULL pointer as NULL

A call to update_ref_lock() passes '0' to the 'int *type_p' parameter.
Noticed by sparse. ("Using plain integer as NULL pointer")

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

config.c: mark file-local function staticRamsay Jones Sun, 6 Oct 2013 20:48:29 +0000 (21:48 +0100)

config.c: mark file-local function static

Commit 7192777 refactors git_parse_ulong, which is public, into a more
generic function. But since we kept the git_parse_ulong wrapper, only
that part needs to be public; nobody outside the file calls the
lower-level git_parse_unsigned.

Noticed with sparse. ("'git_parse_unsigned' was not declared. Should
it be static?")

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Explained-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>