gitweb.git
block alloc: add validations around cache_entry lifecyleJameson Miller Mon, 2 Jul 2018 19:49:39 +0000 (19:49 +0000)

block alloc: add validations around cache_entry lifecyle

Add an option (controlled by an environment variable) perform extra
validations on mem_pool allocated cache entries. When set:

1) Invalidate cache_entry memory when discarding cache_entry.

2) When discarding index_state struct, verify that all cache_entries
were allocated from expected mem_pool.

3) When discarding mem_pools, invalidate mem_pool memory.

This should provide extra checks that mem_pools and their allocated
cache_entries are being used as expected.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

block alloc: allocate cache entries from mem_poolJameson Miller Mon, 2 Jul 2018 19:49:37 +0000 (19:49 +0000)

block alloc: allocate cache entries from mem_pool

When reading large indexes from disk, a portion of the time is
dominated in malloc() calls. This can be mitigated by allocating a
large block of memory and manage it ourselves via memory pools.

This change moves the cache entry allocation to be on top of memory
pools.

Design:

The index_state struct will gain a notion of an associated memory_pool
from which cache_entries will be allocated from. When reading in the
index from disk, we have information on the number of entries and
their size, which can guide us in deciding how large our initial
memory allocation should be. When an index is discarded, the
associated memory_pool will be discarded as well - so the lifetime of
a cache_entry is tied to the lifetime of the index_state that it was
allocated for.

In the case of a Split Index, the following rules are followed. 1st,
some terminology is defined:

Terminology:
- 'the_index': represents the logical view of the index

- 'split_index': represents the "base" cache entries. Read from the
split index file.

'the_index' can reference a single split_index, as well as
cache_entries from the split_index. `the_index` will be discarded
before the `split_index` is. This means that when we are allocating
cache_entries in the presence of a split index, we need to allocate
the entries from the `split_index`'s memory pool. This allows us to
follow the pattern that `the_index` can reference cache_entries from
the `split_index`, and that the cache_entries will not be freed while
they are still being referenced.

Managing transient cache_entry structs:
Cache entries are usually allocated for an index, but this is not always
the case. Cache entries are sometimes allocated because this is the
type that the existing checkout_entry function works with. Because of
this, the existing code needs to handle cache entries associated with an
index / memory pool, and those that only exist transiently. Several
strategies were contemplated around how to handle this:

Chosen approach:
An extra field was added to the cache_entry type to track whether the
cache_entry was allocated from a memory pool or not. This is currently
an int field, as there are no more available bits in the existing
ce_flags bit field. If / when more bits are needed, this new field can
be turned into a proper bit field.

Alternatives:

1) Do not include any information about how the cache_entry was
allocated. Calling code would be responsible for tracking whether the
cache_entry needed to be freed or not.
Pro: No extra memory overhead to track this state
Con: Extra complexity in callers to handle this correctly.

The extra complexity and burden to not regress this behavior in the
future was more than we wanted.

2) cache_entry would gain knowledge about which mem_pool allocated it
Pro: Could (potentially) do extra logic to know when a mem_pool no
longer had references to any cache_entry
Con: cache_entry would grow heavier by a pointer, instead of int

We didn't see a tangible benefit to this approach

3) Do not add any extra information to a cache_entry, but when freeing a
cache entry, check if the memory exists in a region managed by existing
mem_pools.
Pro: No extra memory overhead to track state
Con: Extra computation is performed when freeing cache entries

We decided tracking and iterating over known memory pool regions was
less desirable than adding an extra field to track this stae.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mem-pool: fill out functionalityJameson Miller Mon, 2 Jul 2018 19:49:35 +0000 (19:49 +0000)

mem-pool: fill out functionality

Add functions for:

- combining two memory pools

- determining if a memory address is within the range managed by a
memory pool

These functions will be used by future commits.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mem-pool: add life cycle management functionsJameson Miller Mon, 2 Jul 2018 19:49:34 +0000 (19:49 +0000)

mem-pool: add life cycle management functions

Add initialization and discard functions to mem_pool type. As the
memory allocated by mem_pool can now be freed, we also track the large
allocations.

If the there are existing mp_blocks in the mem_poo's linked list of
mp_blocksl, then the mp_block for a large allocation is inserted
behind the head block. This is because only the head mp_block is considered
when searching for availble space. This results in the following
desirable properties:

1) The mp_block allocated for the large request will not be included
not included in the search for available in future requests, the large
mp_block is sized for the specific request and does not contain any
spare space.

2) The head mp_block will not bumped from considation for future
memory requests just because a request for a large chunk of memory
came in.

These changes are in preparation for a future commit that will utilize
creating and discarding memory pool.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mem-pool: only search head block for available spaceJameson Miller Mon, 2 Jul 2018 19:49:33 +0000 (19:49 +0000)

mem-pool: only search head block for available space

Instead of searching all memory blocks for available space to fulfill
a memory request, only search the head block. If the head block does
not have space, assume that previous block would most likely not be
able to fulfill request either. This could potentially lead to more
memory fragmentation, but also avoids searching memory blocks that
probably will not be able to fulfill request.

This pattern will benefit consumers that are able to generate a good
estimate for how much memory will be needed, or if they are performing
fixed sized allocations, so that once a block is exhausted it will
never be able to fulfill a future request.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

block alloc: add lifecycle APIs for cache_entry structsJameson Miller Mon, 2 Jul 2018 19:49:31 +0000 (19:49 +0000)

block alloc: add lifecycle APIs for cache_entry structs

It has been observed that the time spent loading an index with a large
number of entries is partly dominated by malloc() calls. This change
is in preparation for using memory pools to reduce the number of
malloc() calls made to allocate cahce entries when loading an index.

Add an API to allocate and discard cache entries, abstracting the
details of managing the memory backing the cache entries. This commit
does actually change how memory is managed - this will be done in a
later commit in the series.

This change makes the distinction between cache entries that are
associated with an index and cache entries that are not associated with
an index. A main use of cache entries is with an index, and we can
optimize the memory management around this. We still have other cases
where a cache entry is not persisted with an index, and so we need to
handle the "transient" use case as well.

To keep the congnitive overhead of managing the cache entries, there
will only be a single discard function. This means there must be enough
information kept with the cache entry so that we know how to discard
them.

A summary of the main functions in the API is:

make_cache_entry: create cache entry for use in an index. Uses specified
parameters to populate cache_entry fields.

make_empty_cache_entry: Create an empty cache entry for use in an index.
Returns cache entry with empty fields.

make_transient_cache_entry: create cache entry that is not used in an
index. Uses specified parameters to populate
cache_entry fields.

make_empty_transient_cache_entry: create cache entry that is not used in
an index. Returns cache entry with
empty fields.

discard_cache_entry: A single function that knows how to discard a cache
entry regardless of how it was allocated.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: teach make_cache_entry to take object_idJameson Miller Mon, 2 Jul 2018 19:49:30 +0000 (19:49 +0000)

read-cache: teach make_cache_entry to take object_id

Teach make_cache_entry function to take object_id instead of a SHA-1.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache: teach refresh_cache_entry to take istateJameson Miller Mon, 2 Jul 2018 19:49:29 +0000 (19:49 +0000)

read-cache: teach refresh_cache_entry to take istate

Refactor refresh_cache_entry() to work on a specific index, instead of
implicitly using the_index. This is in preparation for making the
make_cache_entry function apply to a specific index.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsck: check skiplist for object in fsck_blob()Ramsay Jones Wed, 27 Jun 2018 18:39:53 +0000 (19:39 +0100)

fsck: check skiplist for object in fsck_blob()

Since commit ed8b10f631 ("fsck: check .gitmodules content", 2018-05-02),
fsck will issue an error message for '.gitmodules' content that cannot
be parsed correctly. This is the case, even when the corresponding blob
object has been included on the skiplist. For example, using the cgit
repository, we see the following:

$ git fsck
Checking object directories: 100% (256/256), done.
error: bad config line 5 in blob .gitmodules
error in blob 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: could not parse gitmodules blob
Checking objects: 100% (6626/6626), done.
$

$ git config fsck.skiplist '.git/skip'
$ echo 51dd1eff1edc663674df9ab85d2786a40f7ae3a5 >.git/skip
$

$ git fsck
Checking object directories: 100% (256/256), done.
error: bad config line 5 in blob .gitmodules
Checking objects: 100% (6626/6626), done.
$

Note that the error message issued by the config parser is still
present, despite adding the object-id of the blob to the skiplist.

One solution would be to provide a means of suppressing the messages
issued by the config parser. However, given that (logically) we are
asking fsck to ignore this object, a simpler approach is to just not
call the config parser if the object is to be skipped. Add a check to
the 'fsck_blob()' processing function, to determine if the object is
on the skiplist and, if so, exit the function early.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsck: silence stderr when parsing .gitmodulesJeff King Thu, 28 Jun 2018 22:06:04 +0000 (18:06 -0400)

fsck: silence stderr when parsing .gitmodules

If there's a parsing error we'll already report it via the
usual fsck report() function (or not, if the user has asked
to skip this object or warning type). The error message from
the config parser just adds confusion. Let's suppress it.

Note that we didn't test this case at all, so I've added
coverage in t7415. We may end up toning down or removing
this fsck check in the future. So take this test as checking
what happens now with a focus on stderr, and not any
ironclad guarantee that we must detect and report parse
failures in the future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: add options parameter to git_config_from_memJeff King Thu, 28 Jun 2018 22:05:24 +0000 (18:05 -0400)

config: add options parameter to git_config_from_mem

The underlying config parser knows how to handle a
config_options struct, but git_config_from_mem() always
passes NULL. Let's allow our callers to specify the options
struct.

We could add a "_with_options" variant, but since there are
only a handful of callers, let's just update them to pass
NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: add CONFIG_ERROR_SILENT handlerJeff King Thu, 28 Jun 2018 22:05:09 +0000 (18:05 -0400)

config: add CONFIG_ERROR_SILENT handler

We can currently die() or error(), but there's not yet any
way for callers to ask us just to quietly return an error.
Let's give them one.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: turn die_on_error into caller-facing enumJeff King Thu, 28 Jun 2018 22:05:00 +0000 (18:05 -0400)

config: turn die_on_error into caller-facing enum

The config code has a die_on_error flag, which lets us emit
an error() instead of dying when we see a bogus config file.
But there's no way for a caller of the config code to set
this: it's auto-set based on whether we're reading a file or
a blob.

Instead, let's add it to the config_options struct. When
it's not set (or we have no options) we'll continue to fall
back to the existing file/blob behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit.c: allow lookup_commit_reference to handle arbit... Stefan Beller Fri, 29 Jun 2018 01:22:22 +0000 (18:22 -0700)

commit.c: allow lookup_commit_reference to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit.c: allow lookup_commit_reference_gently to handl... Stefan Beller Fri, 29 Jun 2018 01:22:21 +0000 (18:22 -0700)

commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tag.c: allow deref_tag to handle arbitrary repositoriesStefan Beller Fri, 29 Jun 2018 01:22:20 +0000 (18:22 -0700)

tag.c: allow deref_tag to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object.c: allow parse_object to handle arbitrary reposi... Stefan Beller Fri, 29 Jun 2018 01:22:19 +0000 (18:22 -0700)

object.c: allow parse_object to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object.c: allow parse_object_buffer to handle arbitrary... Stefan Beller Fri, 29 Jun 2018 01:22:18 +0000 (18:22 -0700)

object.c: allow parse_object_buffer to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit.c: allow get_cached_commit_buffer to handle... Stefan Beller Fri, 29 Jun 2018 01:22:17 +0000 (18:22 -0700)

commit.c: allow get_cached_commit_buffer to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit.c: allow set_commit_buffer to handle arbitrary... Stefan Beller Fri, 29 Jun 2018 01:22:16 +0000 (18:22 -0700)

commit.c: allow set_commit_buffer to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit.c: migrate the commit buffer to the parsed objec... Stefan Beller Fri, 29 Jun 2018 01:22:15 +0000 (18:22 -0700)

commit.c: migrate the commit buffer to the parsed object store

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-slabs: remove realloc counter outside of slab... Stefan Beller Fri, 29 Jun 2018 01:22:14 +0000 (18:22 -0700)

commit-slabs: remove realloc counter outside of slab struct

The realloc counter is declared outside the struct for the given slabname,
which makes it harder for a follow up patch to move the declaration of the
struct around as then the counter variable would need special treatment.

As the reallocation counter is currently unused we can just remove it.
If we ever need to count the reallocations again, we can reintroduce
the counter as part of 'struct slabname' in commit-slab-decl.h.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit.c: allow parse_commit_buffer to handle arbitrary... Stefan Beller Fri, 29 Jun 2018 01:22:13 +0000 (18:22 -0700)

commit.c: allow parse_commit_buffer to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tag: allow parse_tag_buffer to handle arbitrary reposit... Stefan Beller Fri, 29 Jun 2018 01:22:12 +0000 (18:22 -0700)

tag: allow parse_tag_buffer to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tag: allow lookup_tag to handle arbitrary repositoriesStefan Beller Fri, 29 Jun 2018 01:22:11 +0000 (18:22 -0700)

tag: allow lookup_tag to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: allow lookup_commit to handle arbitrary reposit... Stefan Beller Fri, 29 Jun 2018 01:22:10 +0000 (18:22 -0700)

commit: allow lookup_commit to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tree: allow lookup_tree to handle arbitrary repositoriesStefan Beller Fri, 29 Jun 2018 01:22:09 +0000 (18:22 -0700)

tree: allow lookup_tree to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blob: allow lookup_blob to handle arbitrary repositoriesStefan Beller Fri, 29 Jun 2018 01:22:08 +0000 (18:22 -0700)

blob: allow lookup_blob to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: allow lookup_object to handle arbitrary reposit... Stefan Beller Fri, 29 Jun 2018 01:22:07 +0000 (18:22 -0700)

object: allow lookup_object to handle arbitrary repositories

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: allow object_as_type to handle arbitrary reposi... Stefan Beller Fri, 29 Jun 2018 01:22:06 +0000 (18:22 -0700)

object: allow object_as_type to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tag: add repository argument to deref_tagStefan Beller Fri, 29 Jun 2018 01:22:05 +0000 (18:22 -0700)

tag: add repository argument to deref_tag

Add a repository argument to allow the callers of deref_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tag: add repository argument to parse_tag_bufferStefan Beller Fri, 29 Jun 2018 01:22:04 +0000 (18:22 -0700)

tag: add repository argument to parse_tag_buffer

Add a repository argument to allow the callers of parse_tag_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tag: add repository argument to lookup_tagStefan Beller Fri, 29 Jun 2018 01:22:03 +0000 (18:22 -0700)

tag: add repository argument to lookup_tag

Add a repository argument to allow the callers of lookup_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: add repository argument to get_cached_commit_bufferStefan Beller Fri, 29 Jun 2018 01:22:02 +0000 (18:22 -0700)

commit: add repository argument to get_cached_commit_buffer

Add a repository argument to allow callers of get_cached_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: add repository argument to set_commit_bufferStefan Beller Fri, 29 Jun 2018 01:22:01 +0000 (18:22 -0700)

commit: add repository argument to set_commit_buffer

Add a repository argument to allow callers of set_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: add repository argument to parse_commit_bufferStefan Beller Fri, 29 Jun 2018 01:22:00 +0000 (18:22 -0700)

commit: add repository argument to parse_commit_buffer

Add a repository argument to allow the callers of parse_commit_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: add repository argument to lookup_commitStefan Beller Fri, 29 Jun 2018 01:21:59 +0000 (18:21 -0700)

commit: add repository argument to lookup_commit

Add a repository argument to allow callers of lookup_commit to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: add repository argument to lookup_commit_referenceStefan Beller Fri, 29 Jun 2018 01:21:58 +0000 (18:21 -0700)

commit: add repository argument to lookup_commit_reference

Add a repository argument to allow callers of lookup_commit_reference
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: add repository argument to lookup_commit_refere... Stefan Beller Fri, 29 Jun 2018 01:21:57 +0000 (18:21 -0700)

commit: add repository argument to lookup_commit_reference_gently

Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

tree: add repository argument to lookup_treeStefan Beller Fri, 29 Jun 2018 01:21:56 +0000 (18:21 -0700)

tree: add repository argument to lookup_tree

Add a repository argument to allow the callers of lookup_tree
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

blob: add repository argument to lookup_blobStefan Beller Fri, 29 Jun 2018 01:21:55 +0000 (18:21 -0700)

blob: add repository argument to lookup_blob

Add a repository argument to allow the callers of lookup_blob
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: add repository argument to object_as_typeStefan Beller Fri, 29 Jun 2018 01:21:54 +0000 (18:21 -0700)

object: add repository argument to object_as_type

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: add repository argument to parse_object_bufferStefan Beller Fri, 29 Jun 2018 01:21:53 +0000 (18:21 -0700)

object: add repository argument to parse_object_buffer

Add a repository argument to allow the callers of parse_object_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: add repository argument to lookup_objectStefan Beller Fri, 29 Jun 2018 01:21:52 +0000 (18:21 -0700)

object: add repository argument to lookup_object

Add a repository argument to allow callers of lookup_object to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object: add repository argument to parse_objectStefan Beller Fri, 29 Jun 2018 01:21:51 +0000 (18:21 -0700)

object: add repository argument to parse_object

Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'sb/object-store-grafts' into sb/object... Junio C Hamano Fri, 29 Jun 2018 17:24:33 +0000 (10:24 -0700)

Merge branch 'sb/object-store-grafts' into sb/object-store-lookup

* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h

.mailmap: merge different spellings of namesStefan Beller Fri, 29 Jun 2018 02:10:48 +0000 (19:10 -0700)

.mailmap: merge different spellings of names

This is a continuation of 94b410bba86 (.mailmap: Map email
addresses to names, 2013-07-12), merging names that are
spelled differently but have the same author email to the
same person.

Most spellings differed in accents or the order of names.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: fix the "built from commit" codeJohannes Schindelin Wed, 27 Jun 2018 19:35:23 +0000 (21:35 +0200)

Makefile: fix the "built from commit" code

In ed32b788c06 (version --build-options: report commit, too, if
possible, 2017-12-15), we introduced code to let `git version
--build-options` report the current commit from which the binaries were
built, if any.

To prevent erroneous commits from being reported (e.g. when unpacking
Git's source code from a .tar.gz file into a subdirectory of a different
Git project, as e.g. git_osx_installer does), we painstakingly set
GIT_CEILING_DIRECTORIES when trying to determine the current commit.

Except that we got the quoting wrong, and that variable therefore does
not have the desired effect.

The issue is that the $(shell) is resolved before the output is stuffed
into the command-line with -DGIT_BUILT_FROM_COMMIT, and therefore is
*not* inside quotes. And thus backslashing the quotes is wrong, as the
quote gets literally inserted into the CEILING_DIRECTORIES variable.

Let's fix that quoting, and while at it, also suppress the unhelpful
message

fatal: not a git repository (or any of the parent directories): .git

that gets printed to stderr if no current commit could be determined,
and might scare the occasional developer who simply tries to build Git
from scratch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5407: fix test to cover intended argumentsElijah Newren Thu, 7 Jun 2018 05:05:50 +0000 (22:05 -0700)

t5407: fix test to cover intended arguments

Test 8 in t5407 appears to be an accidental exact duplicate of of test 5;
the testcode is identical and has identical repo state, but the test
description is different and suggests that rebase -m followed by rebase
--skip was what was actually supposed to be tested. Modify the test to
include the -m option.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: fix grammar error in commentElijah Newren Thu, 7 Jun 2018 05:05:25 +0000 (22:05 -0700)

apply: fix grammar error in comment

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Second batch for 2.19 cycleJunio C Hamano Thu, 28 Jun 2018 19:55:47 +0000 (12:55 -0700)

Second batch for 2.19 cycle

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'sb/fix-fetching-moved-submodules'Junio C Hamano Thu, 28 Jun 2018 19:53:34 +0000 (12:53 -0700)

Merge branch 'sb/fix-fetching-moved-submodules'

The code to try seeing if a fetch is necessary in a submodule
during a fetch with --recurse-submodules got confused when the path
to the submodule was changed in the range of commits in the
superproject, sometimes showing "(null)". This has been corrected.

* sb/fix-fetching-moved-submodules:
t5526: test recursive submodules when fetching moved submodules
submodule: fix NULL correctness in renamed broken submodules

Merge branch 'tz/cred-netrc-cleanup'Junio C Hamano Thu, 28 Jun 2018 19:53:33 +0000 (12:53 -0700)

Merge branch 'tz/cred-netrc-cleanup'

Build and test procedure for netrc credential helper (in contrib/)
has been updated.

* tz/cred-netrc-cleanup:
git-credential-netrc: make "all" default target of Makefile
git-credential-netrc: fix exit status when tests fail
git-credential-netrc: use in-tree Git.pm for tests
git-credential-netrc: minor whitespace cleanup in test script

Merge branch 'jc/clean-after-sanity-tests'Junio C Hamano Thu, 28 Jun 2018 19:53:33 +0000 (12:53 -0700)

Merge branch 'jc/clean-after-sanity-tests'

test cleanup.

* jc/clean-after-sanity-tests:
tests: clean after SANITY tests

Merge branch 'nd/completion-negation'Junio C Hamano Thu, 28 Jun 2018 19:53:32 +0000 (12:53 -0700)

Merge branch 'nd/completion-negation'

Continuing with the idea to programmatically enumerate various
pieces of data required for command line completion, the codebase
has been taught to enumerate options prefixed with "--no-" to
negate them.

* nd/completion-negation:
completion: collapse extra --no-.. options
completion: suppress some -no- options
parse-options: option to let --git-completion-helper show negative form

Merge branch 'pw/add-p-recount'Junio C Hamano Thu, 28 Jun 2018 19:53:32 +0000 (12:53 -0700)

Merge branch 'pw/add-p-recount'

When user edits the patch in "git add -p" and the user's editor is
set to strip trailing whitespaces indiscriminately, an empty line
that is unchanged in the patch would become completely empty
(instead of a line with a sole SP on it). The code introduced in
Git 2.17 timeframe failed to parse such a patch, but now it learned
to notice the situation and cope with it.

* pw/add-p-recount:
add -p: fix counting empty context lines in edited patches

Merge branch 'jk/fetch-all-peeled-fix'Junio C Hamano Thu, 28 Jun 2018 19:53:32 +0000 (12:53 -0700)

Merge branch 'jk/fetch-all-peeled-fix'

"git fetch-pack --all" used to unnecessarily fail upon seeing an
annotated tag that points at an object other than a commit.

* jk/fetch-all-peeled-fix:
fetch-pack: test explicitly that --all can fetch tag references pointing to non-commits
fetch-pack: don't try to fetch peel values with --all

Merge branch 'ms/send-pack-honor-config'Junio C Hamano Thu, 28 Jun 2018 19:53:30 +0000 (12:53 -0700)

Merge branch 'ms/send-pack-honor-config'

"git send-pack --signed" (hence "git push --signed" over the http
transport) did not read user ident from the config mechanism to
determine whom to sign the push certificate as, which has been
corrected.

* ms/send-pack-honor-config:
builtin/send-pack: populate the default configs

Merge branch 'jh/partial-clone'Junio C Hamano Thu, 28 Jun 2018 19:53:30 +0000 (12:53 -0700)

Merge branch 'jh/partial-clone'

The recent addition of "partial clone" experimental feature kicked
in when it shouldn't, namely, when there is no partial-clone filter
defined even if extensions.partialclone is set.

* jh/partial-clone:
list-objects: check if filter is NULL before using

Merge branch 'sg/gpg-tests-fix'Junio C Hamano Thu, 28 Jun 2018 19:53:29 +0000 (12:53 -0700)

Merge branch 'sg/gpg-tests-fix'

Some flaky tests have been fixed.

* sg/gpg-tests-fix:
tests: make forging GPG signed commits and tags more robust
t7510-signed-commit: use 'test_must_fail'

Merge branch 'as/safecrlf-quiet-fix'Junio C Hamano Thu, 28 Jun 2018 19:53:29 +0000 (12:53 -0700)

Merge branch 'as/safecrlf-quiet-fix'

Fix for 2.17-era regression around `core.safecrlf`.

* as/safecrlf-quiet-fix:
config.c: fix regression for core.safecrlf false

Merge branch 'ab/refspec-init-fix'Junio C Hamano Thu, 28 Jun 2018 19:53:29 +0000 (12:53 -0700)

Merge branch 'ab/refspec-init-fix'

Make refspec parsing codepath more robust.

* ab/refspec-init-fix:
refspec: initalize `refspec_item` in `valid_fetch_refspec()`
refspec: add back a refspec_item_init() function
refspec: s/refspec_item_init/&_or_die/g

Documentation: declare "core.ignoreCase" as internal... Marc Strapetz Thu, 28 Jun 2018 11:21:57 +0000 (13:21 +0200)

Documentation: declare "core.ignoreCase" as internal variable

The current description of "core.ignoreCase" reads like an option which
is intended to be changed by the user while it's actually expected to
be set by Git on initialization only. Subsequently, Git relies on the
proper configuration of this variable, as noted by Bryan Turner [1]:

Git on a case-insensitive filesystem (APFS, HFS+, FAT32, exFAT,
vFAT, NTFS, etc.) is not designed to be run with anything other
than core.ignoreCase=true.

[1] https://marc.info/?l=git&m=152998665813997&w=2
mid:CAGyf7-GeE8jRGPkME9rHKPtHEQ6P1+ebpMMWAtMh01uO3bfy8w@mail.gmail.com

Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: fix documentation inconsistenciesDerrick Stolee Thu, 28 Jun 2018 12:52:45 +0000 (12:52 +0000)

commit-graph: fix documentation inconsistencies

The commit-graph feature shipped in Git 2.18 has some inconsistencies in
the constants used by the implementation and specified by the format
document.

The commit data chunk uses the key "CDAT" in the file format, but was
previously documented to say "CGET".

The commit data chunk stores commit parents using two 32-bit fields that
typically store the integer position of the parent in the list of commit
ids within the commit-graph file. When a parent does not exist, we had
documented the value 0xffffffff, but implemented the value 0x70000000.
This swap is easy to correct in the documentation, but unfortunately
reduces the number of commits that we can store in the commit-graph.
Update that estimate, too.

Reported-by: Grant Welch <gwelch925@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: implement ref-in-wantBrandon Williams Wed, 27 Jun 2018 22:30:23 +0000 (15:30 -0700)

fetch-pack: implement ref-in-want

Implement ref-in-want on the client side so that when a server supports
the "ref-in-want" feature, a client will send "want-ref" lines for each
reference the client wants to fetch. This feature allows clients to
tolerate inconsistencies that exist when a remote repository's refs
change during the course of negotiation.

This allows a client to request to request a particular ref without
specifying the OID of the ref. This means that instead of hitting an
error when a ref no longer points at the OID it did at the beginning of
negotiation, negotiation can continue and the value of that ref will be
sent at the termination of negotiation, just before a packfile is sent.

More information on the ref-in-want feature can be found in
Documentation/technical/protocol-v2.txt.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: put shallow info in output parameterBrandon Williams Wed, 27 Jun 2018 22:30:22 +0000 (15:30 -0700)

fetch-pack: put shallow info in output parameter

Expand the transport fetch method signature, by adding an output
parameter, to allow transports to return information about the refs they
have fetched. Then communicate shallow status information through this
mechanism instead of by modifying the input list of refs.

This does require clients to sometimes generate the ref map twice: once
from the list of refs provided by the remote (as is currently done) and
potentially once from the new list of refs that the fetch mechanism
provides.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: refactor to make function args narrowerBrandon Williams Wed, 27 Jun 2018 22:30:21 +0000 (15:30 -0700)

fetch: refactor to make function args narrower

Refactor find_non_local_tags and get_ref_map to only take the
information they need instead of the entire transport struct. Besides
improving code clarity, this also improves their flexibility, allowing
for a different set of refs to be used instead of relying on the ones
stored in the transport struct.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: refactor fetch_refs into two functionsBrandon Williams Wed, 27 Jun 2018 22:30:20 +0000 (15:30 -0700)

fetch: refactor fetch_refs into two functions

Refactor the fetch_refs function into a function that does the fetching
of refs and another function that stores them. This is in preparation
for allowing additional processing of the fetched refs before updating
the local ref store.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch: refactor the population of peer ref OIDsBrandon Williams Wed, 27 Jun 2018 22:30:19 +0000 (15:30 -0700)

fetch: refactor the population of peer ref OIDs

Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides
tightening scopes of variables in the code, this also prepares for
get_ref_map being able to be called multiple times within do_fetch.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: test negotiation with changing repositoryBrandon Williams Wed, 27 Jun 2018 22:30:18 +0000 (15:30 -0700)

upload-pack: test negotiation with changing repository

Add tests to check the behavior of fetching from a repository which
changes between rounds of negotiation (for example, when different
servers in a load-balancing agreement participate in the same stateless
RPC negotiation). This forms a baseline of comparison to the ref-in-want
functionality (which will be introduced to the client in subsequent
commits), and ensures that subsequent commits do not change existing
behavior.

As part of this effort, a mechanism to substitute strings in a single
HTTP response is added.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: implement ref-in-wantBrandon Williams Wed, 27 Jun 2018 22:30:17 +0000 (15:30 -0700)

upload-pack: implement ref-in-want

Currently, while performing packfile negotiation, clients are only
allowed to specify their desired objects using object ids. This causes
a vulnerability to failure when an object turns non-existent during
negotiation, which may happen if, for example, the desired repository is
provided by multiple Git servers in a load-balancing arrangement and
there exists replication delay.

In order to eliminate this vulnerability, implement the ref-in-want
feature for the 'fetch' command in protocol version 2. This feature
enables the 'fetch' command to support requests in the form of ref names
through a new "want-ref <ref>" parameter. At the conclusion of
negotiation, the server will send a list of all of the wanted references
(as provided by "want-ref" lines) in addition to the generated packfile.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-rebase--merge: modernize "git-$cmd" to "git $cmd"Elijah Newren Wed, 27 Jun 2018 07:46:00 +0000 (00:46 -0700)

git-rebase--merge: modernize "git-$cmd" to "git $cmd"

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Fix use of strategy options with interactive rebasesElijah Newren Wed, 27 Jun 2018 15:48:04 +0000 (08:48 -0700)

Fix use of strategy options with interactive rebases

git-rebase.sh wrote strategy options to .git/rebase/merge/strategy_opts
in the following format:
'--ours' '--renormalize'
Note the double spaces.

git-rebase--interactive uses sequencer.c to parse that file, and
sequencer.c used split_cmdline() to get the individual strategy options.
After splitting, sequencer.c prefixed each "option" with a double dash,
so, concatenating all its options would result in:
-- --ours -- --renormalize

So, when it ended up calling try_merge_strategy(), that in turn would run
git merge-$strategy -- --ours -- --renormalize $merge_base -- $head $remote

instead of the expected/desired
git merge-$strategy --ours --renormalize $merge_base -- $head $remote

Remove the extra spaces so that when it goes through split_cmdline() we end
up with the desired command line.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3418: add testcase showing problems with rebase -i... Elijah Newren Wed, 27 Jun 2018 15:48:03 +0000 (08:48 -0700)

t3418: add testcase showing problems with rebase -i and strategy options

We are not passing the same args to merge strategies when we are doing an
--interactive rebase as we do with a --merge rebase. The merge strategy
should not need to be aware of which type of rebase is in effect. Add a
testcase which checks for the appropriate args.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

dir.c: fix typos in core.excludesfile commentTodd Zullinger Wed, 27 Jun 2018 04:46:52 +0000 (00:46 -0400)

dir.c: fix typos in core.excludesfile comment

Make it easier to find references to core.excludesfile and the default
$XDG_CONFIG_HOME/git/ignore path.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitignore.txt: clarify default core.excludesfile pathTodd Zullinger Wed, 27 Jun 2018 04:46:51 +0000 (00:46 -0400)

gitignore.txt: clarify default core.excludesfile path

The default core.excludesfile path is $XDG_CONFIG_HOME/git/ignore.
$HOME/.config/git/ignore is used if XDG_CONFIG_HOME is empty or unset,
as described later in the document.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-rebase: make --allow-empty-message the defaultElijah Newren Wed, 27 Jun 2018 07:23:19 +0000 (00:23 -0700)

git-rebase: make --allow-empty-message the default

rebase backends currently behave differently with empty commit messages,
largely as a side-effect of the different underlying commands on which
they are based. am-based rebases apply commits with an empty commit
message without stopping or requiring the user to specify an extra flag.
(It is interesting to note that am-based rebases are the default rebase
type, and no one has ever requested a --no-allow-empty-message flag to
change this behavior.) merge-based and interactive-based rebases (which
are ultimately based on git-commit), will currently halt on any such
commits and require the user to manually specify what to do with the
commit and continue.

One possible rationale for the difference in behavior is that the purpose
of an "am" based rebase is solely to transplant an existing history, while
an "interactive" rebase is one whose purpose is to polish a series before
making it publishable. Thus, stopping and asking for confirmation for a
possible problem is more appropriate in the latter case. However, there
are two problems with this rationale:

1) merge-based rebases are also non-interactive and there are multiple
types of rebases that use the interactive machinery but are not
explicitly interactive (e.g. when either --rebase-merges or
--keep-empty are specified without --interactive). These rebases are
also used solely to transplant an existing history, and thus also
should default to --allow-empty-message.

2) this rationale only says that the user is more accepting of stopping
in the case of an explicitly interactive rebase, not that stopping
for this particular reason actually makes sense. Exploring whether
it makes sense, requires backing up and analyzing the underlying
commands...

If git-commit did not error out on empty commits by default, accidental
creation of commits with empty messages would be a very common occurrence
(this check has caught me many times). Further, nearly all such empty
commit messages would be considered an accidental error (as evidenced by a
huge amount of documentation across version control systems and in various
blog posts explaining how important commit messages are). A simple check
for what would otherwise be a common error thus made a lot of sense, and
git-commit gained an --allow-empty-message flag for special case
overrides. This has made commits with empty messages very rare.

There are two sources for commits with empty messages for rebase (and
cherry-pick): (a) commits created in git where the user previously
specified --allow-empty-message to git-commit, and (b) commits imported
into git from other version control systems. In case (a), the user has
already explicitly specified that there is something special about this
commit that makes them not want to specify a commit message; forcing them
to re-specify with every cherry-pick or rebase seems more likely to be
infuriating than helpful. In case (b), the commit is highly unlikely to
have been authored by the person who has imported the history and is doing
the rebase or cherry-pick, and thus the user is unlikely to be the
appropriate person to write a commit message for it. Stopping and
expecting the user to modify the commit before proceeding thus seems
counter-productive.

Further, note that while empty commit messages was a common error case for
git-commit to deal with, it is a rare case for rebase (or cherry-pick).
The fact that it is rare raises the question of why it would be worth
checking and stopping on this particular condition and not others. For
example, why doesn't an interactive rebase automatically stop if the
commit message's first line is 2000 columns long, or is missing a blank
line after the first line, or has every line indented with five spaces, or
any number of other myriad problems?

Finally, note that if a user doing an interactive rebase does have the
necessary knowledge to add a message for any such commit and wants to do
so, it is rather simple for them to change the appropriate line from
'pick' to 'reword'. The fact that the subject is empty in the todo list
that the user edits should even serve as a way to notify them.

As far as I can tell, the fact that merge-based and interactive-based
rebases stop on commits with empty commit messages is solely a by-product
of having been based on git-commit. It went without notice for a long
time precisely because such cases are rare. The rareness of this
situation made it difficult to reason about, so when folks did eventually
notice this behavior, they assumed it was there for a good reason and just
added an --allow-empty-message flag. In my opinion, stopping on such
messages not desirable in any of these cases, even the (explicitly)
interactive case.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3401: add directory rename testcases for rebase and amElijah Newren Wed, 27 Jun 2018 07:23:18 +0000 (00:23 -0700)

t3401: add directory rename testcases for rebase and am

Add a simple directory rename testcase, in conjunction with each of the
types of rebases:
git-rebase--interactive
git-rebase--am
git-rebase--merge
and also use the same testcase for
git am --3way

This demonstrates a difference in behavior between the different rebase
backends in regards to directory rename detection.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-rebase.txt: document behavioral differences between... Elijah Newren Wed, 27 Jun 2018 07:23:17 +0000 (00:23 -0700)

git-rebase.txt: document behavioral differences between modes

There are a variety of aspects that are common to all rebases regardless
of which backend is in use; however, the behavior for these different
aspects varies in ways that could surprise users. (In fact, it's not
clear -- to me at least -- that these differences were even desirable or
intentional.) Document these differences.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

directory-rename-detection.txt: technical docs on abili... Elijah Newren Wed, 27 Jun 2018 07:23:16 +0000 (00:23 -0700)

directory-rename-detection.txt: technical docs on abilities and limitations

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-rebase.txt: address confusion between --no-ff vs... Elijah Newren Wed, 27 Jun 2018 07:23:15 +0000 (00:23 -0700)

git-rebase.txt: address confusion between --no-ff vs --force-rebase

rebase was taught the --force-rebase option in commit b2f82e05de ("Teach
rebase to rebase even if upstream is up to date", 2009-02-13). This flag
worked for the am and merge backends, but wasn't a valid option for the
interactive backend.

rebase was taught the --no-ff option for interactive rebases in commit
b499549401cb ("Teach rebase the --no-ff option.", 2010-03-24), to do the
exact same thing as --force-rebase does for non-interactive rebases. This
commit explicitly documented the fact that --force-rebase was incompatible
with --interactive, though it made --no-ff a synonym for --force-rebase
for non-interactive rebases. The choice of a new option was based on the
fact that "force rebase" didn't sound like an appropriate term for the
interactive machinery.

In commit 6bb4e485cff8 ("rebase: align variable names", 2011-02-06), the
separate parsing of command line options in the different rebase scripts
was removed, and whether on accident or because the author noticed that
these options did the same thing, the options became synonyms and both
were accepted by all three rebase types.

In commit 2d26d533a012 ("Documentation/git-rebase.txt: -f forces a rebase
that would otherwise be a no-op", 2014-08-12), which reworded the
description of the --force-rebase option, the (no-longer correct) sentence
stating that --force-rebase was incompatible with --interactive was
finally removed.

Finally, as explained at
https://public-inbox.org/git/98279912-0f52-969d-44a6-22242039387f@xiplink.com

In the original discussion around this option [1], at one point I
proposed teaching rebase--interactive to respect --force-rebase
instead of adding a new option [2]. Ultimately --no-ff was chosen as
the better user interface design [3], because an interactive rebase
can't be "forced" to run.

We have accepted both --no-ff and --force-rebase as full synonyms for all
three rebase types for over seven years. Documenting them differently
and in ways that suggest they might not be quite synonyms simply leads to
confusion. Adjust the documentation to match reality.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-rebase: error out when incompatible options passedElijah Newren Wed, 27 Jun 2018 07:23:14 +0000 (00:23 -0700)

git-rebase: error out when incompatible options passed

git rebase has three different types: am, merge, and interactive, all of
which are implemented in terms of separate scripts. am builds on git-am,
merge builds on git-merge-recursive, and interactive builds on
git-cherry-pick. We make use of features in those lower-level commands in
the different rebase types, but those features don't exist in all of the
lower level commands so we have a range of incompatibilities. Previously,
we just accepted nearly any argument and silently ignored whichever ones
weren't implemented for the type of rebase specified. Change this so the
incompatibilities are documented, included in the testsuite, and tested
for at runtime with an appropriate error message shown.

Some exceptions I left out:

* --merge and --interactive are technically incompatible since they are
supposed to run different underlying scripts, but with a few small
changes, --interactive can do everything that --merge can. In fact,
I'll shortly be sending another patch to remove git-rebase--merge and
reimplement it on top of git-rebase--interactive.

* One could argue that --interactive and --quiet are incompatible since
--interactive doesn't implement a --quiet mode (perhaps since
cherry-pick itself does not implement one). However, the interactive
mode is more quiet than the other modes in general with progress
messages, so one could argue that it's already quiet.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3422: new testcases for checking when incompatible... Elijah Newren Wed, 27 Jun 2018 07:23:13 +0000 (00:23 -0700)

t3422: new testcases for checking when incompatible options passed

git rebase is split into three types: am, merge, and interactive. Various
options imply different types, and which mode we are using determine which
sub-script (git-rebase--$type) is executed to finish the work. Not all
options work with all types, so add tests for combinations where we expect
to receive an error rather than having options be silently ignored.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: update design documentDerrick Stolee Wed, 27 Jun 2018 13:24:47 +0000 (09:24 -0400)

commit-graph: update design document

The commit-graph feature is now integrated with 'fsck' and 'gc',
so remove those items from the "Future Work" section of the
commit-graph design document.

Also remove the section on lazy-loading trees, as that was completed
in an earlier patch series.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc: automatically write commit-graph filesDerrick Stolee Wed, 27 Jun 2018 13:24:46 +0000 (09:24 -0400)

gc: automatically write commit-graph files

The commit-graph file is a very helpful feature for speeding up git
operations. In order to make it more useful, make it possible to
write the commit-graph file during standard garbage collection
operations.

Add a 'gc.commitGraph' config setting that triggers writing a
commit-graph file after any non-trivial 'git gc' command. Defaults to
false while the commit-graph feature matures. We specifically do not
want to have this on by default until the commit-graph feature is fully
integrated with history-modifying features like shallow clones.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: add '--reachable' optionDerrick Stolee Wed, 27 Jun 2018 13:24:45 +0000 (09:24 -0400)

commit-graph: add '--reachable' option

When writing commit-graph files, it can be convenient to ask for all
reachable commits (starting at the ref set) in the resulting file. This
is particularly helpful when writing to stdin is complicated, such as a
future integration with 'git gc'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: use string-list API for inputDerrick Stolee Wed, 27 Jun 2018 13:24:44 +0000 (09:24 -0400)

commit-graph: use string-list API for input

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsck: verify commit-graphDerrick Stolee Wed, 27 Jun 2018 13:24:43 +0000 (09:24 -0400)

fsck: verify commit-graph

If core.commitGraph is true, verify the contents of the commit-graph
during 'git fsck' using the 'git commit-graph verify' subcommand. Run
this check on all alternates, as well.

We use a new process for two reasons:

1. The subcommand decouples the details of loading and verifying a
commit-graph file from the other fsck details.

2. The commit-graph verification requires the commits to be loaded
in a specific order to guarantee we parse from the commit-graph
file for some objects and from the object database for others.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify contents match checksumDerrick Stolee Wed, 27 Jun 2018 13:24:42 +0000 (09:24 -0400)

commit-graph: verify contents match checksum

The commit-graph file ends with a SHA1 hash of the previous contents. If
a commit-graph file has errors but the checksum hash is correct, then we
know that the problem is a bug in Git and not simply file corruption
after-the-fact.

Compute the checksum right away so it is the first error that appears,
and make the message translatable since this error can be "corrected" by
a user by simply deleting the file and recomputing. The rest of the
errors are useful only to developers.

Be sure to continue checking the rest of the file data if the checksum
is wrong. This is important for our tests, as we break the checksum as
we modify bytes of the commit-graph file.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: test for corrupted octopus edgeDerrick Stolee Wed, 27 Jun 2018 13:24:41 +0000 (09:24 -0400)

commit-graph: test for corrupted octopus edge

The commit-graph file has an extra chunk to store the parent int-ids for
parents beyond the first parent for octopus merges. Our test repo has a
single octopus merge that we can manipulate to demonstrate the 'verify'
subcommand detects incorrect values in that chunk.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify commit dateDerrick Stolee Wed, 27 Jun 2018 13:24:40 +0000 (09:24 -0400)

commit-graph: verify commit date

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify generation numberDerrick Stolee Wed, 27 Jun 2018 13:24:39 +0000 (09:24 -0400)

commit-graph: verify generation number

While iterating through the commit parents, perform the generation
number calculation and compare against the value stored in the
commit-graph.

The tests demonstrate that having a different set of parents affects
the generation number calculation, and this value propagates to
descendants. Hence, we drop the single-line condition on the output.

Since Git will ship with the commit-graph feature without generation
numbers, we need to accept commit-graphs with all generation numbers
equal to zero. In this case, ignore the generation number calculation.

However, verify that we should never have a mix of zero and non-zero
generation numbers. Create a test that sets one commit to generation
zero and all following commits report a failure as they have non-zero
generation in a file that contains generation number zero.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify parent listDerrick Stolee Wed, 27 Jun 2018 13:24:38 +0000 (09:24 -0400)

commit-graph: verify parent list

The commit-graph file stores parents in a two-column portion of the
commit data chunk. If there is only one parent, then the second column
stores 0xFFFFFFFF to indicate no second parent.

The 'verify' subcommand checks the parent list for the commit loaded
from the commit-graph and the one parsed from the object database. Test
these checks for corrupt parents, too many parents, and wrong parents.

Add a boundary check to insert_parent_or_die() for when the parent
position value is out of range.

The octopus merge will be tested in a later commit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify root tree OIDsDerrick Stolee Wed, 27 Jun 2018 13:24:37 +0000 (09:24 -0400)

commit-graph: verify root tree OIDs

The 'verify' subcommand must compare the commit content parsed from the
commit-graph against the content in the object database. Use
lookup_commit() and parse_commit_in_graph_one() to parse the commits
from the graph and compare against a commit that is loaded separately
and parsed directly from the object database.

Add checks for the root tree OID.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify objects existDerrick Stolee Wed, 27 Jun 2018 13:24:36 +0000 (09:24 -0400)

commit-graph: verify objects exist

In the 'verify' subcommand, load commits directly from the object
database to ensure they exist. Parse by skipping the commit-graph.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify corrupt OID fanout and lookupDerrick Stolee Wed, 27 Jun 2018 13:24:35 +0000 (09:24 -0400)

commit-graph: verify corrupt OID fanout and lookup

In the commit-graph file, the OID fanout chunk provides an index into
the OID lookup. The 'verify' subcommand should find incorrect values
in the fanout.

Similarly, the 'verify' subcommand should find out-of-order values in
the OID lookup.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify required chunks are presentDerrick Stolee Wed, 27 Jun 2018 13:24:34 +0000 (09:24 -0400)

commit-graph: verify required chunks are present

The commit-graph file requires the following three chunks:

* OID Fanout
* OID Lookup
* Commit Data

If any of these are missing, then the 'verify' subcommand should
report a failure. This includes the chunk IDs malformed or the
chunk count is truncated.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: verify catches corrupt signatureDerrick Stolee Wed, 27 Jun 2018 13:24:33 +0000 (09:24 -0400)

commit-graph: verify catches corrupt signature

This is the first of several commits that add a test to check that
'git commit-graph verify' catches corruption in the commit-graph
file. The first test checks that the command catches an error in
the file signature. This is a check that exists in the existing
commit-graph reading code.

Add a helper method 'corrupt_graph_and_verify' to the test script
t5318-commit-graph.sh. This helper corrupts the commit-graph file
at a certain location, runs 'git commit-graph verify', and reports
the output to the 'err' file. This data is filtered to remove the
lines added by 'test_must_fail' when the test is run verbosely.
Then, the output is checked to contain a specific error message.

Most messages from 'git commit-graph verify' will not be marked
for translation. There will be one exception: the message that
reports an invalid checksum will be marked for translation, as that
is the only message that is intended for a typical user.

Helped-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: add 'verify' subcommandDerrick Stolee Wed, 27 Jun 2018 13:24:32 +0000 (09:24 -0400)

commit-graph: add 'verify' subcommand

If the commit-graph file becomes corrupt, we need a way to verify
that its contents match the object database. In the manner of
'git fsck' we will implement a 'git commit-graph verify' subcommand
to report all issues with the file.

Add the 'verify' subcommand to the 'commit-graph' builtin and its
documentation. The subcommand is currently a no-op except for
loading the commit-graph into memory, which may trigger run-time
errors that would be caught by normal use. Add a simple test that
ensures the command returns a zero error code.

If no commit-graph file exists, this is an acceptable state. Do
not report any errors.

Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit-graph: load a root tree from specific graphDerrick Stolee Wed, 27 Jun 2018 13:24:31 +0000 (09:24 -0400)

commit-graph: load a root tree from specific graph

When lazy-loading a tree for a commit, it will be important to select
the tree from a specific struct commit_graph. Create a new method that
specifies the commit-graph file and use that in
get_commit_tree_in_graph().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>