Merge branch 'np/log-graph-octopus-fix' into maint
"git log --graph" showing an octopus merge sometimes miscounted the
number of display columns it is consuming to show the merge and its
parent commits, which has been corrected.
* np/log-graph-octopus-fix:
log: fix coloring of certain octopus merge shapes
The codepath to support the experimental split-index mode had
remaining "racily clean" issues fixed.
* sg/split-index-racefix:
split-index: BUG() when cache entry refers to non-existing shared entry
split-index: smudge and add racily clean cache entries to split index
split-index: don't compare cached data of entries already marked for split index
split-index: count the number of deleted entries
t1700-split-index: date back files to avoid racy situations
split-index: add tests to demonstrate the racy split index problem
t1700-split-index: document why FSMONITOR is disabled in this test script
A partial clone that is configured to lazily fetch missing objects
will on-demand issue a "git fetch" request to the originating
repository to fill not-yet-obtained objects. The request has been
optimized for requesting a tree object (and not the leaf blob
objects contained in it) by telling the originating repository that
no blobs are needed.
* jt/non-blob-lazy-fetch:
fetch-pack: exclude blobs when lazy-fetching trees
fetch-pack: avoid object flags if no_dependents
Merge branch 'en/status-multiple-renames-to-the-same-target-fix' into maint
The code in "git status" sometimes hit an assertion failure. This
was caused by a structure that was reused without cleaning the data
used for the first run, which has been corrected.
* en/status-multiple-renames-to-the-same-target-fix:
commit: fix erroneous BUG, 'multiple renames on the same target? how?'
Merge branch 'ds/commit-graph-with-grafts' into maint
The recently introduced commit-graph auxiliary data is incompatible
with mechanisms such as replace & grafts that "breaks" immutable
nature of the object reference relationship. Disable optimizations
based on its use (and updating existing commit-graph) when these
incompatible features are in use in the repository.
* ds/commit-graph-with-grafts:
commit-graph: close_commit_graph before shallow walk
commit-graph: not compatible with uninitialized repo
commit-graph: not compatible with grafts
commit-graph: not compatible with replace objects
test-repository: properly init repo
commit-graph: update design document
refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
refs.c: migrate internal ref iteration to pass thru repository argument
"git add ':(attr:foo)'" is not supported and is supposed to be
rejected while the command line arguments are parsed, but we fail
to reject such a command line upfront.
* nd/attr-pathspec-fix:
add: do not accept pathspec magic 'attr'
Merge branch 'bp/mv-submodules-with-fsmonitor' into maint
When fsmonitor is in use, after operation on submodules updates
.gitmodules, we lost track of the fact that we did so and relied on
stale fsmonitor data.
* bp/mv-submodules-with-fsmonitor:
git-mv: allow submodules and fsmonitor to work together
Merge branch 'js/rebase-i-autosquash-fix' into maint
"git rebase -i" did not clear the state files correctly when a run
of "squash/fixup" is aborted and then the user manually amended the
commit instead, which has been corrected.
* js/rebase-i-autosquash-fix:
rebase -i: be careful to wrap up fixup/squash chains
rebase -i --autosquash: demonstrate a problem skipping the last squash
"git interpret-trailers" and its underlying machinery had a buggy
code that attempted to ignore patch text after commit log message,
which triggered in various codepaths that will always get the log
message alone and never get such an input.
* jk/trailer-fixes:
append_signoff: use size_t for string offsets
sequencer: ignore "---" divider when parsing trailers
pretty, ref-filter: format %(trailers) with no_divider option
interpret-trailers: allow suppressing "---" divider
interpret-trailers: tighten check for "---" patch boundary
trailer: pass process_trailer_opts to trailer_info_get()
trailer: use size_t for iterating trailer list
trailer: use size_t for string offsets
The way .git/index and .git/sharedindex* files were initially
created gave these files different perm bits until they were
adjusted for shared repository settings. This was made consistent.
* cc/shared-index-permbits:
read-cache: make the split index obey umask settings
Recently added check for case smashing filesystems did not
correctly utilize the cached stat information, leading to false
breakage detected by our test suite, which has been corrected.
* nd/clone-case-smashing-warning:
clone: fix colliding file detection on APFS
As the warning message shown by existing versions of Git for
unknown index extensions is a bit too alarming, two new extensions
are held back and not written by default for the upcoming release.
* jn/eoie-ieot:
index: make index.threads=true enable ieot and eoie
ieot: default to not writing IEOT section
eoie: default to not writing EOIE section
Recently, built-in "rebase" tightened the error checking for a few
options that are passed to underlying "am", but we forgot to make
the matching change to the scripted version, which has been
corrected.
* js/rebase-am-options-fix:
legacy-rebase: backport -C<n> and --whitespace=<option> checks
index: make index.threads=true enable ieot and eoie
If a user explicitly sets
[index]
threads = true
to read the index using multiple threads, ensure that index writes
include the offset table by default to make that possible. This
ensures that the user's intent of turning on threading is respected.
In other words, permit the following configurations:
- index.threads and index.recordOffsetTable unspecified: do not write
the offset table yet (to avoid alarming the user with "ignoring IEOT
extension" messages when an older version of Git accesses the
repository) but do make use of multiple threads to read the index if
the supporting offset table is present.
This can also be requested explicitly by setting index.threads=true,
0, or >1 and index.recordOffsetTable=false.
- index.threads=false or 1: do not write the offset table, and do not
make use of the offset table.
One can set index.recordOffsetTable=false as well, to be more
explicit.
- index.threads=true, 0, or >1 and index.recordOffsetTable unspecified:
write the offset table and make use of threads at read time.
This can also be requested by setting index.threads=true, 0, >1, or
unspecified and index.recordOffsetTable=true.
Fortunately the complication is temporary: once most Git installations
have upgraded to a version with support for the IEOT and EOIE
extensions, we can flip the defaults for index.recordEndOfIndexEntries
and index.recordOffsetTable to true and eliminate the settings.
Helped-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b878579ae7 (clone: report duplicate entries on case-insensitive
filesystems - 2018-08-17) adds a warning to user when cloning a repo
with case-sensitive file names on a case-insensitive file system. The
"find duplicate file" check was doing by comparing inode number (and
only fall back to fspathcmp() when inode is known to be unreliable
because fspathcmp() can't cover all case folding cases).
The inode check is very simple, and wrong. It compares between a
32-bit number (sd_ino) and potentially a 64-bit number (st_ino). When
an inode is larger than 2^32 (which seems to be the case for APFS), it
will be truncated and stored in sd_ino, but comparing with itself will
fail.
As a result, instead of showing a pair of files that have the same
name, we show just one file (marked before the beginning of the
loop). We fail to find the original one.
but this is no longer a reliable test, there are 4G possible inodes
that can match sd_ino because we only match the lower 32 bits instead
of full 64 bits.
There are two options to go. Either we ignore inode and go with
fspathcmp() on Apple platform. This means we can't do accurate inode
check on HFS anymore, or even on APFS when inode numbers are still
below 2^32.
Or we just to to reduce the odds of matching a wrong file by checking
more attributes, counting mostly on st_size because st_xtime is likely
the same. This patch goes with this direction, hoping that false
positive chances are too small to be seen in practice.
While at there, enable the test on Cygwin (verified working by Ramsay
Jones)
(*) this is also already done inside match_stat_data()
Reported-by: Carlo Arenas <carenas@gmail.com> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects: fix off-by-one in delta-island tree-depth computation
When delta-islands are in use, we need to record the deepest path at
which we find each tree and blob. Our loop to do so counts slashes, so
"foo" is depth 0, "foo/bar" is depth 1, and so on.
However, this neglects root trees, which are represented by the empty
string. Those also have depth 0, but are at a layer above "foo". Thus,
"foo" should be 1, "foo/bar" at 2, and so on. We use this depth to
topo-sort the trees in resolve_tree_islands(). As a result, we may fail
to visit a root tree before the sub-trees it contains, and therefore not
correctly pass down the island marks.
That in turn could lead to missing some delta opportunities (objects are
in the same island, but we didn't realize it) or creating unwanted
cross-island deltas (one object is in an island another isn't, but we
don't realize). In practice, it seems to have only a small effect. Some
experiments on the real-world git/git fork network at GitHub showed an
improvement of only 0.14% in the resulting clone size.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 108f530385 (pack-objects: move tree_depth into 'struct
packing_data', 2018-08-16) started maintaining a tree_depth array that
matches the "objects" array. We extend the array when:
1. The objects array is extended, in which case we use realloc to
extend the tree_depth array.
2. A caller asks to store a tree_depth for object N, and this is the
first such request; we create the array from scratch and store the
value for N.
In the latter case, though, we use regular xmalloc(), and the depth
values for any objects besides N is undefined. This happens to not
trigger a bug with the current code, but the reasons are quite subtle:
- we never ask about the depth for any object with index i < N. This is
because we store the depth immediately for all trees and blobs. So
any such "i" must be a non-tree, and therefore we will never need to
care about its depth (in fact, we really only care about the depth of
trees).
- there are no objects at this point with index i > N, because we
always fill in the depth for a tree immediately after its object
entry is created (we may still allocate uninitialized depth entries,
but they'll be initialized by packlist_alloc() when it initializes
the entry in the "objects" array).
So it works, but only by chance. To be defensive, let's zero the array,
which matches the "unset" values which would be handed out by
oe_tree_depth() already.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 108f530385 (pack-objects: move tree_depth into 'struct
packing_data', 2018-08-16) dynamically manages a tree_depth array in
packing_data that maintains one of these invariants:
1. tree_depth is NULL (i.e., the requested options don't require us to
track tree depths)
2. tree_depth is non-NULL and has as many entries as the "objects"
array
We maintain (2) by:
a. When the objects array grows, grow tree_depth to the same size
(unless it's NULL, in which case we can leave it).
b. When a caller asks to set a depth via oe_set_tree_depth(), if
tree_depth is NULL we allocate it.
But in (b), we use the number of stored objects, _not_ the allocated
size of the objects array. So we can run into a situation like this:
1. packlist_alloc() needs to store the Nth object, so it grows the
objects array to M, where M > N.
2. oe_set_tree_depth() wants to store a depth, so it allocates an
array of length N. Now we've violated our invariant.
3. packlist_alloc() needs to store the N+1th object. But it _doesn't_
grow the objects array, since N <= M still holds. We try to assign
to tree_depth[N+1], which is out of bounds.
That doesn't happen in our test scripts, because the repositories they
use are so small, but it's easy to trigger by running:
echo HEAD | git pack-objects --revs --delta-islands --stdout >/dev/null
in any reasonably-sized repo (like git.git).
We can fix it by always growing the array to match pack->nr_alloc, not
pack->nr_objects. Likewise for the "layer" array from fe0ac2fb7f
(pack-objects: move 'layer' into 'struct packing_data', 2018-08-16),
which has the same bug.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was a simple copy/paste error, and an obvious one at that: if we
cannot fill the tree descriptor, we should show an error message about
*that* tree, not another one.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with EOIE, popular versions of Git do not support the new IEOT
extension yet. When accessing a Git repository written by a more
modern version of Git, they correctly ignore the unrecognized section,
but in the process they loudly warn
ignoring IEOT extension
resulting in confusion for users. Introduce the index extension more
gently by not writing it yet in this first version with support for
it. Soon, once sufficiently many users are running a modern version
of Git, we can flip the default so users benefit from this index
extension by default.
Introduce a '[index] recordOffsetTable' configuration variable to
control whether the new index extension is written.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
2018-10-10) Git defaults to writing the new EOIE section when writing
out an index file. Usually that is a good thing because it improves
threaded performance, but when a Git repository is shared with older
versions of Git, it produces a confusing warning:
$ git status
ignoring EOIE extension
HEAD detached at 371ed0defa
nothing to commit, working tree clean
Let's introduce the new index extension more gently. First we'll roll
out the new version of Git that understands it, and then once
sufficiently many users are using such a version, we can flip the
default to writing it by default.
Introduce a '[index] recordEndOfIndexEntries' configuration variable
to allow interested users to benefit from this index extension early.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
legacy-rebase: backport -C<n> and --whitespace=<option> checks
Since 04519d720114 (rebase: validate -C<n> and --whitespace=<mode>
parameters early, 2018-11-14), the built-in rebase validates the -C and
--whitespace arguments early. As this commit also introduced a
regression test for this, and as a later commit introduced the
GIT_TEST_REBASE_USE_BUILTIN mode to run tests, we now have a
"regression" in the scripted version of `git rebase` on our hands.
Backport the validation to fix this.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: split up close_reachable() progress output
Amend the progress output added in 7b0f229222 ("commit-graph write:
add progress output", 2018-09-17) so that the total numbers it reports
aren't higher than the total number of commits anymore. See [1] for a
bug report pointing that out.
When I added this I wasn't intending to provide an accurate count, but
just have some progress output to show the user the command wasn't
hanging[2]. But since we are showing numbers, let's make them
accurate. The progress descriptions were suggested by Derrick Stolee
in [3].
As noted in [2] we are unlikely to show anything except the "Expanding
reachable..." message even on fairly large repositories such as
linux.git. On a test repository I have with north of 7 million commits
all of these are displayed. Two of them don't show up for long, but as
noted in [5] future-proofing this for if the loops become more
expensive in the future makes sense.
test-lib-functions: make 'test_cmp_rev' more informative on failure
The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp'
checking the output of two 'git rev-parse' commands, which means that
its output on failure is not particularly informative, as it's
basically two OIDs with a bit of extra clutter of the diff header, but
without any indication of which two revisions have caused the failure:
It also pollutes the test repo with these two intermediate files,
though that doesn't seem to cause any complications in our current
tests (meaning that I couldn't find any tests that have to work around
the presence of these files by explicitly removing or ignoring them).
Enhance 'test_cmp_rev' to provide a more useful output on failure with
less clutter:
Doing so is more convenient when storing the OIDs outputted by 'git
rev-parse' in a local variable each, which, as a bonus, won't pollute
the repository with intermediate files.
While at it, also ensure that 'test_cmp_rev' is invoked with the right
number of parameters, namely two.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tests: send "bug in the test script" errors to the script's stderr
Some of the functions in our test library check that they were invoked
properly with conditions like this:
test "$#" = 2 ||
error "bug in the test script: not 2 parameters to test-expect-success"
If this particular condition is triggered, then 'error' will abort the
whole test script with a bold red error message [1] right away.
However, under certain circumstances the test script will be aborted
completely silently, namely if:
- a similar condition in a test helper function like
'test_line_count' is triggered,
- which is invoked from the test script's "main" shell [2],
- and the test script is run manually (i.e. './t1234-foo.sh' as
opposed to 'make t1234-foo.sh' or 'make test') [3]
- and without the '--verbose' option,
because the error message is printed from within 'test_eval_', where
standard output is redirected either to /dev/null or to a log file.
The only indication that something is wrong is that not all tests in
the script are executed and at the end of the test script's output
there is no "# passed all N tests" message, which are subtle and can
easily go unnoticed, as I had to experience myself.
Send these "bug in the test script" error messages directly to the
test scripts standard error and thus to the terminal, so those bugs
will be much harder to overlook. Instead of updating all ~20 such
'error' calls with a redirection, let's add a BUG() function to
'test-lib.sh', wrapping an 'error' call with the proper redirection
and also including the common prefix of those error messages, and
convert all those call sites [4] to use this new BUG() function
instead.
[1] That particular error message from 'test_expect_success' is
printed in color only when running with or without '--verbose';
with '--tee' or '--verbose-log' the error is printed without
color, but it is printed to the terminal nonetheless.
[2] If such a condition is triggered in a subshell of a test, then
'error' won't be able to abort the whole test script, but only the
subshell, which in turn causes the test to fail in the usual way,
indicating loudly and clearly that something is wrong.
[3] Well, 'error' aborts the test script the same way when run
manually or by 'make' or 'prove', but both 'make' and 'prove' pay
attention to the test script's exit status, and even a silently
aborted test script would then trigger those tools' usual
noticable error messages.
[4] Strictly speaking, not all those 'error' calls need that
redirection to send their output to the terminal, see e.g.
'test_expect_success' in the opening example, but I think it's
better to be consistent.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A coding convention around the Coccinelle semantic patches to have
two classes to ease code migration process has been proposed and
its support has been added to the Makefile.
Update the "test installed Git" mode of our test suite to work better.
* js/test-git-installed:
tests: explicitly use `git.exe` on Windows
tests: do not require Git to be built when testing an installed Git
t/lib-gettext: test installed git-sh-i18n if GIT_TEST_INSTALLED is set
tests: respect GIT_TEST_INSTALLED when initializing repositories
tests: fix GIT_TEST_INSTALLED's PATH to include t/helper/
The xcurl_off_t() helper function is used to cast size_t to
curl_off_t, but some compilers gave warnings against the code to
ensure the casting is done without wraparound, when size_t is
narrower than curl_off_t. This warning has been squelched.
* tb/xcurl-off-t:
remote-curl.c: xcurl_off_t is not portable (on 32 bit platfoms)
"git push" used to check ambiguities between object-names and
refnames while processing the list of refs' old and new values,
which was unnecessary (as it knew that it is feeding raw object
names). This has been optimized out.
Our testing framework uses a special i18n "poisoned localization"
feature to find messages that ought to stay constant but are
incorrectly marked to be translated. This feature has been made
into a runtime option (it used to be a compile-time option).
* ab/dynamic-gettext-poison:
Makefile: ease dynamic-gettext-poison transition
i18n: make GETTEXT_POISON a runtime option
This lets us use :(attr) with "git grep <tree-ish>" or "git log".
:(attr) requires another round of checking before we can declare that
a path is matched. This is done after path matching since we have lots
of optimization to take a shortcut when things don't match.
Note that if :(attr) is present, we can't return
all_entries_interesting / all_entries_not_interesting anymore because
we can't be certain about that. Not until match_pathspec_attrs() can
tell us "yes all these paths satisfy :(attr)".
Second note. Even though we walk a specific tree, we use attributes
from _worktree_ (or falling back to the index), not from .gitattributes
files on that tree. This by itself is not necessarily wrong, but the
user just have to be aware of this.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function will be reused for matching attributes in pathspec when
walking trees (currently it's used for matching pathspec when walking
a list). pathspec.c would be a more neutral place for this.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree-walk.c: make tree_entry_interesting() take an index
In order to support :(attr) when matching pathspec on a tree,
tree_entry_interesting() needs to take an index (because
git_check_attr() needs it). This is the preparation step for it. This
also makes it clearer what index we fall back to when looking up
attributes during an unpack-trees operation: the source index.
This also fixes revs->pruning.repo initialization that should have
been done in 2abf350385 (revision.c: remove implicit dependency on
the_index - 2018-09-21). Without it, skip_uninteresting() will
dereference a NULL pointer through this call chain
tree.c: make read_tree*() take 'struct repository *'
These functions call tree_entry_interesting() which will soon require
a 'struct index_state *' to be passed in. Instead of just changing the
function signature to take an index, update to take a repo instead
because these functions do need object database access.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
read-cache: make the split index obey umask settings
Make the split index write out its .git/sharedindex_* files with the
same permissions as .git/index. This only changes the behavior when
core.sharedRepository isn't set, i.e. the user's umask settings will
be respected.
This hasn't been the case ever since the split index was originally
implemented in c18b80a0e8 ("update-index: new options to
enable/disable split index mode", 2014-06-13). A mkstemp()-like
function has always been used to create it. First mkstemp() itself,
and then later our own mkstemp()-like in f6ecc62dbf ("write_shared_index(): use tempfile module", 2015-08-10)
A related bug was fixed in df801f3f9f ("read-cache: use shared perms
when writing shared index", 2017-06-25). Since then the split index
has respected core.sharedRepository.
However, using that setting should not be required simply to make git
obey the user's umask setting. It's intended for the use-case of
overriding whatever that umask is set to. This fixes cases where the
user has e.g. set his umask to 022 on a shared server in anticipation
of other user's needing to run "status", "log" etc. in his repository.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git stash" command insists on having a usable user identity to
the same degree as the "git commit-tree" and "git commit" commands
do, because it uses the same codepath that creates commit objects
as these commands.
It is not strictly necesary to do so. Check if we will barf before
creating commit objects and then supply fake identity to please the
machinery that creates commits.
Add test to document that stash executes correctly both with and
without valid ident.
This is not that much of usability improvement, as the users who run
"git stash" would eventually want to record their changes that are
temporarily stored in the stashes in a more permanent history by
committing, and they must do "git config user.{name,email}" at that
point anyway, so arguably this change is only delaying a step that
is necessary to work in the repository.
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Slavica Djukic <slawica92@hotmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git bundle" aborts due to an empty commit ranges
(i.e. resulting in an empty pack), it left a file descriptor to an
lockfile open, which resulted in leftover lockfile on Windows where
you cannot remove a file with an open file descriptor. This has
been corrected.
* jk/close-duped-fd-before-unlock-for-bundle:
bundle: dup() output descriptor closer to point-of-use
The recently merged "rebase in C" has an escape hatch to use the
scripted version when necessary, but it hasn't been documented,
which has been corrected.
* ab/rebase-in-c-escape-hatch:
tests: add a special setup where rebase.useBuiltin is off
rebase doc: document rebase.useBuiltin
The way "git rebase" parses and forwards the command line options
meant for underlying "git am" has been revamped, which fixed for
options with parameters that were not passed correctly.
* js/rebase-am-options:
rebase: validate -C<n> and --whitespace=<mode> parameters early
rebase: really just passthru the `git am` options
"git ls-remote --sort=<thing>" can feed an object that is not yet
available into the comparison machinery and segfault, which has
been corrected to check such a request upfront and reject it.
* sg/ref-filter-wo-repository:
ref-filter: don't look for objects when outside of a repository
Bugfix for the recently graduated "git rebase --rebase-merges".
* js/rebase-r-and-merge-head:
status: rebase and merge can be in progress at the same time
built-in rebase --skip/--abort: clean up stale .git/<name> files
rebase -i: include MERGE_HEAD into files to clean up
rebase -r: do not write MERGE_HEAD unless needed
rebase -r: demonstrate bug with conflicting merges
When editing a patch in a "git add -i" session, a hunk could be
made to no-op. The "git apply" program used to reject a patch with
such a no-op hunk to catch user mistakes, but it is now updated to
explicitly allow a no-op hunk in an edited patch.
"git rebase --autostash" did not correctly re-attach the HEAD at times.
* js/rebase-autostash-detach-fix:
built-in rebase --autostash: leave the current branch alone if possible
built-in rebase: demonstrate regression with --autostash
The "--no-patch" option, which can be used to get a high-level
overview without the actual line-by-line patch difference shown, of
the "range-diff" command was earlier broken, which has been
corrected.
* ab/range-diff-no-patch:
range-diff: make diff option behavior (e.g. --stat) consistent
range-diff: fix regression in passing along diff options
range-diff doc: add a section about output stability
Various functions have been audited for "-Wunused-parameter" warnings
and bugs in them got fixed.
* jk/unused-parameter-fixes:
midx: double-check large object write loop
assert NOARG/NONEG behavior of parse-options callbacks
parse-options: drop OPT_DATE()
apply: return -1 from option callback instead of calling exit(1)
cat-file: report an error on multiple --batch options
tag: mark "--message" option with NONEG
show-branch: mark --reflog option as NONEG
format-patch: mark "--no-numbered" option with NONEG
status: mark --find-renames option with NONEG
cat-file: mark batch options with NONEG
pack-objects: mark index-version option as NONEG
ls-files: mark exclude options as NONEG
am: handle --no-patch-format option
apply: mark include/exclude options as NONEG
fast-export: add a --show-original-ids option to show original names
Knowing the original names (hashes) of commits can sometimes enable
post-filtering that would otherwise be difficult or impossible. In
particular, the desire to rewrite commit messages which refer to other
prior commits (on top of whatever other filtering is being done) is
very difficult without knowing the original names of each commit.
In addition, knowing the original names (hashes) of blobs can allow
filtering by blob-id without requiring re-hashing the content of the
blob, and is thus useful as a small optimization.
Once we add original ids for both commits and blobs, we may as well
add them for tags too for completeness. Perhaps someone will have a
use for them.
This commit teaches a new --show-original-ids option to fast-export
which will make it add a 'original-oid <hash>' line to blob, commits,
and tags. It also teaches fast-import to parse (and ignore) such
lines.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import.c has started with a comment for nine and a half years
re-directing the reader to Documentation/git-fast-import.txt for
maintained documentation. Instead of leaving the unmaintained
documentation in place, just excise it.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git filter-branch has a nifty feature allowing you to rewrite, e.g. just
the last 8 commits of a linear history
git filter-branch $OPTIONS HEAD~8..HEAD
If you try the same with git fast-export, you instead get a history of
only 8 commits, with HEAD~7 being rewritten into a root commit. There
are two alternatives:
1) Don't use the negative revision specification, and when you're
filtering the output to make modifications to the last 8 commits,
just be careful to not modify any earlier commits somehow.
2) First run 'git fast-export --export-marks=somefile HEAD~8', then
run 'git fast-export --import-marks=somefile HEAD~8..HEAD'.
Both are more error prone than I'd like (the first for obvious reasons;
with the second option I have sometimes accidentally included too many
revisions in the first command and then found that the corresponding
extra revisions were not exported by the second command and thus were
not modified as I expected). Also, both are poor from a performance
perspective.
Add a new --reference-excluded-parents option which will cause
fast-export to refer to commits outside the specified rev-list-args
range by their sha1sum. Such a stream will only be useful in a
repository which already contains the necessary commits (much like the
restriction imposed when using --no-data).
Note from Peff:
I think we might be able to do a little more optimization here. If
we're exporting HEAD^..HEAD and there's an object in HEAD^ which is
unchanged in HEAD, I think we'd still print it (because it would not
be marked SHOWN), but we could omit it (by walking the tree of the
boundary commits and marking them shown). I don't think it's a
blocker for what you're doing here, but just a possible future
optimization.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If file paths are specified to fast-export and a ref points to a commit
that does not touch any of the relevant paths, then that ref would
sometimes fail to be exported. (This depends on whether any ancestors
of the commit which do touch the relevant paths would be exported with
that same ref name or a different ref name.) To avoid this problem,
put *all* specified refs into extra_refs to start, and then as we export
each commit, remove the refname used in the 'commit $REFNAME' directive
from extra_refs. Then, in handle_tags_and_duplicates() we know which
refs actually do need a manual reset directive in order to be included.
This means that we do need some special handling for excluded refs; e.g.
if someone runs
git fast-export ^master master
then they've asked for master to be exported, but they have also asked
for the commit which master points to and all of its history to be
excluded. That logically means ref deletion. Previously, such refs
were just silently omitted from being exported despite having been
explicitly requested for export.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export: when using paths, avoid corrupt stream with non-existent mark
If file paths are specified to fast-export and multiple refs point to a
commit that does not touch any of the relevant file paths, then
fast-export can hit problems. fast-export has a list of additional refs
that it needs to explicitly set after exporting all blobs and commits,
and when it tries to get_object_mark() on the relevant commit, it can
get a mark of 0, i.e. "not found", because the commit in question did
not touch the relevant paths and thus was not exported. Trying to
import a stream with a mark corresponding to an unexported object will
cause fast-import to crash.
Avoid this problem by taking the commit the ref points to and finding an
ancestor of it that was exported, and make the ref point to that commit
instead.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export: avoid dying when filtering by paths and old tags exist
If --tag-of-filtered-object=rewrite is specified along with a set of
paths to limit what is exported, then any tags pointing to old commits
that do not contain any of those specified paths cause problems. Since
the old tagged commit is not exported, fast-export attempts to rewrite
such tags to an ancestor commit which was exported. If no such commit
exists, then fast-export currently die()s. Five years after the tag
rewriting logic was added to fast-export (see commit 2d8ad4691921,
"fast-export: Add a --tag-of-filtered-object option for newly dangling
tags", 2009-06-25), fast-import gained the ability to delete refs (see
commit 4ee1b225b99f, "fast-import: add support to delete refs",
2014-04-20), so now we do have a valid option to rewrite the tag to.
Delete these tags instead of dying.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ABORT and ERROR happen to have the same value, but come from differnt
enums. Use the one from the correct enum, and while at it, rename the
values to avoid such problems.
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>