[PATCH] fix scalability problems with git-deltafy-script
Current version would spin forever and exhaust memory while attempting
to sort all files from all revisions at once, until it dies before even
doing any real work. This is especially noticeable when used on a big
repository like the imported bkcvs repo for the Linux kernel.
This patch allows for batching the sort to put a bound on needed
resources and making progress early, as well as including some small
cleanups.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Let "git commit" take arguments for files to commit.
It does a "git-update-cache" on the arguments, meaning that you can
commit files without doing a separate "git-update-cache". This commit
was done with
[PATCH] git-resolve-script: Add LAST_MERGE and use git-rev-parse
Make git-resolve-script only write MERGE_HEAD if a merge actually
occurred. All merge failures leave ORIG_HEAD and LAST_MERGE
behind (instead of ORIG_HEAD and MERGE_HEAD).
Use git-rev-parse to expand arguments (and check for bad ones).
Signed-off-by: Dan Holmsand <holmsand@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Relaxes error checking in epoch.c to allow duplicate parents
Given that real trees in the wild include parents with duplicate parents, I have relaxed
over-zealous error checking in epoch.c and dealt with the problem a different way - duplicate
parents are now silently ignored.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
3. if one of the specified heads is reachable from the other, the
head gets printed twice and this causes problems for upcoming
versions of gitk. This is true for both --merge-order and non
--merge-order style of invocations.
* FAIL 24: one specified head reachable from another a4, c3, --merge-order
* FAIL 26: one specified head reachable from another a4, c3, no --merge-order
* FAIL 27: one specified head reachable from another c3, a4, no --merge-order
4. --merge-order aborts with commits that list the same parent twice...it should handle it more gracefully.
* no longer unit testable
5. broken interaction between --merge-order and --max-age
previously posted as:
"[PATCH 1/2] Test case that demonstrates problem with --merge-order, --max-age interaction"
* FAIL 23: --max-age=c3, --merge-order
Later patches in this patch set fix these problems.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch for a completely rewritten file detected by the -B flag
was shown as a pair of creation followed by deletion in earlier
versions. This was an misguided attempt to make reviewing such
a complete rewrite easier, and unnecessarily ended up confusing
git-apply. Instead, show the entire contents of old version
prefixed with '-', followed by the entire contents of new
version prefixed with '+'. This gives the same easy-to-review
for human consumer while keeping it a single, regular
modification patch for machine consumption, something that even
GNU patch can grok.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Like diff-tree, this patch makes -C option for diff-* brothers
to use only pre-image of modified files as rename/copy detection
by default. Give --find-copies-harder to use unmodified files
to find copies from as well.
This also fixes "diff-files -C" problem earlier noticed by
Linus. It was feeding the null sha1 even when the file in the
work tree was known to match what is in the index file. This
resulted in diff-files showing everything in the project.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I wanted to be able to track CVS repositories in a GIT repository. The
cvs2git program worked fine with the initial import but needed a tiny
modification to enable me to resync the GIT repository with the updated
CVS tree.
[ The original version of this patch failed to track the correct
branch on the first new commit. Fixed and tested by Sven. ]
But warn about them. If somebody really ends up later wanting to
explicitly add a note that something has the same parent twice (who
knows, there are strange people around), we can add a flag to say that
it's expected and ok.
This was brought on by a commit in the kernel tree, where a repeated
merge caused a duplicate parent.
Parent duplicates aren't "wrong" per se, they're just in practice not
something you are ever interested in.
This is (imho) more readable, and is also a lot faster. The expense of
looking up sub-directory beginnings was killing us on things like
"git-diff-cache", even though that one didn't even care at all about the
file vs directory conflicts.
We really only care when somebody tries to add a conflicting name to
stage 0.
We should go through the conflict rules more carefully some day.
git-rev-list: add "--bisect" flag to find the "halfway" point
This is useful for doing binary searching for problems. You start with
a known good and known bad point, and you then test the "halfway" point
in between:
git-rev-list --bisect bad ^good
and you test that. If that one tests good, you now still have a known
bad case, but two known good points, and you can bisect again:
git-rev-list --bisect bad ^good1 ^good2
and test that point. If that point is bad, you now use that as your
known-bad starting point:
git-rev-list --bisect newbad ^good1 ^good2
and basically at every iteration you shrink your list of commits by
half: you're binary searching for the point where the troubles started,
even though there isn't a nice linear ordering.
Use "-M" instead of "-C" for "git diff" and "git status"
The "C" in "-C" may stand for "Cool", but it's also pretty slow, since
right now it leaves all unmodified files to be tested even if there are
no new files at all. That just ends up being unacceptably slow for big
projects, especially if it's not all in the cache.
Jens was the second person who hadn't heard of the "merge" program, and
didn't have it installed. So document as many dependency and install
issues as I can think of.
Sometimes we only want to output revisions, and sometimes we want to
only see the stuff that wasn't revisions. Teach git-rev-parse to
understand the "--revs-only" and "--no-revs" flags.
It's an incredibly cheesy helper that changes human-readable revision
arguments into the git-rev-list argument format.
You can use it to do something like this:
git-rev-list --pretty $(git-rev-parse --default HEAD "$@")
which is what git-log-script will become. Here git-rev-parse will
then allow you to use arguments like "v2.6.12-rc5.." or similar
human-readable ranges.
It's really quite stupid: "a..b" will be converted into "a" and "^b" if
"a" and "b" are valid object pointers. And the "--default" case will be
used if nothing but flags have been seen, so that you can default to a
certain argument if there are no other ranges.
Add "-z fuzz" argument, passed to cvsps, and clean up argument
processing. Also, use "cvsps --cvs-direct", which is is somewhat
faster.
Give the user the option of specifying the timestamp fuzz passed to
cvsps. Looking at the other arguments to it, I can't see anything else
that would be sane to play with. Also, use --cvs-direct, which speeds
up cvsps for remote repositories and doesn't seem to do anything bad to
local repositories.
Signed-off-by: Tommy McGuire <mcguire@crsr.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds some sanity checking to git-cvsimport-script,
specifically forcing the use of cvsps -x (to get the latest information
from the repository, rather than whatever is in the cache) and aborting
early if cvsps does not produce any output.
I debated removing the $MODULE directory following an abort, but I
eventually decided leaving stuff behind would make debugging easier. On
the other hand, this patch should help with the "cvsimport left me with
an empty repository" complaints.
Call cvsps with the -x flag, to get the current state of the repository,
and abort the cvs import early if cvsps does not produce any output.
Signed-off-by: Tommy McGuire <mcguire@crsr.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] diff-stages: unuglify the too big main() function.
Split the core of the program, diff_stage, from one big "main()"
function that does it all and leave only the parameter parsing,
setup and finalize part in the main().
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] read-tree: loosen too strict index requirements
This patch teaches read-tree 3-way merge that, when only "the
other tree" changed a path, and if the index file already has
the same change, we are not in a situation that would clobber
the index and the work tree, and lets the merge succeed; this is
case #14ALT in t1000 test. It does not change the result of the
merge, but prevents it from failing when it does not have to.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Finish making --emu23 equivalent to pure 2-way merge.
This adds #3ALT rule (and #2ALT rule for symmetry) to the
read-tree 3-way merge logic that collapses paths that are added
only in one branch and not in the other internally.
This makes --emu23 to succeed in the last remaining case where
the pure 2-way merge succeeded and earlier one failed. Running
diff between t1001 and t1005 test scripts shows that the only
difference between the two is that --emu23 can leave the states
into separate stages so that the user can use usual 3-way merge
resolution techniques to carry forward the local changes when
pure 2-way merge would have refused to run.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] read-tree: fix too strong index requirement #5ALT
This fixes too strong index requirement 3-way merge enforces in
one case: the same file is added in both branches.
In this case, the original code insisted that if the index file
has that path, it must match our branch and be up-to-date.
However in this particular case, it only has to match it, and
can be dirty. We just need to make sure that we keep the
work-tree copy instead of checking out the merge result.
The resolution of such a path, however, cannot be left to
outside script, because we will not keep the original stage0
entries for unmerged paths when read-tree finishes, and at that
point, the knowledge of "if we resolve it to match the new file
added in both branches, the merge succeeds and the work tree
would not lose information, but we should _not_ update the work
tree from the resulting index file" is lost. For this reason,
the now code needs to resolve this case (#5ALT) internally.
This affects some existing tests in the test suite, but all in
positive ways. In t1000 (3-way test), this #5ALT case now gets
one stage0 entry, instead of an identical stage2 and stage3
entry pair, for such a path, and one test that checked for merge
failure (because the test assumed the "stricter-than-necessary"
behaviour) does not have to fail anymore. In t1005 (emu23
test), two tests that involves a case where the work tree
already had a change introduced in the upstream (aka "merged
head"), the merge succeeds instead of failing.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This new flag causes two-way fast forward to internally use the
three-way merge mechanism. This behaviour is intended to offer
a better fast forward semantics when used in a dirty work tree.
The new test t1005 is parallel to the existing t1001 "pure
2-way" tests, but some parts that are commented out would fail.
These failures are due to three-way merge enforcing too strict
index requirements for cases that could succeed. This problem
will be addressed by later patches.
Without even changing three-way mechanism, the --emu23 two-way
fast forward already gives the user an easier-to-handle merge
result when a file that "merged head" updates has local
modifications. This is demonstrated as "case 16" test in t1005.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is in preparation for "2-way fast-forward emulated with
3-way mechanism" series. It does not change what the tests for
pure 2-way do. It only changes how it tests things, to make
reviewing of differences of the two tests easier in later steps.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Add --diff-filter= output restriction to diff-* family.
This is a halfway between debugging aid and a helper to write an
ultra-smart merge scripts. The new option takes a string that
consists of a list of "status" letters, and limits the diff
output to only those classes of changes, with two exceptions:
- A broken pair (aka "complete rewrite"), does not match D
(deleted) or N (created). Use B to look for them.
- The letter "A" in the diff-filter string does not match
anything itself, but causes the entire diff that contains
selected patches to be output (this behaviour is similar to
that of --pickaxe-all for the -S option).
Normally, diff-tree does not feed unchanged filepair to diffcore
for performance reasons, so copies are detected only when the
source file of the copy happens to be modified in the same
changeset. This adds --find-copies-harder flag to tell
diff-tree to sacrifice the performance in order to find copies
the same way as other commands in diff-* family.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Fix rename/copy when dealing with temporarily broken pairs.
When rename/copy uses a file that was broken by diffcore-break
as the source, and the broken filepair gets merged back later,
the output was mislabeled as a rename. In this case, the source
file ends up staying in the output, so we should label it as a
copy instead.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Re-Fix SIGSEGV on unmerged files in git-diff-files -p
When an unmerged path was fed via diff_unmerged() into diffcore,
it eventually called run_diff() with "one" and "two" parameters
with NULL, but run_diff() was not written carefully enough to
notice this situation.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A meaningful (ie non-empty) git patch always has more information in the
header than just the "diff --git" line itself: it needs to have either a
patch associated with it (which implies "---" and "+++" lines in the
header) or it needs to have rename/copy/delete/create information in it.
Just ignore git patches which have no change information. Otherwise we'll
end up with a patch that doesn't have filenames etc filled in, and we'll
be unhappy.
The diff-* brothers acquired a sibling, git-diff-stages. With
an unmerged index file, you specify two stage numbers and it
shows the differences between them.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the same as "-m", but it will silently ignore any unmerged
entries, which makes it useful for efficiently forcing a new position
regardless of the state of the current index file.
IOW, to reset to a previous HEAD (in case you have had a failed
merge, for example), you'd just do
git-read-tree -u --reset HEAD
which will also update your working tree to the right state.
NOTE! The "update" will not remove files that may have been added by the
merge. Yet.
One more time.. Clean up git-merge-one-file-script
This uses git-checkout-file to make sure that the full pathname is
created, instead of the script having to verify it by hand. Also,
simplify the 3-way merge case by just writing to the right file and
setting the initial index contents early.
[PATCH] git-merge-one-file-script cleanups from Cogito
Chain the resolving sequences (e.g. git-cat-file - chmod -
git-update-cache) through &&s so we stop right away in case one of the
command fails, and report the error code to the script caller.
Also add a copyright notice, some blank lines, ;; on a separate line,
and nicer error messages.
Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds a set of tests to make sure that requirements on
existing cache entries are checked when a read-tree -m 3-way
merge is run with an already populated index file.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes how we handle merges: if a automated merge
fails, we will leave the index as a clean entry pointing
to the original branch, and leave the actual file _dirty_
the way the "merge" program left it.
You can then just do "git-diff-files -p" to see what the
merge conflicts did, fix them up, and commit the end result.
NOTE NOTE NOTE! Do _not_ use "git commit" to commit such
a merge. It won't set the parents right. I'll need to fix
that. In the meantime, you'd need to merge using
git-commit-tree $(git-write) -p HEAD -p MERGE_HEAD
This updates t1000 (basic 3-way merge test) to check the merge
results for both successful cases (earlier one checked the
result for only one of them). Also fixes typos in t1002 that
broke '&&' chain, potentially missing a test failure before the
chain got broken.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes three bugs in --merge-order support
* mark_ancestors_uninteresting was unnecessarily exponential which
caused a problem when a commit with no parents was merged near the
head of something like the linux kernel
* removed a spurious statement from find_base which wasn't
apparently causing problems now, but wasn't correct either.
* removed an unnecessarily strict check from find_base_for_list
that causes a problem if git-rev-list commit ^parent-of-commit
is specified.
* added some unit tests which were accidentally omitted from
original merge-order patch
The fix to mark_ancestors_uninteresting isn't an optimal fix - a full
graph scan will still be performed in this case even though it is
not strictly required. However, a full graph scan is linear
and still no worse than git-rev-list HEAD which runs in less than 2
seconds on a warm cache.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Talk about "git cvsimport" in the cvs migration docs
We should add a lot more information about how you copy repositories,
pulling and pushing, merging etc. Oh, well. I'm not exactly known for
my documentation skills. Maybe somebody else will help me..