The "score" calculation for diffcore-rename was totally broken.
It scaled "score" as
score = src_copied * MAX_SCORE / dst->size;
which means that you got a 100% similarity score even if src and dest were
different, if just every byte of dst was copied from src, even if source
was much larger than dst (eg we had copied 85% of the bytes, but _deleted_
the remaining 15%).
That's clearly bogus. We should do the score calculation relative not to
the destination size, but to the max size of the two.
This seems to fix it.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This tweaks the maximum hashvalue we use to hash the string into
without making the maximum size of the hashtable can grow from
the current limit. With this, the renames detected becomes a
bit more precise without incurring additional paging cost.
This changes diffcore-rename to reuse statistics information
gathered during similarity estimation, and updates the hashtable
implementation used to keep track of the statistics to be
denser. This seems to give better performance.
diffcore-delta: make change counter to byte oriented again.
The textual line oriented change counter was fun but was not
very effective. It tended to overcount the changes. This one
changes it to a simple N-letter substring based implementation.
This is a companion patch to e29e1147e485654d90a0ea0fd5fb7151bb194265
which made diffcore similarity estimator independent from the packfile
deltifier. There is no reason for us to be counting the xdelta anymore.
* lt/rev-list:
setup_revisions(): handle -n<n> and -<n> internally.
git-log (internal): more options.
git-log (internal): add approxidate.
Rip out merge-order and make "git log <paths>..." work again.
Tie it all together: "git log"
Introduce trivial new pager.c helper infrastructure
git-rev-list libification: rev-list walking
Splitting rev-list into revisions lib, end of beginning.
rev-list split: minimum fixup.
First cut at libifying revlist generation
(On some inetd implementations you may have to put the pserver parameter twice.)
Commits are blocked. Naively, git-cvsserver assumes non-malicious users. Please
review the code before setting this up on an internet-accessible server.
NOTE: the <nobody> user above will need write access to the .git directory
to maintain the sqlite database. Updating of the sqlite database should be
put in an update hook to avoid this problem, so that it is maintained by
users with write access.
cvsserver: nested directory creation fixups for Eclipse clients
To create nested directories without (or before) sending file entries
is rather tricky. Most clients just work. Eclipse, however, expects
a very specific sequence of events. With this patch, cvsserver meets
those expectations.
Note: we may want to reuse prepdir() in req_update -- should move it
outside of req_co. Right now prepdir() is tied to how req_co() works.
contrib/git-svn: fix a copied-tree bug in an overzealous assertion
I thought passing --stop-on-copy to svn would save us from all
the trouble svn-arch-mirror had with directory (project) copies.
I was wrong, there was one thing I overlooked.
If a tree was moved from /foo/trunk to /bar/foo/trunk with no
other changes in r10, but the last change was done in r5, the
Last Changed Rev (from svn info) in /bar/foo/trunk will still be
r5, even though the copy in the repository didn't exist until
r10.
Now, if we ever detect that the Last Changed Rev isn't what
we're expecting, we'll run svn diff and only croak if there are
differences between them.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
show-branch --topics: omit more uninteresting commits.
When inspecting contents of topic branches for yet-to-be-merged
commits, a commit that is in the release/master branch is
uninteresting. Previous round still showed them, especially,
the ones before a topic branch that was forked from the
release/master later than other topic branches.
git-mv needs to be run from the base directory so that
the check if a file is under revision also covers files
outside of a subdirectory. Previously, e.g. in the git repo,
cd Documentation; git-mv ../README .
produced the error
Error: '../README' not under version control
The test is extended for this case; it previously only tested
one direction.
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn: avoid re-reading the repository uuid, it never changes
If it does change, we're screwed anyways as SVN will refuse to
commit or update. We also never access more than one SVN
repository per-invocation, so we can store it as a global, too.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn: create a more recent master if one does not exist
In a new repository, the initial fetch creates a master branch
if one does not exist so HEAD has something to point to.
It now creates a master at the end of the initial fetch run,
pointing to the latest revision. Previously it pointed to the
first revision imported, which is generally less useful.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn: strip 'git-svn-id:' when commiting to SVN
We regenerate and use git-svn-id: whenever we fetch or otherwise
commit to remotes/git-svn. We don't actually know what revision
number we'll commit to SVN at commit time, so this is useless.
It won't throw off things like 'rebuild', though, which knows to
only use the last instance of git-svn-id: in a log message
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn: several small bug fixes and changes
* Fixed manually-edited commit messages not going to
remotes/git-svn on sequential commits after the sequential
commit optimization.
* format help correctly after adding 'show-ignore'
* sha1_short regexp matches down to 4 hex characters
(from git-rev-parse --short documentation)
* Print the first line of the commit message when we commit to
SVN next to the sha1.
* Document 'T' (type change) in the comments
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn: add -b/--branch switch for branch detection
I've said I don't like branches in Subversion, and I still don't.
This is a bit more flexible, though, as the argument for -b is any
arbitrary git head/tag reference.
This makes some things easier:
* Importing git history into a brand new SVN branch.
* Tracking multiple SVN branches via GIT_SVN_ID, even from multiple
repositories.
* Adding tags from SVN (still need to use GIT_SVN_ID, though).
* Even merge tracking is supported, if and only the heads end up with
100% equivalent tree objects. This is more stricter but more robust
and foolproof than parsing commit messages, imho.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
If git-update-index --index-info -z is used only the first
record given to the process will actually be updated as
the -z option is ignored until after all index records
have been read and processed. This meant that multiple
null terminated records were seen as a single record which
was lacking a trailing LF, however since the first record
ended in a null the C string handling functions ignored the
trailing garbage. So --index-info should be required to be
the last command line option, much as --stdin is required
to be the last command line option. Because --index-info
implies --stdin this isn't an issue as the user shouldn't
be passing --stdin when also passing --index-info.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
manpages: insert two missing [verse] markers for multi-line SYNOPSIS
Found with:
for i in *.txt; do
grep -A 2 "SYNOPSIS" "$i" | grep -q "^\[verse\]$" && continue
multiline=$(grep -A 3 "SYNOPSIS" "$i" | tail -n 1)
test -n "$multiline" && echo "$i: $multiline"
done
Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net>
A recent Eclipse compat fix broke checkouts with -d. Fix it so that the server
sends the correct module name instead of the destination directory name.
cvsserver: checkout faster by sending files in a sensible order
Just by sending the files in an ordered fashion, clients can process them
much faster. And we can optimize our check of whether we created this
directory already -- faster.
Timings for a checkout on a commandline cvs client for a project with
~13K files totalling ~100MB:
The "similarity" logic was giving added material way too much
negative weight. What we wanted to see was how similar the
post-change image was compared to the pre-change image, so the
natural definition of similarity is how much common things are
there, relative to the post-change image's size.
An earlier commit 8098a178b26dc7a158d129a092a5b78da6d12b72
accidentally lost race protection from git-commit command.
This commit reinstates it. When something else updates HEAD
pointer while you were editing your commit message, the command
would notice and abort the commit.
The new flag is used to amend the tip of the current branch. Prepare
the tree object you would want to replace the latest commit as usual
(this includes the usual -i/-o and explicit paths), and the commit log
editor is seeded with the commit message from the tip of the current
branch. The commit you create replaces the current tip -- if it was a
merge, it will have the parents of the current tip as parents -- so the
current top commit is discarded.
It is a rough equivalent for:
$ git reset --soft HEAD^
$ ... do something else to come up with the right tree ...
$ git commit -c ORIG_HEAD
This adds a new flag, --topics, to help managing topic
branches. When you have topic branches forked some time ago
from your primary line of development, show-branch would show
many "uninteresting" things that happend on the primary line of
development when trying to see what are still not merged from
the topic branches.
With this flag, the first ref given to show-branch is taken as
the primary branch, and the rest are taken as the topic
branches. Output from the command is modified so that commits
only on the primary branch are not shown. In other words,
GIT-VERSION-GEN: squelch unneeded error from "cat version"
Now this is really a corner case, but if you have the git source
tree from somewhere other than the official tarball, you do not
have version file. And if git-describe does not work for you
(maybe you do not have git yet), we spilled an error message
from "cat version".
setup_revisions(): handle -n<n> and -<n> internally.
This moves the handling of max-count shorthand from the internal
implementation of "git log" to setup_revisions() so other users
of setup_revisions() can use it.
Here is an updated version of git-blame. The main changes compared to
the first version are:
* Use the new revision.h interface to do the revision walking
* Do the right thing in a lot of more cases than before. In particular
parallel development tracks are hopefully handled sanely.
* Lots of clean-up
I think git-blame is correct in this case. This patterns occur in
several other places, git-annotate seems to sometimes assign lines to
merge commits when the lines actually changed in some other commit
which precedes the merge.
[jc: I have conned Ryan into doing test cases, so that it would
help development and fixes on both implementations. Let the
battle begin! ;-) ]
Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
Now blame will depend on the new revision walker infrastructure,
we need to make it depend on earlier parts of Linus' rev-list
topic branch, hence this merge.
contrib/git-svn: use refs/remotes/git-svn instead of git-svn-HEAD
After reading a lengthy discussion on the list, I've come to the
conclusion that creating a 'remotes' directory in refs isn't
such a bad idea.
You can still branch from it by specifying remotes/git-svn (not
needing the leading 'refs/'), and the documentation has been
updated to reflect that.
The 'git-svn' part of the ref can of course be set to whatever
you want by using the GIT_SVN_ID environment variable, as
before.
I'm using refs/remotes/git-svn, and not going with something
like refs/remotes/git-svn/HEAD as it's redundant for Subversion
where there's zero distinction between branches and directories.
Run git-svn rebuild --upgrade to upgrade your repository to use
the new head. git-svn-HEAD must be manually deleted for safety
reasons.
Side note: if you ever (and I hope you never) want to run
git-update-refs on a 'remotes/' ref, make sure you have the
'refs/' prefix as you don't want to be clobbering your
'remotes/' in $GIT_DIR (where remote URLs are stored).
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
cvsserver: Eclipse compat -- now "compare with latest from HEAD" works
The Eclipse client uses cvs update when that menu option is triggered.
And doesn't like the standard cvs update response. Give it *exactly* what
it wants.
And hope the other clients don't lose the plot too badly.
gitview: Use horizontal scroll bar in the tree view
Earlier we set up the window to never scroll
horizontally, which made it harder to use on a narrow screen.
This patch allows scrollbar to be used as needed by Gtk
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Initial checkouts were failing to create Entries files under Eclipse.
Eclipse was waiting for two non-standard directory-resets to prepare for a new
directory from the server.
This patch is tricky, because the same directory resets tend to confuse other
clients. It's taken a bit of fiddling to get the commandline cvs client and
Eclipse to get a good, clean checkout.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Commit 8fcf1ad9c68e15d881194c8544e7c11d33529c2b has a
combination of double cast and Andreas' switch to using
unsigned long ... just the latter is sufficient (and a lot less
ugly than using the double cast).
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
We can show commit objects with human readable dates using
various --pretty options, but there was no way to do so with
tags. This introduces two such ways:
$ git-cat-file -p v1.2.3
shows the tag object with tagger dates in human readable format.
$ git-verify-tag --verbose v1.2.3
uses it to show the contents of the tag object as well as doing
GPG verification.
* lt/fix-apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
* lt/apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
Moving a directory ending in a slash was not working as the
destination was not calculated correctly.
E.g. in the git repo,
git-mv t/ Documentation
gave the error
Error: destination 'Documentation' already exists
To get rid of this problem, strip trailing slashes from all arguments.
The comment in cg-mv made me curious about this issue; Pasky, thanks!
As result, the workaround in cg-mv is not needed any more.
Also, another bug was shown by cg-mv. When moving files outside of
a subdirectory, it typically calls git-mv with something like
which triggers the following error from git-update-index:
Ignoring path Documentation/../git-mv.txt
The result is a moved file, removed from git revisioning, but not
added again. To fix this, the paths have to be normalized not have ".."
in the middle. This was already done in git-mv, but only for
a better visual appearance :(
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes "git-mv -h" to output the usage without the need
to be in a git repository.
Additionally:
- fix confusing error message when only one arg was given
- fix typo in error message
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 6baf0484efcd29bb5e58ccd5ea0379481d4a83f4 commit)
For some reason, combined diffs don't honour the --full-index flag when
emitting patches. Fix this.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from e70c6b35749c316f6e97099bd6bdac895c9d6f68 commit)
diffcore-break: micro-optimize by avoiding delta between identical files.
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
This ports the following options from rev-list based git-log
implementation:
* -<n>, -n<n>, and -n <n>. I am still wondering if we want
this natively supported by setup_revisions(), which already
takes --max-count. We may want to move them in the next
round. Also I am not sure if we can get away with not
setting revs->limited when we set max-count. The latest
rev-list.c and revision.c in this series do not, so I left
them as they are.
* --pretty and --pretty=<fmt>.
* --abbrev=<n> and --no-abbrev.
The previous commit already handles time-based limiters
(--since, --until and friends). The remaining things that
rev-list based git-log happens to do are not useful in a pure
log-viewing purposes, and not ported:
* --bisect (obviously).
* --header. I am actually in favor of doing the NUL
terminated record format, but rev-list based one always
passed --pretty, which defeated this option. Maybe next
round.
* --parents. I do not think of a reason a log viewer wants
this. The flag is primarily for feeding squashed history
via pipe to downstream tools.
cvsserver: Eclipse compat - browsing 'modules' (heads in our case) works
Eclipse CVS clients have an odd way of perusing the top level of
the repository, by calling update on module "". So reproduce cvs'
odd behaviour in the interest of compatibility.
It makes it much easier to get a checkout when using Eclipse.
This switches the change estimation logic used by break, rename
and copy detection from delta packing code to a more line
oriented one. This way, thee performance-density tradeoff by
delta packing code can be made without worrying about breaking
the rename detection.
diffcore-break: micro-optimize by avoiding delta between identical files.
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
When on Darwin platforms don't include Fink or DarwinPorts
into the link path unless the related library directory
is actually present. The linker on MacOS 10.4 complains
if it is given a directory which does not exist.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-apply: war on whitespace -- finishing touches.
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
git-apply: war on whitespace -- finishing touches.
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.