cvsserver: checkout faster by sending files in a sensible order
Just by sending the files in an ordered fashion, clients can process them
much faster. And we can optimize our check of whether we created this
directory already -- faster.
Timings for a checkout on a commandline cvs client for a project with
~13K files totalling ~100MB:
* master:
contrib/git-svn: use refs/remotes/git-svn instead of git-svn-HEAD
Merge branch 'maint'
read-tree --aggressive: remove deleted entry from the working tree.
Merge branch 'jc/tag'
Merge part of 'jc/diff'
contrib/git-svn: use refs/remotes/git-svn instead of git-svn-HEAD
After reading a lengthy discussion on the list, I've come to the
conclusion that creating a 'remotes' directory in refs isn't
such a bad idea.
You can still branch from it by specifying remotes/git-svn (not
needing the leading 'refs/'), and the documentation has been
updated to reflect that.
The 'git-svn' part of the ref can of course be set to whatever
you want by using the GIT_SVN_ID environment variable, as
before.
I'm using refs/remotes/git-svn, and not going with something
like refs/remotes/git-svn/HEAD as it's redundant for Subversion
where there's zero distinction between branches and directories.
Run git-svn rebuild --upgrade to upgrade your repository to use
the new head. git-svn-HEAD must be manually deleted for safety
reasons.
Side note: if you ever (and I hope you never) want to run
git-update-refs on a 'remotes/' ref, make sure you have the
'refs/' prefix as you don't want to be clobbering your
'remotes/' in $GIT_DIR (where remote URLs are stored).
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
* np/delta:
diff-delta: allow reusing of the reference buffer index
diff-delta: bound hash list length to avoid O(m*n) behavior
diff-delta: produce optimal pack data
Merge branch 'kh/svnimport'
Merge branch 'js/refs'
annotate: fix -S parameter to take a string
annotate: Add a basic set of test cases.
annotate: handle \No newline at end of file.
gitview: Use horizontal scroll bar in the tree view
diff-delta: allow reusing of the reference buffer index
When a reference buffer is used multiple times then its index can be
computed only once and reused multiple times. This patch adds an extra
pointer to a pointer argument (from_index) to diff_delta() for this.
If from_index is NULL then everything is like before.
If from_index is non NULL and *from_index is NULL then the index is
created and its location stored to *from_index. In this case the caller
has the responsibility to free the memory pointed to by *from_index.
If from_index and *from_index are non NULL then the index is reused as
is.
This currently saves about 10% of CPU time to repack the git archive.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
diff-delta: bound hash list length to avoid O(m*n) behavior
The diff-delta code can exhibit O(m*n) behavior with some patological
data set where most hash entries end up in the same hash bucket.
The latest code rework reduced the block size making it particularly
vulnerable to this issue, but the issue was always there and can be
triggered regardless of the block size.
This patch does two things:
1) the hashing has been reworked to offer a better distribution to
atenuate the problem a bit, and
2) a limit is imposed to the number of entries that can exist in the
same hash bucket.
Because of the above the code is a bit more expensive on average, but
the problematic samples used to diagnoze the issue are now orders of
magnitude less expensive to process with only a slight loss in
compression.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Indexing based on adler32 has a match precision based on the block size
(currently 16). Lowering the block size would produce smaller deltas
but the indexing memory and computing cost increases significantly.
For optimal delta result the indexing block size should be 3 with an
increment of 1 (instead of 16 and 16). With such low params the adler32
becomes a clear overhead increasing the time for git-repack by a factor
of 3. And with such small blocks the adler 32 is not very useful as the
whole of the block bits can be used directly.
This patch replaces the adler32 with an open coded index value based on
3 characters directly. This gives sufficient bits for hashing and
allows for optimal delta with reasonable CPU cycles.
The resulting packs are 6% smaller on average. The increase in CPU time
is about 25%. But this cost is now hidden by the delta reuse patch
while the saving on data transfers is always there.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
cvsserver: Eclipse compat -- now "compare with latest from HEAD" works
The Eclipse client uses cvs update when that menu option is triggered.
And doesn't like the standard cvs update response. Give it *exactly* what
it wants.
And hope the other clients don't lose the plot too badly.
gitview: Use horizontal scroll bar in the tree view
Earlier we set up the window to never scroll
horizontally, which made it harder to use on a narrow screen.
This patch allows scrollbar to be used as needed by Gtk
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Initial checkouts were failing to create Entries files under Eclipse.
Eclipse was waiting for two non-standard directory-resets to prepare for a new
directory from the server.
This patch is tricky, because the same directory resets tend to confuse other
clients. It's taken a bit of fiddling to get the commandline cvs client and
Eclipse to get a good, clean checkout.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Commit 8fcf1ad9c68e15d881194c8544e7c11d33529c2b has a
combination of double cast and Andreas' switch to using
unsigned long ... just the latter is sufficient (and a lot less
ugly than using the double cast).
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
We can show commit objects with human readable dates using
various --pretty options, but there was no way to do so with
tags. This introduces two such ways:
$ git-cat-file -p v1.2.3
shows the tag object with tagger dates in human readable format.
$ git-verify-tag --verbose v1.2.3
uses it to show the contents of the tag object as well as doing
GPG verification.
* lt/fix-apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
* lt/apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
Moving a directory ending in a slash was not working as the
destination was not calculated correctly.
E.g. in the git repo,
git-mv t/ Documentation
gave the error
Error: destination 'Documentation' already exists
To get rid of this problem, strip trailing slashes from all arguments.
The comment in cg-mv made me curious about this issue; Pasky, thanks!
As result, the workaround in cg-mv is not needed any more.
Also, another bug was shown by cg-mv. When moving files outside of
a subdirectory, it typically calls git-mv with something like
which triggers the following error from git-update-index:
Ignoring path Documentation/../git-mv.txt
The result is a moved file, removed from git revisioning, but not
added again. To fix this, the paths have to be normalized not have ".."
in the middle. This was already done in git-mv, but only for
a better visual appearance :(
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes "git-mv -h" to output the usage without the need
to be in a git repository.
Additionally:
- fix confusing error message when only one arg was given
- fix typo in error message
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 6baf0484efcd29bb5e58ccd5ea0379481d4a83f4 commit)
For some reason, combined diffs don't honour the --full-index flag when
emitting patches. Fix this.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from e70c6b35749c316f6e97099bd6bdac895c9d6f68 commit)
diffcore-break: micro-optimize by avoiding delta between identical files.
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
This ports the following options from rev-list based git-log
implementation:
* -<n>, -n<n>, and -n <n>. I am still wondering if we want
this natively supported by setup_revisions(), which already
takes --max-count. We may want to move them in the next
round. Also I am not sure if we can get away with not
setting revs->limited when we set max-count. The latest
rev-list.c and revision.c in this series do not, so I left
them as they are.
* --pretty and --pretty=<fmt>.
* --abbrev=<n> and --no-abbrev.
The previous commit already handles time-based limiters
(--since, --until and friends). The remaining things that
rev-list based git-log happens to do are not useful in a pure
log-viewing purposes, and not ported:
* --bisect (obviously).
* --header. I am actually in favor of doing the NUL
terminated record format, but rev-list based one always
passed --pretty, which defeated this option. Maybe next
round.
* --parents. I do not think of a reason a log viewer wants
this. The flag is primarily for feeding squashed history
via pipe to downstream tools.
* lt/rev-list:
Rip out merge-order and make "git log <paths>..." work again.
Tie it all together: "git log"
Introduce trivial new pager.c helper infrastructure
git-rev-list libification: rev-list walking
blame.c #include's epoch.h; it needed to be killed.
Originally Martin's tree was based on "next", which meant that all
the other things that I am not ready to push out to "master" were
contained in it. His changes looked good, and I wanted to have them
in "master".
So, here is what I did:
- fetch Martin's tree into a temporary topic branch.
$ git fetch $URL $remote:ml/cvsserver
$ git checkout ml/cvsserver
- rebase it on top of "master".
$ git rebase --onto master next
- pull that master into "next", recording Martin's head as well.
$ git pull --append . master
Since I have apply.whitespace=strip in my configuration file, the
rebased cvsserver changes have trailing whitespaces introduced by
Martin's tree cleansed out. Hence the above conflicts.
The reason I made this octopus is to make sure that next time Martin
pulls from my "next" branch, it results in a fast forward. There is
no reason to force him do the same conflict resolution I did with this
merge.
Teach git-checkout-index to read filenames from stdin.
Since git-checkout-index is often used from scripts which
may have a stream of filenames they wish to checkout it is
more convenient to use --stdin than xargs. On platforms
where fork performance is currently sub-optimal and
the length of a command line is limited (*cough* Cygwin
*cough*) running a single git-checkout-index process for
a large number of files beats spawning it multiple times
from xargs.
File names are still accepted on the command line if
--stdin is not supplied. Nothing is performed if no files
are supplied on the command line or by stdin.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
cvsserver: Eclipse compat - browsing 'modules' (heads in our case) works
Eclipse CVS clients have an odd way of perusing the top level of
the repository, by calling update on module "". So reproduce cvs'
odd behaviour in the interest of compatibility.
It makes it much easier to get a checkout when using Eclipse.
cvsserver: Eclipse compat - browsing 'modules' (heads in our case) works
Eclipse CVS clients have an odd way of perusing the top level of
the repository, by calling update on module "". So reproduce cvs'
odd behaviour in the interest of compatibility.
It makes it much easier to get a checkout when using Eclipse.
This switches the change estimation logic used by break, rename
and copy detection from delta packing code to a more line
oriented one. This way, thee performance-density tradeoff by
delta packing code can be made without worrying about breaking
the rename detection.
diffcore-break: micro-optimize by avoiding delta between identical files.
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
When on Darwin platforms don't include Fink or DarwinPorts
into the link path unless the related library directory
is actually present. The linker on MacOS 10.4 complains
if it is given a directory which does not exist.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-apply: war on whitespace -- finishing touches.
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
git-apply: war on whitespace -- finishing touches.
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
diff-delta: bound hash list length to avoid O(m*n) behavior
The diff-delta code can exhibit O(m*n) behavior with some patological
data set where most hash entries end up in the same hash bucket.
The latest code rework reduced the block size making it particularly
vulnerable to this issue, but the issue was always there and can be
triggered regardless of the block size.
This patch does two things:
1) the hashing has been reworked to offer a better distribution to
atenuate the problem a bit, and
2) a limit is imposed to the number of entries that can exist in the
same hash bucket.
Because of the above the code is a bit more expensive on average, but
the problematic samples used to diagnoze the issue are now orders of
magnitude less expensive to process with only a slight loss in
compression.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
apply: squelch excessive errors and --whitespace=error-all
This by default makes --whitespace=warn, error, and strip to
warn only the first 5 additions of trailing whitespaces. A new
option --whitespace=error-all can be used to view all of them
before applying.
In addition to fixing obvious command line parsing bugs in the
previous round, this changes the following:
* Adds "--whitespace=strip". This applies after stripping the
new trailing whitespaces introduced to the patch.
* The output error message format is changed to say
"patch-filename:linenumber:contents of the line". This makes
it similar to typical compiler error message format, and
helps C-x ` (next-error) in Emacs compilation buffer.
* --whitespace=error and --whitespace=warn do not stop at the
first error. We might want to limit the output to say first
20 such lines to prevent cluttering, but on the other hand if
you are willing to hand-fix after inspecting them, getting
everything with a single run might be easier to work with.
After all, somebody has to do the clean-up work somewhere.
On Sat, 25 Feb 2006, Andrew Morton wrote:
>
> I'd suggest a) git will simply refuse to apply such a patch unless given a
> special `forcing' flag, b) even when thus forced, it will still warn and c)
> with a different flag, it will strip-then-apply, without generating a
> warning.
This doesn't do the "strip-then-apply" thing, but it allows you to make
git-apply generate a warning or error on extraneous whitespace.
Use --whitespace=warn to warn, and (surprise, surprise) --whitespace=error
to make it a fatal error to have whitespace at the end.
Totally untested, of course. But it compiles, so it must be fine.
HOWEVER! Note that this literally will check every single patch-line with
"+" at the beginning. Which means that if you fix a simple typo, and the
line had a space at the end before, and you didn't remove it, that's still
considered a "new line with whitespace at the end", even though obviously
the line wasn't really new.
I assume this is what you wanted, and there isn't really any sane
alternatives (you could make the warning activate only for _pure_
additions with no deletions at all in that hunk, but that sounds a bit
insane).
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.
* master:
Merge part of kh/svnimport branch into master
contrib/git-svn: correct commit example in manpage
contrib/git-svn: tell the user to not modify git-svn-HEAD directly
gitview: Remove trailing white space
gitview: Fix the encoding related bug
git-format-patch: Always add a blank line between headers and body.
combine-diff: Honour -z option correctly.
combine-diff: Honour --full-index.
Save username -> Full Name <email@addr.es> map file
When the user specifies a username -> Full Name <email@addr.es> map
file with the -A option, save a copy of that file as
$git_dir/svn-authors. When running git-svnimport with an existing GIT
directory, use $git_dir/svn-authors (if it exists) unless a file was
explicitly specified with -A.
Signed-off-by: Karl Hasselström <kha@treskal.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
apply: squelch excessive errors and --whitespace=error-all
This by default makes --whitespace=warn, error, and strip to
warn only the first 5 additions of trailing whitespaces. A new
option --whitespace=error-all can be used to view all of them
before applying.
contrib/git-svn: tell the user to not modify git-svn-HEAD directly
As a rule, interface branches to different SCMs should never be modified
directly by the user. They are used exclusively for talking to the
foreign SCM.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
Get the encoding information from repository and convert it to utf-8 before
passing to gtk.TextBuffer.set_text. gtk.TextBuffer.set_text work only with utf-8
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-format-patch: Always add a blank line between headers and body.
If the second line of the commit message isn't empty, git-format-patch
needs to add an empty line in order to generate a properly formatted
mail. Otherwise git-rebase drops the rest of the commit message.
Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/apply:
apply --whitespace fixes and enhancements.
The war on trailing whitespace
svnimport: Convert the svn:ignore property
svnimport: Convert executable flag
svnimport: Mention -r in usage summary
Make git diff-generation use a simpler spawn-like interface
In addition to fixing obvious command line parsing bugs in the
previous round, this changes the following:
* Adds "--whitespace=strip". This applies after stripping the
new trailing whitespaces introduced to the patch.
* The output error message format is changed to say
"patch-filename:linenumber:contents of the line". This makes
it similar to typical compiler error message format, and
helps C-x ` (next-error) in Emacs compilation buffer.
* --whitespace=error and --whitespace=warn do not stop at the
first error. We might want to limit the output to say first
20 such lines to prevent cluttering, but on the other hand if
you are willing to hand-fix after inspecting them, getting
everything with a single run might be easier to work with.
After all, somebody has to do the clean-up work somewhere.
On Sat, 25 Feb 2006, Andrew Morton wrote:
>
> I'd suggest a) git will simply refuse to apply such a patch unless given a
> special `forcing' flag, b) even when thus forced, it will still warn and c)
> with a different flag, it will strip-then-apply, without generating a
> warning.
This doesn't do the "strip-then-apply" thing, but it allows you to make
git-apply generate a warning or error on extraneous whitespace.
Use --whitespace=warn to warn, and (surprise, surprise) --whitespace=error
to make it a fatal error to have whitespace at the end.
Totally untested, of course. But it compiles, so it must be fine.
HOWEVER! Note that this literally will check every single patch-line with
"+" at the beginning. Which means that if you fix a simple typo, and the
line had a space at the end before, and you didn't remove it, that's still
considered a "new line with whitespace at the end", even though obviously
the line wasn't really new.
I assume this is what you wanted, and there isn't really any sane
alternatives (you could make the warning activate only for _pure_
additions with no deletions at all in that hunk, but that sounds a bit
insane).