git-apply: war on whitespace -- finishing touches.
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
diff-delta: bound hash list length to avoid O(m*n) behavior
The diff-delta code can exhibit O(m*n) behavior with some patological
data set where most hash entries end up in the same hash bucket.
The latest code rework reduced the block size making it particularly
vulnerable to this issue, but the issue was always there and can be
triggered regardless of the block size.
This patch does two things:
1) the hashing has been reworked to offer a better distribution to
atenuate the problem a bit, and
2) a limit is imposed to the number of entries that can exist in the
same hash bucket.
Because of the above the code is a bit more expensive on average, but
the problematic samples used to diagnoze the issue are now orders of
magnitude less expensive to process with only a slight loss in
compression.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
apply: squelch excessive errors and --whitespace=error-all
This by default makes --whitespace=warn, error, and strip to
warn only the first 5 additions of trailing whitespaces. A new
option --whitespace=error-all can be used to view all of them
before applying.
In addition to fixing obvious command line parsing bugs in the
previous round, this changes the following:
* Adds "--whitespace=strip". This applies after stripping the
new trailing whitespaces introduced to the patch.
* The output error message format is changed to say
"patch-filename:linenumber:contents of the line". This makes
it similar to typical compiler error message format, and
helps C-x ` (next-error) in Emacs compilation buffer.
* --whitespace=error and --whitespace=warn do not stop at the
first error. We might want to limit the output to say first
20 such lines to prevent cluttering, but on the other hand if
you are willing to hand-fix after inspecting them, getting
everything with a single run might be easier to work with.
After all, somebody has to do the clean-up work somewhere.
On Sat, 25 Feb 2006, Andrew Morton wrote:
>
> I'd suggest a) git will simply refuse to apply such a patch unless given a
> special `forcing' flag, b) even when thus forced, it will still warn and c)
> with a different flag, it will strip-then-apply, without generating a
> warning.
This doesn't do the "strip-then-apply" thing, but it allows you to make
git-apply generate a warning or error on extraneous whitespace.
Use --whitespace=warn to warn, and (surprise, surprise) --whitespace=error
to make it a fatal error to have whitespace at the end.
Totally untested, of course. But it compiles, so it must be fine.
HOWEVER! Note that this literally will check every single patch-line with
"+" at the beginning. Which means that if you fix a simple typo, and the
line had a space at the end before, and you didn't remove it, that's still
considered a "new line with whitespace at the end", even though obviously
the line wasn't really new.
I assume this is what you wanted, and there isn't really any sane
alternatives (you could make the warning activate only for _pure_
additions with no deletions at all in that hunk, but that sounds a bit
insane).
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.
* master:
Merge part of kh/svnimport branch into master
contrib/git-svn: correct commit example in manpage
contrib/git-svn: tell the user to not modify git-svn-HEAD directly
gitview: Remove trailing white space
gitview: Fix the encoding related bug
git-format-patch: Always add a blank line between headers and body.
combine-diff: Honour -z option correctly.
combine-diff: Honour --full-index.
Save username -> Full Name <email@addr.es> map file
When the user specifies a username -> Full Name <email@addr.es> map
file with the -A option, save a copy of that file as
$git_dir/svn-authors. When running git-svnimport with an existing GIT
directory, use $git_dir/svn-authors (if it exists) unless a file was
explicitly specified with -A.
Signed-off-by: Karl Hasselström <kha@treskal.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
apply: squelch excessive errors and --whitespace=error-all
This by default makes --whitespace=warn, error, and strip to
warn only the first 5 additions of trailing whitespaces. A new
option --whitespace=error-all can be used to view all of them
before applying.
contrib/git-svn: tell the user to not modify git-svn-HEAD directly
As a rule, interface branches to different SCMs should never be modified
directly by the user. They are used exclusively for talking to the
foreign SCM.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
Get the encoding information from repository and convert it to utf-8 before
passing to gtk.TextBuffer.set_text. gtk.TextBuffer.set_text work only with utf-8
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-format-patch: Always add a blank line between headers and body.
If the second line of the commit message isn't empty, git-format-patch
needs to add an empty line in order to generate a properly formatted
mail. Otherwise git-rebase drops the rest of the commit message.
Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/apply:
apply --whitespace fixes and enhancements.
The war on trailing whitespace
svnimport: Convert the svn:ignore property
svnimport: Convert executable flag
svnimport: Mention -r in usage summary
Make git diff-generation use a simpler spawn-like interface
In addition to fixing obvious command line parsing bugs in the
previous round, this changes the following:
* Adds "--whitespace=strip". This applies after stripping the
new trailing whitespaces introduced to the patch.
* The output error message format is changed to say
"patch-filename:linenumber:contents of the line". This makes
it similar to typical compiler error message format, and
helps C-x ` (next-error) in Emacs compilation buffer.
* --whitespace=error and --whitespace=warn do not stop at the
first error. We might want to limit the output to say first
20 such lines to prevent cluttering, but on the other hand if
you are willing to hand-fix after inspecting them, getting
everything with a single run might be easier to work with.
After all, somebody has to do the clean-up work somewhere.
On Sat, 25 Feb 2006, Andrew Morton wrote:
>
> I'd suggest a) git will simply refuse to apply such a patch unless given a
> special `forcing' flag, b) even when thus forced, it will still warn and c)
> with a different flag, it will strip-then-apply, without generating a
> warning.
This doesn't do the "strip-then-apply" thing, but it allows you to make
git-apply generate a warning or error on extraneous whitespace.
Use --whitespace=warn to warn, and (surprise, surprise) --whitespace=error
to make it a fatal error to have whitespace at the end.
Totally untested, of course. But it compiles, so it must be fine.
HOWEVER! Note that this literally will check every single patch-line with
"+" at the beginning. Which means that if you fix a simple typo, and the
line had a space at the end before, and you didn't remove it, that's still
considered a "new line with whitespace at the end", even though obviously
the line wasn't really new.
I assume this is what you wanted, and there isn't really any sane
alternatives (you could make the warning activate only for _pure_
additions with no deletions at all in that hunk, but that sounds a bit
insane).
svnimport: Read author names and emails from a file
Read a file with lines on the form
username User's Full Name <email@addres.org>
and use "User's Full Name <email@addres.org>" as the GIT author and
committer for Subversion commits made by "username". If encountering a
commit made by a user not in the list, abort.
Signed-off-by: Karl Hasselström <kha@treskal.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Put the value of the svn:ignore property in a regular file when
converting a Subversion repository to GIT. The Subversion and GIT
ignore syntaxes are similar enough that it often just works to set the
filename to .gitignore and do nothing else.
Signed-off-by: Karl Hasselström <kha@treskal.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes "the other end has commit X but since then we tagged
that commit with tag T, and he says he wants T -- what is the
list of objects we need to send him?" question:
git-rev-list --objects ^X T
We ended up sending everything since the beginning of time X-<.
Make git diff-generation use a simpler spawn-like interface
Instead of depending of fork() and execve() and doing things in between
the two, make the git diff functions do everything up front, and then do
a single "spawn_prog()" invocation to run the actual external diff
program (if any is even needed).
This actually ends up simplifying the code, and should make it much
easier to make it efficient under broken operating systems (read: Windows).
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/rev-list:
First cut at libifying revlist generation
Merge branch 'maint'
sample hooks template.
Teach the "git" command to handle some commands internally
Use setenv(), fix warnings
contrib/git-svn: version 0.10.0
contrib/git-svn: optimize sequential commits to svn
contrib/git-svn: add show-ignore command
annotate: Use qx{} for pipes on activestate.
annotate: Convert all -| calls to use a helper open_pipe().
annotate: Handle dirty state and arbitrary revisions.
git-fetch: print the new and old ref when fast-forwarding
This really just splits things up partially, and creates the
interface to set things up by parsing the command line.
No real code changes so far, although the parsing of filenames is a bit
stricter. In particular, if there is a "--", then we do not accept any
filenames before it, and if there isn't any "--", then we check that _all_
paths listed are valid, not just the first one.
The new argument parsing automatically also gives us "--default" and
"--not" handling as in git-rev-parse.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
These two sample hooks try to detect and use the corresponding
commit hook from the same repository. However, they forgot to
set up GIT_DIR for their own use, so was not in effect.
Teach the "git" command to handle some commands internally
This is another patch in the "prepare to do more in C" series, where the
git wrapper command is taught about the notion of handling some
functionality internally.
Right now, the only internal commands are "version" and "help", but the
point being that we can now easily extend it to handle some of the trivial
scripts internally. Things like "git log" and "git diff" wouldn't need
separate external scripts any more.
This also implies that to support the old "git-log" and "git-diff" syntax,
the "git" wrapper now automatically looks at the name it was executed as,
and if it is "git-xxxx", it will assume that it is to internally do what
"git xxxx" would do.
In other words, you can (once you implement an internal command) soft- or
hard-link that command to the "git" wrapper command, and it will do the
right thing, whether you use the "git xxxx" or the "git-xxxx" format.
There's one other change: the search order for external programs is
modified slightly, so that the first entry remains GIT_EXEC_DIR, but the
second entry is the same directory as the git wrapper itself was executed
out of - if we can figure it out from argv[0], of course.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
When compiling on ia64 I get this warning (from gcc 3.4.3):
gcc -o pack-objects.o -c -g -O2 -Wall -DSHA1_HEADER='<openssl/sha.h>' pack-objects.c
pack-objects.c: In function `pack_revindex_ix':
pack-objects.c:94: warning: cast from pointer to integer of different size
A double cast (first to long, then to int) shuts gcc up, but is there
a better way?
[jc: Andreas Ericsson suggests to use ulong instead. ]
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/rev-list:
rev-list --objects: use full pathname to help hashing.
rev-list --objects-edge: remove duplicated edge commit output.
rev-list --objects-edge
* jc/pack-thin:
pack-objects: hash basename and direname a bit differently.
pack-objects: allow "thin" packs to exceed depth limits
pack-objects: use full pathname to help hashing with "thin" pack.
pack-objects: thin pack micro-optimization.
Use thin pack transfer in "git fetch".
Add git-push --thin.
send-pack --thin: use "thin pack" delta transfer.
Thin pack - create packfile with missing delta base.
* master:
Merge branches 'jc/rev-list' and 'jc/pack-thin'
gitview: Code cleanup
Add missing programs to ignore list
git ls files recursively show ignored files
Build and install git-mailinfo.
gitview: Bump the rev
gitview: Fix DeprecationWarning
* jc/rev-list:
rev-list --objects: use full pathname to help hashing.
rev-list --objects-edge: remove duplicated edge commit output.
rev-list --objects-edge
* jc/pack-thin:
pack-objects: hash basename and direname a bit differently.
pack-objects: allow "thin" packs to exceed depth limits
pack-objects: use full pathname to help hashing with "thin" pack.
pack-objects: thin pack micro-optimization.
Use thin pack transfer in "git fetch".
Add git-push --thin.
send-pack --thin: use "thin pack" delta transfer.
Thin pack - create packfile with missing delta base.
Conflicts:
pack-objects.c (manual adjustment for thin pack needed)
send-pack.c
This fix all the known issue with the graph display
The bug need to be explained graphically
|
a
This line need not be there ---->| \
b |
| /
c
c is parent of a and all a,b and c are placed on the same line and b is child of c
With my last checkin I added a seperate line to indicate that a is
connected to c. But then we had the line connecting a and b which should
not be ther. This changes fixes the same bug
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Make git-ls-files --others --ignored recurse into non-excluded
subdirectories.
Typically when asking git-ls-files to display all files which are
ignored by one or more exclude patterns one would want it to recurse
into subdirectories which are not themselves excluded to see if
there are any excluded files contained within those subdirectories.
Merge branches 'jc/rev-list' and 'jc/pack-thin' into next
* jc/rev-list:
rev-list --objects: use full pathname to help hashing.
* jc/pack-thin:
pack-objects: hash basename and direname a bit differently.
pack-objects: allow "thin" packs to exceed depth limits
pack-objects: use full pathname to help hashing with "thin" pack.
* np/delta:
Revert "diff-delta: produce optimal pack data"
Tweak break/merge score to adjust to the new delta generation code.
count-delta: fix counting of copied source.
It turns out that the new algorithm has a really bad corner
case, that literally spends minutes for inputs that takes less
than a quater seconds to delta with the old algorithm. The
resulting delta is 50% smaller which is admirable, but the
performance degradation is simply unacceptable for unconditional
use.
Some example cases are these blobs in Linux 2.6 repository:
rev-list --objects: use full pathname to help hashing.
This helps to group the same files from different revs together,
while spreading files with the same basename in different
directories, to help pack-object.
pack-objects: allow "thin" packs to exceed depth limits
When creating a new pack to be used in .git/objects/pack/
directory, we carefully count the depth of deltified objects to
be reused, so that the generated pack does not to exceed the
specified depth limit for runtime efficiency. However, when we
are generating a thin pack that does not contain base objects,
such a pack can only be used during network transfer that is
expanded on the other end upon reception, so being careful and
artificially cutting the delta chain does not buy us anything
except increased bandwidth requirement. This patch disables the
delta chain depth limit check when reusing an existing delta.
gitview: Display the lines joining commit nodes clearly.
Since i wanted to limit the graph box size i was resetting
the window after an index of 5. This result in line joining
commit nodes to pass over nodes which are not related. The
changes fixes the same
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Tweak break/merge score to adjust to the new delta generation code.
This lowers the default merge threshold score to 75% from
earlier 80%. The break threshold stays the same at 50% for now,
but we might want to revisit it (and the rename detection limit
as well).
* break score: this much edit (both insertion of new material
and deletion of old material) needs to be there in the file
before we consider this _might_ be a rewrite and break the
filepair.
* merge score: after a filepair is broken by the above criteria
and goes through rename detection, if their pieces did not
match with other files as rename/copy, we merge them back
into one as if nothing happened. If the filepair had at
least this much deletion of old material, however, we say
this is completely rewritten with dissimilarity index X% when
we do so.
The updated delta code by Nico is so good that what we earlier
thought to be complete rewrite now reuses a lot more from the
source material (reducing the counted "delete"), so this
adjustment is needed to keep the perceived behaviour similar to
what we had earlier.
count-delta: tweak counting of copied source material.
With the finer grained delta algorithm, count-delta algorithm
started overcounting copied source material, since the new delta
output tends to reuse the same source range more than once and
more aggressively. This broke an earlier assumption that the
number of bytes copied out from the source buffer is a good
approximation how much source material is actually remaining in
the result.
This uses fairly inefficient algorithm to keep track of ranges
of source material that are actually copied out to the
destination buffer. With this tweak, the obvious rename/break
detection tests in the testsuite start to work again.
pack-objects: use full pathname to help hashing with "thin" pack.
This uses the same hashing algorithm to the "preferred base
tree" objects and the incoming pathnames, to group the same
files from different revs together, while spreading files with
the same basename in different directories.
Since we sort objects by type, hash, preferredness and then
size, after we have a delta against preferred base, there is no
point trying a delta with non-preferred base. This seems to
save expensive calls to diff-delta and it also seems to save the
output space as well.
This implements "eye candy" similar to the pack-object/unpack-object
to entertain users while a large tree is being checked out after
a clone or a pull.