* kh/commit:
Export rerere() and launch_editor().
Introduce entry point add_interactive and add_files_to_cache
Enable wt-status to run against non-standard index file.
Enable wt-status output to a given FILE pointer.
* maint:
Update GIT 1.5.3.5 Release Notes
git-rebase--interactive.sh: Make 3-way merge strategies work for -p.
git-rebase--interactive.sh: Don't pass a strategy to git-cherry-pick.
Fix --strategy parsing in git-rebase--interactive.sh
Make merge-recursive honor diff.renamelimit
cherry-pick/revert: more compact user direction message
core-tutorial: Use new syntax for git-merge.
git-merge: document but discourage the historical syntax
Prevent send-pack from segfaulting (backport from 'master')
Documentation/git-cvsexportcommit.txt: s/mgs/msg/ in example
* cc/skip:
Bisect: add "skip" to the short usage string.
Bisect run: "skip" current commit if script exit code is 125.
Bisect: add a "bisect replay" test case.
Bisect: add "bisect skip" to the documentation.
Bisect: refactor "bisect_{bad,good,skip}" into "bisect_state".
Bisect: refactor some logging into "bisect_write".
Bisect: refactor "bisect_write_*" functions.
Bisect: implement "bisect skip" to mark untestable revisions.
Bisect: fix some white spaces and empty lines breakages.
rev-list documentation: add "--bisect-all".
rev-list: implement --bisect-all
* lt/rename:
Do the fuzzy rename detection limits with the exact renames removed
Fix ugly magic special case in exact rename detection
Do exact rename detection regardless of rename limits
Do linear-time/space rename logic for exact renames
copy vs rename detection: avoid unnecessary O(n*m) loops
Ref-count the filespecs used by diffcore
Split out "exact content match" phase of rename detection
Add 'diffcore.h' to LIB_H
* ds/gitweb:
gitweb: Use chop_and_escape_str in more places.
gitweb: Refactor abbreviation-with-title-attribute code.
gitweb: Provide title attributes for abbreviated author names.
git-rebase--interactive.sh: Make 3-way merge strategies work for -p.
git-rebase--interactive.sh used to pass all parents of a merge commit to
git-merge, which means that we have at least 3 heads to merge: HEAD,
first parent and second parent. So 3-way merge strategies like recursive
wouldn't work.
Fortunately, we have checked out the first parent right before the merge
anyway, so that is HEAD. Therefore we can drop simply it from the list
of parents, making 3-way strategies work for merge commits with only
two parents.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-rebase--interactive.sh: Don't pass a strategy to git-cherry-pick.
git-cherry-pick doesn't support a strategy paramter, so don't pass one.
This means that --strategy for interactive rebases is a no-op for
anything but merge commits, but that's still better than being broken. A
correct fix would probably need to port the --merge behaviour from plain
git-rebase.sh, but I have no clue how to integrate that cleanly.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix --strategy parsing in git-rebase--interactive.sh
For the --strategy/-s option, git-rebase--interactive.sh dropped the
parameter which it was trying to parse.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It might be a sign of source code management gone bad, but when two branches
has diverged almost beyond recognition and time has come for the branches to
merge, the user is going to need all the help his tool can give him. Honoring
diff.renamelimit has great potential as a painkiller in such situations.
The painkiller effect could have been achieved by e.g. 'merge.renamelimit',
but the flexibility gained by a separate option is questionable: our user
would probably expect git to detect renames equally good when merging as
when diffing (I known I did).
Signed-off-by: Lars Hjemli <hjemli@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-merge: document but discourage the historical syntax
Historically "git merge" took its command line arguments in a
rather strange order. Document the historical syntax, and also
document clearly that it is not encouraged in new scripts.
There is no reason to deprecate the historical syntax, as the
current code can sanely tell which syntax the caller is using,
and existing scripts by people do use the historical syntax.
If we can't find a source match, and we have no destination, we
need to abort the match function early before we try to match
the destination against the remote.
There are some cases when one line from "raw" git-diff output (raw
format) corresponds to more than one patch in the patchset git-diff
output; we call this situation "split patch". Old code misdetected
subsequent patches (for different files) with the same pre-image and
post-image as fragments of "split patch", leading to mislabeled
from-file/to-file diff header etc.
Old code used pre-image and post-image SHA-1 identifier ('from_id' and
'to_id') to check if current patch corresponds to old raw diff format
line, to find if one difftree raw line coresponds to more than one
patch in the patch format. Now we use post-image filename for that.
This assumes that post-image filename alone can be used to identify
difftree raw line. In the case this changes (which is unlikely
considering current diff engine) we can add 'from_id' and 'to_id'
to detect "patch splitting" together with 'to_file'.
Because old code got pre-image and post-image SHA-1 identifier for the
patch from the "index" line in extended diff header, diff header had
to be buffered. New code takes post-image filename from "git diff"
header, which is first line of a patch; this allows to simplify
git_patchset_body code. A side effect of resigning diff header
buffering is that there is always "diff extended_header" div, even
if extended diff header is empty.
Alternate solution would be to check when git splits patches, and do
not check if parsed info from current patch corresponds to current or
next raw diff format output line. Git splits patches only for 'T'
(typechange) status filepair, and there always two patches
corresponding to one raw diff line. It was not used because it would
tie gitweb code to minute details of git diff output.
While at it, use newly introduced parsed_difftree_line wrapper
subroutine in git_difftree_body.
Noticed-by: Yann Dirson <ydirson@altern.org> Diagnosed-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct handling of upload-pack in builtin-fetch-pack
The field in the args was being ignored in favor of a static constant
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Thanked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Try to avoid a lot of work scanning for excluded files,
by caching some more information when setting up the exclusion
data structure.
Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and
reduces the amount of instructions executed (as measured by valgrind) by a
factor of 2.
* maint:
RelNotes-1.5.3.5: describe recent fixes
merge-recursive.c: mrtree in merge() is not used before set
sha1_file.c: avoid gcc signed overflow warnings
Fix a small memory leak in builtin-add
honor the http.sslVerify option in shell scripts
merge-recursive.c: mrtree in merge() is not used before set
The called function merge_trees() sets its *result, to which the
address of the variable mrtree in merge() function is passed,
only when index_only is set. But that is Ok as the function
uses the value in the variable only under index_only iteration.
However, recent gcc does not realize this. Work it around by
adding a fake initializer.
sha1_file.c: In check_packed_git_:
sha1_file.c:527: warning: assuming signed overflow does not
occur when assuming that (X + c) < X is always false
sha1_file.c:527: warning: assuming signed overflow does not
occur when assuming that (X + c) < X is always false
for a piece of code that tries to make sure that off_t is large
enough to hold more than 2^32 offset. The test tried to make
sure these do not wrap-around:
/* make sure we can deal with large pack offsets */
off_t x = 0x7fffffffUL, y = 0xffffffffUL;
if (x > (x + 1) || y > (y + 1)) {
but gcc assumes it can do whatever optimization it wants for a
signed overflow (undefined behaviour) and warns about this
construct.
Follow Linus's suggestion to check sizeof(off_t) instead to work
around the problem.
No longer talk about Cogito since it's deprecated. Some scripts (such as
git-reset or git-branch) have undergone builtinification so adjust the text
to reflect this.
Fix a typo in the description of git-show-branch (merges are indicated by a
`-', not by a `.').
git-pull/git-push do not seem to use the dumb git-ssh-fetch/git-ssh-upload
(the text was probably missing a word).
Adjust a link that wasn't rendered properly because it was wrapped.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-fetch: do not fail when remote branch disappears
When the branch named with branch.$name.merge is not covered by
the fetch configuration for the remote repository named with
branch.$name.remote, we automatically add that branch to the set
of branches to be fetched. However, if the remote repository
does not have that branch (e.g. it used to exist, but got
removed), this is not a reason to fail the git-fetch itself.
The situation however will be noticed if git-fetch was called by
git-pull, as the resulting FETCH_HEAD would not have any entry
that is marked for merging.
Acked-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of git://git.kernel.org/pub/scm/gitk/gitk: (34 commits)
gitk: Use the UI font for the diff/old version/new version radio buttons
gitk: Simplify the code for finding commits
gitk: Fix a couple more bugs in the path limiting
gitk: Fix some bugs with path limiting in the diff display
gitk: Use the status window for other functions
gitk: Integrate the reset progress bar in the main frame
gitk: Ensure tabstop setting gets restored by Cancel button
gitk: Limit diff display to listed paths by default
gitk: Fix Tcl error: can't unset findcurline
gitk: Get rid of the diffopts variable
gitk: Fix bug where the last few commits would sometimes not be visible
gitk: Add a font chooser
gitk: Keep track of font attributes ourselves instead of using font actual
gitk: Use named fonts instead of the font specification
gitk: Fix bug causing Tcl error when changing find match type
gitk: Fix the tab setting in the diff display window
gitk: Add progress bars for reading in stuff and for finding
gitk: Fix a couple of bugs
gitk: Simplify highlighting interface and combine with Find function
gitk: Fix bug in generating patches
...
gitk: Use the UI font for the diff/old version/new version radio buttons
This makes the radio buttons for selecting whether to see the full diff,
the old version or the new version use the same font as the other user
interface elements.
This unifies findmore and findmorerev, and adds the ability to do
a search with or without wrap around from the end of the list of
commits to the beginning (or vice versa for reverse searches).
findnext and findprev are gone, and the buttons and keys for searching
all call dofind now. dofind doesn't unmark the matches to start with.
Shift-up and shift-down are back by popular request, and the searches
they do don't wrap around. The other keys that do searches (/, ?,
return, M-f) do wrapping searches except for M-g.
Bisect run: "skip" current commit if script exit code is 125.
This is incompatible with previous versions because an exit code
of 125 used to mark current commit as "bad". But hopefully this exit
code is not much used by test scripts or other programs. (126 and 127
are used by POSIX compliant shells to mean "found but not
executable" and "command not found", respectively.)
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also fix "bisect bad" and "bisect good" short usage description.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bisect: refactor "bisect_{bad,good,skip}" into "bisect_state".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bisect: refactor some logging into "bisect_write".
Also use "die" instead of "echo >&2 something ; exit 1".
And simplify "bisect_replay".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bisect: implement "bisect skip" to mark untestable revisions.
When there are some "skip"ped revisions, we add the '--bisect-all'
option to "git rev-list --bisect-vars". Then we filter out the
"skip"ped revisions from the result of the rev-list command, and we
modify the "bisect_rev" var accordingly.
We don't always use "--bisect-all" because it is slower
than "--bisect-vars" or "--bisect".
When we cannot find for sure the first bad commit because of
"skip"ped commits, we print the hash of each possible first bad
commit and then we exit with code 2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Do the fuzzy rename detection limits with the exact renames removed
When we do the fuzzy rename detection, we don't care about the
destinations that we already handled with the exact rename detector.
And, in fact, the code already knew that - but the rename limiter, which
used to run *before* exact renames were detected, did not.
This fixes it so that the rename detection limiter now bases its
decisions on the *remaining* rename counts, rather than the original
ones.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix ugly magic special case in exact rename detection
For historical reasons, the exact rename detection had populated the
filespecs for the entries it compared, and the rest of the similarity
analysis depended on that. I hadn't even bothered to debug why that was
the case when I re-did the rename detection, I just made the new one
have the same broken behaviour, with a note about this special case.
This fixes that fixme. The reason the exact rename detector needed to
fill in the file sizes of the files it checked was that the _inexact_
rename detector was broken, and started comparing file sizes before it
filled them in.
Fixing that allows the exact phase to do the sane thing of never even
caring (since all *it* cares about is really just the SHA1 itself, not
the size nor the contents).
It turns out that this also indirectly fixes a bug: trying to populate
all the filespecs will run out of virtual memory if there is tons and
tons of possible rename options. The fuzzy similarity analysis does the
right thing in this regard, and free's the blob info after it has
generated the hash tables, so the special case code caused more trouble
than just some extra illogical code.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do exact rename detection regardless of rename limits
Now that the exact rename detection is linear-time (with a very small
constant factor to boot), there is no longer any reason to limit it by
the number of files involved.
In some trivial testing, I created a repository with a directory that
had a hundred thousand files in it (all with different contents), and
then moved that directory to show the effects of renaming 100,000 files.
With the new code, that resulted in
[torvalds@woody big-rename]$ time ~/git/git show -C | wc -l
400006
real 0m2.071s
user 0m1.520s
sys 0m0.576s
ie the code can correctly detect the hundred thousand renames in about 2
seconds (the number "400006" comes from four lines for each rename:
diff --git a/really-big-dir/file-1-1-1-1-1 b/moved-big-dir/file-1-1-1-1-1
similarity index 100%
rename from really-big-dir/file-1-1-1-1-1
rename to moved-big-dir/file-1-1-1-1-1
and the extra six lines is from a one-liner commit message and all the
commit information and spacing).
Most of those two seconds weren't even really the rename detection, it's
really all the other stuff needed to get there.
With the old code, this wouldn't have been practically possible. Doing
a pairwise check of the ten billion possible pairs would have been
prohibitively expensive. In fact, even with the rename limiter in
place, the old code would waste a lot of time just on the diff_filespec
checks, and despite not even trying to find renames, it used to look
like:
[torvalds@woody big-rename]$ time git show -C | wc -l 1400006
real 0m12.337s
user 0m12.285s
sys 0m0.192s
ie we used to take 12 seconds for this load and not even do any rename
detection! (The number 1400006 comes from fourteen lines per file moved:
seven lines each for the delete and the create of a one-liner file, and
the same extra six lines of commit information).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do linear-time/space rename logic for exact renames
This implements a smarter rename detector for exact renames, which
rather than doing a pairwise comparison (time O(m*n)) will just hash the
files into a hash-table (size O(n+m)), and only do pairwise comparisons
to renames that have the same hash (time O(n+m) except for unrealistic
hash collissions, which we just cull aggressively).
Admittedly the exact rename case is not nearly as interesting as the
generic case, but it's an important case none-the-less. A similar general
approach should work for the generic case too, but even then you do need
to handle the exact renames/copies separately (to avoid the inevitable
added cost factor that comes from the _size_ of the file), so this is
worth doing.
In the expectation that we will indeed do the same hashing trick for the
general rename case, this code uses a generic hash-table implementation
that can be used for other things too. In fact, we might be able to
consolidate some of our existing hash tables with the new generic code
in hash.[ch].
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
copy vs rename detection: avoid unnecessary O(n*m) loops
The core rename detection had some rather stupid code to check if a
pathname was used by a later modification or rename, which basically
walked the whole pathname space for all renames for each rename, in
order to tell whether it was a pure rename (no remaining users) or
should be considered a copy (other users of the source file remaining).
That's really silly, since we can just keep a count of users around, and
replace all those complex and expensive loops with just testing that
simple counter (but this all depends on the previous commit that shared
the diff_filespec data structure by using a separate reference count).
Note that the reference count is not the same as the rename count: they
behave otherwise rather similarly, but the reference count is tied to
the allocation (and decremented at de-allocation, so that when it turns
zero we can get rid of the memory), while the rename count is tied to
the renames and is decremented when we find a rename (so that when it
turns zero we know that it was a rename, not a copy).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than copy the filespecs when introducing new versions of them
(for rename or copy detection), use a refcount and increment the count
when reusing the diff_filespec.
This avoids unnecessary allocations, but the real reason behind this is
a future enhancement: we will want to track shared data across the
copy/rename detection. In order to efficiently notice when a filespec
is used by a rename, the rename machinery wants to keep track of a
rename usage count which is shared across all different users of the
filespec.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split out "exact content match" phase of rename detection
This makes the exact content match a separate function of its own.
Partly to cut down a bit on the size of the diffcore_rename() function
(which is too complex as it is), and partly because there are smarter
ways to do this than an O(m*n) loop over it all, and that function
should be rewritten to take that into account.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code generating perl/Makefile from Makefile.PL was causing trouble
because it didn't considered NO_PERL_MAKEMAKER and ran makemaker
unconditionally, rewriting perl.mak. Makemaker is FUBAR in ActiveState Perl,
and perl/Makefile has a replacement for it.
Besides, a changed Git.pm is *NOT* a reason to rebuild all the perl scripts,
so remove the dependency too.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* db/fetch-pack: (60 commits)
Define compat version of mkdtemp for systems lacking it
Avoid scary errors about tagged trees/blobs during git-fetch
fetch: if not fetching from default remote, ignore default merge
Support 'push --dry-run' for http transport
Support 'push --dry-run' for rsync transport
Fix 'push --all branch...' error handling
Fix compilation when NO_CURL is defined
Added a test for fetching remote tags when there is not tags.
Fix a crash in ls-remote when refspec expands into nothing
Remove duplicate ref matches in fetch
Restore default verbosity for http fetches.
fetch/push: readd rsync support
Introduce remove_dir_recursively()
bundle transport: fix an alloc_ref() call
Allow abbreviations in the first refspec to be merged
Prevent send-pack from segfaulting when a branch doesn't match
Cleanup unnecessary break in remote.c
Cleanup style nit of 'x == NULL' in remote.c
Fix memory leaks when disconnecting transport instances
Ensure builtin-fetch honors {fetch,transfer}.unpackLimit
...
git-remote: fix "Use of uninitialized value in string ne"
martin f krafft <madduck@madduck.net> writes:
> piper:~> git remote show origin
> * remote origin
> URL: ssh://git.madduck.net/~/git/etc/mailplate.git
> Use of uninitialized value in string ne at /usr/local/stow/git/bin/git-remote line 248.
This is because there might not be branch.<name>.remote defined but
the code unconditionally dereferences $branch->{$name}{'REMOTE'} and
compares with another string.
Tested-by: Martin F Krafft <madduck@madduck.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
First, paths ending in a slash were not matching anything. This fixes
path_filter to handle paths ending in a slash (such entries have to
match a directory, and can't match a file, e.g., foo/bar/ can't match
a plain file called foo/bar).
Secondly, clicking in the file list pane (bottom right) was broken
because $treediffs($ids) contained all the files modified by the
commit, not just those within the file list. This fixes that too.
gitk: Fix some bugs with path limiting in the diff display
First, we weren't putting "--" between the ids and the paths in the
git diff-tree/diff-index/diff-files command, so if there was a tag
and a file with the same name, we could get an ambiguity in the
command. This puts the "--" in to make it clear that the paths are
paths.
Secondly, this implements the path limiting for merge diffs as well
as the normal 2-way diffs.
gitk: Integrate the reset progress bar in the main frame
This makes the reset function use a progress bar in the same location
as the progress bars for reading in commits and for finding commits,
instead of a progress bar in a separate detached window. The progress
bar for resetting is red.
This also puts "Resetting" in the status window while the reset is in
progress. The setting of the status window is done through an
extension of the interface used for setting the watch cursor.
gitk: Ensure tabstop setting gets restored by Cancel button
We weren't restoring the tabstop setting if the user pressed the
Cancel button in the Edit/Preferences window. Also improved the
label for the checkbox (made it "Tab spacing" rather than the laconic
"tabstop") and moved it above the "Display nearby tags" checkbox.
gitk: Limit diff display to listed paths by default
When the user has specified a list of paths, either on the command line
or when creating a view, gitk currently displays the diffs for all files
that a commit has modified, not just the ones that match the path list.
This is different from other git commands such as git log. This change
makes gitk behave the same as these other git commands by default, that
is, gitk only displays the diffs for files that match the path list.
There is now a checkbox labelled "Limit diffs to listed paths" in the
Edit/Preferences pane. If that is unchecked, gitk will display the
diffs for all files as before.
When gitk is run with the --merge flag, it will get the list of unmerged
files at startup, intersect that with the paths listed on the command line
(if any), and use that as the list of paths.
Use PRIuMAX instead of 'unsigned long long' in show-index
Elsewhere in Git we already use PRIuMAX and cast to uintmax_t when
we need to display a value that is 'very big' and we're not exactly
sure what the largest display size is for this platform.
This particular fix is needed so we can do the incredibly crazy
temporary hack of:
allowing us to more easily look for locations where we are passing
a pointer to an 8 byte value to a function that expects a 4 byte
value. This can occur on some platforms where sizeof(long) == 8
and sizeof(size_t) == 4.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* maint:
Describe more 1.5.3.5 fixes in release notes
Fix diffcore-break total breakage
Fix directory scanner to correctly ignore files without d_type
Improve receive-pack error message about funny ref creation
fast-import: Fix argument order to die in file_change_m
git-gui: Don't display CR within console windows
git-gui: Handle progress bars from newer gits
git-gui: Correctly report failures from git-write-tree
gitk.txt: Fix markup.
send-pack: respect '+' on wildcard refspecs
git-gui: accept versions containing text annotations, like 1.5.3.mingw.1
git-gui: Don't crash when starting gitk from a browser session
git-gui: Allow gitk to be started on Cygwin with native Tcl/Tk
git-gui: Ensure .git/info/exclude is honored in Cygwin workdirs
git-gui: Handle starting on mapped shares under Cygwin
git-gui: Display message box when we cannot find git in $PATH
git-gui: Avoid using bold text in entire gui for some fonts
Ok, so on the kernel list, some people noticed that "git log --follow"
doesn't work too well with some files in the x86 merge, because a lot of
files got renamed in very special ways.
In particular, there was a pattern of doing single commits with renames
that looked basically like
- rename "filename.h" -> "filename_64.h"
- create new "filename.c" that includes "filename_32.h" or
"filename_64.h" depending on whether we're 32-bit or 64-bit.
which was preparatory for smushing the two trees together.
Now, there's two issues here:
- "filename.c" *remained*. Yes, it was a rename, but there was a new file
created with the old name in the same commit. This was important,
because we wanted each commit to compile properly, so that it was
bisectable, so splitting the rename into one commit and the "create
helper file" into another was *not* an option.
So we need to break associations where the contents change too much.
Fine. We have the -B flag for that. When we break things up, then the
rename detection will be able to figure out whether there are better
alternatives.
- "git log --follow" didn't with with -B.
Now, the second case was really simple: we use a different "diffopt"
structure for the rename detection than the basic one (which we use for
showing the diffs). So that second case is trivially fixed by a trivial
one-liner that just copies the break_opt values from the "real" diffopts
to the one used for rename following. So now "git log -B --follow" works
fine:
however, the end result does *not* work. Because our diffcore-break.c
logic is totally bogus!
In particular:
- it used to do
if (base_size < MINIMUM_BREAK_SIZE)
return 0; /* we do not break too small filepair */
which basically says "don't bother to break small files". But that
"base_size" is the *smaller* of the two sizes, which means that if some
large file was rewritten into one that just includes another file, we
would look at the (small) result, and decide that it's smaller than the
break size, so it cannot be worth it to break it up! Even if the other
side was ten times bigger and looked *nothing* like the samell file!
That's clearly bogus. I replaced "base_size" with "max_size", so that
we compare the *bigger* of the filepair with the break size.
- It calculated a "merge_score", which was the score needed to merge it
back together if nothing else wanted it. But even if it was *so*
different that we would never want to merge it back, we wouldn't
consider it a break! That makes no sense. So I added
if (*merge_score_p > break_score)
return 1;
to make it clear that if we wouldn't want to merge it at the end, it
was *definitely* a break.
- It compared the whole "extent of damage", counting all inserts and
deletes, but it based this score on the "base_size", and generated the
damage score with
but that makes no sense either, since quite often, this will result in
a number that is *bigger* than MAX_SCORE! Why? Because base_size is
(again) the smaller of the two files we compare, and when you start out
from a small file and add a lot (or start out from a large file and
remove a lot), the base_size is going to be much smaller than the
damage!
Again, the fix was to replace "base_size" with "max_size", at which
point the damage actually becomes a sane percentage of the whole.
With these changes in place, not only does "git log -B --follow" work for
the case that triggered this in the first place, ie now
actually gives reasonable results. But I also wanted to verify it in
general, by doing a full-history
git log --stat -B -C
on my kernel tree with the old code and the new code.
There's some tweaking to be done, but generally, the new code generates
much better results wrt breaking up files (and then finding better rename
candidates). Here's a few examples of the "--stat" output:
Now, in some other cases, it does actually turn a rename into a real
"delete+create" pair, and then the diff is usually bigger, so truth in
advertizing: it doesn't always generate a nicer diff. But for what -B was
meant for, I think this is a big improvement, and I suspect those cases
where it generates a bigger diff are tweakable.
So I think this diff fixes a real bug, but we might still want to tweak
the default values and perhaps the exact rules for when a break happens.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Fix directory scanner to correctly ignore files without d_type
On Fri, 19 Oct 2007, Todd T. Fries wrote:
> If DT_UNKNOWN exists, then we have to do a stat() of some form to
> find out the right type.
That happened in the case of a pathname that was ignored, and we did
not ask for "dir->show_ignored". That test used to be *together*
with the "DTYPE(de) != DT_DIR", but splitting the two tests up
means that we can do that (common) test before we even bother to
calculate the real dtype.
Of course, that optimization only matters for systems that don't
have, or don't fill in DTYPE properly.
I also clarified the real relationship between "exclude" and
"dir->show_ignored". It used to do
if (exclude != dir->show_ignored) {
..
which wasn't exactly obvious, because it triggers for two different
cases:
- the path is marked excluded, but we are not interested in ignored
files: ignore it
- the path is *not* excluded, but we *are* interested in ignored
files: ignore it unless it's a directory, in which case we might
have ignored files inside the directory and need to recurse
into it).
so this splits them into those two cases, since the first case
doesn't even care about the type.
I also made a the DT_UNKNOWN case a separate helper function,
and added some commentary to the cases.
Linus
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Fix "can't unset prevlines(...)" Tcl error
gitk: Avoid an error when cherry-picking if HEAD has moved on
gitk: Check that we are running on at least Tcl/Tk 8.4
gitk: Do not pick up file names of "copy from" lines
gitk: Add support for OS X mouse wheel
gitk: disable colours when calling git log
Merge branch 'maint' of git://repo.or.cz/git-gui into maint
* 'maint' of git://repo.or.cz/git-gui:
git-gui: Don't display CR within console windows
git-gui: Handle progress bars from newer gits
git-gui: Correctly report failures from git-write-tree
git-gui: accept versions containing text annotations, like 1.5.3.mingw.1
git-gui: Don't crash when starting gitk from a browser session
git-gui: Allow gitk to be started on Cygwin with native Tcl/Tk
git-gui: Ensure .git/info/exclude is honored in Cygwin workdirs
git-gui: Handle starting on mapped shares under Cygwin
git-gui: Display message box when we cannot find git in $PATH
git-gui: Avoid using bold text in entire gui for some fonts
This fixes the error reported by Michele Ballabio, where gitk will
throw a Tcl error "can't unset prevlines(...)" when displaying a
commit that has a parent commit listed more than once, and the commit
is the first child of that parent.
The problem was basically that we had two variables, prevlines and
lineends, and were relying on the invariant that prevlines($id) was
set iff $id was in the lineends($r) list for some $r. But having
a duplicate parent breaks that invariant since we end up with the
parent listed twice in lineends.
This fixes it by simplifying the logic to use only a single variable,
lineend. It also rearranges things a little so that we don't try to
draw the line for the duplicated parent twice.