Invalidate cache-tree entries for touched paths in git-apply.
This updates git-apply to maintain cache-tree information. With
this and the previous write-tree patch, repeated "apply --index"
followed by "write-tree" on a huge tree will hopefully become
faster.
The updated write-tree reads from $GIT_DIR/index.aux to pick up
subtree objects information, updates the cache-tree with the
index, and updates index.aux file after writing a tree out of
the index file.
Until update-index and other programs that modify the index are
updated to maintain index.aux file, the index.aux file written
by the last write-tree will become stale immediately after they
update the index, which will result in the whole tree
recomputation just like the original write-tree.
The idea is to convert those commands to invalidate cache-tree
whenever they touch the index entries, and write updated
index.aux out. After the index is updated with them, write-tree
will be able to reuse the parts of the cache-tree that have not
been touched.
The cache_tree data structure is to cache tree object names that
would result from the current index file.
The idea is to have an optional file to record each tree object
name that corresponds to a directory path in the cache when we
run write_cache(), and read it back when we run read_cache().
During various index manupulations, we selectively invalidate
the parts so that the next write-tree can bypass regenerating
tree objects for unchanged parts of the directory hierarchy.
We could perhaps make the cache-tree data an optional part of
the index file, but that would involve the index format updates,
so unless we need it for performance reasons, the current plan
is to use a separate file, $GIT_DIR/index.aux to store this
information and link it with the index file with the checksum
that is already used for index file integrity check.
make update-index --chmod work with multiple files and --stdin
The patch makes "--chmod=-x" and "--chmod=+x" act like "--add"
and "--remove" to affect the behaviour of the command for the
rest of the path parameters, not just the following one.
The second installment to libify diff brothers. The pathname
arguments are checked more strictly than before because we now
use the revision.c::setup_revisions() infrastructure.
This is the first installment to libify diff brothers.
The updated diff-files uses revision.c::setup_revisions()
infrastructure to parse its command line arguments, which means
the pathname arguments are checked more strictly than before.
The tests are adjusted to separate possibly missing paths from
the rest of arguments with double-dashes, to show the kosher
way.
As Linus pointed out, renaming diff.c to diff-lib.c was simply
stupid, so I am renaming it back. The new diff-lib.c is to
contain pieces extracted from diff brothers.
I hacked it up to teach it the git extended diff headers, made
it not to read the whole patch in the array.
Also, the original program, when arguments are given, ran "diff"
with the given arguments and showed the output from it. Of
course, I changed it to run "git diff" ;-).
* master:
Split up builtin commands into separate files from git.c
git-log produces no output
fix pack-object buffer size
mailinfo: decode underscore used in "Q" encoding properly.
Reintroduce svn pools to solve the memory leak.
pack-objects: do not stop at object that is "too small"
git-commit --amend: two fixes.
get_tree_entry(): make it available from tree-walk
sha1_name.c: no need to include diff.h; tree-walk.h will do.
sha1_name.c: prepare to make get_tree_entry() reusable from others.
get_sha1() shorthands for blob/tree objects
pre-commit hook: complain about conflict markers.
git-merge: a bit more readable user guidance.
diff: move diff.c to diff-lib.c to make room.
git log: don't do merge diffs by default
Allow "git repack" users to specify repacking window/depth
Document git-clone --reference
Fix filename scaling for binary files
Fix uninteresting tags in new revision parsing
Conflicts:
Adjusted the addition of fmt-patch to match the recent split
from git.c to builtin.log.c.
Split up builtin commands into separate files from git.c
Right now it split it into "builtin-log.c" for log-related commands
("log", "show" and "whatchanged"), and "builtin-help.c" for the
informational commands (usage printing and "help" and "version").
This just makes things easier to read, I find.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* master:
fix pack-object buffer size
mailinfo: decode underscore used in "Q" encoding properly.
Reintroduce svn pools to solve the memory leak.
pack-objects: do not stop at object that is "too small"
* fix:
fix pack-object buffer size
mailinfo: decode underscore used in "Q" encoding properly.
Reintroduce svn pools to solve the memory leak.
pack-objects: do not stop at object that is "too small"
mailinfo: decode underscore used in "Q" encoding properly.
Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are
subtly different; the latter is used on the mail header and an
underscore needs to be decoded to 0x20.
pack-objects: do not stop at object that is "too small"
Because we sort the delta window by name-hash and then size,
hitting an object that is too small to consider as a delta base
for the current object does not mean we do not have better
candidate in the window beyond it.
Noticed by Shawn Pearce, analyzed by Nico, Linus and me.
When running "git commit --amend" only to fix the commit log
message without any content change, we mistakenly showed the
git-status output that says "nothing to commit" without
commenting it out.
If you have already run update-index but you want to amend the
top commit, "git commit --amend --only" without any paths should
have worked, because --only means "starting from the base
commit, update-index these paths only to prepare the index to
commit, and perform the commit". However, we refused -o without
paths.
I tried the code with pack-objects.c::try_delta(), and was
somewhat dissapointed. The current type-path based heuristics
already limits the delta attempts to similar objects anyway, so
it is not a good place to apply it.
The Net never forgets, so we can resurrect it if we wanted to
later.
When a verbatim rename or copy is detected, we did not show
anything on the "diff --stat" for the filepair. This makes it
to show the rename information.
* jc/unresolve:
Add git-unresolve <paths>...
get_tree_entry(): make it available from tree-walk
sha1_name.c: no need to include diff.h; tree-walk.h will do.
sha1_name.c: prepare to make get_tree_entry() reusable from others.
pre-commit hook: complain about conflict markers.
git-merge: a bit more readable user guidance.
diff: move diff.c to diff-lib.c to make room.
git log: don't do merge diffs by default
Allow "git repack" users to specify repacking window/depth
This is an attempt to address the issue raised on #git channel
recently by Carl Worth.
After a conflicted automerge, "git diff" shows a combined diff
to give you how the tentative automerge result differs from
what came from each branch. During a complex merge, it is
tempting to be able to resolve a few paths at a time, mark
them "I've dealt with them" with git-update-index to unclutter
the next "git diff" output, and keep going. However, when the
final result does not compile or otherwise found to be a
mismerge, the workflow to fix the mismerged paths suddenly
changes to "git diff HEAD -- path" (to get a diff from our
HEAD before merging) and "git diff MERGE_HEAD -- path" (to get
a diff from theirs), and it cannot show the combined anymore.
With git-unresolve <paths>..., the versions from our branch and
their branch for specified blobs are placed in stage #2 and
stage #3, without touching the working tree files. This gives
you the combined diff back for easier review, along with
"diff --ours" and "diff --theirs".
One thing it does not do is to place the base in stage #1; this
means "diff --base" would behave differently between the run
immediately after a conflicted three-way merge, and the run
after an update-index by mistake followed by a git-unresolve.
We could theoretically run merge-base between HEAD and
MERGE_HEAD to find which tree to place in stage #1, but
reviewing "diff --base" is not that useful so....
* lt/xsha1:
get_tree_entry(): make it available from tree-walk
sha1_name.c: no need to include diff.h; tree-walk.h will do.
sha1_name.c: prepare to make get_tree_entry() reusable from others.
get_sha1() shorthands for blob/tree objects
Several <<< or === or >>> characters at the beginning of a line
is very likely to be leftover conflict markers from a failed
automerge the user resolved incorrectly, so detect them.
As usual, this can be defeated with "git commit --no-verify" if
you really do want to have those files, just like changes that
introduce trailing whitespaces.
We said "fix up by hand" after failed automerge, which was a big
"Huh? Now what?". Be a bit more explicit without being too
verbose. Suggested by Carl Worth.
I personally prefer "ignore_merges" to be on by default, because quite
often the merge diff is distracting and not interesting. That's true both
with "-p" and with "--stat" output.
If you want output from merges, you can trivially use the "-m", "-c" or
"--cc" flags to tell that you're interested in merges, which also tells
the diff generator what kind of diff to do (for --stat, any of the three
will do, of course, but they differ for plain patches or for
--patch-with-stat).
This trivial patch just removes the two lines that tells "git log" not to
ignore merges. It will still show the commit log message, of course, due
to the "always_show_header" part.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Allow "git repack" users to specify repacking window/depth
.. but don't even bother documenting it. I don't think any normal person
is supposed to ever really care, but it simplifies testing when you want
to use the "git repack" wrapper rather than forcing you to use the core
programs (which already do support the window/depth arguments, of course).
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a fairly straightforward patch to allow "get_sha1()" to also have
shorthands for tree and blob objects.
The syntax is very simple and intuitive: you can specify a tree or a blob
by simply specifying <revision>:<path>, and get_sha1() will do the SHA1
lookup from the tree for you.
You can currently do it with "git ls-tree <rev> <path>" and parsing the
output, but that's actually pretty awkward.
With this, you can do something like
git cat-file blob v1.2.4:Makefile
to get the contents of "Makefile" at revision v1.2.4.
Now, this isn't necessarily something you really need all that often, but
the concept itself is actually pretty powerful. We could, for example,
allow things like
to see the difference between two arbitrary files in two arbitrary
revisions. To do that, the only thing we'd have to do is to make
git-diff-tree accept two blobs to diff, in addition to the two trees it
now expects.
When I unified the revision argument parsing, I introduced a simple bug
wrt tags that had been marked uninteresting. When it was preparing for the
revision walk, it would mark all the parent commits of an uninteresting
tag correctly uninteresting, but it would forget about the commit itself.
This means that when I just did my 2.6.17-rc2 release, and my scripts
generated the log for "v2.6.17-rc1..v2.6.17-rc2", everything was fine,
except the commit pointed to by 2.6.17-rc1 (which shouldn't have been
there) was included. Even though it should obviously have been marked as
being uninteresting.
Not a huge deal, and the fix is trivial.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
The set_reuse_addr() error case was the only error case in
socklist() where we returned rather than continued. Not sure
why. Either we must free the socklist, or continue. This patch
continues on error.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/logopt:
Fix "git log --stat": make sure to set recursive with --stat.
combine-diff: show diffstat with the first parent.
git.c: LOGSIZE is unused after log printing cleanup.
Log message printout cleanups (#3): fix --pretty=oneline
Log message printout cleanups (#2)
Log message printout cleanups
rev-list --header: output format fix
Fixes for option parsing
log/whatchanged/show - log formatting cleanup.
Simplify common default options setup for built-in log family.
Tentative built-in "git show"
Built-in git-whatchanged.
rev-list option parser fix.
Split init_revisions() out of setup_revisions()
Fix up rev-list option parsing.
Fix up default abbrev in setup_revisions() argument parser.
Common option parsing for "git log --diff" and friends
* master:
packed_object_info_detail(): check for corrupt packfile.
cleanups: remove unused variable from exec_cmd.c
cleanups: prevent leak of two strduped strings in config.c
cleanups: Remove impossible case in quote.c
cleanups: Remove unused vars from combine-diff.c
cleanups: Fix potential bugs in connect.c
Allow empty lines in info/grafts
combine-diff: show diffstat with the first parent.
Asking for stat (either with --stat or --patch-with-stat) gives
you diffstat for the first parent, even under combine-diff.
While the combined patch is useful to highlight the complexity
and interaction of the parts touched by all branches when
reviewing a merge commit, diffstat is a tool to assess the
extent of damage the merge brings in, and showing stat with the
first parent is more sensible than clever per-parent diffstat.
This option is very special, since pretty_print_commit() will _remove_
the newline at the end of it, so we want to have an extra separator
between the things.
I added a honking big comment this time, so that (a) I don't forget this
_again_ (I broke "oneline" several times during this printout cleanup),
and so that people can understand _why_ the code does what it does.
Now, arguably the alternate fix is to always have the '\n' at the end in
pretty-print-commit, but git-rev-list depends on the current behaviour
(but we could have git-rev-list remove it, whatever).
With the big comment, the code hopefully doesn't get broken again. And now
things like
git log --pretty=oneline --cc --patch-with-stat
works (even if that is admittedly a totally insane combination: if you
want the patch, having the "oneline" log format is just crazy, but hey,
it _works_. Even insane people are people).
Here's a further patch on top of the previous one with cosmetic
improvements (no "real" code changes, just trivial updates):
- it gets the "---" before a diffstat right, including for the combined
merge case. Righ now the logic is that we always use "---" when we have
a diffstat, and an empty line otherwise. That's how I visually prefer
it, but hey, it can be tweaked later.
- I made "diff --cc/combined" add the "---/+++" header lines too. The
thing won't be mistaken for a valid diff, since the "@@" lines have too
many "@" characters (three or more), but it just makes it visually
match a real diff, which at least to me makes a big difference in
readability. Without them, it just looks very "wrong".
I guess I should have taken the filename from each individual entry
(and had one "---" file per parent), but I didn't even bother to try to
see how that works, so this was the simple thing.
With this, doing a
git log --cc --patch-with-stat
looks quite readable, I think. The only nagging issue - as far as I'm
concerned - is that diffstats for merges are pretty questionable the way
they are done now. I suspect it would be better to just have the _first_
diffstat, and always make the merge diffstat be the one for "result
against first parent".
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
The switch is inside an if statement which is false if
the character is ' '. Either the if should be <=' '
instead of <' ', or the case should be removed as it could
be misleading.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
The strncmp for ACK was ACK does not include the final space.
Presumably either we should either remove the trailing space,
or compare 4 chars (as this patch does).
'path' is sometimes strdup'ed, but never freed.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/boundary:
rev-list --boundary: show boundary commits even when limited otherwise.
Makefile fixups.
gitk: Fix bug caused by missing commitlisted elements
rev-list --boundary: show boundary commits even when limited otherwise.
The boundary commits are shown for UI like gitk to draw them as
soon as topo-order sorting allows, and should not be omitted by
get_revision() filtering logic. As long as their immediate
child commits are shown, we should not filter them out.
gitk: Fix bug caused by missing commitlisted elements
This bug was reported by Yann Dirson, and results in an 'Error:
expected boolean value but got ""' dialog when scrolling to the bottom
of the graph under some circumstances. The issue is that git-rev-list
isn't outputting all the boundary commits when it is asked for commits
affecting only certain files. We already cope with that by adding the
missing boundary commits in addextraid, but there we weren't adding a
0 to the end of the commitlisted list when we added the extra id to
the end of the displayorder list.
This fixes it by appending 0 to commitlisted in addextraid, thus keeping
commitlisted and displayorder in sync.