Improved XML parsing - replace specialized doc parser callbacks with generic
functions that track the parser context and use document-specific callbacks
to process that data.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Rename object request functions and data to make it more clear which type
of request is being processed - this is a response to the introduction of
slot callbacks and the definition of different types of requests such as
alternates_request.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Move shared HTTP request functionality out of http-fetch and http-push,
and replace the two fwrite_buffer/fwrite_buffer_dynamic functions with
one fwrite_buffer function that does dynamic buffering. Use slot
callbacks to process responses to fetch object transfer requests and
push transfer requests, and put all of http-push into an #ifdef check
for curl multi support.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Johannes suggested this earlier but I did not take it so
seriously because this command is not that important. But this
probably matters on Cygwin which does not seem to come with
precompiled dc. It is a mystery for me that anything that
mimics UNIX does not offer a dc, though.
I did the detection for the lack of dc command a bit differently
from the verison Johannes did.
Rewrite rebase to use git-format-patch piped to git-am.
The current rebase implementation finds commits in our tree but
not in the upstream tree using git-cherry, and tries to apply
them using git-cherry-pick (i.e. always use 3-way) one by one.
Which is fine, but when some of the changes do not apply
cleanly, it punts, and punts badly.
Suppose you have commits A-B-C-D-E since you forked from the
upstream and submitted the changes for inclusion. You fetch
from upstream head U and find that B has been picked up. You
run git-rebase to update your branch, which tries to apply
changes contained in A-C-D-E, in this order, but replaying of C
fails, because the upstream got changes that touch the same area
from elsewhere.
Now what?
It notes that fact, and goes ahead to apply D and E, and at the
very end tells you to deal with C by hand. Even if you somehow
managed to replay C on top of the result, you would now end up
with ...-B-...-U-A-D-E-C.
Breaking the order between B and others was the conscious
decision made by the upstream, so we would not worry about it,
and even if it were worrisome, it is too late for us to fix now.
What D and E do may well depend on having C applied before them,
which is a problem for us.
This rewrites rebase to use git-format-patch piped to git-am,
and when the patch does not apply, have git-am fall back on
3-way merge. The updated diff/patch pair knows how to apply
trivial binary patches as long as the pre- and post-images are
locally available, so this should work on a repository with
binary files as well.
The primary benefit of this change is that it makes rebase
easier to use when some of the changes do not replay cleanly.
In the "unapplicable patch in the middle" case, this "rebase"
works like this:
- A series of patches in e-mail form is created that records
what A-C-D-E do, and is fed to git-am. This is stored in
.dotest/ directory, just like the case you tried to apply
them from your mailbox. Your branch is rewound to the tip of
upstream U, and the original head is kept in .git/ORIG_HEAD,
so you could "git reset --hard ORIG_HEAD" in case the end
result is really messy.
- Patch A applies cleanly. This could either be a clean patch
application on top of rewound head (i.e. same as upstream
head), or git-am might have internally fell back on 3-way
(i.e. it would have done the same thing as git-cherry-pick).
In either case, a rebased commit A is made on top of U.
- Patch C does not apply. git-am stops here, with conflicts to
be resolved in the working tree. Yet-to-be-applied D and E
are still kept in .dotest/ directory at this point. What the
user does is exactly the same as fixing up unapplicable patch
when running git-am:
- Resolve conflict just like any merge conflicts.
- "git am --resolved --3way" to continue applying the patches.
- This applies the fixed-up patch so by definition it had
better apply. "git am" knows the patch after the fixed-up
one is D and then E; it applies them, and you will get the
changes from A-C-D-E commits on top of U, in this order.
I've been using this without noticing any problem, and as people
may know I do a lot of rebases.
A new usage, 'git-branch -f branch [start]', resets the branch head at
start (or current head). Should be considered a dangerous operation,
but if you are like me to keep rewinding branches it is handy.
When fetching/pulling from a remote repository the "--tags" option
can be used to pull tags too. Document that it will limit the pull
to only commits reachable from the tags.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
When HPA added Cygwin target, it ran just fine without NO_MMAP for him,
but recently we are getting reports that for some people things break
without it. For now, just suggest it in the Makefile without actually
updating the default.
the excessively verbose output of git fetch makes the result totally
unreadable. It's impossible to tell if it actually fetched anything new or
not, since the screen will fill up with an endless supply of
...
* committish: 9165ec17fde255a1770886189359897dbb541012
tag 'v0.99.7c' of master.kernel.org:/pub/scm/git/git
* refs/tags/v0.99.7c: same as tag 'v0.99.7c' of master.kernel.org:/pub/scm/git/git
...
and any new tags that got fetched will be totally hidden.
So add a new "--verbose" flag to "git fetch" to enable this verbose mode,
but make the default be quiet.
NOTE! The quiet mode will still report about new or changed heads, so if
you are really fetching a new head, you'll see something like this:
[torvalds@g5 git]$ git fetch --tags parent
Packing 6 objects
Unpacking 6 objects
100% (6/6) done
* refs/tags/v1.0rc2: storing tag 'v1.0rc2' of master.kernel.org:/pub/scm/git/git
* refs/tags/v1.0rc3: storing tag 'v1.0rc3' of master.kernel.org:/pub/scm/git/git
* refs/tags/v1.0rc1: storing tag 'v1.0rc1' of master.kernel.org:/pub/scm/git/git
which actually tells you something useful that isn't hidden by all the
useless crud that you already had.
Extensively tested (hey, for me, this _is_ extensive) by doing a
rm .git/refs/tags/v1.0rc*
and re-fetching with both --verbose and without.
NOTE! This means that if the fetch didn't actually fetch anything at all,
git fetch will be totally quiet. I think that's much better than being so
verbose that you can't even tell whether something was fetched or not, but
some people might prefer to get a "nothing to fetch" message in that case.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Now git-apply can grok binary replacement patches, give --binary
flag to git-am. As a safety measure, this is not by default
enabled, so that you do not let malicious e-mailed patch to
replace an arbitrary path with just a couple of lines (diff
index lines, the filename and string "Binary files "...) by
accident.
This allows people to use syntax like "last thursday" for the approxidate.
(Or, indeed, more complex "three thursdays ago", but I suspect that would
be pretty unusual).
NOTE! The parsing is strictly sequential, so if you do
"one day before last thursday"
it will _not_ do what you think it does. It will take the current time,
subtract one day, and then go back to the thursday before that. So to get
what you want, you'd have to write it the other way around:
"last thursday and one day before"
which is insane (it's usually the same as "last wednesday" _except_ if
today is Thursday, in which case "last wednesday" is yesterday, and "last
thursday and one day before" is eight days ago).
Similarly,
"last thursday one month ago"
will first go back to last thursday, and then go back one month from
there, not the other way around.
I doubt anybody would ever use insane dates like that, but I thought I'd
point out that the approxidate parsing is not exactly "standard English".
Side note 2: if you want to avoid spaces (because of quoting issues), you
can use any non-alphanumberic character instead. So
git log --since=2.days.ago
works without any quotes.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
archimport: allow for old style branch and public tag names
This patch adds the -o switch, which lets old trees tracked by
git-archmirror continue working with their old branch and tag names
to make life easier for people tracking your tree.
Private tags that are only used internally by git-archimport continue to be
new-style, and automatically converted upon first run.
[ ml: rebased to skip import overhaul ]
Signed-off-by:: Eric Wong <normalperson@yhbt.net> Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
git's rev-parse.c function show_datestring presumes gnu date
Ok. This is the insane patch to do this.
It really isn't very careful, and the reason I call it "approxidate()"
will become obvious when you look at the code. It is very liberal in what
it accepts, to the point where sometimes the results may not make a whole
lot of sense.
It accepts "last week" as a date string, by virtue of "last" parsing as
the number 1, and it totally ignoring superfluous fluff like "ago", so
"last week" ends up being exactly the same thing as "1 week ago". Fine so
far.
It has strange side effects: "last december" will actually parse as "Dec
1", which actually _does_ turn out right, because it will then notice that
it's not December yet, so it will decide that you must be talking about a
date last year. So it actually gets it right, but it's kind of for the
"wrong" reasons.
It also accepts the numbers 1..10 in string format ("one" .. "ten"), so
you can do "ten weeks ago" or "ten hours ago" and it will do the right
thing.
But it will do some really strange thigns too: the string "this will last
forever", will not recognize anyting but "last", which is recognized as
"1", which since it doesn't understand anything else it will think is the
day of the month. So if you do
gitk --since="this will last forever"
the date will actually parse as the first day of the current month.
And it will parse the string "now" as "now", but only because it doesn't
understand it at all, and it makes everything relative to "now".
Similarly, it doesn't actually parse the "ago" or "from now", so "2 weeks
ago" is exactly the same as "2 weeks from now". It's the current date
minus 14 days.
But hey, it's probably better (and certainly faster) than depending on GNU
date. So now you can portably do things like
gitk --since="two weeks and three days ago"
git log --since="July 5"
git-whatchanged --since="10 hours ago"
git log --since="last october"
and it will actually do exactly what you thought it would do (I think). It
will count 17 days backwards, and it will do so even if you don't have GNU
date installed.
(I don't do "last monday" or similar yet, but I can extend it to that too
if people want).
It was kind of fun trying to write code that uses such totally relaxed
"understanding" of dates yet tries to get it right for the trivial cases.
The result should be mixed with a few strange preprocessor tricks, and be
submitted for the IOCCC ;)
Feel free to try it out, and see how many strange dates it gets right. Or
wrong.
And if you find some interesting (and valid - not "interesting" as in
"strange", but "interesting" as in "I'd be interested in actually doing
this) thing it gets wrong - usually by not understanding it and silently
just doing some strange things - please holler.
Now, as usual this certainly hasn't been getting a lot of testing. But my
code always works, no?
A new option, --full-index, is introduced to diff family. This
causes the full object name of pre- and post-images to appear on
the index line of patch formatted output, to be used in
conjunction with --allow-binary-replacement option of git-apply.
A new option, --allow-binary-replacement, is introduced.
When you feed a diff that records full SHA1 name of pre- and
post-image blob on its index line to git-apply with this option,
the post-image blob replaces the path if what you have in the
working tree matches the pre-image _and_ post-image blob is
already available in the object directory.
Later we _might_ want to enhance the diff output to also include
the full binary data of the post-image, to make this more
useful, but this is good enough for local rebasing application.
After failed patch application, you can manually apply the patch
(this includes resolving the conflicted merge after git-am falls
back to 3-way merge) and run git-update-index on necessary paths
to prepare the index file in a shape a successful patch
application should have produced. Then re-running git-am --resolved
would record the resulting index file along with the commit log
information taken from the patch e-mail.
Recently we fixed 'git-apply --stat' not to barf on a binary
differences. But it accidentally broke the error detection when
we actually attempt to apply them.
This commit fixes the problem and adds test cases.
For some reason I've done a "git grep" twice with no pattern, which is
really irritating, since it just grep everything. If I actually wanted
that, I could do "git grep ^" or something.
So add a "usage" message if the pattern is empty.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
python 2.2.1 is perfectly capable of executing git-merge-recursive,
provided that it finds heapq and sets. All you have to do is to steal
heapq.py and sets.py from python 2.3 or newer, and drop them in your
GIT_PYTHON_PATH.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch provides a C implementation of the 'git' program and
introduces support for putting the git-* commands in a directory
of their own. It also saves some time on executing those commands
in a tight loop and it prints the currently available git commands
in a nicely formatted list.
The location of the GIT_EXEC_PATH (name discussion's closed, thank gods)
can be obtained by running
git --exec-path
which will hopefully give porcelainistas ample time to adapt their
heavy-duty loops to call the core programs directly and thus save
the extra fork() / execve() overhead, although that's not really
necessary any more.
The --exec-path value is prepended to $PATH, so the git-* programs
should Just Work without ever requiring any changes to how they call
other programs in the suite.
Some timing values for 10000 invocations of git-var >&/dev/null:
git.sh: 24.194s
git.c: 9.044s
git-var: 7.377s
The git-<tab><tab> behaviour can, along with the someday-to-be-deprecated
git-<command> form of invocation, be indefinitely retained by adding
the following line to one's .bash_profile or equivalent:
PATH=$PATH:$(git --exec-path)
Experimental libraries can be used by either setting the environment variable
GIT_EXEC_PATH, or by using
git --exec-path=/some/experimental/exec-path
Relative paths are properly grok'ed as exec-path values.
Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
where gitweb was found to be using a lot of time and memory to
detect renames on huge commits. git-diff family takes -l<num>
flag, and if the number of paths that are rename destination
candidates (i.e. new paths with -M, or modified paths with -C)
are larger than that number, skips rename/copy detection even
when -M or -C is specified on the command line.
This commit makes the rename detection limit easier to use. You
can have:
[diff]
renamelimit = 30
in your .git/config file to specify the default rename detection
limit. You can override this from the command line; giving 0
means 'unlimited':
git diff -M -l0
We might want to change the default behaviour, when you do not
have the configuration, to limit it to say 20 paths or so. This
would also help the diffstat generation after a big 'git pull'.
Rework object refs tracking to reduce memory usage
Store pointers to referenced objects in a variable sized array instead
of linked list. This cuts down memory usage of utilities which use
object references; e.g., git-fsck-objects --full on the git.git
repository consumes about 2 MB of memory tracked by Massif instead of
7 MB before the change. Object refs are still the biggest consumer of
memory (57%), but the malloc overhead for a single block instead of a
linked list is substantially smaller.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
The Massif tool of Valgrind revealed that parsed tree entries occupy
more than 60% of memory allocated by git-fsck-objects. These entries
can be freed immediately after use, which significantly decreases
memory consumption.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
Documentation: do not blindly run 'cat' .git/HEAD, or echo into it.
Many places in the documentation we still talked about reading
what commit is recorded in .git/HEAD or writing the new head
information into it, both assuming .git/HEAD is a symlink. That
is not necessarily so.
The current http-fetch is rather careless about fd leakage, causing
problems while fetching large repositories. This patch does not reserve
exhaustiveness, but I covered everything I spotted. I also left some
safeguards in place in case I missed something, so that we get to know,
sooner or later.
Reported by Becky Bruce <becky.bruce@freescale.com>.
Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch renames the tarball "git" rather than "git-core", and changes
the names of various packages from git-core-foo to git-foo. git-core is
still the true core package; an empty RPM package named "git" pulls in
ALL the git packages -- this makes updates work correctly, and allows
"yum install git" to do the obvious thing.
It also renames the git-(core-)tk package to gitk.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
The -r flag means "rev-list order", i.e. just display the commits
in the order they come from git-rev-list.
The speedups include:
- don't process the whole commit line-by-line, only the header
- don't convert dates when reading the commits, rather do it when
needed
- don't do the $canv delete lines.$id in drawlines when drawing the
graph initially (it was taking a lot of the total time)
- cache the date conversion for each hour (more important with tk8.5,
since [clock format] is a lot slower in 8.5 than in 8.4).
This fixes git-rev-list so that when there are multiple branches, we still
sort the heads in proper approximate date order even when sorting the
output topologically.
This makes things like
gitk --all -d
work sanely and show the branches in date order (where "date order" is
obviously modified by the paren-child dependency requirements of the
topological sort).
The trivial fix is to just build the "work" list in date order rather than
inserting the new work entries at the beginning.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Because we use "lost-found" as the directory name to hold
dangling object names, it is confusing to call the command
git-lost+found, although it makes sense and is even cute ;-).
Fix for multiple alternates requests in http-fetch
Stop additional alternates requests from starting if one is already in
progress. This adds an optional callback which is processed after a slot
has finished running.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Just to avoid confusion that scripts poorly written by somebody
else ;-) might mistake this as a mount point, or backup tools
ignoring the directory. The latter is probably not a big loss,
however, considering that this directory's contents are to be
used while fresh anyway.
This patch renames git-pack-intersect to git-pack-redundant
as suggested by Petr Baudis. The new name reflects what the
program does, rather than how it does it.
Also fix a small argument parsing bug.
Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net>