Bunch of cleanups with a few notable enhancements since
1.3.0-rc1:
- revision traversal infrastructure is updated so that
existence of paths limiters and/or --max-age does not cause
it to call limit_list(). This helps the latency working with
the command quite a bit.
- comes with updated gitk.
One notable fix is to make sure that the IO is restarted upon
signal even on platforms whose default signal semantics is not
to do so. This is the fix for the notorious "clone is broken
since 1.2.2 on Solaris" problem.
git-diff-* --pickaxe-regex will change the -S pickaxe to match
POSIX extended regular expressions instead of fixed strings.
The regex.h library is a rather stupid interface and I like pcre too, but
with any luck it will be everywhere we will want to run Git on, it being
POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
conform to POSIX.2 or if win32 has regex.h. We might add a flag to
Makefile if there is a portability trouble potential.
* lt/fix-sol-pack:
Use sigaction and SA_RESTART in read-tree.c; add option in Makefile.
safe_fgets() - even more anal fgets()
pack-objects: be incredibly anal about stdio semantics
Fix Solaris stdio signal handling stupidities
Use blob_, commit_, tag_, and tree_type throughout.
This replaces occurences of "blob", "commit", "tag", and "tree",
where they're really used as type specifiers, which we already
have defined global constants for.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk> Signed-off-by: Junio C Hamano <junkio@cox.net>
Use sigaction and SA_RESTART in read-tree.c; add option in Makefile.
Might as well ape the sigaction change in read-tree.c to avoid
the same potential problems. The fprintf status output will
be overwritten in a second, so don't bother guarding it. Do
move the fputc after disabling SIGALRM to ensure we go to the
next line, though.
Also add a NO_SA_RESTART option in the Makefile in case someone
doesn't have SA_RESTART but does restart (maybe older HP/UX?).
We want the builder to chose this specifically in case the
system both lacks SA_RESTART and does not restart stdio calls;
a compat #define in git-compat-utils.h would silently allow
broken systems.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/clone:
git-clone: fix handling of upsteram whose HEAD does not point at master.
fix repacking with lots of tags
Documentation: revise top of git man page
git-clone: fix handling of upsteram whose HEAD does not point at
When cloning from a remote repository that has master, main, and
origin branches _and_ with the HEAD pointing at main branch, we
did quite confused things during clone. So this cleans things
up. The behaviour is a bit different between separate remotes/
layout and the mixed branches layout.
The newer layout with $GIT_DIR/refs/remotes/$origin/, things are
simpler and more transparent:
- remote branches are copied to refs/remotes/$origin/.
- HEAD points at the branch with the same name as the remote
HEAD points at, and starts at where the remote HEAD points at.
- $GIT_DIR/remotes/$origin file is set up to fetch all remote
branches, and merge the branch HEAD pointed at at the time of
the cloning.
Everything-in-refs/heads layout was the more confused one, but
cleaned up like this:
- remote branches are copied to refs/heads, but the branch
"$origin" is not copied, instead a copy of the branch the
remote HEAD points at is created there.
- HEAD points at the branch with the same name as the remote
HEAD points at, and starts at where the remote HEAD points at.
- $GIT_DIR/remotes/$origin file is set up to fetch all remote
branches except "$origin", and merge the branch HEAD pointed
at at the time of the cloning.
With this, the remote has master, main and origin, and its HEAD
points at main, you could:
git clone $URL --origin upstream
to use refs/heads/upstream as the tracking branch for remote
"main", and your primary working branch will also be "main".
"master" and "origin" are used to track the corresponding remote
branches and with this setup they do not have any special meaning.
I'm afraid I'll be accused of trying to suck all the jokes and the
personality out of the git documentation. I'm not! Really!
That said, "man git" is one of the first things a new user is likely try,
and it seems a little cruel to start off with a somewhat obscure joke
about the architecture of git.
So instead I'm trying for a relatively straightforward description of what
git does, and what features distinguish it from other systems, together
with immediate links to introductory documentation.
I also did some minor reorganization in an attempt to clarify the
classification of commands. And revised a bit for conciseness (as is
obvious from the diffstat--hopefully I didn't cut anything important).
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
pack-objects: be incredibly anal about stdio semantics
This is the "letter of the law" version of using fgets() properly in the
face of incredibly broken stdio implementations. We can work around the
Solaris breakage with SA_RESTART, but in case anybody else is ever that
stupid, here's the "safe" (read: "insanely anal") way to use fgets.
It probably goes without saying that I'm not terribly impressed by
Solaris libc.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn/git-svn.txt:
added git-repo-config key names for options
fixed quoting of "git-svn-HEAD" in the manpage
use preformatted text for examples
contrib/git-svn/Makefile:
add target to generate HTML:
http://git-svn.yhbt.net/git-svn.html
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
revision: Fix --topo-order and --max-age with reachability limiting.
What ends up not working very well at all is the combination of
"--topo-order" and the output filter in get_revision. It will
return NULL when we see the first commit out of date-order, even
if we have other commits coming.
So we really should do the "past the date order" thing in
get_revision() only if we have _not_ done it already in
limit_list().
Something like this.
The easiest way to test this is with just
gitk --since=3.days.ago
on the kernel tree. Without this patch, it tends to be pretty obviously
broken.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes git-rev-list able to do path-limiting without having to parse
all of history before it starts showing the results.
This makes things like "git log -- pathname" much more pleasant to use.
This is actually a pretty small patch, and the biggest part of it is
purely cleanups (turning the "goto next" statements into "continue"), but
it's conceptually a lot bigger than it looks.
What it does is that if you do a path-limited revision list, and you do
_not_ ask for pseudo-parenthood information, it won't do all the
path-limiting up-front, but instead do it incrementally in
"get_revision()".
This is an absolutely huge deal for anything like "git log -- <pathname>",
but also for some things that we don't do yet - like the "find where
things changed" logic I've described elsewhere, where we want to find the
previous revision that changed a file.
The reason I put "RFC" in the subject line is that while I've validated it
various ways, like doing
git-rev-list HEAD -- drivers/char/ | md5sum
before-and-after on the kernel archive, it's "git-rev-list" after all. In
other words, it's that really really subtle and complex central piece of
software. So while I think this is important and should go in asap, I also
think it should get lots of testing and eyeballs looking at the code.
Btw, don't even bother testing this with the git archive. git itself is so
small that parsing the whole revision history for it takes about a second
even with path limiting. The thing that _really_ shows this off is doing
git log drivers/
on the kernel archive, or even better, on the _historic_ kernel archive.
With this change, the response is instantaneous (although seeking to the
end of the result will obviously take as long as it ever did). Before this
change, the command would think about the result for tens of seconds - or
even minutes, in the case of the bigger old kernel archive - before
starting to output the results.
NOTE NOTE NOTE! Using path limiting with things like "gitk", which uses
the "--parents" flag to actually generate a pseudo-history of the
resulting commits won't actually see the improvement in interactivity,
since that forces git-rev-list to do the whole-history thing after all.
MAYBE we can fix that too at some point, but I won't promise anything.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Move "--parent" parsing into generic revision.c library code
Not only do we do it in both rev-list.c and git.c, the revision walking
code will soon want to know whether we should rewrite parenthood
information or not.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Indeed, we can see that gitk shows these two commits at the
bottom, because the --boundary code failed to output them.
The code did not check to avoid pushing the same uninteresting
commit twice to the result list. I am not sure why this fixes
the reported problem, but this seems to fix it.
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Better workaround for arrows on diagonal line segments
gitk: Allow top panes to scroll horizontally with mouse button 2
gitk: Prevent parent link from overwriting commit headline
gitk: Show diffs for boundary commits
gitk: Use the new --boundary flag to git-rev-list
gitk: Better workaround for arrows on diagonal line segments
Instead of adding extra padding to create a vertical line segment at
the lower end of a line that has an arrow, this now just draws a very
short vertical line segment at the lower end. This alternative
workaround for the Tk8.4 behaviour (not drawing arrows on diagonal
line segments) doesn't have the problem of making the graph very wide
when people do a lot of merges in a row (hi Junio :).
Make git-clone to take long double-dashed origin option (--origin)
git-clone currently take option '-o' to specify origin. this patch
makes git-clone to take double-dashed option '--origin' and other
abbreviations in addtion to the current single-dashed option.
[jc: with minor fixups]
Signed-off-by: Yasushi SHOJI <yashi@atmark-techno.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
gitk: Prevent parent link from overwriting commit headline
When I made drawlineseg responsible for drawing the link to the first
child rather than drawparentlinks, that meant that the right-most X
value computed by drawparentlinks didn't include those first-child
links, and thus the first-child link could go over the top of the
commit headline. This fixes it.
With this we run git-diff-tree on a commit even if we think it has
no parents, either because it really has no parents or because it
is a boundary commit. This means that gitk shows the diff for a
boundary commit when it is selected.
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.
Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h. This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.
When the executable bit is untrustworthy and when we are
comparing the tree with the working tree, we tried to reuse the
mode bits recorded in the index incorrectly (the computation was
bogus on little endian architectures). Just use mode from index
when it is a regular file.
With this, we can show the boundary (open-circle) commits immediately
after their last child, which looks much better than putting all the
boundary commits at the bottom of the graph.
revision arguments: ..B means HEAD..B, just like A.. means A..HEAD
For consistency reasons, we should probably allow that to be written as
just "..branch", the same way we can write "branch.." to mean "everything
in HEAD but not in "branch".
With the new --boundary flag, the output from rev-list includes
the UNINTERESING commits at the boundary, which are usually not
shown. Their object names are prefixed with '-'.
For example, with this graph:
C side
/
A---B---D master
You would get something like this:
$ git rev-list --boundary --header --parents side..master
D B
tree D^{tree}
parent B
... log message for commit D here ...
\0-B A
tree B^{tree}
parent A
... log message for commit B here ...
\0
We do not need to track object refs, neither we need to save commit
unless we are doing verbose header. A lot of traversal happens
inside prepare_revision_walk() these days so setting things up before
calling that function is necessary.
Signed-off-by: Junio C Hamano <junkio@cox.net> Acked-by: Linus Torvalds <torvalds@osdl.org>
The speed of the built-in diff generator is nice; but the function names
shown by `diff -p' are /really/ nice. And I hate having to choose. So,
we hack xdiff to find the function names and print them.
xdiff has grown a flag to say whether to dig up the function names. The
builtin_diff function passes this flag unconditionally. I suppose it
could parse GIT_DIFF_OPTS, but it doesn't at the moment. I've also
reintroduced the `function name' into the test suite, from which it was
removed in commit 3ce8f089.
The function names are parsed by a particularly stupid algorithm at the
moment: it just tries to find a line in the `old' file, from before the
start of the hunk, whose first character looks plausible. Still, it's
most definitely a start.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk> Signed-off-by: Junio C Hamano <junkio@cox.net>
For some reason, I need ALL_LDFLAGS in the git target only on
AIX. Once it builds, only one test "fails" on AIX 5.1 with
1.3.0.rc1, t5500-fetch-pack.sh, but it looks like it's some
odd tool problem in the tester + my setup and not a real bug.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
All of the things that were not in the "master" branch were
either cooked long enough in "next" without causing problems
(e.g. insanely fast rename detector or true built-in diff) or
isolated in a specific subsystem (e.g. tar-tree and svnimport).
So I am clearing the deck to prepare for a 1.3.0. Remaining
wrinkles, if any, will be ironed in the "master" branch.
* master:
Optionally do not list empty directories in git-ls-files --others
Document git-rebase behavior on conflicts.
Fix error handling for nonexistent names
Optionally do not list empty directories in git-ls-files --others
Without the --directory flag, git-ls-files wouldn't ever list directories,
producing no output for empty directories, which is good since they cannot
be added and they bear no content, even untracked one (if Git ever starts
tracking directories on their own, this should obviously change since the
content notion will change).
With the --directory flag however, git-ls-files would list even empty
directories. This may be good in some situations but sometimes you want to
prevent that. This patch adds a --no-empty-directory option which makes
git-ls-files omit empty directories.
When passing in a pathname pattern without the "--" separator on the
command line, we verify that the pathnames in question exist. However,
there were two bugs in that verification:
- git-rev-parse would only check the first pathname, and silently allow
any invalid subsequent pathname, whether it existed or not (which
defeats the purpose of the check, and is also inconsistent with what
git-rev-list actually does)
- git-rev-list (and "git log" etc) would check each filename, but if the
check failed, it would print the error using the first one, i.e.:
[torvalds@g5 git]$ git log Makefile bad-file
fatal: 'Makefile': No such file or directory
instead of saying that it's 'bad-file' that doesn't exist.
This fixes both bugs.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/thin:
git-push: make --thin pack transfer the default.
gitk: Fix two bugs reported by users
gitk: Improve appearance of first child links
gitk: Make downward-pointing arrows end in vertical line segment
gitk: Don't change cursor at end of layout if find in progress
gitk: Make commitdata an array rather than a list
gitk: Fix display of diff lines beginning with --- or +++
[PATCH] gitk: Make error_popup react to Return
gitk: Fix a bug in drawing the selected line as a thick line
gitk: Further speedups
gitk: Various speed improvements
gitk: Fix Update menu item
gitk: Fix clicks on arrows on line ends
gitk: New improved gitk
contrib/git-svn: stabilize memory usage for big fetches
* jc/clone:
git-clone: typofix.
clone: record the remote primary branch with remotes/$origin/HEAD
revamp git-clone (take #2).
revamp git-clone.
fetch,parse-remote,fmt-merge-msg: refs/remotes/* support
* jc/name:
sha1_name: make core.warnambiguousrefs the default.
sha1_name: warning ambiguous refs.
get_sha1_basic(): try refs/... and finally refs/remotes/$foo/HEAD
core.warnambiguousrefs: warns when "name" is used and both "name" branch and tag exists.
git-svnimport: if a limit is specified, respect it
git-svnimport will import the same revision over and over again if a
limit (-l <rev>) has been specified. Instead if that revision has already
been processed, exit with an up-to-date message.
Signed-off-by: Anand Kumria <wildfire@progsoc.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Fix two bugs reported by users
gitk: Improve appearance of first child links
gitk: Make downward-pointing arrows end in vertical line segment
gitk: Don't change cursor at end of layout if find in progress
gitk: Make commitdata an array rather than a list
gitk: Fix display of diff lines beginning with --- or +++
[PATCH] gitk: Make error_popup react to Return
gitk: Fix a bug in drawing the selected line as a thick line
gitk: Further speedups
gitk: Various speed improvements
gitk: Fix Update menu item
gitk: Fix clicks on arrows on line ends
gitk: New improved gitk
contrib/git-svn: stabilize memory usage for big fetches
We should be safely able to import histories with thousands
of revisions without hogging up lots of memory.
With this, we lose the ability to autocorrect mistakes when
people specify revisions in reverse, but it's probably no longer
a problem since we only have one method of log parsing nowadays.
I've added an extra check to ensure that revision numbers do
increment.
Also, increment the version number to 0.11.0. I really should
just call it 1.0 soon...
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
* ew/email:
send-email: lazy-load Email::Valid and make it optional
send-email: try to order messages in email clients more correctly
send-email: Change from Mail::Sendmail to Net::SMTP
send-email: use built-in time() instead of /bin/date '+%s'
* rs/tar-tree:
tar-tree: Use the prefix field of a tar header
tar-tree: Remove obsolete code
tar-tree: Use write_entry() to write the archive contents
tar-tree: Introduce write_entry()
tar-tree: Use SHA1 of root tree for the basedir
git-apply: safety fixes
Removed bogus "<snap>" identifier.
Clarify and expand some hook documentation.
commit-tree: check return value from write_sha1_file()
send-email: Identify author at the top when sending e-mail
Format tweaks for asciidoc.
send-email: lazy-load Email::Valid and make it optional
It's not installed on enough machines, and is overkill most of
the time. We'll fallback to a very basic regexp just in case,
but nothing like the monster regexp Email::Valid has to offer :)
Small cleanup from Merlyn.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
send-email: try to order messages in email clients more correctly
If --no-chain-reply-to is set, patches may not always be ordered
correctly in email clients. This patch makes sure each email
sent from a different second.
I chose to start with a time (slightly) in the past because
those are probably more likely in real-world usage and spam
filters might be more tolerant of them.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
send-email: Change from Mail::Sendmail to Net::SMTP
Net::SMTP is in the base Perl distribution, so users are more
likely to have it. Net::SMTP also allows reusing the SMTP
connection, so sending multiple emails is faster.
[jc: tweaked X-Mailer further while we are at it.]
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
This uses a simplified libxdiff setup to generate unified diffs _without_
doing fork/execve of GNU "diff".
This has several huge advantages, for example:
Before:
[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null
real 0m24.818s
user 0m13.332s
sys 0m8.664s
After:
[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null
real 0m4.563s
user 0m2.944s
sys 0m1.580s
and the fact that this should be a lot more portable (ie we can ignore all
the issues with doing fork/execve under Windows).
Perhaps even more importantly, this allows us to do diffs without actually
ever writing out the git file contents to a temporary file (and without
any of the shell quoting issues on filenames etc etc).
NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the
current "diff-core" code actually will always write the temp-files,
because it used to be something that you simply had to do. So this current
one actually writes a temp-file like before, and then reads it into memory
again just to do the diff. Stupid.
But if this basic infrastructure is accepted, we can start switching over
diff-core to not write temp-files, which should speed things up even
further, especially when doing big tree-to-tree diffs.
Now, in the interest of full disclosure, I should also point out a few
downsides:
- the libxdiff algorithm is different, and I bet GNU diff has gotten a
lot more testing. And the thing is, generating a diff is not an exact
science - you can get two different diffs (and you will), and they can
both be perfectly valid. So it's not possible to "validate" the
libxdiff output by just comparing it against GNU diff.
- GNU diff does some nice eye-candy, like trying to figure out what the
last function was, and adding that information to the "@@ .." line.
libxdiff doesn't do that.
- The libxdiff thing has some known deficiencies. In particular, it gets
the "\No newline at end of file" case wrong. So this is currently for
the experimental branch only. I hope Davide will help fix it.
That said, I think the huge performance advantage, and the fact that it
integrates better is definitely worth it. But it should go into a
development branch at least due to the missing newline issue.
Technical note: this is based on libxdiff-0.17, but I did some surgery to
get rid of the extraneous fat - stuff that git doesn't need, and seriously
cutting down on mmfile_t, which had much more capabilities than the diff
algorithm either needed or used. In this version, "mmfile_t" is just a
trivial <pointer,length> tuple.
That said, I tried to keep the differences to simple removals, so that you
can do a diff between this and the libxdiff origin, and you'll basically
see just things getting deleted. Even the mmfile_t simplifications are
left in a state where the diffs should be readable.
Apologies to Davide, whom I'd love to get feedback on this all from (I
wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old
complex format had a helper function for that, but I did my surgery with
the goal in mind that eventually we _should_ just do
which was really a nightmare with the old "helpful" mmfile_t, and really
is that easy with the new cut-down interfaces).
[ Btw, as any hawk-eye can see from the diff, this was actually generated
with itself, so it is "self-hosting". That's about all the testing it
has gotten, along with the above kernel diff, which eye-balls correctly,
but shows the newline issue when you double-check it with "git-apply" ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
... to store parts of the path, if possible. This allows us to avoid
writing extended headers in certain cases (long pathes can only be
split at '/' chars).
Also adds a file to the test repo with a 100 chars long directory name.
Even old versions of tar that don't understand POSIX extended headers
should be able to handle this testcase.
Btw.: The longest path in the kernel tree currently has 70 chars.
Together with a 30 chars long prefix this would already cross the
field limit of 100 chars.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
... and use it initially to write global extended header records.
Improvements compared to the old write_header():
- Uses a struct ustar_header instead of hardcoded offsets.
- Takes one struct strbuf as path argument instead of a (basedir,
prefix, name) tuple.
- Not only writes the tar header, but also the contents of the
file, if any.
- Does not write directly into the ring buffer. This allows the
code to be layed out more naturally, because there is no more
ordering constraint. Before we had to first finish writing the
extended header, now we can construct the extended and normal
headers in parallel.
- The typeflag parameter has been replaced by (reasonable) magic
values. path == NULL indicates an extended header, additionally
sha1 == NULL means it is a global extended header.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
This was triggered by me testing the "@@" numbering shorthand by GNU
patch, which not only showed that git-apply thought it meant the number
was duplicated (when it means that the second number is 1), but my tests
showed than when git-apply mis-understood the number, it would then not
raise an alarm about it if the patch ended early.
Now, this doesn't actually _matter_, since with a three-line context, the
only case that "x,1" will be shorthanded to "x" is when x itself is 1 (in
which case git-apply got it right), but the fact that git-apply would also
silently accept truncated patches was a missed opportunity for additional
sanity-checking.
So make git-apply refuse to look at a patch fragment that ends early.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
send-email: Identify author at the top when sending e-mail
git-send-email did not check if the sender is the same as the
patch author. Follow the "From: at the beginning" convention to
propagate the patch author correctly.