Here's some changes to the cvs-migration.txt. As usual, in my attempt
to make things clearer someone may have found I've made them less so, or
I may have just gotten something wrong; so any review is welcomed.
I can break up this sort of thing into smaller steps if preferred, the
monolothic patch is just a bit simpler for me for this sort of
thing.
I moved the material describing shared repository management from
core-tutorial.txt to cvs-migration.txt, where it seems more appropriate,
and combined two sections to eliminate some redundancy.
I also revised the earlier sections of cvs-migration.txt, mainly trying
to make it more concise.
I've left the last section of cvs-migration.txt (on CVS annotate
alternatives) alone for now.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
[PATCH] pre-commit sample hook: do not barf on the initial import
The example hook barfs on the initial import. Ideally it should
produce a diff from an empty tree, but for now let's stop at
squelching the bogus error message. Often an initial import
involves tons of badly formatted files from foreign SCM, so not
complaining about them like this patch does might actually be a
better idea than enforcing the "Perfect Patch" format on them.
diff-files -c/--cc: combine only when both ours and theirs exist.
The previous round forgot to make sure there actually are two
versions to compare against the working tree version. Otherwise
using -c/--cc would not make much sense.
rev-parse: make "whatchanged -- git-fetch-script" work again.
The latest update to avoid misspelled revs interfered when we
were not interested in parsing non flags or arguments not meant
for rev-list. This makes these two forms work again:
git whatchanged -- git-fetch-script
We could enable "!def" in the part this change touches to make
the above work without '--', but then it would cause misspelled
v2.6.14..v2.6.16 to be given to diff-tree and defeats the whole
point of the previous fix.
Earier specifying an abbreviation shorter than minimum fell back
to full 40 letters, which was nonsense. Make it to fall back to
the minimum number (currently 4).
When displaying Merge: lines, we used to take the real commit
parents from the commit objects. Use the parsed parents from
the commit object instead, so that we honor fake parent
information from info/grafts.
The usage of rev-parse to serve as a flag/option parser
for git-whatchanged and other commands have serious limitation
that the flags cannot be something that is supported by
rev-parse itself, and it cannot worked around easily. Since
this is rarely used "poor-man's describe", rename the option for
now as an easier workaround.
The minimum length of abbreviated object name was hardcoded in
different places to be 4, risking inconsistencies in the future.
Also there were three different "default abbreviation
precision". Use two C preprocessor symbols to clean up this
mess.
The one thing I've considered doing (I really should) is to add a "stop
when you don't find the file" option to "git-rev-list". This patch does
some of the work towards that: it removes the "parent" thing when the
file disappears, so a "git annotate" could do do something like
git-rev-list --remove-empty --parents HEAD -- "$filename"
and it would get a good graph that stops when the filename disappears
(it's not perfect though: it won't remove all the unintersting commits).
It also simplifies the logic of finding tree differences a bit, at the
cost of making it a tad less efficient.
The old logic was two-phase: it would first simplify _only_ merges tree as
it traversed the tree, and then simplify the linear parts of the remainder
independently. That was pretty optimal from an efficiency standpoint
because it avoids doing any comparisons that we can see are unnecessary,
but it made it much harder to understand than it really needed to be.
The new logic is a lot more straightforward, and compares the trees as it
traverses the graph (ie everything is a single phase). That makes it much
easier to stop graph traversal at any point where a file disappears.
As an example, let's say that you have a git repository that has had a
file called "A" some time in the past. That file gets renamed to B, and
then gets renamed back again to A. The old "git-rev-list" would show two
commits: the commit that renames B to A (because it changes A) _and_ as
its parent the commit that renames A to B (because it changes A).
With the new --remove-empty flag, git-rev-list will show just the commit
that renames B to A as the "root" commit, and stop traversal there
(because that's what you want for "annotate" - you want to stop there, and
for every "root" commit you then separately see if it really is a new
file, or if the paths history disappeared because it was renamed from some
other file).
With this patch, you should be able to basically do a "poor mans 'git
annotate'" with a fairly simple loop:
push("HEAD", "$filename")
while (revision,filename = pop()) {
for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename")
pseudo-parents($i) = git-rev-list parents for that line
if (pseudo-parents($i) is non-empty) {
show diff of $i against pseudo-parents
continue
}
/* See if the _real_ parents of $i had a rename */
parent($i) = real-parent($i)
if (find-rename in $parent($i)->$i)
push $parent($i), "old-name"
}
which should be doable in perl or something (doing stacks in shell is just
too painful to be worth it, so I'm not going to do this).
This ports the "combined diff" to diff-files so that differences
to the working tree files since stage 2 and stage 3 are shown
the same way as combined diff output from diff-tree for the
merge commit would be shown if the current working tree files
are committed.
It considered an otherwise unchanged line that had line removals
in front of it an interesting line, which caused hunks to have
one extra the trailing context line.
diff-tree --cc: squelch header generation on empty patch.
Earlier round showed the commit log header and "diff --combined"
header even for paths that had no interesting hunk under --cc
flag. Move the header display logic around to squelch them.
With this, a merge that does not have any interesting merges
will not be shown with --cc option, unless -m is used at the
same time.
Santi Bejar points out that a hunk that changes from all the
same common parents except one is uninteresting. The earlier
round marked changes from only one parent uninteresting, but
this also marks hunks that have the same change from all but one
parent uninteresting, which is a natural extension of the
original idea to Octopus merges.
Remove extra whitespace between the change indicators and the
body text. That is more in line with the uncombined unified
diff output (pointed out by Santi Bejar).
When showing --cc, say so instead of saying just --combined.
combine-diff: fix appending at the tail of a list.
... and use the established pattern of tail initialized to point
at the head pointer for an empty list, and updated to point at
the next pointer field of the item at the tail when appending.
diff-tree --cc: denser combined diff output for a merge commit.
Building on the previous '-c' (combined) option, '--cc' option
squelches the output further by omitting hunks that consist of
difference with solely one parent.
diff-tree -c: show a merge commit a bit more sensibly.
A new option '-c' to diff-tree changes the way a merge commit is
displayed when generating a patch output. It shows a "combined
diff" (hence the option letter 'c'), which looks like this:
$ git-diff-tree --pretty -c -p fec9ebf1 | head -n 18
diff-tree fec9ebf... (from parents)
Merge: 0620db3... 8a263ae...
Author: Junio C Hamano <junkio@cox.net>
Date: Sun Jan 15 22:25:35 2006 -0800
There are a few things to note about this feature:
- The '-c' option implies '-p'. It also implies '-m' halfway
in the sense that "interesting" merges are shown, but not all
merges.
- When a blob matches one of the parents, we do not show a diff
for that path at all. For a merge commit, this option shows
paths with real file-level merge (aka "interesting things").
- As a concequence of the above, an "uninteresting" merge is
not shown at all. You can use '-m' in addition to '-c' to
show the commit log for such a merge, but there will be no
combined diff output.
- Unlike "gitk", the output is monochrome.
A '-' character in the nth column means the line is from the nth
parent and does not appear in the merge result (i.e. removed
from that parent's version).
A '+' character in the nth column means the line appears in the
merge result, and the nth parent does not have that line
(i.e. added by the merge itself or inherited from another
parent).
The above example output shows that the function signature was
changed from either parents (hence two "-" lines and a "++"
line), and "unsigned char sha1[20]", prefixed by a " +", was
inherited from the first parent.
The code as sent to the list was buggy in few corner cases,
which I have fixed since then.
It does not bother to keep track of and show the line numbers
from parent commits, which it probably should.
merge: seed the commit message with list of conflicted files.
The files with conflicts need to be hand resolved, and it is a
good discipline for the committer to explain which branch was
taken and why. Pre-fill the merge message template with the
list of conflicted paths to encourage it.
Johannes noticed the recent addition of this new flag
inadvertently took over existing --update-head-ok (-u). Require
longer abbreviation to this new option which would be needed in
a rare setup.
This makes read_tree_recursive and read_tree take a struct tree
instead of a buffer. It also move the declaration of read_tree into
tree.h (where struct tree is defined), and updates ls-tree and
diff-index (the only places that presently use read_tree*()) to use
the new versions.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Make git-rev-list and git-rev-parse argument parsing stricter
If you pass it a filename without the "--" marker to separate it from
revision information and flags, we now require that the file in question
actually exists. This makes mis-typed revision information not be silently
just considered a strange filename.
With the "--" marker, you can continue to pass in filenames that do not
actually exists - useful for querying what happened to a file that you
no longer have in the repository.
[ All scripts should use the "--" format regardless, to make things
unambiguous. So this change should not affect any existing tools ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
remove environment variables relating to the current repository
before execing the 'remote' half of a local push or pull operation
[jc: the original from Matt spelled out the environment variable
names, which I changed to the preprocessor symbols defined in
cache.h. Also it missed GRAFT_ENVIRONMENT.]
Without this, there is no way to specify a remote executable when
invoking git-pull/git-fetch as there is for git-clone.
[jc: I have a mild suspicion that this is a broken environment (aka
sysadmin disservice). It may be legal to configure your sshd to
spawn named program without involving shell at all, and if your
sysadmin does so and you have your git programs under your home
directory, you would need something like this, but then I suspect
you would need such workaround everywhere, not just git. But we
have these options we can use to work around the issue, so there
is no strong reason not to reject this patch, either. ]
Signed-off-by: Michal Ostrowski <mostrows@watson.ibm.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
sample update-hook: sanely handle a new branch head.
Instead of showing all the history since the beginning of time
leading to the the branch head, show only the changes this new
branch brings to the world.
This originally came from Linus and tested by Andreas Ericsson.
update-hook: Major overhaul (handling tags, mainly).
This is the update hook we use in all our git-repos.
It has some improvements over the original version, namely:
* Don't send every commit since dawn of time when adding a new tag.
* When updating an annotated tag, just send the diffs since the last tag.
* Add diffstat output for 'normal' commits (top) and annotated tags (bottom).
* Block un-annotated tags in shared repos.
I'm a bit uncertain about that last one, but it demonstrates how to
disallow updates of a ref which we use, so I kept it.
Note that git-describe is needed for the "changes since last annotated tag"
thing to work.
Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
Recommend to remove unused `origin` in a shared repository.
It is a common mistake to leave an unsed `origin` branch behind
if a shared public repository was created by first cloning from
somewhere else. Subsequent `git push` into it with the default
"push all the matching ref" would push the `origin` branch from
the developer repository uselessly.
The current Documentation/tutorial.txt concentrates on the lower-level
git interfaces. So it's useful to people developing alternative
porcelains, to advanced users, etc., but not so much to beginning users.
I think it makes sense for the main tutorial to address those
beginnning users, so with this patch I'm proposing that we move
Documentation/tutorial.txt to Documentation/core-tutorial.txt and
replace it by a new tutorial.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
We forgot to make sure that there is no more than one pattern
parameter. Also when looking for files in a directory called
'--others', it passed that path limiter without preceding the
end-of-options marker '--' to underlying git-ls-files, which
misunderstood it as one of its options instead.
$ git grep --others -e Meta/Make Meta
$ git grep -o -e Meta/Make Meta
$ git grep -o Meta/Make Meta
look for a string "Meta/Make" from untracked files in Meta/
directory.
$ git grep Meta/Make --others
looks for the same string from tracked files in ./--others
directory.
On the other hand,
$ git grep -e Meta/Make --others
does not have a freestanding pattern, so everybody is parameter
and there is no path specifier. It looks for the string in all
the untracked files without any path limiter.
[jc: updated with usability enhancements and documentation
cleanups from Sean.]
When overriding DT_* macro detection with NO_D_TYPE_IN_DIRENT (recent
Cygwin build problem, which hopefully is already fixed in their CVS
snapshot version), we define DTYPE() macro to return just "we do not
know", but still needed to use DT_* macro to avoid ifdef in the code
we use them. If the platform defines DT_* macro but with unusable
d_type, this would have resulted in us redefining these preprocessor
symbols.
Admittedly, that would be just a couple of compilation warnings, and
on Cygwin at least this particular problem is transitory (the problem
is already fixed in their CVS snapshot version), so this is a low
priority fix.
This test depended on "sleep 1" to be enough to dirty the index
entry for a symlink. Alex noticed that on his Cygwin installation
"sleep 1" was sometimes not enough, and after further discussion with
Christopher Faylor, it was brought up that on FAT filesystem timestamp
granularity is 2 seconds so sleeping 1 second is not enough.
For now this patch takes an easy workaround of sleeping for 3 seconds.
Very strictly speaking, POSIX requires lstat to fill only S_IFMT part
of st_mode and st_size for symlinks, and depending on timestamp might
be considered a bug, but we depend on that anyway, so it is better to
test that.
DT_UNKNOWN: do not fully trust existence of DT_UNKNOWN
The recent Cygwin defines DT_UNKNOWN although it does not have d_type
in struct dirent. Give an option to tell us not to use d_type on such
platforms. Hopefully this problem will be transient.
fsck-objects: support platforms without d_ino in struct dirent.
The d_ino field is only used for performance reasons in
fsck-objects. On a typical filesystem, i-number tends to have a
strong correlation with where the actual bits sit on the disk
platter, and we sort the entries to allow us scan things that
ought to be close together together.
If the platform lacks support for it, it is not a big deal.
Just do not use d_ino for sorting, and scan them unsorted.
I think most people will want to install the man pages as well.
[jc: incorporated Pasky's comment on not building them as root.
Some people may not want to install asciidoc/xmlto toolchain, so
redirect them to the man and html branches of the git.git
repository as well.]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
Revert "git-push: avoid falling back on pushing "matching" refs."
This reverts 9e9b26751a5ca7a257b3e1cfb319fe3e4efc663c commit partially.
When no refspec is specified on the command line and there is no
default refspec to push specified in remotes/ file, just let
send-pack to do its default "matching refs" updates.
If git-fetch-pack was called with out any refspec, it would ask the server
for funny refs. That cannot work, since the funny refs are not marked
as OUR_REF by upload-pack, which just exits with an error.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Revert "check_packed_git_idx(): check integrity of the idx file itself."
This reverts c5ced64578a82b9d172aceb2f67c6fb9e639f6d9 commit.
It turns out that doing this check every time we map the idx file
is quite expensive. A corrupt idx file is caught by git-fsck-objects,
so this check is not strictly necessary.
In one unscientific test, 0.99.9m spent 10 seconds usertime for
the same task 1.1.3 takes 37 seconds usertime. Reverting this gives
us the performance of 0.99.9 back.
By popular demand. If you build and install such binary RPMs,
the version numbering will lose monotonicity, so you may have to
later override downgrade warnings from your packaging manager,
but as long as you are aware of that and know how to deal with it,
there is no reason for us to forbid it.
Previously 'git-push --tags dst', used information from
remotes/dst to determine which refs to push; this patch corrects
it, and also documents the --tags option.
When describing more than one, we need to clear the commit marks
before handling the next one, but most of the time we are
running it for only one commit, and in such a case this clearing
phase is totally unnecessary.
cvsimport: ease migration from CVSROOT/users format
This fixes a minor bug, which caused the author email to be
doubly enclosed in a <> pair (the code gave enclosing <> to
GIT_AUTHOR_EMAIL and GIT_COMMITTER_EMAIL environment variable).
The read_author_info() subroutine is taught to also understand
the user list in CVSROOT/users format. This is primarily done
to ease migration for CVS users, who can use the -A option
to read from existing CVSROOT/users file. write_author_info()
always writes in the git-cvsimport's native format ('='
delimited and value without quotes).
This patch adds the option to specify an author name/email conversion
file in the format
exon=Andreas Ericsson <ae@op5.se>
spawn=Simon Pawn <spawn@frog-pond.org>
which will translate the ugly cvs authornames to the more informative
git style.
The info is saved in $GIT_DIR/cvs-authors, so that subsequent
incremental imports will use the same author-info even if no -A
option is specified. If an -A option *is* specified, the info in
$GIT_DIR/cvs-authors is appended/updated appropriately.
Docs updated accordingly.
Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
While reviewing the end user tutorial rewrite by J. Bruce
Fields, I noticed that "git-diff-tree -B -C" did not correctly
break the total rewrite of Documentation/tutorial.txt. It turns
out that we had integer overflow during the break score
computations.
Cop out by using floating point. This is not a kernel.
show-branch: --current includes the current branch.
With this, the command includes the current branch to the list
of revs to be shown when it is not given on the command line.
This is handy to use in the configuration file like this:
[showbranch]
default = --current
default = heads/* ; primary branches, not topics under
; subdirectories
show-branch: make the current branch and merge commits stand out.
This changes the character used to mark the commits that is on the
branch from '+' to '*' for the current branch, to make it stand out.
Also we show '-' for merge commits.
When you have a handful branches with relatively long diversion, it
is easier to see which one is the current branch this way.