Update diff engine for symlinks stored in the cache.
This patch updates the external diff interface engine for the change
to store the symbolic links in the cache, recently done by Kay
Sievers.
The main thing it does is when comparing with the work tree, it
prepares the counterpart to the blob being compared by doing a
readlink followed by sending that result to a temporary file to
be diffed.
Separate out the merge resolve from the actual getting of the
data. Also, update the resolve phase to take advantage of the
fact that we don't need to do the commit->tree object lookup
by hand, since all the actors involved happily just act on a
commit object these days.
Allow to store and track symlink in the repository. A symlink is stored
the same way as a regular file, only with the appropriate mode bits set.
The symlink target is therefore stored in a blob object.
This will hopefully make our udev repository fully functional. :)
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Following up from my fix to rpull, please also apply this, which fixes
rpush.c to call git-rpull rather than rpull which no longer exists after
the Big Rename(TM)...
Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the patch tries to create a new file and the file exists, abort.
This fixes an error introduced to git-apply-patch-script in the previous
round. We do not invoke patch for create/delete case, so we need to
be a bit careful about detecting conflicts like this.
This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs into temporary files when the blob recorded in the
cache matches the corresponding file in the work tree. The file
in the work tree is passed as the comparison source in such a
case instead.
This optimization kicks in only when we have already read the
cache this optimization and this is deliberate. Especially,
diff-tree does not use this code, because changes are contained
in small number of files relative to the project size most of
the time, and reading cache is so expensive for a large project
that the cost of reading it outweighs the savings by not
inflating blobs.
Also this patch cleans up the structure passed from diff clients
by removing one unused structure member.
Terminate diff-* on non-zero exit from GIT_EXTERNAL_DIFF
(slightly updated from the version posted to the GIT mailing list
with small bugfixes).
This patch changes the git-apply-patch-script to exit non-zero when
the patch cannot be applied. Previously, the external diff driver
deliberately ignored the exit status of GIT_EXTERNAL_DIFF command,
which was a design mistake. It now stops the processing when
GIT_EXTERNAL_DIFF exits non-zero, so the damages from running
git-diff-* with git-apply-patch-script between two wrong trees can be
contained.
The "diff" command line generated by the built-in driver is changed to
always exit 0 in order to match this new behaviour. I know Pasky does
not use GIT_EXTERNAL_DIFF yet, so this change should not break Cogito,
either.
Git-prune-script loses blobs referenced from an uncommitted cache.
(updated from the version posted to GIT mailing list).
When a new blob is registered with update-cache, and before the cache
is written as a tree and committed, git-fsck-cache will find the blob
unreachable. This patch adds a new flag, "--cache" to git-fsck-cache,
with which it keeps such blobs from considered "unreachable".
The git-prune-script is updated to use this new flag. At the same time
it adds .git/refs/*/* to the set of default locations to look for heads,
which should be consistent with expectations from Cogito users.
Without this fix, "diff-cache -p --cached" after git-prune-script has
pruned the blob object will fail mysteriously and git-write-tree would
also fail.
When git-local-pull with -l option gets ENOENT attempting to create
a hard link, there is no point falling back to other copy methods.
With this patch, git-local-pull detects such a case and gives up
copying the file early.
Make git-*-pull say who wants them for missing objects.
This patch updates pull.c, the engine that decides which objects are
needed, given a commit to traverse from, to report which commit was
calling for the object that cannot be retrieved from the remote side.
This complements git-fsck-cache in that it checks the consistency of
the remote repository for reachability.
Make it much safer: we write to a temporary file, and then link that
temporary file to the final destination. This avoids all the nasty
races if several people write the same object at the same time.
It should also result in nicer on-disk layout, since it means that
objects all get created in the same subdirectory. That makes a lot
of block allocation algorithms happier, since the objects will now
be allocated from the same zone.
We check the ordering of the entries, and we verify that none
of the entries has a slash in it (this allows us to remove the
hacky "has_full_path" member from the tree structure, since we
now just test it by walking the tree entries instead).
gcc 3.4.3 kicks out this warning:
convert-cache.c: In function `write_subdirectory':
convert-cache.c:102: warning: field precision is not type int (arg 4)
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This improves the cold-cache behaviour on most filesystems,
since it makes the fsck access patterns more regular on
the disk, rather than seeking back and forth.
Note the "most". Not all filesystems have any relationship
between inode number and location on disk.
With this change, git-merge-one-file-script ceases to smudge
files in the work tree when recording the trivial merge results
(conflicting auto-merge failure case does not touch the work
tree file as before).
This new flag tells git-update-cache to remove the named path even
when the work tree still happens to have the file. It is used to
update git-merge-one-file-script not to smudge the work tree.
A new command, git-write-blob, is introduced. This registers
the contents of any file on the filesystem as a blob in the
object database and reports its SHA1 to the standard output.
To implement it, the patch promotes index_fd() from a static
function in update-cache.c to extern and moves it to a library
source, sha1_file.c.
This command is used to update git-merge-one-file-script so that
it does not smudge the work tree.
It's silly, and it shouldn't matter, but every time I look at
the diffs, I ended up just worrying why "l/" and "k/" as the
prefixes.
Junio says it's a tribute to linux-kernel, but graciously also
said I can change it to something else. So make it "a/" and "b/"
until somebody else complains ;)
This is to be applied on top of the previous patch to add
git-local-pull command. In addition to the '-l' (attempt
hardlink before anything else) and the '-s' (then attempt
symlink) flags, it adds '-n' (do not fall back to file copy)
flag. Also it updates the comments.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When git-apply-patch-script creates a new file without
executable mode set, a typo caused it not to report that
activity to the user. Also it was mistakenly running
git-update-cache twice for newly created or deleted paths. This
patch fixes these problems.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently pull() calls fetch() without checking whether we have
the wanted object but all of the existing fetch()
implementations perform this check and return success
themselves. This patch moves the check to the caller.
I will be sending a trivial git-local-pull which depends on
this in the next message.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Make git-update-cache --refresh fail if update/merge needed.
Scripts may find it useful if they do not have to parse the
output from the command but just can rely on its exit status.
Earlier both Linus and myself thought this would be necessary to
make git-prune-script safer but it turns out that the issue was
somewhere else and not related to what this patch addresses.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If somebody wants it later, we can re-do it, but for now we consider
it an experiment that wasn't worth it. Git will still honor symbolic
names, it just won't look up parents for you.
Of course, you can always do it by hand if you want to.
It uses the jit syntax, at least for now. 0-xxxx is the first parent of xxxx,
while 1-xxxx is the second, and so on. You can use just "-xxxx" for the first
parent, but a lot of commands will think that the initial '-' implies a
command line flag.
And be a bitmore careful about matching: if we don't recognize a word
or a number, we skip the whole thing, rather than trying the next character
in that word/number.
Finally: since ctime() adds the final '\n', don't add another one in test-date.
I found this during a conflict merge testing. The original did
not have either DF (a file) or DF/DF (a file DF/DF under a
directory DF). One side created DF, the other created DF/DF. I
first resolved DF as a new file by taking what the first side
did. After that, the entry DF/DF cannot be resolved by running
git-update-cache --remove although it does not exist on the
filesystem.
[PATCH] Really fix git-merge-one-file-script this time.
The merge-cache program was updated to pass executable bits when
calling git-merge-one-file-script, but the called script
supplied as an example were not using them carefully.
This patch fixes the following problems in the script:
* When a new file is created in a directory, which is a file in
the work tree, it tried to create leading directory but did
not check for failure from the "mkdir -p" command.
* The script did not check the exit status from the
git-update-cache command at all.
* The parameter "$4" to the script is a file name that can
contain almost any characters, so it must be quoted with
double quotes and also needs to be preceded with -- to mark
it as a non-option when passed to certain commands.
* The chmod command was used with parameter "$6" or "$7" to set
the mode bits. This contradicts with the strategy taken by
checkout-cache, where we honor user's umask and force only
the executable bits. With this patch, it creates a new file
by redirecting into it (thus honoring user's default umask),
and then uses "chmod +x" if we want the resulting file
executable. Without this fix, the merge result becomes 0644
or 0755 for users whose umask is 002 for whom it should
become 0664 or 0775.
* When "$1 -> $2 -> $3" case was not handled, the script did
not say which path it was working on, which was not so useful
when used with the -a option of git-merge-cache.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Stop attempting to be compatible with cg-patch, and drop
(mode:XXXXXX) bits from the diff.
- Do keep the /dev/null change for created and deleted case.
- No "Index:" line, no "Mode change:" line, anywhere in the
output. Anything that wants the mode bits and sha1 hash can
do things from GIT_EXTERNAL_DIFF mechanism. Maybe document
suggested usage better.
This adds an example script git-apply-patch-script, that can be
used as the GIT_EXTERNAL_DIFF to apply changes between two trees
directly on the current work tree, like this:
Diff-tree-helper take two patch inadvertently dropped the
support of -R option, which is necessary to produce reverse diff
based on diff-cache and diff-files output (diff-tree does not
matter since you can feed two trees in reverse order). This
patch restores it.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
date.c: split up dst information in the timezone table
This still doesn't actually really _use_ it properly, nor make any
distinction between different DST rules, but at least we could (if
we wanted to) fake it a bit better.
Right now the code actually still says "it's always summer". I'm
from Finland, I don't like winter.
Make git-fsck-cache error printouts a bit more informative.
Show the types of objects involved in broken links, and don't bother
warning about unreachable tag files (if somebody cares about tags,
they'll use the --tags flag to see them).
...since everything out there is either strange (libc mktime has issues
with timezones) or introduces unnecessary dependencies for people (libcurl).
This goes back to the old date parsing, but moves it out into a file of
its own, and does the "struct tm" to "seconds since epoch" handling by
hand.
I grepped through the tz-database and it seems there's one "country"
left that has non-60-minute DST: Lord Howe Island. All others dropped
that before 1970.
This switches git-commit-tree to using curl_getdate() for the
AUTHOR_DATE, and thus fixes the problem with "mktime()" parsing dates in
the local timezone. It also ends up being more permissive about the
format of the date.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] GIT: Create tar archives of tree on the fly
Write commit ID to global extended pax header at the beginning of the tar
file, if possible. get-tar-commit-id.c is an example program to get the
ID back out of such a tar archive.
Rename git core commands to be "git-xxxx" to avoid name clashes.
This also regularizes the make. The source files themselves don't get
the "git-" prefix, because that's just inconvenient. So instead we just
make the rule that "git-xxxx" depends on "xxxx.c", and do that for
all the core programs (ie the old "git-mktag.c" got renamed to just
"mktag.c" to match everything else).
And "show-diff" got renamed to "git-diff-files" while at it, since
that's what it really should be to match the other git-diff-xxx cases.
[PATCH] GIT: Honour SHA1_FILE_DIRECTORY env var in git-pull-script
If you set SHA1_FILE_DIRECTORY to something else than .git/objects
git-pull-script will store the fetched files in a location the rest of
the tools does not expect.
git-prune-script also ignores this setting, but I think this is good,
because pruning a shared tree to fit a single project means throwing
away a lot of useful data. :-)
[PATCH] Use read_object_with_reference() in tar-tree
This patch replaces the usage of read_tree_with_tree_or_commit_sha1()
with read_object_with_reference() in tar-tree. As a result the code
that tries to figure out the commit time doesn't need to open the commit
object 'by hand' any more.
[PATCH] Rename and extend read_tree_with_tree_or_commit_sha1
This patch renames read_tree_with_tree_or_commit_sha1() to
read_object_with_reference() and extends it to automatically
dereference not just "commit" objects but "tag" objects. With
this patch, you can say e.g.:
This is an improved version of tar-tree, a streaming archive creator for
GIT. The major added feature is blocking; all write(2) calls now have a
size of 10240, just as GNU tar (and tape drives) likes them. The
buffering overhead does not seem to degrade performance because most
files in the repositories I tested this with are smaller than 10KB, so
we need fewer system calls.
File names are still restricted to 500 bytes and the archive format
currently only allows for files up to 8GB. Both restrictions can be
lifted if need be with more pax extended headers.
The archive format used is the pax interchange format, i.e. POSIX tar
format. It can be read by (and created with) GNU tar. If I read the
specs correctly tar-tree should now be standards compliant (modulo
bugs).
Because it streams the archive (think ls-tree merged with cat-file),
tar-tree doesn't need to create any temporary files. That makes it
quite fast.
It accepts tree IDs and commit IDs as first parameter. In the latter
case tar-tree tries to get the commit date out of the committer line.
Else all files in the archive are time-stamped with the current time.
An optional second parameter is used as a path prefix for all files in
the archive. Example:
When diff-cache -p and friends are interrupted, they can leave
their temporary files behind. Also when the external diff
program is killed instead of exiting (this usually happens when
piping the output to a pager, which can cause SIGPIPE when the
user quits viewing the diff early), they incorrectly died
without cleaning their temporary file.
This fixes these problems.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] diff-tree-helper: do not report unmerged path outside specification.
My bad. diff-tree-helper reports all unmerged paths even when
the command line specifies to filter the paths. This patch
fixes it. Also reverse-diff option was left out during the last
round, which this patch restores as well.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Make diff-cache and friends output more cg-patch friendly.
This changes the way the default arguments to diff are built when
diff-cache and friends are invoked with -p and there is no
GIT_EXTERNAL_DIFF environment variable. It attempts to be more cg-patch
friendly by:
- Showing diffs against /dev/null to denote added or removed
files;
- Showing file modes for existing files as a comment after the
diff label.
Unfortunately with this change GIT_DIFF_CMD customization cannot
be supported easily anymore, so it has been dropped.
GIT_DIFF_OPTS customization to change diffs from unified to
context is still there, though.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Add function to parse an object of unspecified type (take 2)
This adds a function that parses an object from the database when we have
to look up its actual type. It also checks the hash of the file, due to
its heritage as part of fsck-cache.
Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Mark blobs as parsed when they're actually parsed
This eliminates the special case for blobs versus other types of
objects. Now the scheme is entirely regular and I won't introduce stupid
bugs. (And fsck-cache doesn't have to do the do-nothing parse)
Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With the recent "no-patch-by-default" change, the -s flag to the
show-diff command (and silent variable in the show-diff.c) became
meaningless. This deprecates it.
Cogito uses "show-diff -s" for the purpose of "I do not want the patch
text. I just want to know if something has potentially changed, in
which case I know you will have some output. I'll run update-cache
--refresh if you say something", so we cannot barf on seeing -s on our
command line yet.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The "base" string already contains any finishing "/", so the way
to get the full pathname is to just concatenate the base and
path directly, with no extra slashes in between.
Junio pointed out that diff-cache didn't handle the case of a new file
that was different from its index entry correctly. It needs to check
the working copy the same way the modified file case did.