[PATCH] don't load and decompress objects twice with parse_object()
It turns out that parse_object() is loading and decompressing given
object to free it just before calling the specific object parsing
function which does mmap and decompress the same object again. This
patch introduces the ability to parse specific objects directly from a
memory buffer.
Without this patch, running git-fsck-cache on the kernel repositorytake:
real 0m13.006s
user 0m11.421s
sys 0m1.218s
With this patch applied:
real 0m8.060s
user 0m7.071s
sys 0m0.710s
The performance increase is significant, and this is kind of a
prerequisite for sane delta object support with fsck.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This only shows the tree headers when something actually changed. Also,
add a "silent" mode, which doesn't actually show the changes at all,
just the commit information.
gitweb.cgi's default view is the log of the last day and git-rev-list
can stop crawling the whole repo if we have all our data to display in the
browser. Also the rss-feed query needs only the last 20 items. This
will speeds up these queries dramatically.
Implement -v (verbose) option for pull methods other than local transport.
This moves the private "say()" function to pull.c, renames it to
"pull_say()", and introduces a global variable "get_verbosely" that
makes the pull backends report what they fetch. The -v option is
added to git-rpull and git-http-pull to match git-local-pull.
The documentation is updated to describe these pull commands.
The tree object parsing used to get the executable bit wrong,
and didn't know about symlinks. Also, fsck really wants the
full mode value so that it can verify the other bits for sanity,
so save it all in struct tree_entry.
Update git-apply-patch-script for symbolic links.
Make git-prune-script executable again.
Do not write out new index if nothing has changed.
diff-cache shows differences for unmerged paths without --cache.
Update diff engine for symlinks stored in the cache.
This patch updates the git-apply-patch-script for the symbolic links
in the cache, recently added by Kay Sievers.
It currently is very anal about symbolic link changes. It refuses to
change between a regular file and a symbolic link, and only allows
symbolic link changes if the patch is based on the same original.
Update diff engine for symlinks stored in the cache.
This patch updates the external diff interface engine for the change
to store the symbolic links in the cache, recently done by Kay
Sievers.
The main thing it does is when comparing with the work tree, it
prepares the counterpart to the blob being compared by doing a
readlink followed by sending that result to a temporary file to
be diffed.
diff-cache shows differences for unmerged paths without --cache.
While manually resolving a merge conflict, being able to run
diff-cache without --cache option between files in the work tree
and either of the ancestor trees is helpful to verify the hand
merged result. However, diff-cache refuses to handle unmerged
paths, even when run without --cache option.
This changes the behaviour so that the above use case will
report the differences between the compared tree and the magic
0{40} SHA1 (i.e. "look at the work tree"). When there is no
corresponding file in the work tree, or when the command is run
with "--cache" option, it continues to report "unmerged".
Separate out the merge resolve from the actual getting of the
data. Also, update the resolve phase to take advantage of the
fact that we don't need to do the commit->tree object lookup
by hand, since all the actors involved happily just act on a
commit object these days.
Allow to store and track symlink in the repository. A symlink is stored
the same way as a regular file, only with the appropriate mode bits set.
The symlink target is therefore stored in a blob object.
This will hopefully make our udev repository fully functional. :)
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Following up from my fix to rpull, please also apply this, which fixes
rpush.c to call git-rpull rather than rpull which no longer exists after
the Big Rename(TM)...
Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the patch tries to create a new file and the file exists, abort.
This fixes an error introduced to git-apply-patch-script in the previous
round. We do not invoke patch for create/delete case, so we need to
be a bit careful about detecting conflicts like this.
This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs into temporary files when the blob recorded in the
cache matches the corresponding file in the work tree. The file
in the work tree is passed as the comparison source in such a
case instead.
This optimization kicks in only when we have already read the
cache this optimization and this is deliberate. Especially,
diff-tree does not use this code, because changes are contained
in small number of files relative to the project size most of
the time, and reading cache is so expensive for a large project
that the cost of reading it outweighs the savings by not
inflating blobs.
Also this patch cleans up the structure passed from diff clients
by removing one unused structure member.
Terminate diff-* on non-zero exit from GIT_EXTERNAL_DIFF
(slightly updated from the version posted to the GIT mailing list
with small bugfixes).
This patch changes the git-apply-patch-script to exit non-zero when
the patch cannot be applied. Previously, the external diff driver
deliberately ignored the exit status of GIT_EXTERNAL_DIFF command,
which was a design mistake. It now stops the processing when
GIT_EXTERNAL_DIFF exits non-zero, so the damages from running
git-diff-* with git-apply-patch-script between two wrong trees can be
contained.
The "diff" command line generated by the built-in driver is changed to
always exit 0 in order to match this new behaviour. I know Pasky does
not use GIT_EXTERNAL_DIFF yet, so this change should not break Cogito,
either.
Git-prune-script loses blobs referenced from an uncommitted cache.
(updated from the version posted to GIT mailing list).
When a new blob is registered with update-cache, and before the cache
is written as a tree and committed, git-fsck-cache will find the blob
unreachable. This patch adds a new flag, "--cache" to git-fsck-cache,
with which it keeps such blobs from considered "unreachable".
The git-prune-script is updated to use this new flag. At the same time
it adds .git/refs/*/* to the set of default locations to look for heads,
which should be consistent with expectations from Cogito users.
Without this fix, "diff-cache -p --cached" after git-prune-script has
pruned the blob object will fail mysteriously and git-write-tree would
also fail.
When git-local-pull with -l option gets ENOENT attempting to create
a hard link, there is no point falling back to other copy methods.
With this patch, git-local-pull detects such a case and gives up
copying the file early.
Make git-*-pull say who wants them for missing objects.
This patch updates pull.c, the engine that decides which objects are
needed, given a commit to traverse from, to report which commit was
calling for the object that cannot be retrieved from the remote side.
This complements git-fsck-cache in that it checks the consistency of
the remote repository for reachability.
Make it much safer: we write to a temporary file, and then link that
temporary file to the final destination. This avoids all the nasty
races if several people write the same object at the same time.
It should also result in nicer on-disk layout, since it means that
objects all get created in the same subdirectory. That makes a lot
of block allocation algorithms happier, since the objects will now
be allocated from the same zone.
We check the ordering of the entries, and we verify that none
of the entries has a slash in it (this allows us to remove the
hacky "has_full_path" member from the tree structure, since we
now just test it by walking the tree entries instead).
gcc 3.4.3 kicks out this warning:
convert-cache.c: In function `write_subdirectory':
convert-cache.c:102: warning: field precision is not type int (arg 4)
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This improves the cold-cache behaviour on most filesystems,
since it makes the fsck access patterns more regular on
the disk, rather than seeking back and forth.
Note the "most". Not all filesystems have any relationship
between inode number and location on disk.
With this change, git-merge-one-file-script ceases to smudge
files in the work tree when recording the trivial merge results
(conflicting auto-merge failure case does not touch the work
tree file as before).
This new flag tells git-update-cache to remove the named path even
when the work tree still happens to have the file. It is used to
update git-merge-one-file-script not to smudge the work tree.
A new command, git-write-blob, is introduced. This registers
the contents of any file on the filesystem as a blob in the
object database and reports its SHA1 to the standard output.
To implement it, the patch promotes index_fd() from a static
function in update-cache.c to extern and moves it to a library
source, sha1_file.c.
This command is used to update git-merge-one-file-script so that
it does not smudge the work tree.
It's silly, and it shouldn't matter, but every time I look at
the diffs, I ended up just worrying why "l/" and "k/" as the
prefixes.
Junio says it's a tribute to linux-kernel, but graciously also
said I can change it to something else. So make it "a/" and "b/"
until somebody else complains ;)
This is to be applied on top of the previous patch to add
git-local-pull command. In addition to the '-l' (attempt
hardlink before anything else) and the '-s' (then attempt
symlink) flags, it adds '-n' (do not fall back to file copy)
flag. Also it updates the comments.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When git-apply-patch-script creates a new file without
executable mode set, a typo caused it not to report that
activity to the user. Also it was mistakenly running
git-update-cache twice for newly created or deleted paths. This
patch fixes these problems.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently pull() calls fetch() without checking whether we have
the wanted object but all of the existing fetch()
implementations perform this check and return success
themselves. This patch moves the check to the caller.
I will be sending a trivial git-local-pull which depends on
this in the next message.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Make git-update-cache --refresh fail if update/merge needed.
Scripts may find it useful if they do not have to parse the
output from the command but just can rely on its exit status.
Earlier both Linus and myself thought this would be necessary to
make git-prune-script safer but it turns out that the issue was
somewhere else and not related to what this patch addresses.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If somebody wants it later, we can re-do it, but for now we consider
it an experiment that wasn't worth it. Git will still honor symbolic
names, it just won't look up parents for you.
Of course, you can always do it by hand if you want to.
It uses the jit syntax, at least for now. 0-xxxx is the first parent of xxxx,
while 1-xxxx is the second, and so on. You can use just "-xxxx" for the first
parent, but a lot of commands will think that the initial '-' implies a
command line flag.
And be a bitmore careful about matching: if we don't recognize a word
or a number, we skip the whole thing, rather than trying the next character
in that word/number.
Finally: since ctime() adds the final '\n', don't add another one in test-date.
I found this during a conflict merge testing. The original did
not have either DF (a file) or DF/DF (a file DF/DF under a
directory DF). One side created DF, the other created DF/DF. I
first resolved DF as a new file by taking what the first side
did. After that, the entry DF/DF cannot be resolved by running
git-update-cache --remove although it does not exist on the
filesystem.
[PATCH] Really fix git-merge-one-file-script this time.
The merge-cache program was updated to pass executable bits when
calling git-merge-one-file-script, but the called script
supplied as an example were not using them carefully.
This patch fixes the following problems in the script:
* When a new file is created in a directory, which is a file in
the work tree, it tried to create leading directory but did
not check for failure from the "mkdir -p" command.
* The script did not check the exit status from the
git-update-cache command at all.
* The parameter "$4" to the script is a file name that can
contain almost any characters, so it must be quoted with
double quotes and also needs to be preceded with -- to mark
it as a non-option when passed to certain commands.
* The chmod command was used with parameter "$6" or "$7" to set
the mode bits. This contradicts with the strategy taken by
checkout-cache, where we honor user's umask and force only
the executable bits. With this patch, it creates a new file
by redirecting into it (thus honoring user's default umask),
and then uses "chmod +x" if we want the resulting file
executable. Without this fix, the merge result becomes 0644
or 0755 for users whose umask is 002 for whom it should
become 0664 or 0775.
* When "$1 -> $2 -> $3" case was not handled, the script did
not say which path it was working on, which was not so useful
when used with the -a option of git-merge-cache.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Stop attempting to be compatible with cg-patch, and drop
(mode:XXXXXX) bits from the diff.
- Do keep the /dev/null change for created and deleted case.
- No "Index:" line, no "Mode change:" line, anywhere in the
output. Anything that wants the mode bits and sha1 hash can
do things from GIT_EXTERNAL_DIFF mechanism. Maybe document
suggested usage better.
This adds an example script git-apply-patch-script, that can be
used as the GIT_EXTERNAL_DIFF to apply changes between two trees
directly on the current work tree, like this:
Diff-tree-helper take two patch inadvertently dropped the
support of -R option, which is necessary to produce reverse diff
based on diff-cache and diff-files output (diff-tree does not
matter since you can feed two trees in reverse order). This
patch restores it.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
date.c: split up dst information in the timezone table
This still doesn't actually really _use_ it properly, nor make any
distinction between different DST rules, but at least we could (if
we wanted to) fake it a bit better.
Right now the code actually still says "it's always summer". I'm
from Finland, I don't like winter.
Make git-fsck-cache error printouts a bit more informative.
Show the types of objects involved in broken links, and don't bother
warning about unreachable tag files (if somebody cares about tags,
they'll use the --tags flag to see them).
...since everything out there is either strange (libc mktime has issues
with timezones) or introduces unnecessary dependencies for people (libcurl).
This goes back to the old date parsing, but moves it out into a file of
its own, and does the "struct tm" to "seconds since epoch" handling by
hand.
I grepped through the tz-database and it seems there's one "country"
left that has non-60-minute DST: Lord Howe Island. All others dropped
that before 1970.
This switches git-commit-tree to using curl_getdate() for the
AUTHOR_DATE, and thus fixes the problem with "mktime()" parsing dates in
the local timezone. It also ends up being more permissive about the
format of the date.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] GIT: Create tar archives of tree on the fly
Write commit ID to global extended pax header at the beginning of the tar
file, if possible. get-tar-commit-id.c is an example program to get the
ID back out of such a tar archive.
Rename git core commands to be "git-xxxx" to avoid name clashes.
This also regularizes the make. The source files themselves don't get
the "git-" prefix, because that's just inconvenient. So instead we just
make the rule that "git-xxxx" depends on "xxxx.c", and do that for
all the core programs (ie the old "git-mktag.c" got renamed to just
"mktag.c" to match everything else).
And "show-diff" got renamed to "git-diff-files" while at it, since
that's what it really should be to match the other git-diff-xxx cases.
[PATCH] GIT: Honour SHA1_FILE_DIRECTORY env var in git-pull-script
If you set SHA1_FILE_DIRECTORY to something else than .git/objects
git-pull-script will store the fetched files in a location the rest of
the tools does not expect.
git-prune-script also ignores this setting, but I think this is good,
because pruning a shared tree to fit a single project means throwing
away a lot of useful data. :-)
[PATCH] Use read_object_with_reference() in tar-tree
This patch replaces the usage of read_tree_with_tree_or_commit_sha1()
with read_object_with_reference() in tar-tree. As a result the code
that tries to figure out the commit time doesn't need to open the commit
object 'by hand' any more.