Clarify dual license status of subprocess.py file.
The author of the file we stole from Python 2.4 distribution, Peter
Astrand <astrand@lysator.liu.se>, OK'ed to add this at the end of the
licensing terms section of the file:
Use of this file within git is permitted under GPLv2.
[PATCH] Teach "git-rev-parse" about date-based cut-offs
This adds the options "--since=date" and "--before=date" to git-rev-parse,
which knows how to translate them into seconds since the epoch for
git-rev-list.
With this, you can do
git log --since="2 weeks ago"
or
git log --until=yesterday
to show the commits that have happened in the last two weeks or are
older than 24 hours, respectively.
The flags "--after=" and "--before" are synonyms for --since and --until,
and you can combine them, so
git log --after="Aug 5" --before="Aug 10"
is a valid (but strange) thing to do.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This is my ARM assembly SHA1 implementation for GIT. It is approximately
50% faster than the generic C version. On an XScale processor running at
400MHz:
generic C version: 9.8 MB/s
my version: 14.5 MB/s
It's not that I expect a lot of big GIT users on ARM, but I stillknow
about one important ARM user that might benefit from it, and writing
that code was fun.
I also reworked the makefile a bit so any optimized SHA1 implementations
is used regardless of whether NO_OPENSSL is defined or not.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
[PATCH] Make the git-fsck-objects diagnostics more useful
Actually report what exactly is wrong with the object, instead of an
ambiguous 'bad sha1 file' or such. In places where we already do, unify
the format and clean the messages up.
Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
[PATCH] Return proper error valud from "parse_date()"
Right now we don't return any error value at all from parse_date(), and if
we can't parse it, we just silently leave the result buffer unchanged.
That's fine for the current user, which will always default to the current
date, but it's a crappy interface, and we might well be better off with an
error message rather than just the default date.
So let's change the thing to return a negative value if an error occurs,
and the length of the result otherwise (snprintf behaviour: if the buffer
is too small, it returns how big it _would_ have been).
[ I started looking at this in case we could support date-based revision
names. Looks ugly. Would have to parse relative dates.. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Add -m/--modified to show files that have been modified wrt. the index.
[jc: The original came from Brian Gerst on Sep 1st but it only checked
if the paths were cache dirty without actually checking the files were
modified. I also added the usage string and a new test.]
If the length in the stat information does not match what is recorded
in the index, there is no point rehashing the contents to see if the
index entry can be refreshed.
We need to be a bit careful. Immediately after read-tree or
checkout-index without -u, ce_size is set to zero and does not match
the length of the blob that is recorded, and we need to actually look
at the contents to see if it has been changed.
Use GECOS field a bit better to produce default human readable name.
This updates the default human readable name we generate from GECOS
field. We assume the "full-name, followed by additional information
separated by commas" format, with an & expanding to the capitalized
login name.
Somehow I missed it when we updated read-tree to support the recursive
merge strategy. Also -i should require -m as well, which the command
did not check.
[PATCH] Documentation: Add asciidoc.conf file and gitlink: macro
Introduce an asciidoc.conf file with the purpose of adding a gitlink:
macro which will improve the manpage output.
Original cogito patch by Jonas Fonseca <fonseca@diku.dk>;
asciidoc.conf from that patch was further enhanced to use the proper
DocBook tag <citerefentry> for references to man pages.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
The base target directory for the templates copying was initialized
to git_dir, but git_dir[len] is not zero but / at the time we do the
initialization. This is not what we want for our target directory string
since we pass it to mkdir(), so make it zero-terminated manually.
Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
[PATCH] Remove total confusion from "git checkout"
The target to check out does not need to be a branch. The _result_ of the
checkout needs to be a branch. Don't confuse the two, and then insult the
user.
Insulting is ok, but I personally get really pissed off is a tool is both
confused and insulting. At least be _correct_ and insulting.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes git-update-index error reporting much less confusing. The
user will know what went wrong with better precision, and will be given
a hopefully less confusing advice.
Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes everybodys favourite complaint about "git add", namely that it
doesn't take directories.
We use "git-ls-files --others" to generate an arbitrary list of filenames,
and thus also automatically honor ignore-files while we're at it.
Side note: there's a lot of room for improvement here. In particular, if
we have a long list of filenames (importing a big archive), this will just
do a big stupid for-loop and add them one at a time. Maybe it should use
generate-list | xargs -0 git-update-idex --add --
instead.
Also, I think we should have a default ignore list if we don't find a
.git/info/exclude file. Ignoring "*.o" and ".*" by default would probably
be the right thing to do.
But I think this is a good first step.
Use the "-n" flag to just show the list of files to be added without
adding them.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
[PATCH] Support alternates and http-alternates in http-fetch
This allows the remote repository to refer to additional repositories
in a file objects/info/http-alternates or
objects/info/alternates. Each line may be:
a relative path, starting with ../, to get from the objects directory
of the starting repository to the objects directory of the added
repository.
an absolute path of the objects directory of the added repository (on
the same server).
(only in http-alternates) a full URL of the objects directory of the
added repository.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
The recent safety check to trust only the commits we have made
things impossibly slow and turn out to waste a lot of memory.
This commit fixes it with the following improvements:
- mark already scanned objects and avoid rescanning the same
object again;
- free the tree entries when we have scanned the tree entries;
this is the same as b0d8923ec01fd91b75ab079034f89ced91500157
which reduced memory usage by rev-list;
- plug memory leak from the object_list dequeuing code;
- use the process_queue not just for fetching but for scanning,
to make things tail recursive to avoid deep recursion; the
deep recursion was especially prominent when we cloned a big
pack.
- avoid has_sha1_file() call when we already know we do not have
that object.
Using "git repack -a -d" can destroy your git archive if you use it
twice in succession, because the new pack can be called the same as
the old pack. Found by Linus.
As a convenience measure, 'bisect bad' or 'bisect good' automatically
does 'bisect next' when it knows it can, but the result of that test
to see if it can was leaking through as the exit code from the whole
thing, which was bad. Noticed by Anton Blanchard.
This tries .../objects/info/http-alternates and then
.../objects/info/alternates, looking for a file which specifies where
else to download objects and packs from.
It currently only supports absolute paths, and doesn't support full URLs.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a nicer fix for git-shortlog being unable to handle the raw log
format. Just use a more permissive regexp instead of doing two nearly
identical ones.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
There were two bugs in there:
- if the range didn't end up working, we restored the '.' character in
the wrong place.
- an empty end-of-range should be interpreted as HEAD.
See rev-parse.c for the reference implementation of this.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
For local operations and downloading and uploading via git aware protocols,
use of $GIT_OBJECT_DIRECTORY/info/alternates is recommended on the server
side for big projects that are derived from another one (like Linux kernel).
However, dumb protocols and rsync transport needs to resolve this on the
client end, which we did not bother doing until this week.
I noticed we use "rsync -z" but most of our payload is already compressed,
which was not quite right. This commit also fixes it.
and note how the number of pages touched by git-rev-list for this
particular object list has shrunk from 26,718 (104 MB) to 18,509 (72 MB).
Calculating the total object difference between two git revisions is still
clearly the most expensive git operation (both in memory and CPU time),
but it's now less than 40% of what it used to be.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This avoids keeping tree entries around, and free's them as it traverses
the list. This avoids building up a huge memory footprint just for these
small but very common allocations.
Note how the minor fault numbers - which ends up being how many pages we
needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is:
Before:
42934 minor pagefaults
After:
26718 minor pagefaults
This is all in _addition_ to the previous fixes. It used to be
~48,000 pagefaults.
That's still a honking big memory footprint, but it's about half of what
it was just a day or two ago (and this is the object list for a pretty big
update - almost 60,000 objects. Smaller updates need less memory).
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Update git-core spec file based on feedback from Fedora Extras review.
- update BuildRoot to be more specific
- eliminate Requires that must be satisfied for base system install
- drop Vendor
- use dist tag to differentiate between branches
- own %{_datadir}/git-core/
- use RPM_OPT_FLAGS in spec file
Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
An earlier commit causes a mismatch in <emphasis> and <superscript>
tags, one way of fixing it is having no more than one caret symbol per
line, which is the only solution I found in the asciidoc
documentation. Ugly, but it works.
[jc: ugly indeed but that is not Peter's fault.]
Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
The logic to calculate the full object list used to be very inter-twined
with the logic that looked up the commits.
For no good reason - it's actually a lot simpler to just do that logic
as a separate pass.
This improves performance a bit, and uses slightly less memory in my
tests, but more importantly it makes the code simpler to work with and
follow what it does.
The performance win is less than I had hoped for, but I get:
ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's
20MB out of 174MB to go). And got the same number of objects (in theory,
the more expensive one might find some more shared objects to avoid. In
practice it obviously doesn't).
I know how to make it use _lots_ less memory, which will probably speed it
up. But that's for another time, and I'd prefer to see this go in first.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
As pointed out on the list, git-rev-list can use a lot of memory.
One low-hanging fruit is to free the commit buffer for commits that we
parse. By default, parse_commit() will save away the buffer, since a lot
of cases do want it, and re-reading it continually would be unnecessary.
However, in many cases the buffer isn't actually necessary and saving it
just wastes memory.
We could just free the buffer ourselves, but especially in git-rev-list,
we actually end up using the helper functions that automatically add
parent commits to the commit lists, so we don't actually control the
commit parsing directly.
Instead, just make this behaviour of "parse_commit()" a global flag.
Maybe this is a bit tasteless, but it's very simple, and it makes a
noticable difference in memory usage.
note how the minor faults have decreased from 3714 pages to 2433 pages.
That's all due to the fewer anonymous pages allocated to hold the comment
buffers and their metadata.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Be more backward compatible with git-ssh-{push,pull}.
HPA reminded me that these programs knows about the name of the
counterpart on the other end and simply symlinking the old name to
new name locally would not be enough.
This patch does proper quoting, and uses "env" to be compatible with
tcsh. As a side benefit, I believe the code is a lot cleaner to read.
[jc: I am accepting this not because I necessarily agree with the
quoting approach taken by it, but because (1) the code is only used
by ssh-fetch/ssh-upload pair which I do not care much about (if you
have ssh account on the remote end you should be using git-send-pack
git-fetch-pack pair over ssh anyway), and (2) HPA is one of the more
important customers belonging to the Linux kernel community and I
want to help his workflow -- which includes not wasting his time by
asking him to switch to git-send-pack/git-fetch-pack pair, nor to use
a better shell ;-). I might not have taken this patch if it mucked
with git_connect in connect.c in its current form.]
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
The code did not catch the case where you removed an existing ref
without changing anything else. We are not talking about hundreds of
refs anyway, so remove that optimization.
[PATCH] git-http-fetch: Allow caching of retrieved objects by proxy servers
By default the curl library adds "Pragma: no-cache" header to all
requests, which disables caching by proxy servers. However, most
files in a GIT repository are immutable, and caching them is safe and
could be useful.
This patch removes the "Pragma: no-cache" header from requests for all
files except the pack list (objects/info/packs) and references
(refs/*), which are really mutable and should not be cached.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 3b2a4c46fd5093ec79fb60e1b14b8d4a58c74612 commit)
The new flag '-d' lets you delete a branch. For safety, it does not
lets you delete the branch you are currently on, nor a branch that
has been fully merged into your current branch.
The credit for the safety check idea goes to Daniel Barkalow.
An entry in the alternates file can name a directory relative to
the object store it describes. A typical linux-2.6 maintainer
repository would have "../../../torvalds/linux-2.6.git/objects" there,
because the subsystem maintainer object store would live in
This unfortunately is different from GIT_ALTERNATE_OBJECT_DIRECTORIES
which is relative to the cwd of the running process, but there is no
way to make it consistent with the behaviour of the environment
variable. The process typically is run in $system.git/ directory for
a naked repository, or one level up for a repository with a working
tree, so we just define it to be relative to the objects/ directory
to be different from either ;-).
Later, the dumb transport could be updated to read from info/alternates
and make requests for the repository the repository borrows from.
* It takes scripts to unexpected places (somebody had
CDPATH=..:../..:$HOME and the "cd" in git-clone.sh:get_repo_base
took him to $HOME/.git when he said "clone foo bar" to clone a
repository in "foo" which had "foo/.git"). CDPATH mechanism does
not implicitly give "." at the beginning of CDPATH, which is
the most irritating part.
* The extra echo when it does its thing confuses scripts further.
Most of our scripts that use "cd" includes git-sh-setup so the problem
is primarily fixed there. git-clone starts without a repository, and
it needs its own fix.
[PATCH] Make 'git checkout' a bit more forgiving when switching branches.
If you make a commit on a path, and then make the path
cache-dirty afterwards without changing its contents, 'git
checkout' to switch to another branch is prevented because
switching the branches done with 'read-tree -m -u $current
$next' detects that the path is cache-dirty, but it does not
bother noticing that the contents of the path has not been
actualy changed.
Since switching branches would involve checking out paths
different in the two branches, hence it is reasonably expensive
operation, we can afford to run update-index before running
read-tree to reduce this kind of false change from triggering
the check needlessly.
[PATCH] Omit patches that have already been merged from format-patch output.
This switches the logic to pick which commits to include in the output
from git-rev-list to git-cherry; as a side effect, 'format-patch ^up mine'
would stop working although up..mine would continue to work.
[PATCH] There are several undocumented dependencies
There are several undocumented dependencies in the .spec and in the
INSTALL files. The following is from Fedora, perhaps other RPM
distributions call the packages differently.
Also, the manpages aren't always installed gzipped.
Updates to git-core.spec.in file:
- Some git scripts use Perl
- gitk needs wish, which is part of TCL/Tk.
- curl is used all over
- Need the ssh program from openssh-clients
Updates to INSTALL:
- Mention wish
- Mention ssh
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
This allows any arbitrary flags to "grep", and knows about the few
special grep flags that take an argument too.
It also allows some flags for git-ls-files, although their usefulness
is questionable.
With this, something line
git grep -w -1 pattern
works, without the script enumerating every possible flag.
[jc: this is the version Linus sent out after I showed him a
barf-o-meter test version that avoids shell arrays. He must
have typed this version blindly, since he said:
I'm not barfing, but that's probably because my brain just shut
down and is desperately trying to gouge my eyes out with a spoon.
I slightly fixed it to catch the remaining arguments meant to be
given git-ls-files.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This should work around the compilation problem Johannes Schindelin
and others had on Mac OS/X.
Quoting Linus:
Any operating system where socklen_t is anything else than "int" is
terminally broken. The people who introduced that typedef were confused,
and I actually had to argue with them that it was fundamentally wrong:
there is no other valid type than "int" that makes sense for it.
When the git diff status 'N' was changed to 'A', diff-helper.c was
not updated accordingly. This means that it no longer shows the
diff for newly added files.
This patch makes that change in diff-helper.c.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Junio C Hamano <junkio@cox.net>