Now three types of path based URLs are supported:
gitweb.cgi/project.git
gitweb.cgi/project.git/branch
gitweb.cgi/project.git/branch/filename
The first one (show project summary) was already supported for a long time
now. The other two are new: they show the shortlog of a branch or
the plain file contents of some file contained in the repository.
This is especially useful to support project web pages for small
projects: just create an html branch and then use an URL like
gitweb.cgi/project.git/html/index.html.
Signed-off-by: Martin Waitz <tali@admingilde.org> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
apply --unidiff-zero: loosen sanity checks for --unidiff=0 patches
In "git-apply", we have a few sanity checks and heuristics that
expects that the patch fed to us is a unified diff with at least
one line of context.
* When there is no leading context line in a hunk, the hunk
must apply at the beginning of the preimage. Similarly, no
trailing context means that the hunk is anchored at the end.
* We learn a patch deletes the file from a hunk that has no
resulting line (i.e. all lines are prefixed with '-') if it
has not otherwise been known if the patch deletes the file.
Similarly, no old line means the file is being created.
And we declare an error condition when the file created by a
creation patch already exists, and/or when a deletion patch
still leaves content in the file.
These sanity checks are good safety measures, but breaks down
when people feed a diff generated with --unified=0. This was
recently noticed first by Matthew Wilcox and Gerrit Pape.
This adds a new flag, --unified-zero, to allow bypassing these
checks. If you are in control of the patch generation process,
you should not use --unified=0 patch and fix it up with this
flag; rather you should try work with a patch with context. But
if all you have to work with is a patch without context, this
flag may come handy as the last resort.
I had a hard time figuring out why this test was failing with
the packed-refs update without running it under "sh -x". This
makes output from "sh t1400-update-ref.sh -v" more descriptive.
Updating other tests would be a good janitorial task.
http-fetch.c: consolidate code to detect missing fetch target
At a handful places we check two error codes from curl library
to see if the file we asked was missing from the remote (e.g.
we asked for a loose object when it is in a pack) to decide what
to do next. This consolidates the check into a single function.
NOTE: the original did not check for HTTP_RETURNED_ERROR when
error code is 404, but this version does to make sure 404 is
from HTTP and not some other protcol.
An earlier commit cbd64af added a check that prevents "git-am"
to run without its standard input connected to a terminal while
resuming operation. This was to catch a user error to try
feeding a new patch from its standard input while recovery.
The assumption of the check was that it is an indication that a
new patch is being fed if the standard input is not connected to
a terminal. It is however not quite correct (the standard input
can be /dev/null if the user knows the operation does not need
any input, for example). This broke t3403 when the test was run
with its standard input connected to /dev/null.
When git-am is given an explicit command such as --skip, there
is no reason to insist that the standard input is a terminal; we
are not going to read a new patch anyway.
Credit goes to Gerrit Pape for noticing and reporting the
problem with t3403-rebase-skip test.
This allows you to maintain a few filesystem pathnames concurrently, by
simply replacing the single static "pathname" buffer with a LRU of four
buffers.
We did exactly the same thing with sha1_to_hex(), for pretty much exactly
the same reason. Sometimes you want to use two pathnames, and while it's
easy enough to xstrdup() them, why not just do the LU buffer thing.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
gitweb: Allow for href() to be used for links without project param
Make it possible to use href() subroutine to generate link with
query string which does not include project ('p') parameter.
href() used to add project=$project to its parameters, if it
was not set (to be more exact if $params{'project'} was false).
Now you can pass "project => undef" if you don't want for href()
to add project parameter to query string in the generated link.
Links to "project_list", "project_index" and "opml" (all related
to list of all projects/all git repositories) doesn't need project
parameter. Moreover "project_list" is default view (action) if
project ('p') parameter is not set, just like "summary" is default
view (action) if project is set; project list served as a kind
of "home" page for gitweb instalation, and links to "project_list"
view were done without specyfying it as an action.
Convert remaining links (except $home_link and anchor links)
to use href(); this required adding 'order => "o"' to @mapping
in href(). This finishes consolidation of URL generation.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
gitweb: Add git_project_index for generating index.aux
Add git_project_index, which generates index.aux file that can be used
as a source of projects list, instead of generating projects list from
a directory. Using file as a source of projects list allows for some
projects to be not present in gitweb main (project_list) page, and/or
correct project owner info. And is probably faster.
Additionally it can be used to get the list of all available repositories
for scripts (in easily parseable form).
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
gitweb: Do not parse refs by hand, use git-peek-remote instead
This is in response to Linus's work on packed refs. Additionally it
makes gitweb work with symrefs, too.
Do not parse refs by hand, using File::Find and reading individual
heads to get hash of reference, but use git-peek-remote output
instead. Assume that the hash for deref (with ^{}) always follows hash
for ref, and that we have derefs only for tag objects; this removes
call to git_get_type (and git-cat-file -t invocation) for tags, which
speeds "summary" and "tags" views generation, but might slow generation
of "heads" view a bit. For now, we do not save and use the deref hash.
Remove git_get_hash_by_ref while at it, as git_get_refs_list was the
only place it was used.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
gitweb: Use File::Find::find in git_get_projects_list
Earlier code to get list of projects when $projects_list is a
directory (e.g. when it is equal to $projectroot) had a hardcoded flat
(one level) list of directories. Allow for projects to be in
subdirectories also for $projects_list being a directory by using
File::Find.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
It turns out that I actually wanted to avoid the filenames (because I
didn't care - I just wanted to see the context in which something was
used) when doing a grep. But since "git grep" didn't take the "-h"
parameter, I ended up having to do "grep -5 -h *.c" instead.
So here's a trivial patch that adds "-h" (and thus has to enable -H too)
to "git grep" parsing.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Fetch over http from a repository that uses alternates to borrow
from neighbouring repositories were quite broken, apparently for
some time now.
We parse input and count bytes to allocate the new buffer, and
when we copy into that buffer we know exactly how many bytes we
want to copy from where. Using strlcpy for it was simply
stupid, and the code forgot to take it into account that strlcpy
terminated the string with NUL.
Fetch over http from a repository that uses alternates to borrow
from neighbouring repositories were quite broken, apparently for
some time now.
We parse input and count bytes to allocate the new buffer, and
when we copy into that buffer we know exactly how many bytes we
want to copy from where. Using strlcpy for it was simply
stupid, and the code forgot to take it into account that strlcpy
terminated the string with NUL.
Actually, teach runstatus what to do if it is not passed; it should not list
the contents of completely untracked directories, but only the name of that
directory (plus a trailing '/').
[jc: with comments by Jeff King to match hide-empty-directories
behaviour of the original.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
upload-archive: monitor child communication more carefully.
Franck noticed that the code around polling and relaying messages
from the child process was quite bogus. Here is an attempt to
clean it up a bit, based on his patch:
- When POLLHUP is set, it goes ahead and reads the file
descriptor. Worse yet, it does not check the return value of
read() for errors when it does.
- When we processed one POLLIN, we should just go back and see
if any more data is available. We can check if the child is
still there when poll gave control back at us but without any
actual input.
git_connect() can return 0 if we use git protocol for example.
Users of this function don't know and don't care if a process
had been created or not, and to avoid them to check it before
calling finish_connect() this patch allows finish_connect() to
take a null pid. And in that case return 0.
[jc: updated function signature of git_connect() with a comment on
its return value. ]
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix a memory leak in "connect.c" and die if command too long.
Use "add_to_string" instead of "sq_quote" and "snprintf", so
that there is no memory allocation and no memory leak.
Also check if the command is too long to fit into the buffer
and die if this is the case, instead of truncating it to the
buffer size.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git_history output is now divided into pages, like git_shortlog,
git_tags and git_heads output. As whole git-rev-list output is now
read into array before writing anything, it allows for better
signaling of errors.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
As pickaxe search (selected using undocumented 'pickaxe:' operator in
search query) is resource consuming, allow to turn it on/off using
feature meachanism. Turned on by default, for historical reasons.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Add sideband status report to git-archive protocol
Using the refactored sideband code from existing upload-pack protocol,
this lets the error condition and status output sent from the remote
process to be shown locally.
* jc/sideband:
Prepare larger packet buffer for upload-pack protocol.
Move sideband server side support into reusable form.
Move sideband client side support into reusable form.
get_sha1_hex() micro-optimization
Prepare larger packet buffer for upload-pack protocol.
The original side-band support added to the upload-pack protocol used the
default 1000-byte packet length. The pkt-line format allows up to 64k, so
prepare the receiver for the maximum size, and have the uploader and
downloader negotiate if larger packet length is allowed.
Some people needed --exec to specify the location of the upload-pack
executable, because their default SSH log-in does not include the
directory they have their own private copy of git on the $PATH.
These people need to be able to say --exec to git-archive --remote
for the same reason.
Use xstrdup instead of strdup in builtin-{tar,zip}-tree.c
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 5d2aea4cb383a43e40d47ab69d8ad7a495df6ea2 commit)
archive: allow remote to have more formats than we understand.
This fixes git-archive --remote not to parse archiver arguments;
otherwise if the remote end implements formats other than the
one known locally we will not be able to access that format.
This command implements the git archive protocol on the server
side. This command is not intended to be used by the end user.
Underlying git-archive command line options are sent over the
protocol from "git-archive --remote=...", just like upload-tar
currently does with "git-tar-tree=...".
As for "git-archive" command implementation, this new command
does not execute any existing "git-{tar,zip}-tree" but rely
on the archive API defined by "git-archive" patch. Hence we
get 2 good points:
- "git-archive" and "git-upload-archive" share all option
parsing code.
- All kind of git-upload-{tar,zip} can be deprecated.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-archive is a command to make TAR and ZIP archives of a git tree.
It helps prevent a proliferation of git-{format}-tree commands.
Instead of directly calling git-{tar,zip}-tree command, it defines
a very simple API, that archiver should implement and register in
"git-archive.c". This API is made up by 2 functions whose prototype
is defined in "archive.h" file.
- The first one is used to parse 'extra' parameters which have
signification only for the specific archiver. That would allow
different archive backends to have different kind of options.
- The second one is used to ask to an archive backend to build
the archive given some already resolved parameters.
The main reason for making this API is to avoid using
git-{tar,zip}-tree commands, hence making them useless. Maybe it's
time for them to die ?
It also implements remote operations by defining a very simple
protocol: it first sends the name of the specific uploader followed
the repository name (git-upload-tar git://example.org/repo.git).
Then it sends options. It's done by sending a sequence of one
argument per packet, with prefix "argument ", followed by a flush.
The remote protocol is implemented in "git-archive.c" for client
side and is triggered by "--remote=<repo>" option. For example,
to fetch a TAR archive in a remote repo, you can issue:
$ git archive --format=tar --remote=git://xxx/yyy/zzz.git HEAD
We choose to not make a new command "git-fetch-archive" for example,
avoind one more GIT command which should be nice for users (less
commands to remember, keeps existing --remote option).
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Acked-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
This creates a new git-runstatus which should do roughly the same thing
as the run_status function from git-commit.sh. Except for color support,
the main focus has been to keep the output identical, so that it can be
verified as correct and then used as a C platform for other improvements to
the status printing code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
autoconf: Add support for setting NO_ICONV and ICONVDIR
Add support for ./configure options --without-iconv (if neither libc
nor libiconv properly support iconv), and for --with-iconv=PATH (to
set prefix to libiconv library and headers, used only when
NEED_LIBICONV is set). While at it, make ./configure set or unset
NO_ICONV always (it is not autodetected in Makefile).
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
which picks up all loose objects that are still live and creates
a new pack.
This implements --unpacked=<existing pack> option to tell the
revision walking machinery to pretend as if objects in such a
pack are unpacked for the purpose of object listing. With this,
we could say:
instead, to mean "all live loose objects but pretend as if
objects that are in this pack are also unpacked". The newly
created pack would be perfect for updating $active_pack by
replacing it.
Since pack-objects now knows how to do the rev-list's work
itself internally, you can also write the above example by:
Historically we did not allow binary patch applied without an
explicit permission from the user, and this flag was the way to
do so. This makes the flag a no-op by always allowing binary
patch application.
When we are generating packs to update remote repositories we
want to supply as much information as possible about the revisions
that already exist to rev-list in order optimise the pack as much
as possible. We need to pass two revisions for each branch we are
updating in the remote repository and one for each additional branch.
Where the remote repository has numerous branches we can run out
of command line space to pass them.
Utilise the git-rev-list --stdin mode to allow unlimited numbers
of revision constraints. This allows us to move back to the much
simpler unordered revision selection code.
[jc: added some comments in the code to describe the pipe flow
a bit.]
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-repack: create new packs inside $GIT_DIR, not cwd
Avoid failing when cwd is !writable by writing the
packfiles in $GIT_DIR, which is more in line with other commands.
Without this, git-repack was failing when run from crontab
by non-root user accounts. For large repositories, this
also makes the mv operation a lot cheaper, and avoids leaving
temp packfiles around the fs upon failure.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Teach rev-list an option to read revs from the standard input.
When --stdin option is given, in addition to the <rev>s listed
on the command line, the command can read one rev parameter per
line from the standard input. The list of revs ends at the
first empty line or EOF.
Note that you still have to give all the flags from the command
line; only rev arguments (including A..B, A...B, and A^@ notations)
can be give from the standard input.
revision.c: allow injecting revision parameters after setup_revisions().
setup_revisions() wants to get all the parameters at once and
then postprocesses the resulting revs structure after it is done
with them. This code structure is a bit cumbersome to deal with
efficiently when we want to inject revision parameters from the
side (e.g. read from standard input).
Fortunately, the nature of this postprocessing is not affected by
revision parameters; they are affected only by flags. So it is
Ok to do add_object() after the it returns.
This splits out the code that deals with the revision parameter
out of the main loop of setup_revisions(), so that we can later
call it from elsewhere after it returns.
When build a pack for a push we query the remote copy for existant
heads. These are used to prune unnecessary objects from the pack.
As we receive the remote references in get_remote_heads() we validate
the reference names via check_ref() which includes a length check;
rejecting those >45 characters in size.
This is a miss converted change, it was originally designed to reject
messages which were less than 45 characters in length (a 40 character
sha1 and refs/) to prevent comparing unitialised memory. check_ref()
now gets the raw length so check for at least 5 characters.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
diff-index --cc shows a 3-way diff between HEAD, index and working tree.
This implements a 3-way diff between the HEAD commit, the state in the
index, and the working directory. This is like the n-way diff for a
merge, and uses much of the same code. It is invoked with the -c flag
to git-diff-index, which it already accepted and did nothing with.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/pack:
more lightweight revalidation while reusing deflated stream in packing
pack-objects: fix thinko in revalidate code
pack-objects: re-validate data we copy from elsewhere.
gitweb: Divide page path into directories -- path's "breadcrumbs"
Divide page path into directories, so that each part of path links to
the "tree" view of the $hash_base (or HEAD, if $hash_base is not set)
version of the directory.
If the entity is blob, final part (basename) links to $hash_base or
HEAD revision of the "raw" blob ("blob_plain" view). If the entity is
tree, link to the "tree" view.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
If you try to fsck a repository that isn't entirely empty, but that has no
inter-object references (ie all the objects are blobs, and don't refer to
anything else), git-fsck-objects currently fails.
This probably cannot happen in practice, but can be tested with something
like
where the fsck will die by a divide-by-zero when it tries to look up the
references from the one object it found (hash_obj() will do a modulus by
refs_hash_size).
On some other archiectures (ppc, sparc) the divide-by-zero will go
unnoticed, and we'll instead SIGSEGV when we hit the "refs_hash[j]"
access.
So move the test that should protect against this from mark_reachable()
into lookup_object_refs(), which incidentally in the process also fixes
mark_reachable() itself (it used to not mark the one object that _was_
reachable, because it decided that it had no refs too early).
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
unpack-objects desperately salvages objects from a corrupt pack
The command unpack-objects dies upon the first error. This is
probably considered a feature -- if a pack is corrupt, instead
of trying to extract from it and possibly risking to contaminate
a good repository with objects whose validity is dubious, we
should seek a good copy of the pack and retry. However, we may
not have any good copy anywhere. This implements the last
resort effort to extract what are salvageable from such a
corrupt pack.
This flag might have helped Sergio when recovering from a
corrupt pack. In my test, it managed to salvage 247 objects out
of a pack that had 251 objects but without it the command
stopped after extracting 73 objects.
more lightweight revalidation while reusing deflated stream in packing
When copying from an existing pack and when copying from a loose
object with new style header, the code makes sure that the piece
we are going to copy out inflates well and inflate() consumes
the data in full while doing so.
The check to see if the xdelta really apply is quite expensive
as you described, because you would need to have the image of
the base object which can be represented as a delta against
something else.
gitweb: Change the name of diff to parent link in "commit" view to "diff
Change the name of diff to parent (current commit to one of parents)
link in "commit" view (git_commit subroutine) from "commitdiff" to
"diff". Let's leave "commitdiff" for equivalent of git-show, or
git-diff-tree with one revision, i.e. diff for a given commit to its
parent (parents).
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
When revalidating an entry from an existing pack entry->size and
entry->type are not necessarily the size of the final object
when the entry is deltified, but for base objects they must
match.
* master:
Trace into a file or an open fd and refactor tracing code.
Replace uses of strdup with xstrdup.
consolidate two copies of new style object header parsing code.
Documentation: Fix howto/revert-branch-rebase.html generation
fmt-merge-msg: fix off-by-one bug
git-rev-list(1): group options; reformat; document more options
Constness tightening for move/link_temp_to_file()
gitweb: Fix git_blame
Include config.mak.autogen in the doc Makefile
Use xmalloc instead of malloc
git(7): move gitk(1) to the list of porcelain commands
gitk: Fix some bugs in the new cherry-picking code
gitk: Improve responsiveness while reading and layout out the graph
gitk: Update preceding/following tag info when creating a tag
gitk: Add a menu item for cherry-picking commits
gitk: Fix a couple of buglets in the branch head menu items
gitk: Add a context menu for heads
gitk: Add a row context-menu item for creating a new branch
gitk: Recompute ancestor/descendent heads/tags when rereading refs
gitk: Minor cleanups
pack-objects: re-validate data we copy from elsewhere.
When reusing data from an existing pack and from a new style
loose objects, we used to just copy it staight into the
resulting pack. Instead make sure they are not corrupt, but
do so only when we are not streaming to stdout, in which case
the receiving end will do the validation either by unpacking
the stream or by constructing the .idx file.
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
consolidate two copies of new style object header parsing code.
Also while we are at it, remove redundant typename[] array from
unpack_sha1_header. The only reason it is different from the
type_names[] array in object.c module is that this code cares
about the subset of object types that are valid in a loose
object, so prepare a separate array of boolean that tells us
which types are valid, and share the name translation with the
others.