Fetch over http from a repository that uses alternates to borrow
from neighbouring repositories were quite broken, apparently for
some time now.
We parse input and count bytes to allocate the new buffer, and
when we copy into that buffer we know exactly how many bytes we
want to copy from where. Using strlcpy for it was simply
stupid, and the code forgot to take it into account that strlcpy
terminated the string with NUL.
Fetch over http from a repository that uses alternates to borrow
from neighbouring repositories were quite broken, apparently for
some time now.
We parse input and count bytes to allocate the new buffer, and
when we copy into that buffer we know exactly how many bytes we
want to copy from where. Using strlcpy for it was simply
stupid, and the code forgot to take it into account that strlcpy
terminated the string with NUL.
Actually, teach runstatus what to do if it is not passed; it should not list
the contents of completely untracked directories, but only the name of that
directory (plus a trailing '/').
[jc: with comments by Jeff King to match hide-empty-directories
behaviour of the original.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
upload-archive: monitor child communication more carefully.
Franck noticed that the code around polling and relaying messages
from the child process was quite bogus. Here is an attempt to
clean it up a bit, based on his patch:
- When POLLHUP is set, it goes ahead and reads the file
descriptor. Worse yet, it does not check the return value of
read() for errors when it does.
- When we processed one POLLIN, we should just go back and see
if any more data is available. We can check if the child is
still there when poll gave control back at us but without any
actual input.
git_connect() can return 0 if we use git protocol for example.
Users of this function don't know and don't care if a process
had been created or not, and to avoid them to check it before
calling finish_connect() this patch allows finish_connect() to
take a null pid. And in that case return 0.
[jc: updated function signature of git_connect() with a comment on
its return value. ]
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix a memory leak in "connect.c" and die if command too long.
Use "add_to_string" instead of "sq_quote" and "snprintf", so
that there is no memory allocation and no memory leak.
Also check if the command is too long to fit into the buffer
and die if this is the case, instead of truncating it to the
buffer size.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git_history output is now divided into pages, like git_shortlog,
git_tags and git_heads output. As whole git-rev-list output is now
read into array before writing anything, it allows for better
signaling of errors.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
As pickaxe search (selected using undocumented 'pickaxe:' operator in
search query) is resource consuming, allow to turn it on/off using
feature meachanism. Turned on by default, for historical reasons.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Add sideband status report to git-archive protocol
Using the refactored sideband code from existing upload-pack protocol,
this lets the error condition and status output sent from the remote
process to be shown locally.
* jc/sideband:
Prepare larger packet buffer for upload-pack protocol.
Move sideband server side support into reusable form.
Move sideband client side support into reusable form.
get_sha1_hex() micro-optimization
Prepare larger packet buffer for upload-pack protocol.
The original side-band support added to the upload-pack protocol used the
default 1000-byte packet length. The pkt-line format allows up to 64k, so
prepare the receiver for the maximum size, and have the uploader and
downloader negotiate if larger packet length is allowed.
Some people needed --exec to specify the location of the upload-pack
executable, because their default SSH log-in does not include the
directory they have their own private copy of git on the $PATH.
These people need to be able to say --exec to git-archive --remote
for the same reason.
Use xstrdup instead of strdup in builtin-{tar,zip}-tree.c
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 5d2aea4cb383a43e40d47ab69d8ad7a495df6ea2 commit)
archive: allow remote to have more formats than we understand.
This fixes git-archive --remote not to parse archiver arguments;
otherwise if the remote end implements formats other than the
one known locally we will not be able to access that format.
This command implements the git archive protocol on the server
side. This command is not intended to be used by the end user.
Underlying git-archive command line options are sent over the
protocol from "git-archive --remote=...", just like upload-tar
currently does with "git-tar-tree=...".
As for "git-archive" command implementation, this new command
does not execute any existing "git-{tar,zip}-tree" but rely
on the archive API defined by "git-archive" patch. Hence we
get 2 good points:
- "git-archive" and "git-upload-archive" share all option
parsing code.
- All kind of git-upload-{tar,zip} can be deprecated.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-archive is a command to make TAR and ZIP archives of a git tree.
It helps prevent a proliferation of git-{format}-tree commands.
Instead of directly calling git-{tar,zip}-tree command, it defines
a very simple API, that archiver should implement and register in
"git-archive.c". This API is made up by 2 functions whose prototype
is defined in "archive.h" file.
- The first one is used to parse 'extra' parameters which have
signification only for the specific archiver. That would allow
different archive backends to have different kind of options.
- The second one is used to ask to an archive backend to build
the archive given some already resolved parameters.
The main reason for making this API is to avoid using
git-{tar,zip}-tree commands, hence making them useless. Maybe it's
time for them to die ?
It also implements remote operations by defining a very simple
protocol: it first sends the name of the specific uploader followed
the repository name (git-upload-tar git://example.org/repo.git).
Then it sends options. It's done by sending a sequence of one
argument per packet, with prefix "argument ", followed by a flush.
The remote protocol is implemented in "git-archive.c" for client
side and is triggered by "--remote=<repo>" option. For example,
to fetch a TAR archive in a remote repo, you can issue:
$ git archive --format=tar --remote=git://xxx/yyy/zzz.git HEAD
We choose to not make a new command "git-fetch-archive" for example,
avoind one more GIT command which should be nice for users (less
commands to remember, keeps existing --remote option).
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Acked-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
This creates a new git-runstatus which should do roughly the same thing
as the run_status function from git-commit.sh. Except for color support,
the main focus has been to keep the output identical, so that it can be
verified as correct and then used as a C platform for other improvements to
the status printing code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
autoconf: Add support for setting NO_ICONV and ICONVDIR
Add support for ./configure options --without-iconv (if neither libc
nor libiconv properly support iconv), and for --with-iconv=PATH (to
set prefix to libiconv library and headers, used only when
NEED_LIBICONV is set). While at it, make ./configure set or unset
NO_ICONV always (it is not autodetected in Makefile).
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
which picks up all loose objects that are still live and creates
a new pack.
This implements --unpacked=<existing pack> option to tell the
revision walking machinery to pretend as if objects in such a
pack are unpacked for the purpose of object listing. With this,
we could say:
instead, to mean "all live loose objects but pretend as if
objects that are in this pack are also unpacked". The newly
created pack would be perfect for updating $active_pack by
replacing it.
Since pack-objects now knows how to do the rev-list's work
itself internally, you can also write the above example by:
Historically we did not allow binary patch applied without an
explicit permission from the user, and this flag was the way to
do so. This makes the flag a no-op by always allowing binary
patch application.
When we are generating packs to update remote repositories we
want to supply as much information as possible about the revisions
that already exist to rev-list in order optimise the pack as much
as possible. We need to pass two revisions for each branch we are
updating in the remote repository and one for each additional branch.
Where the remote repository has numerous branches we can run out
of command line space to pass them.
Utilise the git-rev-list --stdin mode to allow unlimited numbers
of revision constraints. This allows us to move back to the much
simpler unordered revision selection code.
[jc: added some comments in the code to describe the pipe flow
a bit.]
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-repack: create new packs inside $GIT_DIR, not cwd
Avoid failing when cwd is !writable by writing the
packfiles in $GIT_DIR, which is more in line with other commands.
Without this, git-repack was failing when run from crontab
by non-root user accounts. For large repositories, this
also makes the mv operation a lot cheaper, and avoids leaving
temp packfiles around the fs upon failure.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Teach rev-list an option to read revs from the standard input.
When --stdin option is given, in addition to the <rev>s listed
on the command line, the command can read one rev parameter per
line from the standard input. The list of revs ends at the
first empty line or EOF.
Note that you still have to give all the flags from the command
line; only rev arguments (including A..B, A...B, and A^@ notations)
can be give from the standard input.
revision.c: allow injecting revision parameters after setup_revisions().
setup_revisions() wants to get all the parameters at once and
then postprocesses the resulting revs structure after it is done
with them. This code structure is a bit cumbersome to deal with
efficiently when we want to inject revision parameters from the
side (e.g. read from standard input).
Fortunately, the nature of this postprocessing is not affected by
revision parameters; they are affected only by flags. So it is
Ok to do add_object() after the it returns.
This splits out the code that deals with the revision parameter
out of the main loop of setup_revisions(), so that we can later
call it from elsewhere after it returns.
When build a pack for a push we query the remote copy for existant
heads. These are used to prune unnecessary objects from the pack.
As we receive the remote references in get_remote_heads() we validate
the reference names via check_ref() which includes a length check;
rejecting those >45 characters in size.
This is a miss converted change, it was originally designed to reject
messages which were less than 45 characters in length (a 40 character
sha1 and refs/) to prevent comparing unitialised memory. check_ref()
now gets the raw length so check for at least 5 characters.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
diff-index --cc shows a 3-way diff between HEAD, index and working tree.
This implements a 3-way diff between the HEAD commit, the state in the
index, and the working directory. This is like the n-way diff for a
merge, and uses much of the same code. It is invoked with the -c flag
to git-diff-index, which it already accepted and did nothing with.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/pack:
more lightweight revalidation while reusing deflated stream in packing
pack-objects: fix thinko in revalidate code
pack-objects: re-validate data we copy from elsewhere.
gitweb: Divide page path into directories -- path's "breadcrumbs"
Divide page path into directories, so that each part of path links to
the "tree" view of the $hash_base (or HEAD, if $hash_base is not set)
version of the directory.
If the entity is blob, final part (basename) links to $hash_base or
HEAD revision of the "raw" blob ("blob_plain" view). If the entity is
tree, link to the "tree" view.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
If you try to fsck a repository that isn't entirely empty, but that has no
inter-object references (ie all the objects are blobs, and don't refer to
anything else), git-fsck-objects currently fails.
This probably cannot happen in practice, but can be tested with something
like
where the fsck will die by a divide-by-zero when it tries to look up the
references from the one object it found (hash_obj() will do a modulus by
refs_hash_size).
On some other archiectures (ppc, sparc) the divide-by-zero will go
unnoticed, and we'll instead SIGSEGV when we hit the "refs_hash[j]"
access.
So move the test that should protect against this from mark_reachable()
into lookup_object_refs(), which incidentally in the process also fixes
mark_reachable() itself (it used to not mark the one object that _was_
reachable, because it decided that it had no refs too early).
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
unpack-objects desperately salvages objects from a corrupt pack
The command unpack-objects dies upon the first error. This is
probably considered a feature -- if a pack is corrupt, instead
of trying to extract from it and possibly risking to contaminate
a good repository with objects whose validity is dubious, we
should seek a good copy of the pack and retry. However, we may
not have any good copy anywhere. This implements the last
resort effort to extract what are salvageable from such a
corrupt pack.
This flag might have helped Sergio when recovering from a
corrupt pack. In my test, it managed to salvage 247 objects out
of a pack that had 251 objects but without it the command
stopped after extracting 73 objects.
more lightweight revalidation while reusing deflated stream in packing
When copying from an existing pack and when copying from a loose
object with new style header, the code makes sure that the piece
we are going to copy out inflates well and inflate() consumes
the data in full while doing so.
The check to see if the xdelta really apply is quite expensive
as you described, because you would need to have the image of
the base object which can be represented as a delta against
something else.
gitweb: Change the name of diff to parent link in "commit" view to "diff
Change the name of diff to parent (current commit to one of parents)
link in "commit" view (git_commit subroutine) from "commitdiff" to
"diff". Let's leave "commitdiff" for equivalent of git-show, or
git-diff-tree with one revision, i.e. diff for a given commit to its
parent (parents).
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
When revalidating an entry from an existing pack entry->size and
entry->type are not necessarily the size of the final object
when the entry is deltified, but for base objects they must
match.
* master:
Trace into a file or an open fd and refactor tracing code.
Replace uses of strdup with xstrdup.
consolidate two copies of new style object header parsing code.
Documentation: Fix howto/revert-branch-rebase.html generation
fmt-merge-msg: fix off-by-one bug
git-rev-list(1): group options; reformat; document more options
Constness tightening for move/link_temp_to_file()
gitweb: Fix git_blame
Include config.mak.autogen in the doc Makefile
Use xmalloc instead of malloc
git(7): move gitk(1) to the list of porcelain commands
gitk: Fix some bugs in the new cherry-picking code
gitk: Improve responsiveness while reading and layout out the graph
gitk: Update preceding/following tag info when creating a tag
gitk: Add a menu item for cherry-picking commits
gitk: Fix a couple of buglets in the branch head menu items
gitk: Add a context menu for heads
gitk: Add a row context-menu item for creating a new branch
gitk: Recompute ancestor/descendent heads/tags when rereading refs
gitk: Minor cleanups
pack-objects: re-validate data we copy from elsewhere.
When reusing data from an existing pack and from a new style
loose objects, we used to just copy it staight into the
resulting pack. Instead make sure they are not corrupt, but
do so only when we are not streaming to stdout, in which case
the receiving end will do the validation either by unpacking
the stream or by constructing the .idx file.
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
consolidate two copies of new style object header parsing code.
Also while we are at it, remove redundant typename[] array from
unpack_sha1_header. The only reason it is different from the
type_names[] array in object.c module is that this code cares
about the subset of object types that are valid in a loose
object, so prepare a separate array of boolean that tells us
which types are valid, and share the name translation with the
others.
The rule for howto/*.html used "$?", which expands to the list of all
newer prerequisites, including asciidoc.conf added by another rule.
"$<" should be used instead.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Fix some bugs in the new cherry-picking code
gitk: Improve responsiveness while reading and layout out the graph
gitk: Update preceding/following tag info when creating a tag
gitk: Add a menu item for cherry-picking commits
gitk: Fix a couple of buglets in the branch head menu items
gitk: Add a context menu for heads
gitk: Add a row context-menu item for creating a new branch
gitk: Recompute ancestor/descendent heads/tags when rereading refs
gitk: Minor cleanups
Now if GIT_TRACE is set to an integer value greater than 1
and lower than 10, we interpret this as an open fd value
and we trace into it. Note that this behavior is not
compatible with the previous one.
We also trace whole messages using one write(2) call to
make sure messages from processes do net get mixed up in
the middle.
It's now possible to run the tests like this:
GIT_TRACE=9 make test 9>/var/tmp/trace.log
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
The intention of the test seems to be to build a long chain of
clones that locally borrow objects from their parents and see the
system give up dereferencing long chains. There were two problems:
(1) it did not test the right repository;
(2) it did not build a chain long enough to trigger the limitation.
I do not think it is a good test to make sure the limitation the
current implementation happens to have still exists, but that is
a topic at a totally different level.
gitweb: Extend parse_difftree_raw_line to save commit info
Extend parse_difftree_raw_line to save commit info from when
git-diff-tree is given only one <tree-ish>, for example when fed
from git-rev-list using --stdin option.
git-diff-tree outputs a line with the commit ID when applicable.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
use do() instead of require() to include configuration
When run under mod_perl, require() will read and execute the configuration
file on the first invocation only. On every subsequent invocation, all
configuration variables will be reset to their default values. do() reads
and executes the configuration file unconditionally.
Signed-off-by: Dennis Stosberg <dennis@stosberg.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
On Aug 27th, Jakub Narebski sent a patch which removed the git_to_hash()
function and this call to it. The patch did not apply cleanly and had to
be applied manually. Removing the last chunk has obviously been forgotten.
This patch clean up append_signoff() by moving specific code that
looks up for "^[-A-Za-z]+: [^@]+@" pattern into a function.
It also stops the primary search when the cursor oversteps
'buf + at' limit.
This patch changes slightly append_signoff() behaviour too. If we
detect any Signed-off-by pattern during the primary search, we
needn't to do a pattern research after.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-fsck-objects: lacking default references should not be fatal
The comment added says it all: if we have lost all references in a git
archive, git-fsck-objects should still work, so instead of dying it should
just notify the user about that condition.
This change was triggered by me just doing a "git-init-db" and then
populating that empty git archive with a pack/index file to look at it.
Having git-fsck-objects not work just because I didn't have any references
handy was rather irritating, since part of the reason for running
git-fsck-objects in the first place was to _find_ the missing references.
However, "--unreachable" really doesn't make sense in that situation, and
we want to turn it off to protect anybody who uses the old "git prune"
shell-script (rather than the modern built-in). The old pruning script
used to remove all objects that were reported as unreachable, and without
any refs, that obviously means everything - not worth it.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>