implement some resilience against pack corruptions
We should be able to fall back to loose objects or alternative packs when
a pack becomes corrupted. This is especially true when an object exists
in one pack only as a delta but its base object is corrupted. Currently
there is no way to retrieve the former object even if the later is
available in another pack or loose.
This patch allows for a delta to be resolved (with a performance cost)
using a base object from a source other than the pack where that delta
is located. Same thing for non-delta objects: rather than failing
outright, a search is made in other packs or used loose when the
currently active pack has it but corrupted.
Of course git will become extremely noisy with error messages when that
happens. However, if the operation succeeds nevertheless, a simple
'git repack -a -f -d' will "fix" the corrupted repository given that all
corrupted objects have a good duplicate somewhere in the object store,
possibly manually copied from another source.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Suppose someone fetches git-svn-ified commits from another repo and then
attempts to use 'git-svn init --rewrite-root=foo bar'. Using git svn rebase
after that will fail badly:
* For each commit tried by working_head_info, rebuild is called indirectly.
* rebuild will iterate over all commits and skip all of them because the
URL does not match. Because of that no rev_map file is generated at all.
* Thus, rebuild will run once for every commit. This takes ages.
* In the end there still isn't any rev_map file and thus working_head_info
fails.
Addressing this behaviour fixes an apparently not too uncommon problem with
providing git-svn mirrors of Subversion repositories. Some repositories are
accessed using different URLs depending on whether the user has push
privileges or not. In the latter case, an anonymous URL is often used that
differs from the push URL. Providing a mirror that is usable in both cases
becomes a lot more possible with this change.
Signed-off-by: Jan Krüger <jk@jk.gs> Acked-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, we have spawnv() variants to run a child process instead of
fork()/exec(). In order to attach pipe ends to stdin, stdout, and stderr,
we have to use this idiom:
i.e. the child process closes the both pipe ends after duplicating one
to the file descriptors where they are needed.
On Windows, which does not have fork(), we never have an opportunity to
(1) duplicate a pipe end in the child, (2) close unused pipe ends. Instead,
we must use this idiom:
i.e. save away the descriptor at the destination slot, replace by the pipe
end, spawn process, restore the saved file.
But there is a problem: Notice that the child did not only inherit the
dup2()ed descriptor, but also *both* original pipe ends. Although the one
end that was dup()ed could be closed before the spawn(), we cannot close
the other end - the child inherits it, no matter what.
The solution is to generate non-inheritable pipes. At the first glance,
this looks strange: The purpose of pipes is usually to be inherited to
child processes. But notice that in the course of actions as outlined
above, the pipe descriptor that we want to inherit to the child is
dup2()ed, and as it so happens, Windows's dup2() creates inheritable
duplicates.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Windows: Wrap execve so that shell scripts can be invoked.
When an external git command is invoked, it can be a Bourne shell script.
This patch looks into the command file to see whether it is one.
In this case, the command line is rearranged to invoke the shell
with the proper arguments.
With this change, scripted git commands work. Command line arguments
to those scripts cannot be complex (contain spaces or double-quotes), yet.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Windows's rename() is based on the MoveFile() API, which fails if the
destination exists. Here we work around the problem by using MoveFileEx().
Furthermore, the posixly correct error is returned if the destination is
a directory.
The implementation is still slightly incomplete, however, because of the
missing error code translation: We assume that the failure is due to
permissions.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
getpwuid() is implemented just enough that GIT does not issue errors.
Since the information that it returns is not very useful, users are
required to set up user.name and user.email configuration.
All uses of getpwuid() are like getpwuid(getuid()), hence, the return value
of getuid() is irrelevant and the uid parameter is not even looked at.
Side note: getpwnam() is only used to resolve '~' and '~username' paths,
which is an idiom not known on Windows, hence, we don't implement it.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Windows: Implement a wrapper of the open() function.
The wrapper does two things:
- Requests to open /dev/null are redirected to open the nul pseudo file.
- A request to open a file that currently exists as a directory on
Windows fails with EACCES; this is changed to EISDIR.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
GIT's guts work with a forward slash as a path separators. We do not change
that. Rather we make sure that only "normalized" paths enter the depths
of the machinery.
We have to translate backslashes to forward slashes in the prefix and in
command line arguments. Fortunately, all of them are passed through
functions in setup.c.
A macro has_dos_drive_path() is defined that checks whether a path begins
with a drive letter+colon combination. This predicate is always false on
Unix. Another macro is_dir_sep() abstracts that a backslash is also a
directory separator on Windows.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Shrink the git binary a bit by avoiding unnecessary inline functions
So I was looking at the disgusting size of the git binary, and even with
the debugging removed, and using -Os instead of -O2, the size of the text
section was pretty high. In this day and age I guess almost a megabyte of
text isn't really all that surprising, but it still doesn't exactly make
me think "lean and mean".
With -Os, a surprising amount of text space is wasted on inline functions
that end up just being replicated multiple times, and where performance
really isn't a valid reason to inline them. In particular, the trivial
wrapper functions like "xmalloc()" are used _everywhere_, and making them
inline just duplicates the text (and the string we use to 'die()' on
failure) unnecessarily.
So this just moves them into a "wrapper.c" file, getting rid of a tiny bit
of unnecessary bloat. The following numbers are both with "CFLAGS=-Os":
Before:
[torvalds@woody git]$ size git
text data bss dec hex filename
700460 15160 292184 1007804 f60bc git
After:
[torvalds@woody git]$ size git
text data bss dec hex filename
670540 15160 292184 977884 eebdc git
so it saves almost 30k of text-space (it actually saves more than that
with the default -O2, but I don't think that's necessarily a very relevant
number from a "try to shrink git" standpoint).
It might conceivably have a performance impact, but none of this should be
_that_ performance critical. The real cost is not generally in the wrapper
anyway, but in the code it wraps (ie the cost of "xread()" is all in the
read itself, not in the trivial wrapping of it).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Extend parse-options test suite
api-parse-options.txt: Introduce documentation for parse options API
parse-options.c: fix documentation syntax of optional arguments
api-builtin.txt: update and fix typo
This patch serves two purposes:
1. test-parse-option.c should be a more complete
example for the parse-options API, and
2. there have been no tests for OPT_CALLBACK,
OPT_DATE, OPT_BIT, OPT_SET_INT and OPT_SET_PTR
before.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options.c: fix documentation syntax of optional arguments
When an argument for an option is optional, short options don't need a
space between the option and the argument, and long options need a "=".
Otherwise, arguments are misinterpreted.
Signed-off-by: Michele Ballabio <barra_cuda@katamail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/test:
enable whitespace checking of test scripts
avoid trailing whitespace in zero-change diffstat lines
avoid whitespace on empty line in automatic usage message
mask necessary whitespace policy violations in test scripts
fix whitespace violations in test scripts
With this change GIT can be compiled and linked using MinGW. Builtins
that only read the repository such as the log family and grep already
work.
Simple stubs are provided for a number of functions that the Windows C
runtime does not offer. They will be completed in later patches.
However, a fix for the snprintf/vsnprintf replacement is applied here
to avoid buffer overflows.
Dmitry Kakurin pointed out that access(..., X_OK) would always fails on
Vista and suggested the -D__USE_MINGW_ACCESS workaround.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
We don't have fnmatch and regular expressions on Windows. We borrow
fnmatch.[ch] from the GNU C library (license is LGPL 2 or later) and
GNU regexp (regexp.c[ch], license is GPL 2 or later). Note that regexp.c
was changed slightly to avoid warnings with gcc.
We make the addition of these files an extra commit so as not to clutter
the next commits.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
The test used "diff-files -q" which is not about reporting if there is
a difference at all. Instead, make sure that the path remains as
conflicting in the index after rerere autoresolves it, as we will be
adding rerere.autoupdate configuration with the next patch.
It is dubious if it is cheaper to shift entries repeatedly using memmove()
to collect entries that needs to be written out in front of an array than
simply marking the entries to be skipped. In addition, the label called this
"tail optimization", but this obviously is not what people usually call
with that name.
rerere: rerere_created_at() and has_resolution() abstraction
There were too many places in the code how an entry in the rerere database
looks like, and the garbage_collect() function that iterates over
subdirectories of the rr-cache directory was the worse offender.
Introduce two helper functions, rerere_created_at() and has_resolution(),
to abstract out the logic a bit better.
Incidentally this fixes a small memory leak in garbage_collect()
function. The path list to collect the entries to be pruned were defined
to strdup the paths but the caller was feeding a path after doing an extra
copy. Because the list does not have to be sorted by conflict signature
hash, we use path_list_append() instead of path_list_insert().
While we are at it, make a conflicted hunk comparision in handle_file() a
bit easier to read.
so we should document it to be more clear about that.
Suggested-by: Marek Zawirski <marek.zawirski@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui: Fix accidental staged state toggle when clicking top pixel row
If a text widget is asked the index at x,y with y == 0 or y == 1 it will
always return 1.0 as the nearest index, regardless of the x position.
This means that clicking the top 2 pixels of the Unstaged/Staged Changes
lists caused the state of the file there to be toggled. This patch
checks that the pixel clicked is greater than 1, so there is less chance
of accidentally staging or unstaging changes.
Signed-off-by: Richard Quirk <richard.quirk@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Many error status codes simply default to 403 Forbidden, which is not
correct in most cases. This patch makes gitweb return semantically
correct status codes.
For convenience the die_error function now only takes the status code
without reason as first parameter (e.g. 404 instead of "404 Not
Found"), and it now defaults to 500 (Internal Server Error), even
though the default is not used anywhere.
Also documented status code conventions in die_error.
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make git_dir a path relative to work_tree in setup_work_tree()
Once we find the absolute paths for git_dir and work_tree, we can make
git_dir a relative path since we know pwd will be work_tree. This should
save the kernel some time traversing the path to work_tree all the time
if git_dir is inside work_tree.
Daniel's patch didn't apply for me as-is, so I recreated it with some
differences, and here are the numbers from ten runs each.
There is some IO for me - probably due to more-or-less random flushing of
the journal - so the variation is bigger than I'd like, but whatever:
Before:
real 0m8.135s
real 0m7.933s
real 0m8.080s
real 0m7.954s
real 0m7.949s
real 0m8.112s
real 0m7.934s
real 0m8.059s
real 0m7.979s
real 0m8.038s
After:
real 0m7.685s
real 0m7.968s
real 0m7.703s
real 0m7.850s
real 0m7.995s
real 0m7.817s
real 0m7.963s
real 0m7.955s
real 0m7.848s
real 0m7.969s
Now, going by "best of ten" (on the assumption that the longer numbers
are all due to IO), I'm saying a 7.933s -> 7.685s reduction, and it does
seem to be outside of the noise (ie the "after" case never broke 8s, while
the "before" case did so half the time).
So looks like about 3% to me.
Doing it for a slightly smaller test-case (just the "arch" subdirectory)
gets more stable numbers probably due to not filling the journal with
metadata updates, so we have:
Before:
real 0m1.633s
real 0m1.633s
real 0m1.633s
real 0m1.632s
real 0m1.632s
real 0m1.630s
real 0m1.634s
real 0m1.631s
real 0m1.632s
real 0m1.632s
After:
real 0m1.610s
real 0m1.609s
real 0m1.610s
real 0m1.608s
real 0m1.607s
real 0m1.610s
real 0m1.609s
real 0m1.611s
real 0m1.608s
real 0m1.611s
where I'ld just take the averages and say 1.632 vs 1.610, which is just
over 1% peformance improvement.
So it's not in the noise, but it's not as big as I initially thought and
measured.
(That said, it obviously depends on how deep the working directory path is
too, and whether it is behind NFS or something else that might need to
cause more work to look up).
Due to a misplaced list block separator, general hints about the config
file options got indented at the same level as the description of the last
option, making it easy to miss them.
Signed-off-by: Jan Krüger <jk@jk.gs> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7502-commit.sh: test_must_fail doesn't work with inline environment variables
When the arguments to test_must_fail() begin with a variable assignment,
test_must_fail() attempts to execute the variable assignment as a command.
This fails, and so test_must_fail returns with a successful status value
without running the command it was intended to test.
For example, the following script:
#!/bin/sh
test_must_fail () {
"$@"
test $? -gt 0 -a $? -le 129
}
foo='wo adrian'
test_must_fail foo='yo adrian' sh -c 'echo foo: $foo'
always exits zero and prints the message:
test.sh: line 3: foo=yo adrian: command not found
Test 16 calls test_must_fail in such a way and therefore has not been
testing whether git 'do[es] not fire editor in the presence of conflicts'.
A workaround is to set and export the variable in a normal way, not
using one-shot notation. Because this would affect the remainder of
the process, the test is done inside a subshell.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In repos with many refs, it is unlikely that most refs will ever change.
This fact is already exploited by "git gc" by executing "git pack-refs"
to consolidate all refs into a single file.
When cloning a repo with many refs, it does not make sense to create the
loose refs in the first place, just to have the next "git gc" consolidate
them into one file. Instead, make "git clone" create the packed refs file
immediately, and forego the loose refs completely.
Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare testsuite for a "git clone" that packs refs
t5515-fetch-merge-logic removes many, but not all, refs between each test.
This is done by removing the corresponding refs/foo/* files in the .git/refs
hierarchy. However, once "git clone" starts producing packed refs, these refs
will no longer be in the .git/refs hierarchy, but rather listed in
.git/packed-refs. This patch teaches t5515-fetch-merge-logic to remove the
refs using "git update-ref -d" which properly handles packed refs.
Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves pack_refs() and underlying functionality into the library,
to make pack-refs functionality easily available to all git programs.
Most of builtin-pack-refs.c has been moved verbatim into a new file
pack-refs.c that is compiled into libgit.a. A corresponding header
file, pack-refs.h, has also been added, declaring pack_refs() and
the #defines associated with the flags parameter to pack_refs().
This patch introduces no other changes in functionality.
Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Incorporate fetched packs in future object traversal
Immediately after fetching a pack, we should call reprepare_packed_git() to
make sure the objects in the pack are reachable. Otherwise, we will fail to
look up objects that are present only in the fetched pack.
Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin-fast-export: Add importing and exporting of revision marks
This adds the --import-marks and --export-marks to fast-export. These import
and export the marks used to for all revisions exported in a similar fashion
to what fast-import does. The format is the same as fast-import, so you can
create a bidirectional importer / exporter by using the same marks file on
both sides.
Signed-off-by: Pieter de Bie <pdebie@ai.rug.nl> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/test-lib.sh: add test_external and test_external_without_stderr
This is for running external test scripts in other programming
languages that provide continuous output about their tests. Using
test_expect_success (like "test_expect_success 'description' 'perl
test-script.pl'") doesn't suffice here because test_expect_success
eats stdout in non-verbose mode, which is not fixable without major
file descriptor trickery.
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a --long-tests option to test-lib.sh, which enables tests to
selectively run more exhaustive (longer running, potentially
brute-force) tests. Such exhaustive tests would only be useful if one
works on the specific module that is being tested -- for a general "cd
t/; make" to check whether everything is OK, such exhaustive tests
shouldn't be run by default since the longer it takes to run the
tests, the less often they are actually run.
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use size=0 as the magic token to say the entry is known to be racily
clean, but a sequence that does:
- update the path with a non-empty blob and write the index;
- update an unrelated path and write the index -- this smudges
the above entry;
- truncate the path to size zero.
would make both the size field for the path in the index and the size on
the filesystem zero. We should not mistake it as a clean index entry.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff -c/--cc: do not include uninteresting deletion before leading context
When we include a few uninteresting lines before the interesting ones as
context, we are only interested in seeing the surviving lines themselves
and not the deleted lines that are before them. Mark the added leading
context lines in give_context() and not show deleted lines form them.
Add config option to enable 'fsync()' of object files
As explained in the documentation[*] this is totally useless on
filesystems that do ordered/journalled data writes, but it can be a
useful safety feature on filesystems like HFS+ that only journal the
metadata, not the actual file contents.
It defaults to off, although we could presumably in theory some day
auto-enable it on a per-filesystem basis.
[*] Yes, I updated the docs for the thing. Hell really _has_ frozen
over, and the four horsemen are probably just beyond the horizon.
EVERYBODY PANIC!
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up default "core" config parsing into helper routine
It makes the code a bit easier to read, and in theory a bit faster too
(no need to compare all the different "core.*" strings against non-core
config options).
The config system really should get something of a complete overhaul,
but in the absense of that, this at least improves on it a tiny bit.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clean up error conventions of remote.c:match_explicit
match_explicit is called for each push refspec to try to
fully resolve the source and destination sides of the
refspec. Currently, we look at each refspec and report
errors on both the source and the dest side before aborting.
It makes sense to report errors for each refspec, since an
error in one is independent of an error in the other.
However, reporting errors on the 'dst' side of a refspec if
there has been an error on the 'src' side does not
necessarily make sense, since the interpretation of the
'dst' side depends on the 'src' side (for example, when
creating a new unqualified remote ref, we use the same type
as the src ref).
This patch lets match_explicit return early when the src
side of the refspec is bogus. We still look at all of the
refspecs before aborting the push, though.
At the same time, we clean up the call signature, which
previously took an extra "errs" flag. This was pointless, as
we didn't act on that flag, but rather just passed it back
to the caller. Instead, we now use the more traditional
"return -1" to signal an error, and the caller aggregates
the error count.
This change fixes two bugs, as well:
- the early return avoids a segfault when passing a NULL
matched_src to guess_ref()
- the check for multiple sources pointing to a single dest
aborted if the "err" flag was set. Presumably the intent
was not to bother with the check if we had no
matched_src. However, since the err flag was passed in
from the caller, we might abort the check just because a
previous refspec had a problem, which doesn't make
sense.
In practice, this didn't matter, since due to the error
flag we end up aborting the push anyway.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit af66366a9feb0194ed04b1f538998021ece268a8 introduced the keyword
"never" to be used with approxidate() but defined it with a fixed date
without taking care of timezone. As a result approxidate() will return
a timestamp in the future with a negative timezone.
With this patch, approxidate("never") always return 0 whatever your
timezone is.
Signed-off-by: Olivier Marin <dkr@freesurf.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-am: head -1 is obsolete and doesn't work on some new systems
head -<n> was deprecated by POSIX, and as modern versions of coreutils
package don't support it at least one exports _POSIX2_VERSION=199209
it's fails on some systems.
head -n<n> is portable, but sed <n>q is even more.
Signed-off-by: Alejandro Mery <amery@geeks.cl> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The data read from MERGE_RR file is kept in path-list by hanging textual
40-byte conflict signature to path of the blob that contains the
conflict. The signature is strdup'ed twice, and the second copy is given
to the path-list, leaking the first copy.
Signed-off-by: Junio C Hamano <junio@pobox.com> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
gitweb: quote commands properly when calling the shell
This eliminates the function git_cmd_str, which was used for composing
command lines, and adds a quote_command function, which quotes all of
its arguments (as in quote.c).
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was implemented as a thin wrapper around an otherwise unused
helper function parse_pack_index_file(). The code becomes simpler
and easier to read by consolidating the two.
write_loose_object: don't bother trying to read an old object
Before even calling this, all callers have done a "has_sha1_file(sha1)"
or "has_loose_object(sha1)" check, so there is no point in doing a
second check.
If something races with us on object creation, we handle that in the
final link() that moves it to the right place.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When initializing the struct async and struct child_process structures,
the documentation suggested "clearing" the structure with '0' instead of
'\0'. It is enough to use integer zero here.
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It worked that way since commit 50f575fc (Tweak diff colors,
2006-06-22), but commit c1795bb0 (Unify whitespace checking, 2007-12-13)
changed it. This patch restores the old behaviour.
Besides Linus' arguments in the log message of 50f575fc, resetting color
before printing newline is also important to keep 'git add --patch'
happy. If the last line(s) of a file are removed, then that hunk will
end with a colored line. However, if the newline comes before the color
reset, then the diff output will have an additional line at the end
containing only the reset sequence. This causes trouble in
git-add--interactive.perl's parse_diff function, because @colored will
have one more element than @diff, and that last element will contain the
color reset. The elements of these arrays will then be copied to @hunk,
but only as many as the number of elements in @diff. As a result the
last color reset is lost and all subsequent terminal output will be
printed in color.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>