upload-pack: start pack-objects before async rev-list
In a pthread-enabled version of upload-pack, there's a race condition
that can cause a deadlock on the fflush(NULL) we call from run-command.
What happens is this:
1. Upload-pack is informed we are doing a shallow clone.
2. We call start_async() to spawn a thread that will generate rev-list
results to feed to pack-objects. It gets a file descriptor to a
pipe which will eventually hook to pack-objects.
3. The rev-list thread uses fdopen to create a new output stream
around the fd we gave it, called pack_pipe.
4. The thread writes results to pack_pipe. Outside of our control,
libc is doing locking on the stream. We keep writing until the OS
pipe buffer is full, and then we block in write(), still holding
the lock.
5. The main thread now uses start_command to spawn pack-objects.
Before forking, it calls fflush(NULL) to flush every stdio output
buffer. It blocks trying to get the lock on pack_pipe.
And we have a deadlock. The thread will block until somebody starts
reading from the pipe. But nobody will read from the pipe until we
finish flushing to the pipe.
To fix this, we swap the start order: we start the
pack-objects reader first, and then the rev-list writer
after. Thus the problematic fflush(NULL) happens before we
even open the new file descriptor (and even if it didn't,
flushing should no longer block, as the reader at the end of
the pipe is now active).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Fix parsing of negative fractional timezones in JavaScript
Extract converting numerical timezone in the form of '(+|-)HHMM' to
timezoneOffset function, and fix parsing of negative fractional
timezones.
This is used to format timestamps in 'blame_incremental' view; this
complements commit 2b1e172 (gitweb: Fix handling of fractional
timezones in parse_date, 2011-03-25).
pull: do not clobber untracked files on initial pull
For a pull into an unborn branch, we do not use "git merge"
at all. Instead, we call read-tree directly. However, we
used the --reset parameter instead of "-m", which turns off
the safety features.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jk/format-patch-multiline-header' into maint
* jk/format-patch-multiline-header:
format-patch: rfc2047-encode newlines in headers
format-patch: wrap long header lines
strbuf: add fixed-length version of add_wrapped_text
Starting with commit c793430 (Limit file descriptors used by packs,
2011-02-28), git uses getrlimit to tell how many file descriptors it
can use. Unfortunately it does not include the header declaring that
function, resulting in compilation errors:
sha1_file.c: In function 'open_packed_git_1':
sha1_file.c:718: error: storage size of 'lim' isn't known
sha1_file.c:721: warning: implicit declaration of function 'getrlimit'
sha1_file.c:721: error: 'RLIMIT_NOFILE' undeclared (first use in this function)
sha1_file.c:718: warning: unused variable 'lim'
The standard header to include for this is <sys/resource.h> (which on
some systems itself requires declarations from <sys/types.h> or
<sys/time.h>). Probably the problem was missed until now because in
current glibc sys/resource.h happens to be included by sys/wait.h.
MinGW does not provide sys/resource.h (and compat/mingw takes care of
providing getrlimit some other way), so add the missing #include to
the "#ifndef __MINGW32__" block in git-compat-util.h.
Reported-by: Stefan Sperling <stsp@stsp.name> Tested-by: Stefan Sperling <stsp@stsp.name> [on OpenBSD] Tested-by: Arnaud Lacombe <lacombar@gmail.com> [on FreeBSD 8] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Doc: mention --delta-base-offset is the default for Porcelain commands
The underlying pack-objects plumbing command still needs an explicit
option from the command line, but these days Porcelain passes the
option, so there is no need for end users to worry about it anymore.
docs: fix filter-branch subdir example for exotic repo names
The GIT_INDEX_FILE variable we get from git has the full
path to the repo, which may contain spaces. When we use it
in our shell snippet, it needs to be quoted.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
HOME must be set before calling git-init when creating test repositories
Otherwise the created test repositories will be affected by users ~/.gitconfig.
For example, setting core.logAllrefupdates in users config will make all
calls to "git config --unset core.logAllrefupdates" fail which will break
the first test which uses the statement and expects it to succeed.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Fix handling of fractional timezones in parse_date
Fractional timezones, like -0330 (NST used in Canada) or +0430
(Afghanistan, Iran DST), were not handled properly in parse_date; this
means values such as 'minute_local' and 'iso-tz' were not generated
correctly.
This was caused by two mistakes:
* sign of timezone was applied only to hour part of offset, and not
as it should be also to minutes part (this affected only negative
fractional timezones).
* 'int $h + $m/60' is 'int($h + $m/60)' and not 'int($h) + $m/60',
so fractional part was discarded altogether ($h is hours, $m is
minutes, which is always less than 60).
Note that positive fractional timezones +0430, +0530 and +1030 can be
found as authortime in git.git repository itself.
For example http://repo.or.cz/w/git.git/commit/88d50e7 had authortime
of "Fri, 8 Jan 2010 18:48:07 +0000 (23:48 +0530)", which is not marked
with 'atnight', when "git show 88d50e7" gives correct author date of
"Sat Jan 9 00:18:07 2010 +0530".
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: technical details about the index file format
* Clarify "string of unsigned bytes";
* Blob has two variants (regular file vs symlink), not (blob vs symlink);
* Clarify permission mode bits;
* Clarify ce_namelen() "too long to fit in the length field" case;
* Clarify "." etc are forbidden as path components;
* Match the description with the internal wording "cache-tree";
* All types of extension begin with signature and length as explained in
the first part. Don't repeat the "length" part in the description of
each extension (can be mistaken as if there is a separate 32-bit size
field inside the extension), but state what the signature for each
extension is.
* Don't say "Extension tag", as we have said "Extension signature" in the
first part---be consistent;
* Clarify the invalidation of cache-tree entries;
* Correct description on subtree_nr field in the cache-tree;
git-am.txt: advertise 'git am --abort' instead of 'rm .git/rebase-apply'
'git am --abort' is around for quite a long time now, and users should
normally not poke around inside the .git directory, yet the
documentation of 'git am' still recommends the following:
... if you decide to start over from scratch,
run `rm -f -r .git/rebase-apply` ...
Suggest 'git am --abort' instead.
It's not quite the same as the original, because 'git am --abort' will
restore the original branch, while simply removing '.git/rebase-apply'
won't, but that's rather a thinko in the original wording, because
that won't actually "start over _from scratch_".
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
update $GIT_INDEX_FILE when there are racily clean entries
Traditional "opportunistic index update" done by read-only "diff" and
"status" was about updating cached lstat(2) information in the index for
the next round. We missed another obvious optimization opportunity: when
there are racily clean entries that will cease to be racily clean by
updating $GIT_INDEX_FILE. Detect that case and write $GIT_INDEX_FILE out
to give it a newer timestamp.
Noticed by Lasse Makholm by stracing "git status" in a fresh checkout and
counting the number of open(2) calls.
When we had to refresh the index internally before running diff or status,
we opportunistically updated the $GIT_INDEX_FILE so that later invocation
of git can use the lstat(2) we already did in this invocation.
bisect: visualize with git-log if gitk is unavailable
If gitk is not available in the PATH, bisect ends up
exiting with the shell's 127 error code, confusing the git
wrapper into thinking that bisect is not a git command.
We already fallback to git-log if there doesn't seem to be a
graphical display available. We should do the same if gitk
is not available in our PATH at all. This not only fixes the
ugly error message, but is a much more sensible default than
failing to show the user anything.
Reported by Maxin John.
Tested-by: Maxin B. John <maxin@maxinbjohn.info> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/maint-fd-limit:
sha1_file.c: Don't retain open fds on small packs
mingw: add minimum getrlimit() compatibility stub
Limit file descriptors used by packs
Merge branch 'so/submodule-no-update-first-time' into maint
* so/submodule-no-update-first-time:
t7406: "git submodule update {--merge|--rebase]" with new submodules
submodule: no [--merge|--rebase] when newly cloned
The test setup in t8006-blame-textconv.sh uses "ln -sf" to
overwrite an existing symlink. Unfortunately, both /usr/bin/ln
and /usr/xpg4/bin/ln on solaris 9 don't properly handle -f and -s
used at the same time. This caused the test setup and subsequent
checks to fail.
Instead, remove the symlink and then create a new one in the
setup code.
The upstream Solaris bug (fixed in 10, but not 9) is documented
here:
stash: copy the index using --index-output instead of cp -p
'git stash create' must operate with a temporary index. For this purpose,
it used 'cp -p' to create a copy. -p is needed to preserve the timestamp
of the index file. Now Jakob Pfender reported a certain combination of
a Linux NFS client, OpenBSD NFS server, and cp implementation where this
operation failed.
Luckily, the first operation in git-stash after copying the index is to
call 'git read-tree'. Therefore, use --index-output instead of 'cp -p'
to write the copy of the index.
--index-output requires that the specified file is on the same volume as
the source index, so that the lock file can be rename()d. For this reason,
the name of the temporary index is constructed in a way different from the
other temporary files. The code path of 'stash -p' also needs a temporary
index, but we do not use the new name because it does not depend on the
same precondition as --index-output.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
stash: fix incorrect quoting in cleanup of temporary files
The * was inside the quotes, so that the pattern was never expanded and the
temporary files were never removed. As a consequence, 'stash -p' left a
.git-stash-*-patch file in $GIT_DIR. Other code paths did not leave files
behind because they removed the temporary file themselves, at least in
non-error paths.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'lt/rename-no-extra-copy-detection' into maint
* lt/rename-no-extra-copy-detection:
diffcore-rename: improve estimate_similarity() heuristics
diffcore-rename: properly honor the difference between -M and -C
for_each_hash: allow passing a 'void *data' pointer to callback
Merge branch 'mg/placeholders-are-lowercase' into maint
* mg/placeholders-are-lowercase:
Make <identifier> lowercase in Documentation
Make <identifier> lowercase as per CodingGuidelines
Make <identifier> lowercase as per CodingGuidelines
Make <identifier> lowercase as per CodingGuidelines
CodingGuidelines: downcase placeholders in usage messages
I tried both Firefox and Opera and saw the same behavior.
The desired output is:
1 /*
2 * test
3 */
This can be accomplished by specifying "--replace-tabs=8" on the
highlight command line.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Acked-by: John 'Warthog9' Hawley <warthog9@eaglescrag.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --quiet: disable optimization when --diff-filter=X is used
The code notices that the caller does not want any detail of the changes
and only wants to know if there is a change or not by specifying --quiet.
And it breaks out of the loop when it knows it already found any change.
When you have a post-process filter (e.g. --diff-filter), however, the
path we found to be different in the previous round and set HAS_CHANGES
bit may end up being uninteresting, and there may be no output at the end.
The optimization needs to be disabled for such case.
Note that the f245194 (diff: change semantics of "ignore whitespace"
options, 2009-05-22) already disables this optimization by refraining
from setting HAS_CHANGES when post-process filters that need to inspect
the contents of the files (e.g. -S, -w) in diff_change() function.
Some versions of strlen use SSE to speed up the calculation and load 4
bytes at a time, even if it means reading past the end of the
allocated memory. This read is safe and when the strlen function is
inlined, it is not replaced by valgrind, which reports a
false-possitive.
Tell valgrind to ignore this particular error, as the read is, in
fact, safe. Current upstream-released version 3.6.1 is affected. Some
distributions have this fixed in their latest versions.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- prepare_submodule_summary prepares the revision walker
to list changes in a submodule. That is, it:
* finds merge bases between the commits pointed to this
path from before ("left") and after ("right") the change;
* checks whether this is a fast-forward or fast-backward;
* prepares a revision walk to list commits in the symmetric
difference between the commits at each endpoint.
It returns nonzero on error.
- print_submodule_summary runs the revision walk and saves
the result to a strbuf in --left-right format.
The goal is just readability. No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
branch: split off function that writes tracking info and commit subject
Introduce a add_verbose_info function that takes care of adding
- an abbreviated object name;
- a summary of the form [ahead x, behind y] of the relationship
to the corresponding upstream branch;
- a one line commit subject
for the tip commit of a branch, for use in "git branch -v" output.
No functional change intended. This just unindents the code a little
and makes it easier to skip on first reading.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:
struct foo {
int bar;
char *baz;
};
Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.
Linus sayeth:
Heretic people all over the world have claimed that this inconsistency
is ... well ... inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since v1.7.2-rc0~23^2~2 (Add per-repository eol normalization,
2010-05-19), building with gcc -std=gnu89 -pedantic produces warnings
like the following:
convert.c:21:11: warning: comma at end of enumerator list [-pedantic]
gcc is right to complain --- these commas are not permitted in C89.
In the spirit of v1.7.2-rc0~32^2~16 (2010-05-14), remove them.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
list-objects.c: don't add an unparsed NULL as a pending tree
"git rev-list --first-parent --boundary $commit^..$commit" segfaults on a
merge commit since 8d2dfc4 (process_{tree,blob}: show objects without
buffering, 2009-04-10), as it tried to dereference a commit that was
discarded as UNINTERESTING without being parsed (hence lacking "tree").
git stash: show status relative to current directory
git status shows modified paths relative to current directory, so it's
possible to copy&paste them directly, even if you're in a subdirectory.
But "git stash apply" always shows status from root of git repository.
This is misleading because you can't use the paths without modifications.
This is caused by changing directory to root of repository at the
beginning of git stash.
This patch makes git stash show status relative to current directory.
Instead of removing the "cd to toplevel", which would affect whole
script and might have other side-effects, the fix is to change directory
temporarily back to original dir just before displaying status.
Signed-off-by: Piotr Krukowiecki <piotr.krukowiecki@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default of 7 comes from fairly early in git development, when
seven hex digits was a lot (it covers about 250+ million hash
values). Back then I thought that 65k revisions was a lot (it was what
we were about to hit in BK), and each revision tends to be about 5-10
new objects or so, so a million objects was a big number.
These days, the kernel isn't even the largest git project, and even
the kernel has about 220k revisions (_much_ bigger than the BK tree
ever was) and we are approaching two million objects. At that point,
seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number
of objects and the hash size, there _will_ be collisions in truncated
hash values. It's no longer even close to unrealistic - it happens all
the time.
We should both increase the default abbrev that was unrealistically
small, _and_ add a way for people to set their own default per-project
in the git config file.
This is the first step to first make it configurable; the default of 7
is not raised yet.
Revert "core.abbrevguard: Ensure short object names stay unique a bit longer"
This reverts commit 72a5b561fc1c4286bc7c5b0693afc076af261e1f, as adding
fixed number of hexdigits more than necessary to make one object name
locally unique does not help in futureproofing the uniqueness of names
we generate today.
log: fix --max-count when used together with -S or -G
The --max-count limit is implemented by counting revisions in
get_revision(), but the -S and -G take effect later when running diff.
Hence "--max-count=10 -Sfoo" meant "examine the 10 first revisions, and
out of them, show only those changing the occurences of foo", not "show 10
revisions changing the occurences of foo".
In case the commit isn't actually shown, cancel the decrement of
max_count.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SubmittingPatches: clarify the expected commit log description
Earlier, 47afed5 (SubmittingPatches: itemize and reflect upon well written
changes, 2009-04-28) added a discussion on the contents of the commit log
message, but the last part of the new paragraph didn't make much sense.
Reword it slightly to make it more readable.
Update the "quicklist" to clarify what we mean by "motivation" and
"contrast". Also mildly discourage external references.
The description was unclear if -c or --cc was the default (--cc is for
some commands), and incorrectly implied that the default applies to
all the diff generating commands.
Most importantly, "log" does not default to "--cc" (it defaults to
"--no-merges") and "log -p" obeys the user's wish to see non-combined
format. Only "diff" (during merge and three-blob comparison) and
"show" use --cc as the default.
Signed-off-by: Adam Monsen <haircut@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-compat-util.h: Honor HP C's noreturn attribute
HP C for Integrity servers (Itanium) gained support for noreturn
attribute sometime in 2006. It was released in Compiler Version
A.06.10 and made available in July 2006.
The __HP_cc define detects the HP C compiler version. Precede the
__GNUC__ check so it works well when compiling with HP C using -Agcc
option that enables partial support for the GNU C dialect. The -Agcc
defines the __GNUC__ too.
Signed-off-by: Michal Rokos <michal.rokos@nextsoft.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply -v: show offset count when patch did not apply exactly
When the line number the patch intended to touch does not match
the line in the version being patched, GNU patch reports that
it applied the hunk at a different line number, with how big an
offset.
Teach "git apply" to do the same under --verbose option.
apply: do not patch lines that were already patched
When looking for a place to apply a hunk, we used to check lines that
match the preimage of it, starting from the line that the patch wants to
apply the hunk at, looking forward and backward with increasing offsets
until we find a match.
Colin Guthrie found an interesting case where this misapplied a patch that
wanted to touch a preimage that consists of
}
}
return 0;
}
which is a rather unfortunately common pattern.
The target version of the file originally had only one such location, but
the hunk immediately before that created another instance of such block of
lines, and find_pos() happily reported that the preimage of the hunk
matched what it wanted to modify.
Oops.
By marking the lines application of earlier hunks touched and preventing
match_fragment() from considering them as a match with preimage of other
hunks, we can reduce such an accident.
I also considered to teach apply_one_fragment() to take the offset we have
found while applying the previous hunk into account when looking for a
match with find_pos(), but dismissed that approach, because it would
sometimes work better but sometimes worse, depending on the difference
between the version the patch was created against and the version the
patch is being applied.
This does _not_ prevent misapplication of patches to a file that has many
similar looking blocks of lines and a preimage cannot identify which one
of them should be applied. For that, we would need to scan beyond the
first match in find_pos(), and issue a warning (or error out). That will
be a separate topic.