When "git fsck" reports a broken link (e.g. a tree object contains
a blob that does not exist), both containing object and the object
that is referred to were reported with their 40-hex object names.
The command learned the "--name-objects" option to show the path to
the containing object from existing refs (e.g. "HEAD~24^2:file.txt").
* js/fsck-name-object:
fsck: optionally show more helpful info for broken links
fsck: give the error function a chance to see the fsck_options
fsck_walk(): optionally name objects on the go
fsck: refactor how to describe objects
"git add -N dir/file && git write-tree" produced an incorrect tree
when there are other paths in the same directory that sorts after
"file".
* nd/cache-tree-ita:
cache-tree: do not generate empty trees as a result of all i-t-a subentries
cache-tree.c: fix i-t-a entry skipping directory updates sometimes
test-lib.sh: introduce and use $EMPTY_BLOB
test-lib.sh: introduce and use $EMPTY_TREE
* nd/test-helpers:
t/test-lib.sh: fix running tests with --valgrind
Makefile: use VCSSVN_LIB to refer to svn library
Makefile: drop extra dependencies for test helpers
Makefile assumed that -lrt is always available on platforms that
want to use clock_gettime() and CLOCK_MONOTONIC, which is not a
case for recent Mac OS X. The necessary symbols are often found in
libc on many modern systems and having -lrt on the command line, as
long as the library exists, had no effect, but when the platform
removes librt.a that is a different matter--having -lrt will break
the linkage.
This change could be seen as a regression for those who do need to
specify -lrt, as they now specifically ask for NEEDS_LIBRT when
building. Hopefully they are in the minority these days.
* rw/make-needs-librt:
config.mak.uname: define NEEDS_LIBRT under Linux, for now
Makefile: add NEEDS_LIBRT to optionally link with librt
The API to iterate over all the refs (i.e. for_each_ref(), etc.)
has been revamped.
* mh/ref-iterators:
for_each_reflog(): reimplement using iterators
dir_iterator: new API for iterating over a directory tree
for_each_reflog(): don't abort for bad references
do_for_each_ref(): reimplement using reference iteration
refs: introduce an iterator interface
ref_resolves_to_object(): new function
entry_resolves_to_object(): rename function from ref_resolves_to_object()
get_ref_cache(): only create an instance if there is a submodule
remote rm: handle symbolic refs correctly
delete_refs(): add a flags argument
refs: use name "prefix" consistently
do_for_each_ref(): move docstring to the header file
refs: remove unnecessary "extern" keywords
Error handling in the codepaths that updates refs has been
improved.
* mh/update-ref-errors:
lock_ref_for_update(): avoid a symref resolution
lock_ref_for_update(): make error handling more uniform
t1404: add more tests of update-ref error handling
t1404: document function test_update_rejected
t1404: remove "prefix" argument to test_update_rejected
t1404: rename file to t1404-update-ref-errors.sh
This allows us to use common test infrastructure and parallelize
the tests. For now, GIT_SVN_TEST_HTTPD=true needs to be set to
enable the SVN HTTP tests because we reuse the same test cases
for both file:// and http:// SVN repositories. SVN_HTTPD_PORT
is no longer honored.
Tested under Apache 2.2 and 2.4 on Debian 7.x (wheezy) and
8.x (jessie), respectively.
Cc: Clemens Buchacher <drizzd@aon.at> Cc: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When c5c31d33 (grep: move pattern-type bits support to top-level
grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
function, the intention was to allow the users of grep API to having
to fiddle only with .pattern_type_option (which can be set to "fixed",
"basic", "extended", and "pcre"), and then immediately before compiling
the pattern strings for use, call grep_commit_pattern_type() to have
it prepare various bits in the grep_opt structure (like .fixed,
.regflags, etc.).
However, grep_set_pattern_type_option() helper function the grep API
internally uses were left as an external function by mistake. This
function shouldn't have been made callable by the users of the API.
Later when the grep API was used in revision traversal machinery,
the caller then mistakenly started calling the function around 34a4ae55 (log --grep: use the same helper to set -E/-F options as
"git grep", 2012-10-03), instead of setting the .pattern_type_option
field and letting the grep_commit_pattern_type() to take care of the
details.
This caused an unnecessary bug that made a configured
grep.patternType take precedence over the command line options
(e.g. --basic-regexp, --fixed-strings) in "git log" family of
commands.
The actual shortening rules aren't that interesting and
probably not worth getting into (I gloss over them here as
"shortened for human readability"). But the fact that %gD
shows whatever you gave on the command line is subtle and
worth mentioning. Since most people will feed a shortened
refname in the first place, it otherwise makes it hard to
understand the difference between the two.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc/pretty-formats: describe index/time formats for %gd
The "reflog selector" format changes based on a series of
heuristics, and that applies equally to both stock "log -g"
output, as well as "--format=%gd". The documentation for
"%gd" doesn't cover this. Let's mention the multiple formats
and refer the user back to the "-g" section for the complete
rules.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We document that asking for HEAD@{now} will switch the
output to show HEAD@{timestamp}, but not that specifying
`--date` has a similar effect, or that it can be overridden
with HEAD@{0}. Let's do so.
These rules come from 794151e (reflog-walk: always make
HEAD@{0} show indexed selectors, 2012-05-04), though that is
simply the culmination of years of these heuristics growing
organically.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc/rev-list-options: clarify "commit@{Nth}" for "-g" option
When "log -g" shows "HEAD@{1}", "HEAD@{2}", etc, calling
that "commit@{Nth}" is not really accurate. The "HEAD" part
is really the refname. By saying "commit", a reader may
misunderstand that to mean something related to the specific
commit we are showing, not the ref whose reflog we are
traversing.
While we're here, let's also switch these instances to use
literal backticks, as our style guide recommends. As a
bonus, that lets us drop some asciidoc quoting.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule-helper: fix indexing in clone retry error reporting path
'git submodule--helper update-clone' has logic to retry failed clones
a second time. For this purpose, there is a list of submodules to clone,
and a second list that is filled with the submodules to retry. Within
these lists, the submodules are identified by an index as if both lists
were just appended.
This works nicely except when the second clone attempt fails as well. To
report an error, the identifying index must be adjusted by an offset so
that it can be used as an index into the second list. However, the
calculation uses the logical total length of the lists so that the result
always points one past the end of the second list.
Pick the correct index.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Acked-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-submodule: forward exit code of git-submodule--helper more faithfully
git-submodule--helper is invoked as the upstream of a pipe in several
places. Usually, the failure of a program in this position is not
detected by the shell. For this reason, the code inserts a token in the
output stream when git-submodule--helper fails that is detected
downstream, where the shell script is quit with exit code 1.
There happens to be a bug in git-submodule--helper that leads to a
segmentation fault. The test suite triggers the crash in several places,
all of which are protected by 'test_must_fail'. But due to the inspecific
exit code 1, the crash remains undiagnosed.
Extend the failure protocol such that git-submodule--helper's exit code
is passed downstream (only in the case of failure). This enables the
downstream to use it as its own exit code, and 'test_must_fail' to
identify the segmentation fault as an unexpected failure.
The bug itself is fixed in the next commit.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Acked-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the transport protocol we use NAK to signal the non existence of a
common base, so fix the documentation. This helps readers of the document,
as they don't have to wonder about the difference between NAK and NACK.
As NACK is used in git archive and upload-archive, this is easy to get
wrong.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have whitespace errors in lines you've introduced, it
can be convenient to be able to jump directly to them for
fixing. You can't quite use "git jump diff" for this,
because though it passes arbitrary options to "git diff", it
expects to see an actual unified diff in the output.
Whereas "git diff --check" actually produces lines that look
like compiler quickfix lines already, meaning we just need
to run it and feed the output directly to the editor.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
contrib/git-jump: fix greedy regex when matching hunks
The hunk-header regex looks for "\+\d+" to find the
post-image line numbers, but it skips the pre-image line
numbers with a simple ".*". That means we may greedily eat
the post-image numbers and match a "\+\d" further on, in the
funcname text.
diff: do not reuse worktree files that need "clean" conversion
When accessing a blob for a diff, we may try to reuse file
contents in the working tree, under the theory that it is
faster to mmap those file contents than it would be to
extract the content from the object database.
When we have to filter those contents, though, that
assumption does not hold. Even for our internal conversions
like CRLF, we have to allocate and fill a new buffer anyway.
But much worse, for external clean filters we have to exec
an arbitrary script, and we have no idea how expensive it
may be to run.
So let's skip this optimization when conversion into git's
"clean" form is required. This applies whenever the
"want_file" flag is false. When it's true, the caller
actually wants the smudged worktree contents, which the
reused file by definition already has (in fact, this is a
key optimization going the other direction, since reusing
the worktree file there lets us skip smudge filters).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit introduced the first use of ENOTSOCK. This macro is
not available on Windows. Define it as WSAENOTSOCK because that is the
corresponding error value reported by the Windows versions of socket
functions.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
contrib/persistent-https: use Git version for build label
The previous method simply used the UNIX timestamp of when the binary was
built as its build label.
$ make && ./git-remote-persistent-http -print_label 1469061546
This patch aims to align the label for this binary with the Git version
contained in the GIT-VERSION-FILE. This gives a better sense of the version
of the binary as it can be mapped to a particular revision or release of
Git itself. For example:
$ make && ./git-remote-persistent-http -print_label
2.9.1.275.g75676c8
Discussion of this patch is available on a related thread in the mailing
list surrounding this package called "contrib/persistent-https: update
ldflags syntax for Go 1.7+". The gmane.org link is:
http://article.gmane.org/gmane.comp.version-control.git/299653/
Signed-off-by: Parker Moore <parkrmoore@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
contrib/persistent-https: update ldflags syntax for Go 1.7+
Running `make all` in `contrib/persistent-https` results in a
failure on Go 1.7 and above.
Specifically, the error is:
go build -o git-remote-persistent-https \
-ldflags "-X main._BUILD_EMBED_LABEL 1468613136"
# _/Users/parkr/github/git/contrib/persistent-https
/usr/local/Cellar/go/1.7rc1/libexec/pkg/tool/darwin_amd64/link: -X
flag requires argument of the form importpath.name=value
make: *** [git-remote-persistent-https] Error 2
This `name=value` syntax for the -X flag was introduced in Go v1.5
(released Aug 19, 2015):
strbuf: avoid calling strbuf_grow() twice in strbuf_addbuf()
Implement strbuf_addbuf() as a normal function in order to avoid calling
strbuf_grow() twice, with the second callinside strbud_add() being a
no-op. This is slightly faster and also reduces the text size a bit.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the GPG prereq is not set, we do not run test 34. That
test changes the directory of the test script as a side
effect (something we usually frown on, but which matches the
style of the rest of this script). When test 35 (the
url-scrubbing test) runs, it expects to be in the directory
from test 34. If it's not, the test fails; we are in a
different sub-repo, our test-commit is built on a different
history, and the push becomes a non-fast-forward.
We can fix this by unconditionally moving to the directory
we expect (again, against our usual style but matching how
the rest of the script operates).
As an additional protection, let's also switch from "make a
new commit and push to master" to just "push to a new
branch". We don't care about the branch name; we just want
_some_ ref update to trigger the status output. Pushing to a
new branch is less likely to run into problems with
force-updates, changing the checked-out branch, etc.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: send keepalives during quiet periods
After a client has sent us the complete pack, we may spend
some time processing the data and running hooks. If the
client asked us to be quiet, receive-pack won't send any
progress data during the index-pack or connectivity-check
steps. And hooks may or may not produce their own progress
output. In these cases, the network connection is totally
silent from both ends.
Git itself doesn't care about this (it will wait forever),
but other parts of the system (e.g., firewalls,
load-balancers, etc) might hang up the connection. So we'd
like to send some sort of keepalive to let the network and
the client side know that we're still alive and processing.
We can use the same trick we did in 05e9515 (upload-pack:
send keepalive packets during pack computation, 2013-09-08).
Namely, we will send an empty sideband data packet every `N`
seconds that we do not relay any stderr data over the
sideband channel. As with 05e9515, this means that we won't
bother sending keepalives when there's actual progress data,
but will kick in when it has been disabled (or if there is a
lull in the progress data).
The concept is simple, but the details are subtle enough
that they need discussing here.
Before the client sends us the pack, we don't want to do any
keepalives. We'll have sent our ref advertisement, and we're
waiting for them to send us the pack (and tell us that they
support sidebands at all).
While we're receiving the pack from the client (or waiting
for it to start), there's no need for keepalives; it's up to
them to keep the connection active by sending data.
Moreover, it would be wrong for us to do so. When we are the
server in the smart-http protocol, we must treat our
connection as half-duplex. So any keepalives we send while
receiving the pack would potentially be buffered by the
webserver. Not only does this make them useless (since they
would not be delivered in a timely manner), but it could
actually cause a deadlock if we fill up the buffer with
keepalives. (It wouldn't be wrong to send keepalives in this
phase for a full-duplex connection like ssh; it's simply
pointless, as it is the client's responsibility to speak).
As soon as we've gotten all of the pack data, then the
client is waiting for us to speak, and we should start
keepalives immediately. From here until the end of the
connection, we send one any time we are not otherwise
sending data.
But there's a catch. Receive-pack doesn't know the moment
we've gotten all the data. It passes the descriptor to
index-pack, who reads all of the data, and then starts
resolving the deltas. We have to communicate that back.
To make this work, we instruct the sideband muxer to enable
keepalives in three phases:
1. In the beginning, not at all.
2. While reading from index-pack, wait for a signal
indicating end-of-input, and then start them.
3. Afterwards, always.
The signal from index-pack in phase 2 has to come over the
stderr channel which the muxer is reading. We can't use an
extra pipe because the portable run-command interface only
gives us stderr and stdout.
Stdout is already used to pass the .keep filename back to
receive-pack. We could also send a signal there, but then we
would find out about it in the main thread. And the
keepalive needs to be done by the async muxer thread (since
it's the one writing sideband data back to the client). And
we can't reliably signal the async thread from the main
thread, because the async code sometimes uses threads and
sometimes uses forked processes.
Therefore the signal must come over the stderr channel,
where it may be interspersed with other random
human-readable messages from index-pack. This patch makes
the signal a single NUL byte. This is easy to parse, should
not appear in any normal stderr output, and we don't have to
worry about any timing issues (like seeing half the signal
bytes in one read(), and half in a subsequent one).
This is a bit ugly, but it's simple to code and should work
reliably.
Another option would be to stop using an async thread for
muxing entirely, and just poll() both stderr and stdout of
index-pack from the main thread. This would work for
index-pack (because we aren't doing anything useful in the
main thread while it runs anyway). But it would make the
connectivity check and the hook muxers much more
complicated, as they need to simultaneously feed the
sub-programs while reading their stderr.
The index-pack phase is the only one that needs this
signaling, so it could simply behave differently than the
other two. That would mean having two separate
implementations of copy_to_sideband (and the keepalive
code), though. And it still doesn't get rid of the
signaling; it just means we can write a nicer message like
"END_OF_INPUT" or something on stdout, since we don't have
to worry about separating it from the stderr cruft.
One final note: this signaling trick is only done with
index-pack, not with unpack-objects. There's no point in
doing it for the latter, because by definition it only kicks
in for a small number of objects, where keepalives are not
as useful (and this conveniently lets us avoid duplicating
the implementation).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we receive a large push, the server side of the
connection may spend a lot of time (30s or more for a full
push of linux.git) walking the object graph without
producing any output. Let's give the user some indication
that we're actually working.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: relay connectivity errors to sideband
If the connectivity check encounters a problem when
receiving a push, the error output goes to receive-pack's
stderr, whose destination depends on the protocol used
(ssh tends to send it to the user, though without a "remote"
prefix; http will generally eat it in the server's error
log).
The information should consistently go back to the user, as
there is a reasonable chance their client is buggy and
generating a bad pack.
We can do so by muxing it over the sideband as we do with
other sub-process stderr.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: turn on index-pack resolving progress
When we receive a large push, the server side may have to
spend a lot of CPU processing the incoming packfile.
During the "receiving" phase, we are typically network
bound, and the client is writing its own progress to the
user. But during the delta resolution phase, we may spend
minutes (e.g., for a full push of linux.git) without
making any indication to the user that the connection has
not hung.
Let's ask index-pack to produce progress output for this
phase (unless the client asked us to be quiet, of course).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
index-pack: add flag for showing delta-resolution progress
The index-pack command has two progress meters: one for
"receiving objects", and one for "resolving deltas". You get
neither by default, or both with "-v".
But for a push through receive-pack, we would want only the
"resolving deltas" phase, _not_ the "receiving objects"
progress. There are two reasons for this.
One is simply that existing clients are already printing
"writing objects" progress at the same time. Arguably
"receiving" from the far end is more useful, because it
tells you what has actually gotten there, as opposed to what
might be stuck in a buffer somewhere between the client and
server. But that would require a protocol extension to tell
clients not to print their progress. Possible, but
complexity for little gain.
The second reason is much more important. In a full-duplex
connection like git-over-ssh, we can print progress while
the pack is incoming, and it will immediately get to the
client. But for a half-duplex connection like git-over-http,
we should not say anything until we have received the full
request. Anything we write is subject to being stuck in a
buffer by the webserver. Worse, we can end up in a deadlock
if that buffer fills up.
So our best bet is to avoid writing anything that isn't a
small fixed size until we've received the full pack.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clone: use a real progress meter for connectivity check
Because the initial connectivity check for a cloned
repository can be slow, 0781aa4 (clone: let the user know
when check_everything_connected is run, 2013-05-03) added a
"fake" progress meter; we simply say "Checking connectivity"
when it starts, and "done" at the end, with nothing between.
Since check_connected() now knows how to do a real progress
meter, we can drop our fake one and use that one instead.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Connectivity checks have to traverse the entire object graph
in the worst case (e.g., a full clone or a full push). For
large repositories like linux.git, this can take 30-60
seconds, during which time git may produce little or no
output.
Let's add the option of showing progress, which is taken
care of by rev-list.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_connected: relay errors to alternate descriptor
Unless the "quiet" flag is given, check_connected sends any
errors to the stderr of the caller (because the child
rev-list inherits that descriptor). However, server-side
callers may want to send these over a sideband channel
instead. Let's make that possible.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_everything_connected: use a struct with named options
The number of variants of check_everything_connected has
grown over the years, so that the "real" function takes
several possibly-zero, possibly-NULL arguments. We hid the
complexity behind some wrapper functions, but this doesn't
scale well when we want to add new options.
If we add more wrapper variants to handle the new options,
then we can get a combinatorial explosion when those options
might be used together (right now nobody wants to use both
"shallow" and "transport" together, so we get by with just a
few wrappers).
If instead we add new parameters to each function, each of
which can have a default value, then callers who want the
defaults end up with confusing invocations like:
where it is unclear which parameter is which (and every
caller needs updated when we add new options).
Instead, let's add a struct to hold all of the optional
parameters. This is a little more verbose for the callers
(who have to declare the struct and fill it in), but it
makes their code much easier to follow, because every option
is named as it is set (and unused options do not have to be
mentioned at all).
Note that we could also stick the iteration function and its
callback data into the option struct, too. But since those
are required for each call, by avoiding doing so, we can let
very simple callers just pass "NULL" for the options and not
worry about the struct at all.
While we're touching each site, let's also rename the
function to check_connected(). The existing name was quite
long, and not all of the wrappers even used the full name.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to ask rev-list to do a traversal that may takes
many seconds (e.g., by calling "--objects --all"). In theory
you can monitor its progress by the output you get to
stdout, but this isn't always easy.
Some operations, like "--count", don't make any output until
the end.
And some callers, like check_everything_connected(), are
using it just for the error-checking of the traversal, and
throw away stdout entirely.
This patch adds a "--progress" option which can be used to
give some eye-candy for a user waiting for a long traversal.
This is just a rev-list option and not a regular traversal
option, because it needs cooperation from the callbacks in
builtin/rev-list.c to do the actual count.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_everything_connected: always pass --quiet to rev-list
The check_everything_connected function takes a "quiet"
parameter which does two things if non-zero:
1. redirect rev-list's stderr to /dev/null to avoid
showing errors to the user
2. pass "--quiet" to rev-list
Item (1) is obviously useful. But item (2) is
surprisingly not. For rev-list, "--quiet" does not have
anything to do with chattiness on stderr; it tells rev-list
not to bother writing the list of traversed objects to
stdout, for efficiency. And since we always redirect
rev-list's stdout to /dev/null in this function, there is no
point in asking it to ever write anything to stdout.
The efficiency gains are modest; a best-of-five run of "git
rev-list --objects --all" on linux.git dropped from 32.013s
to 30.502s when adding "--quiet". That's only about 5%, but
given how easy it is, it's worth doing.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack: grow stateless RPC windows exponentially
When updating large repositories, the LARGE_FLUSH limit (that is, the
limit at which the window growth strategy switches from exponential to
linear) is reached quite quickly. Use a conservative exponential growth
strategy when that limit is reached instead (and increase LARGE_FLUSH so
that there is no regression in window size).
This optimization is only applied during stateless RPCs to avoid the
issue raised and fixed in commit 44d8dc54 (Fix potential local
deadlock during fetch-pack, 2011-03-29).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
One part of "git am" had an oddball helper function that called
stuff from outside "his" as opposed to calling what we have "ours",
which was not gender-neutral and also inconsistent with the rest of
the system where outside stuff is usuall called "theirs" in
contrast to "ours".
General code clean-up around a helper function to write a
single-liner to a file.
* jk/write-file:
branch: use write_file_buf instead of write_file
use write_file_buf where applicable
write_file: add format attribute
write_file: add pointer+len variant
write_file: use xopen
write_file: drop "gently" form
branch: use non-gentle write_file for branch description
am: ignore return value of write_file()
config: fix bogus fd check when setting up default config
Improve the look of the way "git fetch" reports what happened to
each ref that was fetched.
* nd/fetch-ref-summary:
fetch: reduce duplicate in ref update status lines with placeholder
fetch: align all "remote -> local" output
fetch: change flag code for displaying tag update and deleted ref
fetch: refactor ref update status formatting code
git-fetch.txt: document fetch output
The test framework learned a new helper test_match_signal to
check an exit code from getting killed by an expected signal.
* jk/test-match-signal:
t/lib-git-daemon: use test_match_signal
test_must_fail: use test_match_signal
t0005: use test_match_signal as appropriate
tests: factor portable signal check out of t0005
There are certain house-keeping tasks that need to be performed at
the very beginning of any Git program, and programs that are not
built-in commands had to do them exactly the same way as "git"
potty does. It was easy to make mistakes in one-off standalone
programs (like test helpers). A common "main()" function that
calls cmd_main() of individual program has been introduced to
make it harder to make mistakes.
* jk/common-main:
mingw: declare main()'s argv as const
common-main: call git_setup_gettext()
common-main: call restore_sigpipe_to_default()
common-main: call sanitize_stdfds()
common-main: call git_extract_argv0_path()
add an extra level of indirection to main()
"git grep -i" has been taught to fold case in non-ascii locales
correctly.
* nd/icase:
grep.c: reuse "icase" variable
diffcore-pickaxe: support case insensitive match on non-ascii
diffcore-pickaxe: Add regcomp_or_die()
grep/pcre: support utf-8
gettext: add is_utf8_locale()
grep/pcre: prepare locale-dependent tables for icase matching
grep: rewrite an if/else condition to avoid duplicate expression
grep/icase: avoid kwsset when -F is specified
grep/icase: avoid kwsset on literal non-ascii strings
test-regex: expose full regcomp() to the command line
test-regex: isolate the bug test code
grep: break down an "if" stmt in preparation for next changes
The commands in the "log/diff" family have had an FILE* pointer in the
data structure they pass around for a long time, but some codepaths
used to always write to the standard output. As a preparatory step
to make "git format-patch" available to the internal callers, these
codepaths have been updated to consistently write into that FILE*
instead.
* js/log-to-diffopt-file:
mingw: fix the shortlog --output=<file> test
diff: do not color output when --color=auto and --output=<file> is given
t4211: ensure that log respects --output=<file>
shortlog: respect the --output=<file> setting
format-patch: use stdout directly
format-patch: avoid freopen()
format-patch: explicitly switch off color when writing to files
shortlog: support outputting to streams other than stdout
graph: respect the diffopt.file setting
line-log: respect diffopt's configured output file stream
log-tree: respect diffopt's configured output file stream
log: prepare log/log-tree to reuse the diffopt.close_file attribute
Fix recently introduced codepaths that are involved in parallel
submodule operations, which gave up on reading too early, and
could have wasted CPU while attempting to write under a corner
case condition.
* sb/submodule-parallel-fetch:
hoist out handle_nonblock function for xread and xwrite
xwrite: poll on non-blocking FDs
xread: retry after poll on EAGAIN/EWOULDBLOCK
When in a subdirectory of a repository, path arguments should be
interpreted relative to the current directory not the root of the
working tree.
The Git::repository object passed into setup_dir_diff() is configured to
handle this correctly but we create a new Git::repository here without
setting the WorkingSubdir argument. By simply using the existing
repository, path arguments are handled relative to the current
directory.
Reported-by: Bernhard Kirchen <bernhard.kirchen@rwth-aachen.de> Signed-off-by: John Keeping <john@keeping.me.uk> Acked-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This has always been supported since we read config variables
based on the command-line option parser. Document it explicitly
since users usually want to maintain the same program across
invocations.
fsck: optionally show more helpful info for broken links
When reporting broken links between commits/trees/blobs, it would be
quite helpful at times if the user would be told how the object is
supposed to be reachable.
With the new --name-objects option, git-fsck will try to do exactly
that: name the objects in a way that shows how they are reachable.
For example, when some reflog got corrupted and a blob is missing that
should not be, the user might want to remove the corresponding reflog
entry. This option helps them find that entry: `git fsck` will now
report something like this:
broken link from tree b5eb6ff... (refs/stash@{<date>}~37:)
to blob ec5cf80...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
blame: allow to blame paths freshly added to the index
When blaming files, changes in the work tree are taken into account
and displayed as being "Not Committed Yet".
However, when blaming a file that is not known to the current HEAD,
git blame fails with `no such path 'foo' in HEAD`, even when the file
was git add'ed.
Allowing such a blame is useful when the new file added to the index
(not yet committed) was created by renaming an existing file. It
also is useful when the new file was created from pieces already in
HEAD, moved or copied from other files and blaming with copy
detection (i.e. "-C").
Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache-tree: do not generate empty trees as a result of all i-t-a subentries
If a subdirectory contains nothing but i-t-a entries, we generate an
empty tree object and add it to its parent tree. Which is wrong. Such
a subdirectory should not be added.
Note that this has a cascading effect. If subdir 'a/b/c' contains
nothing but i-t-a entries, we ignore it. But then if 'a/b' contains
only (the non-existing) 'a/b/c', then we should ignore 'a/b' while
building 'a' too. And it goes all the way up to top directory.
Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache-tree.c: fix i-t-a entry skipping directory updates sometimes
Commit 3cf773e (cache-tree: fix writing cache-tree when CE_REMOVE is
present - 2012-12-16) skips i-t-a entries when building trees objects
from the index. Unfortunately it may skip too much.
The code in question checks if an entry is an i-t-a one, then no tree
entry will be written. But it does not take into account that
directories can also be written with the same code. Suppose we have
this in the index.
We write an entry for a-file as normal and move on to subdir/file1,
where we realize the entry name for this level is simply just
"subdir", write down an entry for "subdir" then jump three items ahead
to the-last-file.
That is what happens normally when the first file in subdir is not an
i-t-a entry. If subdir/file1 is an i-t-a, because of the broken
condition in this code, we still think "subdir" is an i-t-a file and
not writing "subdir" down and jump to the-last-file. The result tree
now only has two items: a-file and the-last-file. subdir should be
there too (even though it only records two sub-entries, file2 and
file3).
If the i-t-a entry is subdir/file2 or subdir/file3, this is not a
problem because we jump over them anyway. Which may explain why the
bug is hidden for nearly four years.
Fix it by making sure we only skip i-t-a entries when the entry in
question is actual an index entry, not a directory.
Reported-by: Yuri Kanivetsky <yuri.kanivetsky@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fsck_options->name_objects is initialized, and if it already has
name(s) for the object(s) that are to be the starting point(s) for
fsck_walk(), then that function will now add names for the objects
that were walked.
This will be highly useful for teaching git-fsck to identify root causes
for broken links, which is the task for the next patch in this series.
Note that this patch opts for decorating the objects with plain strings
instead of full-blown structs (à la `struct rev_name` in the code of
the `git name-rev` command), for several reasons:
- the code is much simpler than if it had to work with structs that
describe arbitrarily long names such as "master~14^2~5:builtin/am.c",
- the string processing is actually quite light-weight compared to the
rest of fsck's operation,
- the caller of fsck_walk() is expected to provide names for the
starting points, and using plain and simple strings is just the
easiest way to do that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many places, we refer to objects via their SHA-1s. Let's abstract
that into a function.
For the moment, it does nothing else than what we did previously: print
out the 40-digit hex string. But that will change over the course of the
next patches.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need to test linkage of pthread_create and pthread_join,
as pthread_mutex_* and pthread_key_* functions do not need
extra linkage under FreeBSD 10.3, leading to a false-positive
of the empty case.
Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The OS X build pulls in sys/queue.h, which pollutes the preprocessor
namespace with a macro generically named LIST_HEAD, and clashes with
the name we use here.
archive-tar: huge offset and future timestamps would not work on 32-bit
As we are not yet moving everything to size_t but still using ulong
internally when talking about the size of object, platforms with
32-bit long will not be able to produce tar archive with 4GB+ file,
and cannot grok 077777777777UL as a constant. Disable the extended
header feature and do not test it on them.
t0006: skip "far in the future" test when unsigned long is not long enough
Git's source code refers to timestamps as unsigned longs. On 32-bit
platforms, as well as on Windows, unsigned long is not large enough
to capture dates that are "absurdly far in the future".
While we can fix this issue properly by replacing unsigned long with
a larger type, we want to be a bit more conservative and just skip
those tests on the maint track.
Signed-off-by: Jeff King <peff@peff.net> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: implement advertising and receiving push options
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.
Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
push options: {pre,post}-receive hook learns about push options
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.
The code is not executed as the push options are set to NULL, nor is the
new capability advertised.
There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:
Keep all options in one environment variable
============================================
+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.
Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
======================================
+ After a discussion about environment variables and shells, we may not
want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
to use mkstemp. That's not too bad, but still.
Use environment variables, but restrict to key/value pairs
==========================================================
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
it doesn't allow doubles (e.g. cc=reviewer@email)
Present the options in different environment variables
======================================================
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
temporary files
As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".
[1] 'Shellshock' https://lwn.net/Articles/614218/
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In v1.8.5 days, 7f2ea5f0 (diff: allow lowercase letter to specify
what change class to exclude, 2013-07-17) taught the "--diff-filter"
mechanism to take lowercase letters as exclusion, but we forgot to
document it.
When we tried to fix in 58461bd (t1308: do not get fooled by symbolic
links to the source tree, 2016-06-02) an obscure case where the user
cd's into Git's source code via a symbolic link, a regression was
introduced that affects all test runs on Windows.
The original patch introducing the test case in question was careful to
use `$(pwd)` instead of `$PWD`.
This was done to account for the fact that Git's test suite uses shell
scripting even on Windows, where the shell's Unix-y paths are
incompatible with the main Git executable's idea of paths: it only
accepts Windows paths.
It is an awkward but necessary thing, then, to use `$(pwd)` (which gives
us a Windows path) when interacting with the Git executable and `$PWD`
(which gives the shell's idea of the current working directory in Unix-y
form) for shell scripts, including the test suite itself.
Obviously this broke the use case of the Git maintainer when changing
the working directory into Git's source code directory via a symlink,
i.e. when `$(pwd)` does not agree with `$PWD`.
However, we must not fix that use case at the expense of regressing
another use case.
Let's special-case Windows here, even if it is ugly, for lack of a more
elegant solution.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 47abd85 (fetch: Strip usernames from url's before
storing them, 2009-04-17) taught fetch to anonymize URLs.
The primary purpose there was to avoid sticking passwords in
merge-commit messages, but as a side effect, we also avoid
printing them to stderr.
The push side does not have the merge-commit problem, but it
probably should avoid printing them to stderr. We can reuse
the same anonymizing function.
Note that for this to come up, the credentials would have to
appear either on the command line or in a git config file,
neither of which is particularly secure. So people _should_
be switching to using credential helpers instead, which
makes this problem go away. But that's no excuse not to
improve the situation for people who for whatever reason end
up using credentials embedded in the URL.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git archive" learned to handle files that are larger than 8GB and
commits far in the future than expressible by the traditional US-TAR
format.
* jk/big-and-future-archive-tar:
archive-tar: drop return value
archive-tar: write extended headers for far-future mtime
archive-tar: write extended headers for file sizes >= 8GB
t5000: test tar files that overflow ustar headers
t9300: factor out portable "head -c" replacement
Git does not know what the contents in the index should be for a
path added with "git add -N" yet, so "git grep --cached" should not
show hits (or show lack of hits, with -L) in such a path, but that
logic does not apply to "git grep", i.e. searching in the working
tree files. But we did so by mistake, which has been corrected.
* nd/ita-cleanup:
grep: fix grepping for "intent to add" files
t7810-grep.sh: fix a whitespace inconsistency
t7810-grep.sh: fix duplicated test name