lock_ref_sha1_basic: handle REF_NODEREF with invalid refs
We sometimes call lock_ref_sha1_basic with REF_NODEREF
to operate directly on a symbolic ref. This is used, for
example, to move to a detached HEAD, or when updating
the contents of HEAD via checkout or symbolic-ref.
However, the first step of the function is to resolve the
refname to get the "old" sha1, and we do so without telling
resolve_ref_unsafe() that we are only interested in the
symref. As a result, we may detect a problem there not with
the symref itself, but with something it points to.
The real-world example I found (and what is used in the test
suite) is a HEAD pointing to a ref that cannot exist,
because it would cause a directory/file conflict with other
existing refs. This situation is somewhat broken, of
course, as trying to _commit_ on that HEAD would fail. But
it's not explicitly forbidden, and we should be able to move
away from it. However, neither "git checkout" nor "git
symbolic-ref" can do so. We try to take the lock on HEAD,
which is pointing to a non-existent ref. We bail from
resolve_ref_unsafe() with errno set to EISDIR, and the lock
code thinks we are attempting to create a d/f conflict.
Of course we're not. The problem is that the lock code has
no idea what level we were at when we got EISDIR, so trying
to diagnose or remove empty directories for HEAD is not
useful.
To make things even more complicated, we only get EISDIR in
the loose-ref case. If the refs are packed, the resolution
may "succeed", giving us the pointed-to ref in "refname",
but a null oid. Later, we say "ah, the null oid means we are
creating; let's make sure there is room for it", but
mistakenly check against the _resolved_ refname, not the
original.
We can fix this by making two tweaks:
1. Call resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE
when REF_NODEREF is set. This means any errors
we get will be from the orig_refname, and we can act
accordingly.
We already do this in the REF_DELETING case, but we
should do it for update, too.
2. If we do get a "refname" return from
resolve_ref_unsafe(), even with RESOLVE_REF_NO_RECURSE
it may be the name of the ref pointed-to by a symref.
We already normalize this back to orig_refname before
taking the lockfile, but we need to do so before the
null_oid check.
While we're rearranging the REF_NODEREF handling, we can
also bump the initialization of lflags to the top of the
function, where we are setting up other flags. This saves us
from having yet another conditional block on REF_NODEREF
just to set it later.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
lock_ref_sha1_basic: always fill old_oid while holding lock
Our basic strategy for taking a ref lock is:
1. Create $ref.lock to take the lock
2. Read the ref again while holding the lock (during which
time we know that nobody else can be updating it).
3. Compare the value we read to the expected "old_sha1"
The value we read in step (2) is returned to the caller via
the lock->old_oid field, who may use it for other purposes
(such as writing a reflog).
If we have no "old_sha1" (i.e., we are unconditionally
taking the lock), then we obviously must omit step 3. But we
_also_ omit step 2. This seems like a nice optimization, but
it means that the caller sees only whatever was left in
lock->old_oid from previous calls to resolve_ref_unsafe(),
which happened outside of the lock.
We can demonstrate this race pretty easily. Imagine you have
three commits, $one, $two, and $three. One script just flips
between $one and $two, without providing an old-sha1:
while true; do
git update-ref -m one refs/heads/foo $one
git update-ref -m two refs/heads/foo $two
done
Meanwhile, another script tries to set the value to $three,
also not using an old-sha1:
while true; do
git update-ref -m three refs/heads/foo $three
done
If these run simultaneously, we'll see a lot of lock
contention, but each of the writes will succeed some of the
time. The reflog may record movements between any of the
three refs, but we would expect it to provide a consistent
log: the "from" field of each log entry should be the same
as the "to" field of the previous one.
But if we check this:
perl -alne '
print "mismatch on line $."
if defined $last && $F[0] ne $last;
$last = $F[1];
' .git/logs/refs/heads/foo
we'll see many mismatches. Why?
Because sometimes, in the time between lock_ref_sha1_basic
filling lock->old_oid via resolve_ref_unsafe() and it taking
the lock, there may be a complete write by another process.
And the "from" field in our reflog entry will be wrong, and
will refer to an older value.
This is probably quite rare in practice. It requires writers
which do not provide an old-sha1 value, and it is a very
quick race. However, it is easy to fix: we simply perform
step (2), the read-under-lock, whether we have an old-sha1
or not. Then the value we hand back to the caller is always
atomic.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout,clone: check return value of create_symref
It's unlikely that we would fail to create or update a
symbolic ref (especially HEAD), but if we do, we should
notice and complain. Note that there's no need to give more
details in our error message; create_symref will already
have done so.
While we're here, let's also fix a minor memory leak in
clone.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generally hold a lock on the matching ref while writing
to its reflog; this prevents two simultaneous writers from
clobbering each other's reflog lines (it does not even have
to be two symref updates; because we don't hold the lock, we
could race with somebody writing to the pointed-to ref via
HEAD, for example).
We can fix this by writing the reflog before we commit the
lockfile. This runs the risk of writing the reflog but
failing the final rename(), but at least we now err on the
same side as the rest of the ref code.
Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The create_symref() function predates the existence of
"struct lock_file", let alone the more recent "struct
ref_lock". Instead, it just does its own manual dot-locking.
Besides being more code, this has a few downsides:
- if git is interrupted while holding the lock, we don't
clean up the lockfile
- we don't do the usual directory/filename conflict check.
So you can sometimes create a symref "refs/heads/foo/bar",
even if "refs/heads/foo" exists (namely, if the refs are
packed and we do not hit the d/f conflict in the
filesystem).
This patch refactors create_symref() to use the "struct
ref_lock" interface, which handles both of these things.
There are a few bonus cleanups that come along with it:
- we leaked ref_path in some error cases
- the symref contents were stored in a fixed-size buffer,
putting an artificial (albeit large) limitation on the
length of the refname. We now write through fprintf, and
handle refnames of any size.
- we called adjust_shared_perm only after the file was
renamed into place, creating a potential race with
readers in a shared repository. The lockfile code now
handles this when creating the lockfile, making it
atomic.
- the legacy prefer_symlink_refs path did not do any
locking at all. Admittedly, it is not atomic from a
reader's perspective (as it unlinks and re-creates the
symlink to overwrite), but at least it cannot conflict
with other writers now.
- the result of this patch is hopefully more readable. It
eliminates three goto labels. Two were for error checking
that is now simplified, and the third was to reach shared
code that has been pulled into its own function.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, create_symref() was used only to point
HEAD at a branch name, and the variable names reflect that
(e.g., calling the path git_HEAD). However, it is much more
generic these days (and has been for some time). Let's
update the variable names to make it easier to follow:
- `ref_target` is now just `refname`. This is closer to
the `ref` that is already in `cache.h`, but with the
extra twist that "name" makes it clear this is the name
and not a ref struct. Dropping "target" hopefully makes
it clear that we are talking about the symref itself,
not what it points to.
- `git_HEAD` is now `ref_path`; the on-disk path
corresponding to `ref`.
- `refs_heads_master` is now just `target`; i.e., what the
symref points at. This term also matches what is in
the symlink(2) manpage (at least on Linux).
- the buffer to hold the symref file's contents was simply
called `ref`. It's now `buf` (admittedly also generic,
but at least not actively introducing confusion with the
other variable holding the refname).
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current code writes a reflog entry whenever we update a
symbolic ref, but we never test that this is so. Let's add a
test to make sure upcoming refactoring doesn't cause a
regression.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
symbolic-ref: propagate error code from create_symref()
If create_symref() fails, git-symbolic-ref will still exit
with code 0, and our caller has no idea that the command did
nothing.
This appears to have been broken since the beginning of time
(e.g., it is not a regression where create_symref() stopped
calling die() or something similar).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"format-patch" has learned a new option to zero-out the commit
object name on the mbox "From " line.
* bc/format-patch-null-from-line:
format-patch: check that header line has expected format
format-patch: add an option to suppress commit hash
sha1_file.c: introduce a null_oid constant
When getpwuid() on the system returned NULL (e.g. the user is not
in the /etc/passwd file or other uid-to-name mappings), the
codepath to find who the user is to record it in the reflog barfed
and died. Loosen the check in this codepath, which already accepts
questionable ident string (e.g. host part of the e-mail address is
obviously bogus), and in general when we operate fmt_ident() function
in non-strict mode.
* jk/ident-loosen-getpwuid:
ident: loosen getpwuid error in non-strict mode
ident: keep a flag for bogus default_email
ident: make xgetpwuid_self() a static local helper
Add new config to avoid typing "--recurse-submodules" on each push.
* mc/push-recurse-submodules-config:
push: follow the "last one wins" convention for --recurse-submodules
push: test that --recurse-submodules on command line overrides config
push: add recurseSubmodules config option
On Windows, when writing to a pipe fails, errno is always
EINVAL. However, Git expects it to be EPIPE.
According to the documentation, there are two cases in which write()
triggers EINVAL: the buffer is NULL, or the length is odd but the mode
is 16-bit Unicode (the broken pipe is not mentioned as possible cause).
Git never sets the file mode to anything but binary, therefore we know
that errno should actually be EPIPE if it is EINVAL and the buffer is
not NULL.
See https://msdn.microsoft.com/en-us/library/1570wh78.aspx for more
details.
This works around t5571.11 failing with v2.6.4 on Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git://ozlabs.org/~paulus/gitk:
gitk: sv.po: Update Swedish translation (311t)
gitk: Let .bleft.mid widgets 'breathe'
gitk: Match ttk fonts to gitk fonts
gitk: Update revision date in Japanese PO file
gitk: Update "Language:" header
gitk: Improve translation message
gitk: Remove unused line
gitk: Update year
gitk: Change last translator line
gitk: Update fuzzy messages
gitk: Update Japanese translation
gitk: Fix translation around copyright sign
gitk: Update Japanese translation
gitk: Fix wrong translation
gitk: Translate Japanese catalog
gitk: Translate more to Japanese catalog
gitk: Update Japanese message catalog
gitk: Re-sync line number in Japanese message catalogue
gitk: Color name update
format-patch: check that header line has expected format
The format of the "From " header line is very specific to allow
utilities to detect Git-style patches. Add a test that the patches
created are in the expected format.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
format-patch: add an option to suppress commit hash
Oftentimes, patches created by git format-patch will be stored in
version control or compared with diff. In these cases, two otherwise
identical patches can have different commit hashes, leading to diff
noise. Teach git format-patch a --zero-commit option that instead
produces an all-zero hash to avoid this diff noise.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'ls/p4-keep-empty-commits' into maint
"git p4" used to import Perforce CLs that touch only paths outside
the client spec as empty commits. It has been corrected to ignore
them instead, with a new configuration git-p4.keepEmptyCommits as a
backward compatibility knob.
* ls/p4-keep-empty-commits:
git-p4: add option to keep empty commits
The helper used to iterate over loose object directories to prune
stale objects did not closedir() immediately when it is done with a
directory--a callback such as the one used for "git prune" may want
to do rmdir(), but it would fail on open directory on platforms
such as WinXP.
* jk/prune-mtime:
prune: close directory earlier during loose-object directory traversal
"git p4" used to import Perforce CLs that touch only paths outside
the client spec as empty commits. It has been corrected to ignore
them instead, with a new configuration git-p4.keepEmptyCommits as a
backward compatibility knob.
* ls/p4-keep-empty-commits:
git-p4: add option to keep empty commits
The helper used to iterate over loose object directories to prune
stale objects did not closedir() immediately when it is done with a
directory--a callback such as the one used for "git prune" may want
to do rmdir(), but it would fail on open directory on platforms
such as WinXP.
* jk/prune-mtime:
prune: close directory earlier during loose-object directory traversal
completion: fix completing unstuck email alias arguments
Completing unstuck form of email aliases doesn't quite work:
$ git send-email --to <TAB>
alice bob cecil
$ git send-email --to a<TAB>
alice bob cecil
While listing email aliases works as expected, the second case should
just complete to 'alice', but it keeps offering all email aliases
instead.
The cause for this behavior is that in this case we mistakenly tell
__gitcomp() explicitly that the current word to be completed is empty,
while in reality it is not. As a result __gitcomp() doesn't filter
out non-matching aliases, so all aliases end up being offered over and
over again.
Fix this by not passing the current word to be completed to
__gitcomp() and letting it go the default route and grab it from the
'$cur' variable. Don't pass empty prefix either, because it's assumed
to be empty when unspecified, so it's not necessary.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 00bce77 (ident.c: add support for IPv6, 2015-11-27)
moved the "gethostbyname" call out of "add_domainname" and
into the helper function "canonical_name". But when moving
the code, it forgot that the "buf" variable is passed as
"host" in the helper.
Reported-by: johan defries <johandefries@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user has not specified an identity and we have to
turn to getpwuid() to find the username or gecos field, we
die immediately when getpwuid fails (e.g., because the user
does not exist). This is OK for making a commit, where we
have set IDENT_STRICT and would want to bail on bogus input.
But for something like a reflog, where the ident is "best
effort", it can be pain. For instance, even running "git
clone" with a UID that is not in /etc/passwd will result in
git barfing, just because we can't find an ident to put in
the reflog.
Instead of dying in xgetpwuid_self, we can instead return a
fallback value, and set a "bogus" flag. For the username in
an email, we already have a "default_email_is_bogus" flag.
For the name field, we introduce (and check) a matching
"default_name_is_bogus" flag. As a bonus, this means you now
get the usual "tell me who you are" advice instead of just a
"no such user" error.
No tests, as this is dependent on configuration outside of
git's control. However, I did confirm that it behaves
sensibly when I delete myself from the local /etc/passwd
(reflogs get written, and commits complain).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fonts set in setoptions aren't consistently picked up by ttk, which
uses its own predefined fonts. This is noticeable when switching
between using and not using ttk with custom fonts or in HiDPI settings
(where the default TTK fonts do _not_ respect tk sclaing).
Fix by mapping the ttk fontset to the one used by gitk internally.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
rebase -i: remember merge options beyond continue actions
If the user explicitly specified a merge strategy or strategy
options, continue to use that strategy/option after
"rebase --continue". Add a test of the corrected behavior.
If --merge is specified or implied by -s or -X, then "strategy and
"strategy_opts" are set to values from which "strategy_args" can be
derived; otherwise they are set to empty strings. Either way,
their values are propagated from one step of an interactive rebase
to the next via state files.
"do_merge", on the other hand, is *not* propagated to later steps of
an interactive rebase. Therefore, making the initialization of
"strategy_args" conditional on "do_merge" being set prevents later
steps of an interactive rebase from setting it correctly.
Luckily, we don't need the "do_merge" guard at all. If the rebase
was started without --merge, then "strategy" and "strategy_opts"
are both the empty string, which results in "strategy_args" also
being set to the empty string, which is just what we want in that
situation. So remove the "do_merge" guard and derive
"strategy_args" from "strategy" and "strategy_opts" every time.
Reported-by: Diogo de Campos <campos@esss.com.br> Signed-off-by: Fabian Ruch <bafain@gmail.com> Helped-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' into maint
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic
Update "git subtree" (in contrib/) so that it can take whitespaces
in the pathnames, not only in the in-tree pathname but the name of
the directory that the repository is in.
* as/subtree-with-spaces:
contrib/subtree: respect spaces in a repository path
t7900-subtree: test the "space in a subdirectory name" case
Merge branch 'jk/test-lint-forbid-when-finished-in-subshell' into maint
Because "test_when_finished" in our test framework queues the
clean-up tasks to be done in a shell variable, it should not be
used inside a subshell. Add a mechanism to allow 'bash' to catch
such uses, and fix the ones that were found.
* jk/test-lint-forbid-when-finished-in-subshell:
test-lib-functions: detect test_when_finished in subshell
t7800: don't use test_config in a subshell
test-lib-functions: support "test_config -C <dir> ..."
t5801: don't use test_when_finished in a subshell
t7610: don't use test_config in a subshell
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic
IO::Socket::SSL defines level 1 debug as "print out errors from
IO::Socket::SSL and ciphers from Net::SSLeay". In fact, it aliases
Net::SSLeay::trace which is defined to guarantee silence at level 0 and
only emit error messages at level 1, so let's enable it by default.
The modification of warnings is needed to avoid a warning about:
Name "IO::Socket::SSL::DEBUG" used only once: possible typo
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we have to deduce the user's email address and can't come
up with something plausible for the hostname, we simply
write "(none)" or ".(none)" in the hostname.
Later, our strict-check is forced to use strstr to look for
this magic string. This is probably not a problem in
practice, but it's rather ugly. Let's keep an extra flag
that tells us the email is bogus, and check that instead.
We could get away with simply setting the global in
add_domainname(); it only gets called to write into
git_default_email. However, let's make the code a little
more obvious to future readers by actually passing a pointer
to our "bogus" flag down the call-chain.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ident: make xgetpwuid_self() a static local helper
This function is defined in wrapper.c, but nobody besides
ident.c uses it. And nobody is likely to in the future,
either, as anything that cares about the user's name should
be going through the ident code.
Moving it here is a cleanup of the global namespace, but it
will also enable further cleanups inside ident.c.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
More transition from "unsigned char[40]" to "struct object_id".
This needed a few merge fixups, but is mostly disentangled from other
topics.
* bc/object-id:
remote: convert functions to struct object_id
Remove get_object_hash.
Convert struct object to object_id
Add several uses of get_object_hash.
object: introduce get_object_hash macro.
ref_newer: convert to use struct object_id
push_refs_with_export: convert to struct object_id
get_remote_heads: convert to struct object_id
parse_fetch: convert to use struct object_id
add_sought_entry_mem: convert to struct object_id
Convert struct ref to use object_id.
sha1_file: introduce has_object_file helper.
The necessary infrastructure to build topics using the free Travis
CI has been added. Developers forking from this topic (and enabling
Travis) can do their own builds, and we can turn on auto-builds for
git/git (including build-status for pull requests that people
open).
A changelist that contains only excluded files due to a client spec was
imported as an empty commit. Fix that issue by ignoring these commits.
Add option "git-p4.keepEmptyCommits" to make the previous behavior
available.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Helped-by: Pete Harlan Acked-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A build without NO_IPv6 used to use gethostbyname() when guessing
user's hostname, instead of getaddrinfo() that is used in other
codepaths in such a build.
* ep/ident-with-getaddrinfo:
ident.c: add support for IPv6