Comparing the result of read_in_full() using less-than is
potentially dangerous, as a negative return value may be
converted to an unsigned type and be considered a success.
This is discussed further in 561598cfcf (read_pack_header:
handle signed/unsigned comparison in read result,
2017-09-13).
Each of these instances is actually fine in practice:
- in get-tar-commit-id, the HEADERSIZE macro expands to a
signed integer. If it were switched to an unsigned type
(e.g., a size_t), then it would be a bug.
- the other two callers check for a short read only after
handling a negative return separately. This is a fine
practice, but we'd prefer to model "!=" as a general
rule.
So all of these cases can be considered cleanups and not
actual bugfixes.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We call write_in_full() with a size that we know is greater
than zero. The return value can never be zero, then, since
write_in_full() converts such a failed write() into ENOSPC
and returns -1. We can just drop this branch of the error
handling entirely.
Suggested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
But during the conflict resolution in c50424a6f0 (Merge
branch 'jk/write-in-full-fix', 2017-09-25), this morphed
into
write_in_full(...) < 1
This behaves as we want, but we prefer to avoid modeling the
"less than length" error-check which can be subtly buggy, as
shown in efacf609c8 (config: avoid "write_in_full(fd, buf,
len) < len" pattern, 2017-09-13).
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
A handful of tests to demonstrates a recursive implementation of
"name-rev" hurts.
* mg/name-rev-tests-with-short-stack:
t6120: test describe and name-rev with deep repos
t6120: clean up state after breaking repo
t6120: test name-rev --all and --stdin
t7004: move limited stack prereq to test-lib
Many of our programs consider that it is OK to release dynamic
storage that is used throughout the life of the program by simply
exiting, but this makes it harder to leak detection tools to avoid
reporting false positives. Plug many existing leaks and introduce
a mechanism for developers to mark that the region of memory
pointed by a pointer is not lost/leaking to help these tools.
* jk/leak-checkers:
git-compat-util: make UNLEAK less error-prone
Commit 0e5bba5 ("add UNLEAK annotation for reducing leak false
positives", 2017-09-08) introduced an UNLEAK macro to be used as
"UNLEAK(var);", but its existing definitions leave semicolons that act
as empty statements, which will lead to syntax errors, e.g.
if (condition)
UNLEAK(var);
else
something_else(var);
would be broken with two statements between if (condition) and else.
Lose the excess semicolon from the end of the macro replacement text.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike "git commit-tree < file", "git commit-tree -F file" did not
pass the contents of the file verbatim and instead completed an
incomplete line at the end, if exists. The latter has been updated
to match the behaviour of the former.
* rk/commit-tree-make-F-verbatim:
commit-tree: do not complete line in -F input
* rs/strbuf-leakfix: (34 commits)
wt-status: release strbuf after use in wt_longstatus_print_tracking()
wt-status: release strbuf after use in read_rebase_todolist()
vcs-svn: release strbuf after use in end_revision()
utf8: release strbuf on error return in strbuf_utf8_replace()
userdiff: release strbuf after use in userdiff_get_textconv()
transport-helper: release strbuf after use in process_connect_service()
sequencer: release strbuf after use in save_head()
shortlog: release strbuf after use in insert_one_record()
sha1_file: release strbuf on error return in index_path()
send-pack: release strbuf on error return in send_pack()
remote: release strbuf after use in set_url()
remote: release strbuf after use in migrate_file()
remote: release strbuf after use in read_remote_branches()
refs: release strbuf on error return in write_pseudoref()
notes: release strbuf after use in notes_copy_from_stdin()
merge: release strbuf after use in write_merge_heads()
merge: release strbuf after use in save_state()
mailinfo: release strbuf on error return in handle_boundary()
mailinfo: release strbuf after use in handle_from()
help: release strbuf on error return in exec_woman_emacs()
...
Implement transactional update to the packed-ref representation of
references.
* mh/packed-ref-transactions:
files_transaction_finish(): delete reflogs before references
packed-backend: rip out some now-unused code
files_ref_store: use a transaction to update packed refs
t1404: demonstrate two problems with reference transactions
files_initial_transaction_commit(): use a transaction for packed refs
prune_refs(): also free the linked list
files_pack_refs(): use a reference transaction to write packed refs
packed_delete_refs(): implement method
packed_ref_store: implement reference transactions
struct ref_transaction: add a place for backends to store data
packed-backend: don't adjust the reference count on lock/unlock
* kw/merge-recursive-cleanup:
merge-recursive: change current file dir string_lists to hashmap
merge-recursive: remove return value from get_files_dirs
merge-recursive: fix memory leak
As "git commit" to conclude a conflicted "git merge" honors the
commit-msg hook, "git merge" that recoreds a merge commit that
cleanly auto-merges should, but it didn't.
* sb/merge-commit-msg-hook:
builtin/merge: honor commit-msg hook for merges
Many of our programs consider that it is OK to release dynamic
storage that is used throughout the life of the program by simply
exiting, but this makes it harder to leak detection tools to avoid
reporting false positives. Plug many existing leaks and introduce
a mechanism for developers to mark that the region of memory
pointed by a pointer is not lost/leaking to help these tools.
* jk/leak-checkers:
add UNLEAK annotation for reducing leak false positives
set_git_dir: handle feeding gitdir to itself
repository: free fields before overwriting them
reset: free allocated tree buffers
reset: make tree counting less confusing
config: plug user_config leak
update-index: fix cache entry leak in add_one_file()
add: free leaked pathspec after add_files_to_cache()
test-lib: set LSAN_OPTIONS to abort by default
test-lib: --valgrind should not override --verbose-log
Our hashmap implementation in hashmap.[ch] is not thread-safe when
adding a new item needs to expand the hashtable by rehashing; add
an API to disable the automatic rehashing to work it around.
* jh/hashmap-disable-counting:
hashmap: add API to disable item counting when threaded
The long-standing rule that an in-core lockfile instance, once it
is used, must not be freed, has been lifted and the lockfile and
tempfile APIs have been updated to reduce the chance of programming
errors.
* jk/incore-lockfile-removal:
stop leaking lock structs in some simple cases
ref_lock: stop leaking lock_files
lockfile: update lifetime requirements in documentation
tempfile: auto-allocate tempfiles on heap
tempfile: remove deactivated list entries
tempfile: use list.h for linked list
tempfile: release deactivated strbufs instead of resetting
tempfile: robustify cleanup handler
tempfile: factor out deactivation
tempfile: factor out activation
tempfile: replace die("BUG") with BUG()
tempfile: handle NULL tempfile pointers gracefully
tempfile: prefer is_tempfile_active to bare access
lockfile: do not rollback lock on failed close
tempfile: do not delete tempfile on failed close
always check return value of close_tempfile
verify_signed_buffer: prefer close_tempfile() to close()
setup_temporary_shallow: move tempfile struct into function
setup_temporary_shallow: avoid using inactive tempfile
write_index_as_tree: cleanup tempfile on error
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.
* nd/prune-in-worktree:
refs.c: reindent get_submodule_ref_store()
refs.c: remove fallback-to-main-store code get_submodule_ref_store()
rev-list: expose and document --single-worktree
revision.c: --reflog add HEAD reflog from all worktrees
files-backend: make reflog iterator go through per-worktree reflog
revision.c: --all adds HEAD from all worktrees
refs: remove dead for_each_*_submodule()
refs.c: move for_each_remote_ref_submodule() to submodule.c
revision.c: use refs_for_each*() instead of for_each_*_submodule()
refs: add refs_head_ref()
refs: move submodule slash stripping code to get_submodule_ref_store
refs.c: refactor get_submodule_ref_store(), share common free block
revision.c: --indexed-objects add objects from all worktrees
revision.c: refactor add_index_objects_to_pending()
refs.c: use is_dir_sep() in resolve_gitlink_ref()
revision.h: new flag in struct rev_info wrt. worktree-related refs
* ma/split-symref-update-fix:
refs/files-backend: add `refname`, not "HEAD", to list
refs/files-backend: correct return value in lock_ref_for_update
refs/files-backend: fix memory leak in lock_ref_for_update
refs/files-backend: add longer-scoped copy of string to list
* mh/notes-cleanup:
load_subtree(): check that `prefix_len` is in the expected range
load_subtree(): declare some variables to be `size_t`
hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
get_oid_hex_segment(): don't pad the rest of `oid`
load_subtree(): combine some common code
get_oid_hex_segment(): return 0 on success
load_subtree(): only consider blobs to be potential notes
load_subtree(): check earlier whether an internal node is a tree entry
load_subtree(): separate logic for internal vs. terminal entries
load_subtree(): fix incorrect comment
load_subtree(): reduce the scope of some local variables
load_subtree(): remove unnecessary conditional
notes: make GET_NIBBLE macro more robust
read_pack_header: handle signed/unsigned comparison in read result
The result of read_in_full() may be -1 if we saw an error.
But in comparing it to a sizeof() result, that "-1" will be
promoted to size_t. In fact, the largest possible size_t
which is much bigger than our struct size. This means that
our "< sizeof(header)" error check won't trigger.
In practice, we'd go on to read uninitialized memory and
compare it to the PACK signature, which is likely to fail.
But we shouldn't get there.
We can fix this by making a direct "!=" comparison to the
requested size, rather than "<". This means that errors get
lumped in with short reads, but that's sufficient for our
purposes here. There's no PH_ERROR tp represent our case.
And anyway, this function reads from pipes and network
sockets. A network error may racily appear as EOF to us
anyway if there's data left in the socket buffers.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The store_write_section() and store_write_pairs() functions
are basically high-level wrappers around write(). But their
return values are flipped from our usual convention, using
"1" for success and "0" for failure.
Let's flip them to follow the usual write() conventions and
update all callers. As these are local to config.c, it's
unlikely that we'd have new callers in any topics in flight
(which would be silently broken by our change). But just to
be on the safe side, let's rename them to just
write_section() and write_pairs(). That also accentuates
their relationship with write().
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
notes-merge: use ssize_t for write_in_full() return value
We store the return value of write_in_full() in a long,
though the return is actually an ssize_t. This probably
doesn't matter much in practice (since the buffer size is
alredy an unsigned long), but it might if the size if
between what can be represented in "long" and "unsigned
long", and if your size_t is larger than a "long" (as it is
on 64-bit Windows).
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pkt-line: check write_in_full() errors against "< 0"
As with the previous two commits, we prefer to check
write_in_full()'s return value to see if it is negative,
rather than comparing it to the input length.
These cases actually flip the logic to check for success,
making conversion a little different than in other cases. We
could of course write:
if (write_in_full(...) >= 0)
return 0;
return error(...);
But our usual method of spelling write() error checks is
just "< 0". So let's flip the logic for each of these
conditionals to our usual style.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
convert less-trivial versions of "write_in_full() != len"
The prior commit converted many sites to check the return
value of write_in_full() for negativity, rather than a
mismatch with the input length. This patch covers similar
cases, but where the return value is stored in an
intermediate variable. These should get the same treatment,
but they need to be reviewed more carefully since it would
be a bug if the return value is stored in an unsigned type
(which indeed, it is in one of the cases).
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).
So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:
1. It can do a funny signed/unsigned comparison. If your
"len" is signed (e.g., a size_t) then the compiler will
promote the "-1" to its unsigned variant.
This works out for "!= len" (unless you really were
trying to write the maximum size_t bytes), but is a
bug if you check "< len" (an example of which was fixed
recently in config.c).
We should avoid promoting the mental model that you
need to check the length at all, so that new sites are
not tempted to copy us.
2. Checking for a negative value is shorter to type,
especially when the length is an expression.
3. Linus says so. In d34cf19b89 (Clean up write_in_full()
users, 2007-01-11), right after the write_in_full()
semantics were changed, he wrote:
I really wish every "write_in_full()" user would just
check against "<0" now, but this fixes the nasty and
stupid ones.
Appeals to authority aside, this makes it clear that
writing it this way does not have an intentional
benefit. It's a historical curiosity that we never
bothered to clean up (and which was undoubtedly
cargo-culted into new sites).
So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).
[1] A careful reader may notice there is one way that
write_in_full() can return a different value. If we ask
write() to write N bytes and get a return value that is
_larger_ than N, we could return a larger total. But
besides the fact that this would imply a totally broken
version of write(), it would already invoke undefined
behavior. Our internal remaining counter is an unsigned
size_t, which means that subtracting too many byte will
wrap it around to a very large number. So we'll instantly
begin reading off the end of the buffer, trying to write
gigabytes (or petabytes) of data.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
get-tar-commit-id: check write_in_full() return against 0
We ask to write 41 bytes and make sure that the return value
is at least 41. This is the same "dangerous" pattern that
was fixed in the prior commit (wherein a negative return
value is promoted to unsigned), though it is not dangerous
here because our "41" is a constant, not an unsigned
variable.
But we should convert it anyway to avoid modeling a
dangerous construct.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return type of write_in_full() is a signed ssize_t,
because we may return "-1" on failure (even if we succeeded
in writing some bytes). But "len" itself is may be an
unsigned type (the function takes a size_t, but of course we
may have something else in the calling function). So while
it seems like:
if (write_in_full(fd, buf, len) < len)
die_errno("write error");
would trigger on error, it won't if "len" is unsigned. The
compiler sees a signed/unsigned comparison and promotes the
signed value, resulting in (size_t)-1, the highest possible
size_t (or again, whatever type the caller has). This cannot
possibly be smaller than "len", and so the conditional can
never trigger.
I scoured the code base for cases of this, but it turns out
that these two in git_config_set_multivar_in_file_gently()
are the only ones. Here our "len" is the difference between
two size_t variables, making the result an unsigned size_t.
We can fix this by just checking for a negative return value
directly, as write_in_full() will never return any value
except -1 or the full count.
There's no addition to the test suite here, since you need
to convince write() to fail in order to see the problem. The
simplest reproduction recipe I came up with is to trigger
ENOSPC:
# make a limited-size filesystem
dd if=/dev/zero of=small.disk bs=1M count=1
mke2fs small.disk
mkdir mnt
sudo mount -o loop small.disk mnt
cd mnt
sudo chown $USER:$USER .
# make a config file with some content
git config --file=config one.key value
git config --file=config two.key value
# now fill up the disk
dd if=/dev/zero of=fill
# and try to delete a key, which requires copying the rest
# of the file to config.lock, and will fail on write()
git config --file=config --unset two.key
That final command should (and does after this patch)
produce an error message due to the failed write, and leave
the file intact. Instead, it silently ignores the failure
and renames config.lock into place, leaving you with a
totally empty config file!
Reported-by: demerphq <demerphq@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
subprocess: loudly die when subprocess asks for an unsupported capability
The handshake_capabilities() function first advertises the set of
capabilities it supports, so that the other side can pick and choose
which ones to use and ask us to enable in its response. Then we
read the response that tells us what choice the other side made. If
we saw something that we never advertised, that indicates one of two
things. The other side, i.e. the "upgraded" filter, is not paying
attention of the capabilities advertisement, and asking something
its correct operation relies on, but we are not capable of giving
that unknown feature and operate without it, so after that point the
exchange of data is a garbage-in-garbage-out. Or the other side
wanted to ask for one of the capabilities we advertised, but the
code has typo and their wish to enable a capability that its correct
operation relies on is not understood on this end. The result is
the same garbage-in-garbage-out.
Instead of sweeping such a potential bug under the rug, die loudly
when we see a request for an unsupported capability in order to
force sloppily-written filter scripts to get corrected.
"git branch -M a b" while on a branch that is completely unrelated
to either branch a or branch b misbehaved when multiple worktree
was in use. This has been fixed.
* nd/worktree-kill-parse-ref:
branch: fix branch renaming not updating HEADs correctly
In addition to "cc: <a@dd.re.ss> # cruft", "cc: a@dd.re.ss # cruft"
was taught to "git send-email" as a valid way to tell it that it
needs to also send a carbon copy to <a@dd.re.ss> in the trailer
section.
* mm/send-email-cc-cruft:
send-email: don't use Mail::Address, even if available
send-email: fix garbage removal after address
Merge branch 'rs/archive-excluded-directory' into maint
"git archive" did not work well with pathspecs and the
export-ignore attribute.
We may want to resurrect the "we don't archive an empty directory"
bonus patch, but I do not mind merging the above early to 'next'
and leave it as a separate follow-up enhancement.
cf. <20170820090629.tumvqwzkromcykjf@sigill.intra.peff.net>
* rs/archive-excluded-directory:
archive: don't queue excluded directories
archive: factor out helper functions for handling attributes
t5001: add tests for export-ignore attributes and exclude pathspecs
Killing "git merge --edit" before the editor returns control left
the repository in a state with MERGE_MSG but without MERGE_HEAD,
which incorrectly tells the subsequent "git commit" that there was
a squash merge in progress. This has been fixed.
* mg/killed-merge:
merge: save merge state earlier
merge: split write_merge_state in two
merge: clarify call chain
Documentation/git-merge: explain --continue
"git apply" that is used as a better "patch -p1" failed to apply a
taken from a file with CRLF line endings to a file with CRLF line
endings. The root cause was because it misused convert_to_git()
that tried to do "safe-crlf" processing by looking at the index
entry at the same path, which is a nonsense---in that mode, "apply"
is not working on the data in (or derived from) the index at all.
This has been fixed.
* tb/apply-with-crlf:
apply: file commited with CRLF should roundtrip diff and apply
convert: add SAFE_CRLF_KEEP_CRLF
Merge branch 'cc/subprocess-handshake-missing-capabilities' into maint
When handshake with a subprocess filter notices that the process
asked for an unknown capability, Git did not report what program
the offending subprocess was running. This has been corrected.
We may want a follow-up fix to tighten the error checking, though.
* cc/subprocess-handshake-missing-capabilities:
sub-process: print the cmd when a capability is unsupported
"git svn" used with "--localtime" option did not compute the tz
offset for the timestamp in question and instead always used the
current time, which has been corrected.
* ur/svn-local-zone:
git svn fetch: Create correct commit timestamp when using --localtime
"git am -s" has been taught that some input may end with a trailer
block that is not Signed-off-by: and it should refrain from adding
an extra blank line before adding a new sign-off in such a case.
* pw/am-signoff:
am: fix signoff when other trailers are present
Merge branch 'ma/pager-per-subcommand-action' into maint
The "tag.pager" configuration variable was useless for those who
actually create tag objects, as it interfered with the use of an
editor. A new mechanism has been introduced for commands to enable
pager depending on what operation is being carried out to fix this,
and then "git tag -l" is made to run pager by default.
If this works out OK, I think there are low-hanging fruits in
other commands like "git branch" that outputs long list in one mode
while taking input in another.
* ma/pager-per-subcommand-action:
git.c: ignore pager.* when launching builtin as dashed external
tag: change default of `pager.tag` to "on"
tag: respect `pager.tag` in list-mode only
t7006: add tests for how git tag paginates
git.c: provide setup_auto_pager()
git.c: let builtins opt for handling `pager.foo` themselves
builtin.h: take over documentation from api-builtin.txt
"git log --tag=no-such-tag" showed log starting from HEAD, which
has been fixed---it now shows nothing.
* jk/rev-list-empty-input:
revision: do not fallback to default when rev_input_given is set
rev-list: don't show usage when we see empty ref patterns
revision: add rev_input_given flag
t6018: flesh out empty input/output rev-list tests
Merge branch 'st/lib-gpg-kill-stray-agent' into maint
Some versions of GnuPG fails to kill gpg-agent it auto-spawned
and such a left-over agent can interfere with a test. Work it
around by attempting to kill one before starting a new test.
* st/lib-gpg-kill-stray-agent:
t: lib-gpg: flush gpg agent on startup
refs/files-backend: add `refname`, not "HEAD", to list
An earlier patch rewrote `split_symref_update()` to add a copy of a
string to a string list instead of adding the original string. That was
so that the original string could be freed in a later patch, but it is
also conceptually cleaner, since now all calls to `string_list_insert()`
and `string_list_append()` add `update->refname`. --- Except a literal
"HEAD" is added in `split_head_update()`.
Restructure `split_head_update()` in the same way as the earlier patch
did for `split_symref_update()`. This does not correct any practical
problem, but makes things conceptually cleaner. The downside is a call
to `string_list_has_string()`, which should be relatively cheap.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/files-backend: correct return value in lock_ref_for_update
In one code path we return a literal -1 and not a symbolic constant. The
value -1 would be interpreted as TRANSACTION_NAME_CONFLICT, which is
wrong. Use TRANSACTION_GENERIC_ERROR instead (that is the only other
return value we have to choose from).
Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>