Code clean-up to remove hardcoded SHA-1 hash from many places.
* jk/oidhash:
hashmap: convert sha1hash() to oidhash()
hash.h: move object_id definition from cache.h
khash: rename oid helper functions
khash: drop sha1-specific map types
pack-bitmap: convert khash_sha1 maps into kh_oid_map
delta-islands: convert island_marks khash to use oids
khash: rename kh_oid_t to kh_oid_set
khash: drop broken oid_map typedef
object: convert create_object() to use object_id
object: convert internal hash_obj() to object_id
object: convert lookup_object() to use object_id
object: convert lookup_unknown_object() to use object_id
pack-objects: convert locate_object_entry_hash() to object_id
pack-objects: convert packlist_find() to use object_id
pack-bitmap-write: convert some helpers to use object_id
upload-pack: rename a "sha1" variable to "oid"
describe: fix accidental oid/hash type-punning
The code to parse scaled numbers out of configuration files has
been made more robust and also easier to follow.
* rs/config-unit-parsing:
config: simplify parsing of unit factors
config: don't multiply in parse_unit_factor()
config: use unsigned_mult_overflows to check for overflows
* js/gcc-8-and-9:
config: avoid calling `labs()` on too-large data type
winansi: simplify loading the GetCurrentConsoleFontEx() function
kwset: allow building with GCC 8
poll (mingw): allow compiling with GCC 8 and DEVELOPER=1
"git rebase --abort" used to leave refs/rewritten/ when concluding
"git rebase -r", which has been corrected.
* pw/rebase-abort-clean-rewritten:
rebase --abort/--quit: cleanup refs/rewritten
sequencer: return errors from sequencer_remove_state()
rebase: warn if state directory cannot be removed
rebase: fix a memory leak
* am/p4-branches-excludes:
git-p4: respect excluded paths when detecting branches
git-p4: add failing test for "git-p4: respect excluded paths when detecting branches"
git-p4: don't exclude other files with same prefix
git-p4: add failing test for "don't exclude other files with same prefix"
git-p4: don't groom exclude path list on every commit
git-p4: match branches case insensitively if configured
git-p4: add failing test for "git-p4: match branches case insensitively if configured"
git-p4: detect/prevent infinite loop in gitCommitByP4Change()
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.
* ds/close-object-store:
packfile: rename close_all_packs to close_object_store
packfile: close commit-graph in close_all_packs
commit-graph: use raw_object_store when closing
An incorrect list of options was cached after command line
completion failed (e.g. trying to complete a command that requires
a repository outside one), which has been corrected.
* nd/completion-no-cache-failure:
completion: do not cache if --git-completion-helper fails
"git mergetool" and its tests now spawn fewer subprocesses.
* js/mergetool-optim:
mergetool: use shell variable magic instead of `awk`
mergetool: dissect strings with shell variable magic instead of `expr`
t7610-mergetool: use test_cmp instead of test $(cat file) = $txt
t7610-mergetool: do not place pipelines headed by `yes` in subshells
Code restructuring during 2.20 period broke fetching tags via
"import" based transports.
* fc/fetch-with-import-fix:
fetch: fix regression with transport helpers
fetch: make the code more understandable
fetch: trivial cleanup
t5801 (remote-helpers): add test to fetch tags
t5801 (remote-helpers): cleanup refspec stuff
"git branch --list" learned to show branches that are checked out
in other worktrees connected to the same repository prefixed with
'+', similar to the way the currently checked out branch is shown
with '*' in front.
* nb/branch-show-other-worktrees-head:
branch: add worktree info on verbose output
branch: update output to include worktree info
ref-filter: add worktreepath atom
an apparent typo for the environment variable was included with 81567caf87
("trace2: update docs to describe system/global config settings",
2019-04-15), and was missed when renamed variables by e4b75d6a1d
("trace2: rename environment variables to GIT_TRACE2*", 2019-05-19)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just return the value of the factor or zero for unrecognized strings
instead of using an output reference and a separate return value to
indicate success. This is shorter and simpler.
It basically reverts that function to before c8deb5a146 ("Improve error
messages when int/long cannot be parsed from config", 2007-12-25), while
keeping the better messages, so restore its old name, get_unit_factor(),
as well.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_unit_factor() multiplies the number that is passed to it with the
value of a recognized unit factor (K, M or G for 2^10, 2^20 and 2^30,
respectively). All callers pass in 1 as a number, though, which allows
them to check the actual multiplication for overflow before they are
doing it themselves.
Ignore the passed in number and don't multiply, as this feature of
parse_unit_factor() is not used anymore. Rename the output parameter to
reflect that it's not about the end result anymore, but just about the
unit factor.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: use unsigned_mult_overflows to check for overflows
parse_unit_factor() checks if a K, M or G is present after a number and
multiplies it by 2^10, 2^20 or 2^30, respectively. One of its callers
checks if the result is smaller than the number alone to detect
overflows. The other one passes 1 as the number and does multiplication
and overflow check itself in a similar manner.
This works, but is inconsistent, and it would break if we added support
for a bigger unit factor. E.g. 16777217T is 2^64 + 2^40, i.e. too big
for a 64-bit number. Modulo 2^64 we get 2^40 == 1TB, which is bigger
than the raw number 16777217 == 2^24 + 1, so the overflow would go
undetected by that method.
Let both callers pass 1 and handle overflow check and multiplication
themselves. Do the check before the multiplication, using
unsigned_mult_overflows, which is simpler and can deal with larger unit
factors.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description about slashes in gitignore patterns (used to
indicate things like "anchored to this level only" and "only
matches directories") has been revamped.
* an/ignore-doc-update:
gitignore.txt: make slash-rules more readable
The filter_data used in the list-objects-filter (which manages a
lazily sparse clone repository) did not use the dynamic array API
correctly---'nr' is supposed to point at one past the last element
of the array in use. This has been corrected.
* md/list-objects-filter-memfix:
list-objects-filter: correct usage of ALLOC_GROW
"git fetch" into a lazy clone forgot to fetch base objects that are
necessary to complete delta in a thin packfile, which has been
corrected.
* jt/partial-clone-missing-ref-delta-base:
t5616: cover case of client having delta base
t5616: use correct flag to check object is missing
index-pack: prefetch missing REF_DELTA bases
t5616: refactor packfile replacement
rev-list: teach --no-object-names to enable piping
Allow easier parsing by cat-file by giving rev-list an option to print
only the OID of a non-commit object without any additional information.
This is a short-term shim; later on, rev-list should be taught how to
print the types of objects it finds in a format similar to cat-file's.
Before this commit, the output from rev-list needed to be massaged
before being piped to cat-file, like so:
There are no callers left of sha1hash() that do not simply pass the
"hash" member of a "struct object_id". Let's get rid of the outdated
sha1-specific function and provide one that operates on the whole struct
(even though the technique, taking the first few bytes of the hash, will
remain the same).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our hashmap.h helpfully defines a sha1hash() function. But it cannot
define a similar oidhash() without including all of cache.h, which
itself wants to include hashmap.h! Let's break this circular dependency
by moving the definition to hash.h, along with the remaining RAWSZ
macros, etc. That will put them with the existing git_hash_algo
definition.
One alternative would be to move oidhash() into cache.h, but it's
already quite bloated. We're better off moving things out than in.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
For use in object_id hash tables, we have oid_hash() and oid_equal().
But these are confusingly similar to the existing oideq() and the
oidhash() we plan to add to replace sha1hash().
The big difference from those functions is that rather than accepting a
const pointer to the "struct object_id", we take the arguments by value
(which is a khash internal convention). So let's make that obvious by
calling them oidhash_by_value() and oideq_by_value().
Those names are fairly horrendous to type, but we rarely need to do so;
they are passed to the khash implementation macro and then only used
internally. Callers get to use the nice kh_put_oid_map(), etc.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of khash_sha1 and khash_sha1_pos have been removed,
in favor of using maps that use "struct object_id" as their keys. Let's
drop these now-obsolete types.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
delta-islands: convert island_marks khash to use oids
All of the users of this map have an actual "struct object_id" rather
than a bare sha1. Let's use the more descriptive type (and get one step
closer to dropping khash_sha1 entirely).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
khash lets us define a hash as either a map or a set (i.e., with no
"value" type). For the oid maps we define, "oid" is the set and
"oid_map" is the map. As the bug in the previous commit shows, it's easy
to pick the wrong one.
So let's make the names more distinct: "oid_set" and "oid_map".
An alternative naming scheme would be to actually name the type after
the key/value types. So e.g., "oid" _would_ be the set, since it has no
value type. And "oid_map" would become "oid_void" or similar (and
"oid_pos" becomes "oid_int"). That's better in some ways: it's more
regular, and a given map type can be more reasily reused in multiple
contexts (e.g., something storing an "int" that isn't a "pos"). But it's
also slightly less descriptive.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 5a8643eff1 (khash: move oid hash table definition, 2019-02-19)
added a khash "oid_map" type to match the existing "oid" type, which is
a simple set (i.e., just keys, no values). But in setting up the
khash_oid_map typedef, it accidentally referred to "kh_oid_t", which is
the set type.
Nobody noticed the breakage because there are not yet any callers; the
type was added just as a match to the existing sha1 types (whose map
type confusingly _is_ called khash_sha1, and it has no matching set
type).
We could easily fix this with s/oid/oid_map/ in the typedef. But let's
take this a step further, and just drop the typedef entirely. These
typedefs were added by 5a8643eff1 to match the khash_sha1 typedefs. But
the actual khash-derived type names are descriptive enough; this is just
adding an extra layer of indirection. The khash names do not quite
follow our usual style (e.g., they end in "_t"), but since we end up
using other khash names (e.g., khiter_t, kh_get_oid()) anyway, just
typedef-ing the struct name is not really helping much.
And there are already many cases where we use the raw khash type names
anyway (e.g., the "set" variant defined just above us does not have such
a typedef!).
So let's drop this typedef, and the matching oid_pos one (which actually
_does_ have a user, but we can easily convert it).
We'll leave the khash_sha1 typedef around. The ultimate fate of its
callers should be conversion to kh_oid_map_t, so there's no point in
going through the noise of changing the names now.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers left of create_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that lookup_object() has an object_id, we can consistently pass that
around instead of a raw sha1. We still convert to a hash to pass to
sha1hash(), but the goal is for that to go away shortly.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers left of lookup_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables. It also
matches the existing conversions of lookup_blob(), etc.
The conversions of callers were done by hand, but they're all mechanical
one-liners.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
object: convert lookup_unknown_object() to use object_id
There are no callers left of lookup_unknown_object() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
It also matches the existing conversions of lookup_blob(), etc.
The conversions of callers were done by hand, but they're all mechanical
one-liners.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects: convert locate_object_entry_hash() to object_id
There are no callers of locate_object_entry_hash() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects: convert packlist_find() to use object_id
We take a raw hash pointer, but most of our callers have a "struct
object_id" already. Let's switch to taking the full struct, which will
let us continue removing uses of raw sha1 buffers.
There are two callers that do need special attention:
- in rebuild_existing_bitmaps(), we need to switch to
nth_packed_object_oid(). This incurs an extra hash copy over
pointing straight to the mmap'd sha1, but it shouldn't be measurable
compared to the rest of the operation.
- in can_reuse_delta() we already spent the effort to copy the sha1
into a "struct object_id", but now we just have to do so a little
earlier in the function (we can't easily convert that function's
callers because they may be pointing at mmap'd REF_DELTA blocks).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-bitmap-write: convert some helpers to use object_id
A few functions take raw hash pointers, but all of their callers
actually have a "struct object_id". Let's retain that extra type as long
as possible (which will let future patches extend that further, and so
on).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This variable is a "struct object_id", but uses the old-style name
"sha1". Let's call it oid to match more modern code (and make it clear
that it can handle any algorithm, since it uses parse_oid_hex()
properly).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The find_commit_name() function passes an object_id.hash as the key of a
hashmap. That ends up in commit_name_neq(), which then feeds it to
oideq(). Which means we should actually be the whole "struct object_id".
It works anyway because pointers to the two are interchangeable. And
because we're going through a layer of void pointers, the compiler
doesn't notice the type mismatch.
But it's worth cleaning up (especially since once we switch away from
sha1hash() on the same line, accessing the hash member will look doubly
out of place).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: only run 'gc' once when fetching multiple remotes
In multiple remotes mode, git-fetch is launched for n-1 remotes and the
last remote is handled by the current process. Each of these processes
will in turn run 'gc' at the end.
This is not really a problem because even if multiple 'gc --auto' is run
at the same time we still handle it correctly. It does show multiple
"auto packing in the background" messages though. And we may waste some
resources when gc actually runs because we still do some stuff before
checking the lock and moving it to background.
So let's try to avoid that. We should only need one 'gc' run after all
objects and references are added anyway. Add a new option --no-auto-gc
that will be used by those n-1 processes. 'gc --auto' will always run on
the main fetch process (*).
(*) even if we fetch remotes in parallel at some point in future, this
should still be fine because we should "join" all those processes
before this step.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the conversion of 'stash show' to C in dc7bd382b1 ("stash: convert
show to builtin", 2019-02-25), 'git stash show <n>', where n is the
index of a stash got broken, if n is not a file or a valid revision by
itself.
'stash show' accepts any flag 'git diff' accepts for changing the
output format. Internally we use 'setup_revisions()' to parse these
command line flags. Currently we pass the whole argv through to
'setup_revisions()', which includes the stash index.
As the stash index is not a valid revision or a file in the working
tree in most cases however, this 'setup_revisions()' call (and thus
the whole command) ends up failing if we use this form of 'git stash
show'.
Instead of passing the whole argv to 'setup_revisions()', only pass
the flags (and the command name) through, while excluding the stash
reference. The stash reference is parsed (and validated) in
'get_stash_info()' already.
This separate parsing also means that we currently do produce the
correct output if the command succeeds.
Reported-by: Mike Hommey <mh@glandium.org> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before this patch, "git branch" would put "(HEAD detached...)" and "(no
branch, rebasing...)" lines before all the other branches *in most
cases* except for when using Chinese-language messages. zh_CN generally
uses a full-width "(" symbol (codepoint FF08) to match the full-width
proportions of Chinese characters, and the translated strings we had did
use them. This meant that the detached HEAD line would appear after all
local refs and even after the remote refs if there were any.
AFAIK, it is sometimes not jarring to see the half-width parenthesis in
"full-width" text as in the CJK languages, for instance when there are
no characters preceding or following the parenthesized text fragment. By
removing the parenthesis from the localizable text, we can share strings
with wt-status.c and remove a cautionary comment to translators.
Remove the ( from the localizable portion of messages so the sorting
happens properly regardless of locale.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
0620b39b3b ("compat: add a mkstemps() compatibility function", 2009-05-31)
included a function based on code from libiberty which would result in
undefined behaviour in platforms where timeval's tv_usec is a 32-bit signed
type as shown by:
wrapper.c:505:31: runtime error: left shift of 594546 by 16 places cannot be represented in type '__darwin_suseconds_t' (aka 'int')
interestingly the version of this code from gcc never had this bug and the
code had a cast that would had prevented the issue (at least in 64-bit
platforms) but was misapplied.
change the cast to uint64_t so it also works in 32-bit platforms.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interpret-trailers program does not do the usual loading of config
via git_default_config(), and thus does not respect many of the usual
options. In particular, we will not load core.commentChar, even though
the underlying trailer code uses its value.
This can be seen in the accompanying test, where setting
core.commentChar to anything besides "#" results in a failure to treat
the comments correctly.
Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current semantic patch for COPY_ARRAY transforms memcpy(3) calls on
pointers, but Coccinelle distinguishes them from arrays. It already
contains three rules to handle the options for sizeof (i.e. source,
destination and type), and handling arrays as source and destination
would require four times as many rules if we enumerated all cases.
We also don't handle array subscripts, and supporting that would
increase the number of rules by another factor of four. (An isomorphism
telling Coccinelle that "sizeof x[...]" is equivalent to "sizeof *x"
would be nice..)
Support arrays and array subscripts, but keep the number of rules down
by adding normalization steps: First turn array subscripts into
derefences, then determine the types of expressions used with sizeof and
replace them with these types, and then convert the different possible
combinations of arrays and pointers with memcpy(3) to COPY_ARRAY.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fsmonitor: avoid signed integer overflow / infinite loop
883e248b8a ("fsmonitor: teach git to optionally utilize a file system
monitor to speed up detecting new or changed files.", 2017-09-22) uses
an int in a loop that would wrap if index_state->cache_nr (unsigned)
is bigger than INT_MAX
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a partial clone, the object filtering criteria is
recorded for the origin of the clone, but this incorrectly used a
hardcoded name "origin" to name that remote; it has been corrected
to honor the "--origin <name>" option.
* xl/record-partial-clone-origin:
clone: respect user supplied origin name when setting up partial clone
"git request-pull" learned to warn when the ref we ask them to pull
from in the local repository and in the published repository are
different.
* pb/request-pull-verify-remote-ref:
request-pull: warn if the remote object is not the same as the local one
request-pull: quote regex metacharacters in local ref
The command line to invoke a "git cat-file" command from inside
"git p4" was not properly quoted to protect a caret and running a
broken command on Windows, which has been corrected.
* mm/p4-unshelve-windows-fix:
p4 unshelve: fix "Not a valid object name HEAD0" on Windows
A new tutorial targetting specifically aspiring git-core
developers.
* es/first-contrib-tutorial:
doc: add some nit fixes to MyFirstContribution
documentation: add anchors to MyFirstContribution
documentation: add tutorial for first contribution
"git clone --recurse-submodules" learned to set up the submodules
to ignore commit object names recorded in the superproject gitlink
and instead use the commits that happen to be at the tip of the
remote-tracking branches from the get-go, by passing the new
"--remote-submodules" option.
* ba/clone-remote-submodules:
clone: add `--remote-submodules` flag
"git merge --squash" is designed to update the working tree and the
index without creating the commit, and this cannot be countermanded
by adding the "--commit" option; the command now refuses to work
when both options are given.
* vv/merge-squash-with-explicit-commit:
merge: refuse --commit with --squash
"git bundle verify" needs to see if prerequisite objects exist in
the receiving repository, but the command did not check if we are
in a repository upfront, which has been corrected.
* js/bundle-verify-require-object-store:
bundle verify: error out if called without an object database
"git am -i --resolved" segfaulted after trying to see a commit as
if it were a tree, which has been corrected.
* jk/am-i-resolved-fix:
am: fix --interactive HEAD tree resolution
am: drop tty requirement for --interactive
am: read interactive input from stdin
am: simplify prompt response handling
The server side support for "git fetch" used to show incorrect
value for the HEAD symbolic ref when the namespace feature is in
use, which has been corrected.
* jk/HEAD-symref-in-xfer-namespaces:
upload-pack: strip namespace from symref data
A "merge -c" instruction during "git rebase --rebase-merges" should
give the user a chance to edit the log message, even when there is
otherwise no need to create a new merge and replace the existing
one (i.e. fast-forward instead), but did not. Which has been
corrected.
* pw/rebase-edit-message-for-replayed-merge:
rebase -r: always reword merge -c
The way of specifying the path to find dynamic libraries at runtime
has been simplified. The old default to pass -R/path/to/dir has been
replaced with the new default to pass -Wl,-rpath,/path/to/dir,
which is the more recent GCC uses. Those who need to build with an
old GCC can still use "CC_LD_DYNPATH=-R"
* ab/deprecate-R-for-dynpath:
Makefile: remove the NO_R_TO_GCC_LINKER flag
The ownership rule for the file descriptor to fast-import remote
backend was mixed up, leading to unrelated file descriptor getting
closed, which has been fixed.
* mh/import-transport-fd-fix:
Use xmmap_gently instead of xmmap in use_pack
dup() the input fd for fast-import used for remote helpers