* jk/snprintf-cleanups:
daemon: use an argv_array to exec children
gc: replace local buffer with git_path
transport-helper: replace checked snprintf with xsnprintf
convert unchecked snprintf into xsnprintf
combine-diff: replace malloc/snprintf with xstrfmt
replace unchecked snprintf calls with heap buffers
receive-pack: print --pack-header directly into argv array
name-rev: replace static buffer with strbuf
create_branch: use xstrfmt for reflog message
create_branch: move msg setup closer to point of use
avoid using mksnpath for refs
avoid using fixed PATH_MAX buffers for refs
fetch: use heap buffer to format reflog
tag: use strbuf to format tag header
diff: avoid fixed-size buffer for patch-ids
odb_mkstemp: use git_path_buf
odb_mkstemp: write filename into strbuf
do not check odb_mkstemp return value for errors
Use reference iteration rather than `do_for_each_entry_in_dir()` in
the definition of `files_pack_refs()`. This makes the code shorter and
easier to follow, because the logic can be inline rather than spread
between the main function and a callback function, and it removes the
need to use `pack_refs_cb_data` to preserve intermediate state.
This removes the last callers of `entry_resolves_to_object()` and
`get_loose_ref_dir()`, so delete those functions.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use reference iteration rather than do_for_each_entry_in_dir() in the
definition of commit_packed_refs().
Note that an internal consistency check that was previously done in
`write_packed_entry_fn()` is not there anymore. This is actually an
improvement:
The old error message was emitted when there is an entry in the
packed-ref cache that is not `REF_KNOWS_PEELED`, and when we attempted
to peel the reference, the result was `PEEL_INVALID`,
`PEEL_IS_SYMREF`, or `PEEL_BROKEN`. Since a packed ref cannot be a
symref, `PEEL_IS_SYMREF` and `PEEL_BROKEN` can be ruled out. So we're
left with `PEEL_INVALID`.
An entry without `REF_KNOWS_PEELED` can get into the packed-refs cache
in the following two ways:
* The reference was read from a `packed-refs` file that didn't have
the `fully-peeled` attribute. In that case, we *don't want* to emit
an error, because the broken value is presumably a stale value of
the reference that is now masked by a loose version of the same
reference (which we just don't happen to be packing this time). This
is a perfectly legitimate situation and doesn't indicate that the
repository is corrupt. The old code incorrectly emits an error
message in this case. (It was probably never reported as a bug
because this scenario is rare.)
* The reference was a loose reference that was just added to the
packed ref cache by `files_packed_refs()` via
`pack_if_possible_fn()` in preparation for being packed. The latter
function refuses to pack a reference for which
`entry_resolves_to_object()` returns false, and otherwise calls
`peel_entry()` itself and checks the return value. So an entry added
this way should always have `REF_KNOWS_PEELED` and shouldn't trigger
the error message in either the old code or the new.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change `cache_ref_iterator_begin()` to take two new arguments:
* `prefix` -- to iterate only over references with the specified
prefix.
* `prime_dir` -- to "prime" (i.e., pre-load) the cache before starting
the iteration.
The new functionality makes it possible for
`files_ref_iterator_begin()` to be made more ignorant of the internals
of `ref_cache`, and `find_containing_dir()` and `prime_ref_dir()` to
be made private.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract a new function, `get_loose_ref_cache()`, from
get_loose_ref_dir(). The function returns the `ref_cache` for the
loose refs of a `files_ref_store`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: handle "refs/bisect/" in `loose_fill_ref_dir()`
That "refs/bisect/" has to be handled specially when filling the
ref_cache for loose references is a peculiarity of the files backend,
and the ref-cache code shouldn't need to know about it. So move this
code to the callback function, `loose_fill_ref_dir()`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref-cache: use a callback function to fill the cache
It is a leveling violation for `ref_cache` to know about
`files_ref_store` or that it should call `read_loose_refs()` to lazily
fill cache directories. So instead, have its constructor take as an
argument a callback function that it should use for lazy-filling, and
change `files_ref_store` to supply a pointer to function
`read_loose_refs` (renamed to `loose_fill_ref_dir`) when creating the
ref cache for its loose refs.
This means that we can generify the type of the back-pointer in
`struct ref_cache` from `files_ref_store` to `ref_store`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: record the ref_store in ref_cache, not ref_dir
Instead of keeping a pointer to the `ref_store` in every `ref_dir`
entry, store it once in `struct ref_cache`, and change `struct
ref_dir` to include a pointer to its containing `ref_cache` instead.
This makes it easier to add to the information that is accessible from
a `ref_dir` without increasing the size of every `ref_dir` instance.
Note that previously, every `ref_dir` pointed at the containing
`files_ref_store` regardless of whether it was a part of the loose or
packed reference cache. Now we have to be sure to initialize the
instances to point at the correct containing `ref_cache`. So change
`create_dir_entry()` to take a `ref_cache` parameter, and change its
callers to pass the correct `ref_cache` depending on the purpose of
the new `dir_entry`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `ref_cache` code is currently too tightly coupled to
`files-backend`, making the code harder to understand and making it
awkward for new code to use `ref_cache` (as we indeed have planned).
Start loosening that coupling by splitting `ref_cache` into a separate
module.
This commit moves code, adds declarations, and changes the visibility
of some functions, but doesn't change any code.
The modules are still too tightly coupled, but the situation will be
improved in subsequent commits.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs_verify_refname_available(): use function in more places
Change `lock_raw_ref()` and `lock_ref_sha1_basic()` to use
`refs_verify_refname_available()` instead of
`verify_refname_available_dir()`. This means that those callsites now
check for conflicts with all references rather than just packed refs,
but the performance cost shouldn't be significant (and will be
regained later).
These were the last callers of `verify_refname_available_dir()`, so
also delete that (very complicated) function.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs_verify_refname_available(): implement once for all backends
It turns out that we can now implement
`refs_verify_refname_available()` based on the other virtual
functions, so there is no need for it to be defined at the backend
level. Instead, define it once in `refs.c` and remove the
`files_backend` definition.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/am: fold am_signoff() into am_append_signoff()
There are no more direct calls to am_signoff(), so we can fold its
logic in am_append_signoff().
(This is done in a separate commit rather than in the previous one, to
make it easier to revert this specific change if additional calls are
ever introduced.)
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signoff is handled in parse_mail(), but not in parse_mail_rebasing(),
since the latter is only used when git-rebase calls git-am with the
--rebasing option, and --signoff is never passed in this case.
In order to introduce (in the upcoming commits) support for
`git-rebase --signoff`, we must make git-am pay attention to it also
in the rebase case. This can be done by moving the conditional
addition of the signoff from parse_mail() to the caller am_run(),
after either of the parse_mail*() functions were called.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4: add failing test for name-rev rather than symbolic-ref
Using name-rev to find the current git branch means that git-p4
does not correctly get the current branch name if there are
multiple branches pointing at HEAD, or a tag.
This change adds a test case which demonstrates the problem.
Configuring which branches are allowed to be submitted from goes
wrong, as git-p4 gets confused about which branch is in use.
This appears to be the only place that git-p4 actually cares
about the current branch.
Signed-off-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule: prevent backslash expantion in submodule names
When attempting to add a submodule with backslashes in its name 'git
submodule' fails in a funny way. We can see that some of the
backslashes are expanded resulting in a bogus path:
git -C main submodule add ../sub\\with\\backslash
fatal: repository '/tmp/test/sub\witackslash' does not exist
fatal: clone of '/tmp/test/sub\witackslash' into submodule path
To solve this, convert calls to 'read' to 'read -r' in git-submodule.sh
in order to prevent backslash expantion in submodule names.
Reported-by: Joachim Durchholz <jo@durchholz.org> Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
b1ef400e (setup_git_env: avoid blind fall-back to ".git") made programs
that tried to access a repository without initializing properly die with
a diagnostic message. One offender is test-read-cache, which is used in
p0002. Fix it by calling setup_git_directory() before accessing the
index.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: reject ref updates while GIT_QUARANTINE_PATH is set
As documented in git-receive-pack(1), updating a ref from
within the pre-receive hook is dangerous and can corrupt
your repo. This patch forbids ref updates entirely during
the hook to make it harder for adventurous hook writers to
shoot themselves in the foot.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03) changed the underlying
details of how we take in objects. This is mostly
transparent to the user, but there are a few things they
might notice. Let's document them.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: drop tmp_objdir_env from run_update_hook
Since 722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03), we have to feed the
pre-receive hook the tmp_objdir environment, so that git
programs run from the hook know where to find the objects.
That commit modified run_update_hook() to do the same, but
there it is a noop. By the time we get to the update hooks,
we have already migrated the objects from quarantine, and so
tmp_objdir_env() will always return NULL. We can drop this
useless call.
Note that the ordering here and the lack of support for the
update hook is intentional. The update hook calls are
interspersed with actual ref updates, and we must migrate
the objects before any refs are updated (since otherwise
those refs would appear broken to outside processes). So the
only other options are:
- remain in quarantine for the _first_ ref, but not the
others. This is sufficiently confusing that it can be
rejected outright.
- run all the individual update hooks first, then migrate,
then update all the refs. But this changes the repository
state that the update hooks see (i.e., whether or not
refs from the same push are updated yet or not).
So the functionality is fine and remains unchanged with this
patch; we're just cleaning up a useless and confusing line
of code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t6500: wait for detached auto gc at the end of the test script
The last test in 't6500-gc', 'background auto gc does not run if
gc.log is present and recent but does if it is old', added in a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically
trigger an error message from the test harness:
rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty
The test in question ends with executing an auto gc in the backround,
which occasionally takes so long that it's still running when
'test_done' is about to remove the trash directory. This 'rm -rf
$trash' in the foreground might race with the detached auto gc to
create and delete files and directories, and gc might (re-)create a
path that 'rm' already visited and removed, triggering the above error
message when 'rm' attempts to remove its parent directory.
Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01)
fixed the same problem in a different test script by simply
disallowing background gc. Unfortunately, what worked there is not
applicable here, because the purpose of this test is to check the
behavior of a detached auto gc.
Make sure that the test doesn't continue before the gc is finished in
the background with a clever bit of shell trickery:
- Open fd 9 in the shell, to be inherited by the background gc
process, because our daemonize() only closes the standard fds 0,
1 and 2.
- Duplicate this fd 9 to stdout.
- Read 'git gc's stdout, and thus fd 9, through a command
substitution. We don't actually care about gc's output, but this
construct has two useful properties:
- This read blocks until stdout or fd 9 are open. While stdout is
closed after the main gc process creates the background process
and exits, fd 9 remains open until the backround process exits.
- The variable assignment from the command substitution gets its
exit status from the command executed within the command
substitution, i.e. a failing main gc process will cause the test
to fail.
Note, that this fd trickery doesn't work on Windows, because due to
MSYS limitations the git process only inherits the standard fds 0, 1
and 2 from the shell. Luckily, it doesn't matter in this case,
because on Windows daemonize() is basically a noop, thus 'git gc
--auto' always runs in the foreground.
And since we can now continue the test reliably after the detached gc
finished, check that there is only a single packfile left at the end,
i.e. that the detached gc actually did what it was supposed to do.
Also add a comment at the end of the test script to warn developers of
future tests about this issue of long running detached gc processes.
Helped-by: Jeff King <peff@peff.net> Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 'clear_pathspec()' the incorrect index parameter is used to bound an
inner-loop which is used to free a 'struct attr_match' value field.
Using the incorrect index parameter (in addition to being incorrect)
occasionally causes segmentation faults when attempting to free an
invalid pointer. Fix this by using the correct index parameter 'i'.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the cleanup phase for the grep command to free the pathspec
struct that's allocated earlier in the same block, and used just a few
lines earlier.
With "grep hi README.md" valgrind reports a loss of 239 bytes now,
down from 351.
The relevant --num-callers=40 --leak-check=full --show-leak-kinds=all
backtrace is:
[...] 187 (112 direct, 75 indirect) bytes in 1 blocks are definitely lost in loss record 70 of 110
[...] at 0x4C2BBAF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
[...] by 0x60B339: do_xmalloc (wrapper.c:59)
[...] by 0x60B2F6: xmalloc (wrapper.c:86)
[...] by 0x576B37: parse_pathspec (pathspec.c:652)
[...] by 0x4519F0: cmd_grep (grep.c:1215)
[...] by 0x4062EF: run_builtin (git.c:371)
[...] by 0x40544D: handle_builtin (git.c:572)
[...] by 0x4060A2: run_argv (git.c:624)
[...] by 0x4051C6: cmd_main (git.c:701)
[...] by 0x4C5901: main (common-main.c:43)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e9d9a8a4d (connect: handle putty/plink also in
GIT_SSH_COMMAND, 2017-01-02) added a call to
split_cmdline(), but checks only for a non-zero return to
see if we got any output. Since the function returns
negative values (and a NULL argv) on error, we end up
dereferencing NULL and segfaulting.
Arguably we could report on the parsing error here, but it's
probably not worth it. This is a best-effort attempt to see
if we are using plink. So we can simply return here with
"no, it wasn't plink" and let the shell actually complain
about the bogus quoting.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
travis-ci: add static analysis build job to run coccicheck
Add a dedicated build job for static analysis. As a starter we only run
coccicheck but in the future we could run Clang Static Analyzer or
similar tools, too.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
unpack-trees: avoid duplicate ODB lookups during checkout
Teach traverse_trees_recursive() to not do redundant ODB
lookups when both directories refer to the same OID.
In operations such as read-tree and checkout, there will
likely be many peer directories that have the same OID when
the differences between the commits are relatively small.
In these cases we can avoid hitting the ODB multiple times
for the same OID.
This patch handles n=2 and n=3 cases and simply copies the
data rather than repeating the fill_tree_descriptor().
================
On the Windows repo (500K trees, 3.1M files, 450MB index),
this reduced the overall time by 0.75 seconds when cycling
between 2 commits with a single file difference.
string-list: use ALLOC_GROW macro when reallocing string_list
Use ALLOC_GROW() macro when reallocing a string_list array
rather than simply increasing it by 32. This is a performance
optimization.
During status on a very large repo and there are many changes,
a significant percentage of the total run time is spent
reallocing the wt_status.changes array.
This change decreases the time in wt_status_collect_changes_worktree()
from 125 seconds to 45 seconds on my very large repository.
This produced a modest gain on my 1M file artificial repo, but
broke even on linux.git.
Test HEAD^^ HEAD
---------------------------------------------------------------------------------------
0005.2: read-tree status br_ballast (1000001) 8.29(5.62+2.62) 8.22(5.57+2.63) -0.8%
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git to skip verification of the SHA1-1 checksum at the end of
the index file in verify_hdr() which is called from read_index()
unless the "force_verify_index_checksum" global variable is set.
Teach fsck to force this verification.
The checksum verification is for detecting disk corruption, and for
small projects, the time it takes to compute SHA-1 is not that
significant, but for gigantic repositories this calculation adds
significant time to every command.
These effect can be seen using t/perf/p0002-read-cache.sh:
Test HEAD~1 HEAD
--------------------------------------------------------------------------------------
0002.1: read_cache/discard_cache 1000 times 0.66(0.44+0.20) 0.30(0.27+0.02) -54.5%
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pathspec: honor `PATHSPEC_PREFIX_ORIGIN` with empty prefix
Previous to commit 5d8f084a5 (pathspec: simpler logic to prefix original
pathspec elements, 2017-01-04), we were always using the computed
`match` variable to perform pathspec matching whenever
`PATHSPEC_PREFIX_ORIGIN` is set. This is for example useful when passing
the parsed pathspecs to other commands, as the computed `match` may
contain a pathspec relative to the repository root. The commit changed
this logic to only do so when we do have an actual prefix and when
literal pathspecs are deactivated.
But this change may actually break some commands which expect passed
pathspecs to be relative to the repository root. One such case is `git
add --patch`, which now fails when using relative paths from a
subdirectory. For example if executing "git add -p ../foo.c" in a
subdirectory, the `git-add--interactive` command will directly pass
"../foo.c" to `git-ls-files`. As ls-files is executed at the
repository's root, the command will notice that "../foo.c" is outside
the repository and fail.
Fix the issue by again using the computed `match` variable when
`PATHSPEC_PREFIX_ORIGIN` is set and global literal pathspecs are
deactivated. Note that in contrast to previous behavior, we will now
always call `prefix_magic` regardless of whether a prefix is actually
set. But this is the right thing to do: when the `match` variable has
been resolved to the repository's root, it will be set to an empty
string. When passing the empty string directly to other commands, it
will result in a warning regarding deprecated empty pathspecs. By always
adding the prefix magic, we will end up with at least the string
":(prefix:0)" and thus avoid the warning.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Brandon Williams <bmwill@google.com> Reviewed-by: Duy Nguyen <pclouds@gmail.com>
config: resolve symlinks in conditional include's patterns
$GIT_DIR returned by get_git_dir() is normalized, with all symlinks
resolved (see setup_work_tree function). In order to match paths (or
patterns) against $GIT_DIR char-by-char, they have to be normalized
too. There is a note in config.txt about this, that the user need to
resolve symlinks by themselves if needed.
The problem is, we allow certain path expansion, '~/' and './', for
convenience and can't ask the user to resolve symlinks in these
expansions. Make sure the expanded paths have all symlinks resolved.
PS. The strbuf_realpath(&text, get_git_dir(), 1) is still needed because
get_git_dir() may return relative path.
Noticed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
path.c: and an option to call real_path() in expand_user_path()
In the next patch we need the ability to expand '~' to
real_path($HOME). But we can't do that from outside because '~' is part
of a pattern, not a true path. Add an option to expand_user_path() to do
so.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a git command is present in the upstream of a pipe, its failure
gets masked by piping. Hence we should avoid it for testing the
upstream git command. By writing out the output of the git command to
a file, we can test the exit codes of both the commands as a failure exit
code in any command is able to stop the && chain.
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_ref_dir(): don't call read_loose_refs() for "refs/bisect"
Since references under "refs/bisect/" are per-worktree, they have to
be sought in the worktree rather than in the main repository. But
since loose references are found by traversing directories, the
reference iterator won't even get the idea to look for a
"refs/bisect/" directory in the worktree if there is not a directory
with that name in the main repository. Thus `get_ref_dir()` manually
inserts a dir_entry for "refs/bisect/" whenever it reads the entry for
"refs/".
The current code then immediately calls `read_loose_refs()` on that
directory. But since the dir_entry is created with its `incomplete`
flag set, any traversal that gets to this point will read the
directory automatically. So there is no need to call
`read_loose_refs()` explicitly; the lazy mechanism suffices.
And in fact, the attempt to `read_loose_refs()` was broken anyway.
That function needs its `dirname` argument to have a trailing `/`
character, but the invocation here was passing it "refs/bisect"
without a trailing slash. So `read_loose_refs()` would read
`$GIT_DIR/refs/bisect" correctly, but if it found an entry "foo" in
that directory, it would try to read "$GIT_DIR/refs/bisectfoo".
Normally it wouldn't find anything at that path, but the failure was
canceled out because `get_ref_dir()` *also* forgot to reset the
`REF_INCOMPLETE` bit on the dir_entry. So the read was attempted again
when it was accessed, via the lazy mechanism, and this time the read
was done correctly.
This code has been broken since it was first introduced.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
files-backend: avoid ref api targeting main ref store
A small step towards making files-backend work as a non-main ref store
using the newly added store-aware API.
For the record, `join` and `nm` on refs.o and files-backend.o tell me
that files-backend no longer uses functions that default to
get_main_ref_store().
I'm not yet comfortable at the idea of removing
files_assert_main_repository() (or converting REF_STORE_MAIN to
REF_STORE_WRITE). More staring and testing is required before that can
happen. Well, except peel_ref(). I'm pretty sure that function is safe.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is not meant to cover all existing API. It adds enough to test ref
stores with the new test program test-ref-store, coming soon and to be
used by files-backend.c.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: rename get_ref_store() to get_submodule_ref_store() and make it public
This function is intended to replace *_submodule() refs API. It provides
a ref store for a specific submodule, which can be operated on by a new
set of refs API.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
files-backend: replace submodule_allowed check in files_downcast()
files-backend.c is unlearning submodules. Instead of having a specific
check for submodules to see what operation is allowed, files backend
now takes a set of flags at init. Each operation will check if the
required flags is present before performing.
For now we have four flags: read, write and odb access. Main ref store
has all flags, obviously, while submodule stores are read-only and have
access to odb (*).
The "main" flag stays because many functions in the backend calls
frontend ones without a ref store, so these functions always target the
main ref store. Ideally the flag should be gone after ref-store-aware
api is in place and used by backends.
(*) Submodule code needs for_each_ref. Try take REF_STORE_ODB flag
out. At least t3404 would fail. The "have access to odb" in submodule is
a bit hacky since we don't know from he whether add_submodule_odb() has
been called.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
has_sha1_file: don't bother if we are not in a repository
Most callers to this function already require that they are in a
git repository, but there is an exception: "git apply" uses
has_sha1_file to avoid work if the result of applying a binary
patch is already present in the repository. When run outside any
repository, this produces an error:
fatal: BUG: setup_git_env called without repository
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
http.postbuffer: allow full range of ssize_t values
Unfortunately, in order to push some large repos where a server does
not support chunked encoding, the http postbuffer must sometimes
exceed two gigabytes. On a 64-bit system, this is OK: we just malloc
a larger buffer.
This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the
buffer size.
Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The left and right base directories were pointed to the buf field of
two strbufs, which were subject to change.
A contrived test case shows the problem where a file with a long enough
name to force the strbuf to grow is up-to-date (hence the code path is
used where the work tree's version of the file is reused), and then a
file that is not up-to-date needs to be written (hence the code path is
used where checkout_entry() uses the previously recorded base_dir that
is invalid by now).
Let's just copy the base_dir strings for use with checkout_entry(),
never touch them until the end, and release them then. This is an easily
verifiable fix (as opposed to the next-obvious alternative: to re-set
base_dir after every loop iteration).
This fixes https://github.com/git-for-windows/git/issues/1124
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-diff understands "--ours", "--theirs" and "--base" for files with
conflicts. But so far they were not documented for the central diff
command but only for diff-files.
Signed-off-by: Andreas Heiduk <asheiduk@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitattributes.txt: document how to normalize the line endings
The instructions how to normalize the line endings should have been updated
as part of commit 6523728499e 'convert: unify the "auto" handling of CRLF',
(but that part never made it into the commit).
Update the documentation in Documentation/gitattributes.txt and add
a test case in t0025.
Reported by Kristian Adrup
https://github.com/git-for-windows/git/issues/954
Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
http: fix the silent ignoring of proxy misconfiguraion
Earlier, the whole http.proxy option string was passed to curl without
any preprocessing so curl could complain about the invalid proxy
configuration.
After the commit 372370f167 ("http: use credential API to handle proxy
authentication", 2016-01-26), if the user specified an invalid HTTP
proxy option in the configuration, then the option parsing silently
fails and NULL will be passed to curl as a proxy. This forces curl to
fall back to detecting the proxy configuration from the environment,
causing the http.proxy option ignoring.
Fix this issue by checking the proxy option parsing result. If parsing
failed then print an error message and die. Such behaviour allows the
user to quickly figure the proxy misconfiguration and correct it.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
http: honor empty http.proxy option to bypass proxy
Curl distinguishes between an empty proxy address and a NULL proxy
address. In the first case it completely disables proxy usage, but if
the proxy address option is NULL then curl attempts to determine the
proxy address from the http_proxy environment variable.
According to the documentation, if the http.proxy option is set to an
empty string, git should bypass proxy and connect to the server
directly:
export http_proxy=http://network-proxy/
cd ~/foobar-project
git config remote.origin.proxy ""
git fetch
Previously, proxy host was configured by one line:
Commit 372370f167 ("http: use credential API to handle proxy
authentication", 2016-01-26) parses the proxy option, then extracts the
proxy host address and updates the curl configuration, making the
previous call a noop:
But if the proxy option is empty then the proxy host field becomes NULL.
This forces curl to fall back to detecting the proxy configuration from
the environment, causing the http.proxy option to not work anymore.
Fix this issue by explicitly handling http.proxy being set the empty
string. This also makes the code a bit more clear and should help us
avoid such regressions in the future.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The lazy-init codepath will not be exercised uniless threaded. Skip
the entire test on a single-core box. Also replace a hard-coded
constant of 2000 (number of cache entries to manifacture for tests)
with a variable with a human readable name.
Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
An empty line should stop any pending in-body headers, and start the
actual body parsing.
This also modifies the original test for the in-body headers to actually
have a real commit body that starts with spaces, and changes the test to
check that the long line matches _exactly_, and doesn't get extra data
from the body.
Fixes:6b4b013f1884 ("mailinfo: handle in-body header continuations") Cc: Jonathan Tan <jonathantanmy@google.com> Cc: Jeff King <peff@peff.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
push: propagate remote and refspec with --recurse-submodules
Teach "push --recurse-submodules" to propagate, if given a name as remote, the
provided remote and refspec recursively to the pushes performed in the
submodules. The push will therefore only succeed if all submodules have a
remote with such a name configured.
Note that "push --recurse-submodules" with a path or URL as remote will not
propagate the remote or refspec and instead use the default remote and refspec
configured in the submodule, preserving the current behavior.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the 'push-check' subcommand to submodule--helper which is used to
check if the provided remote and refspec can be used as part of a push
operation in the submodule.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git tag/branch/for-each-ref" family of commands long allowed to
filter the refs by "--contains X" (show only the refs that are
descendants of X), "--merged X" (show only the refs that are
ancestors of X), "--no-merged X" (show only the refs that are not
ancestors of X). One curious omission, "--no-contains X" (show
only the refs that are not descendants of X) has been added to
them.
* ab/ref-filter-no-contains:
tag: add tests for --with and --without
ref-filter: reflow recently changed branch/tag/for-each-ref docs
ref-filter: add --no-contains option to tag/branch/for-each-ref
tag: change --point-at to default to HEAD
tag: implicitly supply --list given another list-like option
tag: change misleading --list <pattern> documentation
parse-options: add OPT_NONEG to the "contains" option
tag: add more incompatibles mode tests
for-each-ref: partly change <object> to <commit> in help
tag tests: fix a typo in a test description
tag: remove a TODO item from the test suite
ref-filter: add test for --contains on a non-commit
ref-filter: make combining --merged & --no-merged an error
tag doc: reword --[no-]merged to talk about commits, not tips
tag doc: split up the --[no-]merged documentation
tag doc: move the description of --[no-]merged earlier
diff: submodule inline diff to initialize env array.
David reported:
> When I try to run `git diff --submodule=diff` in a submodule which has
> it's own submodules that have changes I get the error: fatal: bad
> object.
This happens, because we do not properly initialize the environment
in which the diff is run in the submodule. That means we inherit the
environment from the main process, which sets environment variables.
(Apparently we do set environment variables which we do not set
when not in a submodules, i.e. the .git directory is linked)
This commit, just like fd47ae6a5b (diff: teach diff to display
submodule difference with an inline diff, 2016-08-31) introduces bad
test code (i.e. hard coded hash values), which will be cleanup up in
a later patch.
Reported-by: David Parrish <daveparrish@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There isn't any obvious reason for the 'struct string_list push_options'
and 'struct string_list_item *item' to be marked as static, so unmark
them as being static. Also, clear the push_options string_list to
prevent memory leaking.
Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
index-pack: detect local corruption in collision check
When we notice that we have a local copy of an incoming
object, we compare the two objects to make sure we haven't
found a collision. Before we get to the actual object
bytes, though, we compare the type and size from
sha1_object_info().
If our local object is corrupted, then the type will be
OBJ_BAD, which obviously will not match the incoming type,
and we'll report "SHA1 COLLISION FOUND" (with capital
letters and everything). This is confusing, as the problem
is not a collision but rather local corruption. We should
report that instead (just like we do if reading the rest of
the object content fails a few lines later).
Note that we _could_ just ignore the error and mark it as a
non-collision. That would let you "git fetch" to replace a
corrupted object. But it's not a very reliable method for
repairing a repository. The earlier want/have negotiation
tries to get the other side to omit objects we already have,
and it would not realize that we are "missing" this
corrupted object. So we're better off complaining loudly
when we see corruption, and letting the user take more
drastic measures to repair (like making a full clone
elsewhere and copying the pack into place).
Note that the test sets transfer.unpackLimit in the
receiving repository so that we use index-pack (which is
what does the collision check). Normally for such a small
push we'd use unpack-objects, which would simply try to
write the loose object, and discard the new one when we see
that there's already an old one.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1_loose_object_info: return error for corrupted objects
When sha1_loose_object_info() finds that a loose object file
cannot be stat(2)ed or mmap(2)ed, it returns -1 to signal an
error to the caller. However, if it found that the loose
object file is corrupt and the object data cannot be used
from it, it stuffs OBJ_BAD into "type" field of the
object_info, but returns zero (i.e., success), which can
confuse callers.
This is due to 052fe5eac (sha1_loose_object_info: make type
lookup optional, 2013-07-12), which switched the return to a
strict success/error, rather than returning the type (but
botched the return).
Callers of regular sha1_object_info() don't notice the
difference, as that function returns the type (which is
OBJ_BAD in this case). However, direct callers of
sha1_object_info_extended() see the function return success,
but without setting any meaningful values in the object_info
struct, leading them to access potentially uninitialized
memory.
The easiest way to see the bug is via "cat-file -s", which
will happily ignore the corruption and report whatever
value happened to be in the "size" variable.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
contrib/git-resurrect.sh: do not write \t for HT in sed scripts
Just like we did in 0d1d6e50 ("t/t7003: replace \t with literal tab
in sed expression", 2010-08-12), avoid writing "\t" for HT in sed
scripts, which is not portable.
Add check for the end of the entries for the thread partition.
Add test for lazy init name hash with specific directory structure
The lazy init hash name was causing a buffer overflow when the last
entry in the index was multiple folder deep with parent folders that
did not have any files in them.
This adds a test for the boundary condition of the thread partitions
with the folder structure that was triggering the buffer overflow.
The fix was to check if it is the last entry for the thread partition
in the handle_range_dir and not try to use the next entry in the cache.
Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: update and rename api-sha1-array.txt
Since the structure and functions have changed names, update the code
examples and the documentation. Rename the file to match the new name
of the API.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this structure handles an array of object IDs, rename it to struct
oid_array. Also rename the accessor functions and the initialization
constant.
This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:
Convert sha1_array_for_each_unique and for_each_abbrev to object_id
Make sha1_array_for_each_unique take a callback using struct object_id.
Since one of these callbacks is an argument to for_each_abbrev, convert
those as well. Rename various functions, replacing "sha1" with "oid".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our struct child_process already has its own argv_array.
Let's use that to avoid having to format options into
separate buffers.
Note that we'll need to declare the child process outside of
the run_service_command() helper to do this. But that opens
up a further simplification, which is that the helper can
append to our argument list, saving each caller from
specifying "." manually.
We probe the "17/" loose object directory for auto-gc, and
use a local buffer to format the path. We can just use
git_path() for this. It handles paths of any length
(reducing our error handling). And because we feed the
result straight to a system call, we can just use the static
variant.
Note that git_path also knows the string "objects/" is
special, and will replace it with git_object_directory()
when necessary.
Another alternative would be to use sha1_file_name() for the
pretend object "170000...", but that ends up being more
hassle for no gain, as we have to truncate the final path
component.