Convert the sha1 member of struct cache_tree to struct object_id by
changing the definition and applying the following semantic patch, plus
the standard object_id transforms:
Fix up one reference to active_cache_tree which was not automatically
caught by Coccinelle. These changes are prerequisites for converting
parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The semantic patch for standard object_id transforms found two
outstanding places where we could make a transformation automatically.
Apply these changes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all uses of unsigned char [20] to struct object_id. Switch one
use of get_sha1_hex to parse_oid_hex to avoid the need for a constant.
This change is necessary in order to convert parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule: refactor logic to determine changed submodules
There are currently two instances (fetch and push) where we want to
determine if submodules have changed given some revision specification.
These two instances don't use the same logic to generate a list of
changed submodules and as a result there is a fair amount of code
duplication.
This patch refactors these two code paths such that they both use the
same logic to generate a list of changed submodules. This also makes it
easier for future callers to be able to reuse this logic as they only
need to create an argv_array with the revision specification to be using
during the revision walk.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Completion for "git checkout <branch>" that auto-creates the branch
out of a remote tracking branch can now be disabled, as this
completion often gets in the way when completing to checkout an
existing local branch that happens to share the same prefix with
bunch of remote tracking branches.
The function 'add_oid_to_argv()' provides the same functionality as
'append_oid_to_argv()'. Remove this duplicate function and instead use
'append_oid_to_argv()' where 'add_oid_to_argv()' was previously used.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename 'free_submodules_sha1s()' to 'free_submodules_oids()' since the
function frees a 'struct string_list' which has a 'struct oid_array'
stored in the 'util' field.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename 'add_sha1_to_array()' to 'append_oid_to_array()' to more
accurately describe what the function does, since it handles
'struct object_id' and not sha1 character arrays.
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tests: rename a test having to do with shallow submodules
Rename the t5614-clone-submodules.sh test to
t5614-clone-submodules-shallow.sh. It's not a general test of
submodules, but of shallow cloning in relation to submodules. Move it
to create another similar t56*-clone-submodules-*.sh test.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clone: add a --no-tags option to clone without tags
Add a --no-tags option to clone without fetching any tags.
Without this change there's no easy way to clone a repository without
also fetching its tags.
When supplying --single-branch the primary remote branch will be
cloned, but in addition tags will be followed & retrieved. Now
--no-tags can be added --single-branch to clone a repository without
tags, and which only tracks a single upstream branch.
This option works without --single-branch as well, and will do a
normal clone but not fetch any tags.
Many git commands pay some fixed overhead as a function of the number
of references. E.g. creating ~40k tags in linux.git will cause a
command like `git log -1 >/dev/null` to run in over a second instead
of in a matter of milliseconds, in addition numerous other things will
slow down, e.g. "git log <TAB>" with the bash completion will slowly
show ~40k references instead of 1.
The user might want to avoid all of that overhead to simply use a
repository like that to browse the "master" branch, or something like
a CI tool might want to keep that one branch up-to-date without caring
about any other references.
Without this change the only way of accomplishing this was either by
manually tweaking the config in a fresh repository:
Which requires hardcoding the "master" name, which may not be the main
--single-branch would have retrieved, or alternatively by setting
tagOpt=--no-tags right after cloning & deleting any existing tags:
git clone --single-branch git@github.com:git/git.git &&
cd git &&
git config remote.origin.tagOpt --no-tags &&
git tag -l | xargs git tag -d
Which of course was also subtly buggy if --branch was pointed at a
tag, leaving the user in a detached head:
git clone --single-branch --branch v2.12.0 git@github.com:git/git.git &&
cd git &&
git config remote.origin.tagOpt --no-tags &&
git tag -l | xargs git tag -d
Change occurrences "cd" followed by "fetch" on a single line to be on
two lines.
This is purely a stylistic change pointed out in code review for an
unrelated patch. Change the these tests use so new tests added later
using the more common style don't look out of place.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Peter Krefting <peter@softwolves.pp.se> Signed-off-by: Jean-Noel Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The building of the reflog message is using strbuf, which is not
friendly with internationalization frameworks. No other reflog
messages are translated right now and switching all the messages to
i18n would require a major rework of the way the messages are built.
Signed-off-by: Jean-Noel Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7400: add !CYGWIN prerequisite to 'add with \\ in path'
Commit cf9e55f494 ("submodule: prevent backslash expantion in submodule
names", 07-04-2017) added a test which creates a git repository with
some backslash characters in the name. On windows, where the backslash
character is a directory separator, it is not possible to create a
repository with the name 'sub\with\backslash'. (The NTFS filesystem would
probably allow it, but the win32 api does not). The MinGW and Git for
Windows versions of git actually create a repository called 'backslash'
in the sub-directory 'sub/with'.
On cygwin, however, due to the slightly schizophrenic treatment of the
backslash character by cygwin-git, this test fails at the 'git init'
stage. The git-init command does not recognise the directory separators
in the input path (eg. is_dir_sep('\\') is false), so it does not
attempt to create the leading directories 'sub/with'. (The call to
mkdir('sub\\with\\backslash') actually does recognise the directory
separators, but fails because the 'sub/with' directory doesn't exist).
In order to suppress the test failure (for now), add the !CYGWIN test
prerequisite.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
githooks.txt: clarify push hooks are always executed in $GIT_DIR
Listing the specific hooks might feel verbose but without it the
reader is left to wonder which hooks are triggered during the
push. Something which is not immediately obvious when only trying
to find out where the hook is executed.
Signed-off-by: Simon Ruderich <simon@ruderich.org> Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check if unzip supports the ZIP64 format and skip the tests that create
big archives otherwise. Also skip the test that archives a big file on
32-bit platforms because the git object systems can't unpack files
bigger than 4GB there.
Reported-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
status: fix missing newline when comment chars are disabled
When git-status shows tracking data for the current branch
in the long format, we try to end the stanza with a blank
line. When status.displayCommentPrefix is true, we call
color_fprintf_ln() to do so. But when it's false, we call
the enigmatic:
fputs("", s->fp);
which does nothing at all! This is a bug from 7d7d68022
(silence a bunch of format-zero-length warnings,
2014-05-04). Prior to that, we called fprintf_ln() with an
empty string. Switching to fputs() meant we needed to
include the "newline in the string, but we didn't.
So you see:
On branch jk/status-tracking-newline
Your branch is ahead of 'origin/master' by 1 commit.
Changes not staged for commit:
modified: foo
Untracked files:
bar
whereas there should be a blank line before the "Changes not
staged" line.
The fix itself is a one-liner. But we never noticed this
bug because t7508 doesn't exercise the ahead/behind code at
all. So let's configure an upstream during the initial
setup, which means that the code will be exercised as part
of all of the various invocations in that script. This makes
the diff rather noisy, but should give us good coverage.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
archive-zip: set version field for big files correctly
Signal that extractors need to implement spec version 4.5 (or higher)
for files with sizes of 4GB and more. Older unzippers might produce
truncated results otherwise; they should rather refuse to extract.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
am: shorten ident_split variable name in get_commit_info()
The local ident_split variable is often mentioned three
times per line when dealing with its begin/end pointer
pairs. Let's use a shorter name which lets us get rid of
some long lines. Since this is a short self-contained
function, readability doesn't suffer.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
After we call split_ident_line(), we have several begin/end
pairs for various parts of the ident. We then copy each into
a strbuf to create a single string, and then detach that
string. We can instead skip the strbuf entirely and just
duplicate the strings directly.
This is shorter, and it makes it more obvious that we are
not leaking the strbuf (we were not before, because every
code path either died or hit a strbuf_detach).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling logmsg_reencode() may allocate a buffer for the
commit message (because we need to load it from disk, or
because it needs re-encoded). We must "unuse" it afterwards
to free it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we used `unsigned long` for timestamps. This was only a good
choice on Linux, where we know implicitly that `unsigned long` is what is
used for `time_t`.
However, we want to use a different data type for timestamps for two
reasons:
- there is nothing that says that `unsigned long` should be the same data
type as `time_t`, and indeed, on 64-bit Windows for example, it is not:
`unsigned long` is 32-bit but `time_t` is 64-bit.
- even on 32-bit Linux, where `unsigned long` (and thereby `time_t`) is
32-bit, we *want* to be able to encode timestamps in Git that are
currently absurdly far in the future, *even if* the system library is
not able to format those timestamps into date strings.
So let's just switch to the maximal integer type available, which should
be at least 64-bit for all practical purposes these days. It certainly
cannot be worse than `unsigned long`, so...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase -i: reread the todo list if `exec` touched it
In the scripted version of the interactive rebase, there was no internal
representation of the todo list; it was re-read before every command.
That allowed the hack that an `exec` command could append (or even
completely rewrite) the todo list.
This hack was broken by the partial conversion of the interactive rebase
to C, and this patch reinstates it.
We also add a small test to verify that this fix does not regress in the
future.
Signed-off-by: Stephen Hicks <sdh@google.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Linux32 build was not build with our strict compiler settings (e.g.
warnings as errors). Fix this by passing the DEVELOPER environment
variable to the docker container.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the $STATUS variable contains a "%" character then printf will
interpret that as invalid format string. Fix this by formatting $STATUS
as string.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
`make` does not necessarily fail with an error code if
Asciidoc/AsciiDoctor encounters problems. Anything written to stderr
might be a better indicator for problems.
Ensure that nothing is written to stderr during a documentation build.
The redirects do not work in `sh`, therefore the script uses `bash`.
This shouldn't be a problem as the script is only executed on TravisCI.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When encountering a commit message that does not end in a newline,
sequencer does not complete the line before determining if a blank line
should be added. This causes the "(cherry picked..." and sign-off lines
to sometimes appear on the same line as the last line of the commit
message.
This behavior was introduced by commit 967dfd4 ("sequencer: use
trailer's trailer layout", 2016-11-29). However, a revert of that commit
would not resolve this issue completely: prior to that commit, a
conforming footer was deemed to be non-conforming by
has_conforming_footer() if there was no terminating newline, resulting
in both conforming and non-conforming footers being treated the same
when they should not be.
Resolve this issue, both for conforming and non-conforming footers, and
in both do_pick_commit() and append_signoff(), by always adding a
newline to the commit message if it does not end in one before checking
the footer for conformity.
Reported-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t1450: avoid use of "sed" on the index, which is a binary file
The previous step added a path zzzzzzzz to the index, and then used
"sed" to replace this string to yyyyyyyy to create a test case where
the checksum at the end of the file does not match the contents.
Unfortunately, use of "sed" on a non-text file is not portable.
Instead, use a Perl script that seeks to the end and modifies the
last byte of the file (where we _know_ stores the trailing
checksum).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internals of the refs API around the cached refs has been
streamlined.
* mh/separate-ref-cache:
do_for_each_entry_in_dir(): delete function
files_pack_refs(): use reference iteration
commit_packed_refs(): use reference iteration
cache_ref_iterator_begin(): make function smarter
get_loose_ref_cache(): new function
get_loose_ref_dir(): function renamed from get_loose_refs()
do_for_each_entry_in_dir(): eliminate `offset` argument
refs: handle "refs/bisect/" in `loose_fill_ref_dir()`
ref-cache: use a callback function to fill the cache
refs: record the ref_store in ref_cache, not ref_dir
ref-cache: introduce a new type, ref_cache
refs: split `ref_cache` code into separate files
ref-cache: rename `remove_entry()` to `remove_entry_from_dir()`
ref-cache: rename `find_ref()` to `find_ref_entry()`
ref-cache: rename `add_ref()` to `add_ref_entry()`
refs_verify_refname_available(): use function in more places
refs_verify_refname_available(): implement once for all backends
refs_ref_iterator_begin(): new function
refs_read_raw_ref(): new function
get_ref_dir(): don't call read_loose_refs() for "refs/bisect"
Allow to lock a worktree immediately after it's created. This helps
prevent a race between "git worktree add; git worktree lock" and
"git worktree prune".
While handy, "git_path()" is a dangerous function to use as a
callsite that uses it safely one day can be broken by changes
to other code that calls it. Reduction of its use continues.
* jk/war-on-git-path:
am: drop "dir" parameter from am_state_init
replace strbuf_addstr(git_path()) with git_path_buf()
replace xstrdup(git_path(...)) with git_pathdup(...)
use git_path_* helper functions
branch: add edit_description() helper
bisect: add git_path_bisect_terms helper
"git checkout" that handles a lot of paths has been optimized by
reducing the number of unnecessary checks of paths in the
has_dir_name() function.
* jh/add-index-entry-optim:
read-cache: speed up has_dir_name (part 2)
read-cache: speed up has_dir_name (part 1)
read-cache: speed up add_index_entry during checkout
p0006-read-tree-checkout: perf test to time read-tree
read-cache: add strcmp_offset function
The recently introduced conditional inclusion of configuration did
not work well when early-config mechanism was involved.
* nd/conditional-config-in-early-config:
config: correct file reading order in read_early_config()
config: handle conditional include when $GIT_DIR is not set up
config: prepare to pass more info in git_config_with_options()
Having a git command on the upstream side of a pipe in a test
script will hide the exit status from the command, which may cause
us to fail to notice a breakage; rewrite tests in a script to avoid
this issue.
* pc/t2027-git-to-pipe-cleanup:
t2027: avoid using pipes
* gb/rebase-signoff:
rebase: pass --[no-]signoff option to git am
builtin/am: fold am_signoff() into am_append_signoff()
builtin/am: honor --signoff also when --rebasing
The convention "$remove_trash is set to the trash directory that is
used during the test, so that it will be removed at the end, but
under --debug option we set the varilable to empty string to
preserve the directory" made sense back when it was introduced, as
there was no $TRASH_DIRECTORY variable. These days, since no tests
looks at the variable, it is obscure and even risks that by mistake
the variable gets used for something else (e.g. remove_trash=yes)
and cause us misbehave. Worse yet, remove_trash was not initialized
to an empty string at the beginning, so a stray environment variable
the user has could have affected the logic when "--debug" is in use.
Rewrite the clean-up sequence in test_done helper to explicitly
check the $debug condition and remove the trash directory using
the $TRASH_DIRECTORY variable.
Note that "go to the directory one level above the trash and then
remove it" is kept and this is deliverate; test_at_end_hook_ will
keep running from the expected location, and also some platforms may
not like a directory that is serving as the $cwd of a still-active
process removed.
test-lib.sh: do not barf under --debug at the end of the test
The original did "does $remove_trash exist? Then go one level above
and remove it". There was no problem under "--debug", where
the variable is left empty, as the first "test -d $remove_trash" would
have said "No, it doesn't".
With the check implemented in the previous step, we'd always get an
error under "--debug".
Noticed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Write a zip64 extended information extra field for big files as part of
their local headers and as part of their central directory headers.
Also write a zip64 version of the data descriptor in that case.
If we're streaming then we don't know the compressed size at the time we
write the header. Deflate can end up making a file bigger instead of
smaller if we're unlucky. Write a local zip64 header already for files
with a size of 2GB or more in this case to be on the safe side.
Both sizes need to be included in the local zip64 header, but the extra
field for the directory must only contain 64-bit equivalents for 32-bit
values of 0xffffffff.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a zip64 extended information extra field to the central directory
and emit the zip64 end of central directory records as well as locator
if the offset of an entry within the archive exceeds 4GB.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
archive-zip: write ZIP dir entry directly to strbuf
Write all fields of the ZIP directory record for an archive entry
in the right order directly into the strbuf instead of taking a detour
through a struct. Do that at end, when we have all necessary data like
checksum and compressed size. The fields are documented just as well,
the code becomes shorter and we save an extra copy.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Keep the ZIP central directory, which is written after all archive
entries, in a strbuf instead of a custom-managed buffer. It contains
binary data, so we can't (and don't want to) use the full range of
strbuf functions and we don't need the terminating NUL, but the result
is shorter and simpler code.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the creation of ZIP archives bigger than 4GB and containing files
bigger than 4GB. They are marked as EXPENSIVE because they take quite a
while and because the first one needs a bit more than 4GB of disk space
to store the resulting archive.
The big archive in the first test is made up of a tree containing
thousands of copies of a small file. Yet the test has to write out the
full archive because unzip doesn't offer a way to read from stdin.
The big file in the second test is provided as a zipped pack file to
avoid writing another 4GB file to disk and then adding it.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
70999e9cec (branch -m: update all per-worktree HEADs - 2016-03-27)
added this function in order to update HEADs of all relevant
worktrees, when a branch is renamed.
It, as a public ref api, kind of breaks abstraction when it uses
internal functions of files backend. With the introduction of
refs_create_symref(), we can move back pretty close to the code before 70999e9cec, where create_symref() was used for updating HEAD.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
worktree.c: kill parse_ref() in favor of refs_resolve_ref_unsafe()
The manual parsing code is replaced with a call to refs_resolve_ref_unsafe().
The manual parsing code must die because only refs/files-backend.c
should do that.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
files-backend at this point is still aware of the per-repo/worktree
separation in refs, so it can handle a linked worktree.
Some refs operations are known not working when current files-backend is
used in a linked worktree (e.g. reflog). Tests will be written when
refs_* functions start to be called with worktree backend to verify that
they work as expected.
Note: accessing a worktree of a submodule remains unaddressed. Perhaps
after get_worktrees() can access submodule (or rather a new function
get_submodule_worktrees(), that lists worktrees of a submodule), we can
update this function to work with submodules as well.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
prio_queue_reverse: don't swap elements with themselves
Our array-reverse algorithm does the usual "walk from both
ends, swapping elements". We can quit when the two indices
are equal, since:
1. Swapping an element with itself is a noop.
2. If i and j are equal, then in the next iteration i is
guaranteed to be bigge than j, and we will exit the
loop.
So exiting the loop on equality is slightly more efficient.
And more importantly, the new SWAP() macro does not expect
to handle noop swaps; it will call memcpy() with the same src
and dst pointers in this case. It's unclear whether that
causes a problem on any platforms by violating the
"overlapping memory" constraint of memcpy, but it does cause
valgrind to complain.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gethostname(2) may not NUL terminate the buffer if hostname does
not fit; unfortunately there is no easy way to see if our buffer
was too small, but at least this will make sure we will not end up
using garbage past the end of the buffer.
* dt/xgethostname-nul-termination:
xgethostname: handle long hostnames
use HOST_NAME_MAX to size buffers for gethostname(2)
* rs/misc-cppcheck-fixes:
server-info: avoid calling fclose(3) twice in update_info_file()
files_for_each_reflog_ent_reverse(): close stream and free strbuf on error
am: close stream on error, but not stdin
"git fetch-pack" was not prepared to accept ERR packet that the
upload-pack can send with a human-readable error message. It
showed the packet contents with ERR prefix, so there was no data
loss, but it was redundant to say "ERR" in an error message.
* jt/fetch-pack-error-reporting:
fetch-pack: show clearer error message upon ERR
* jk/quarantine-received-objects:
refs: reject ref updates while GIT_QUARANTINE_PATH is set
receive-pack: document user-visible quarantine effects
receive-pack: drop tmp_objdir_env from run_update_hook
The index file has a trailing SHA-1 checksum to detect file
corruption, and historically we checked it every time the index
file is used. Omit the validation during normal use, and instead
verify only in "git fsck".
In a 2- and 3-way merge of trees, more than one source trees often
end up sharing an identical subtree; optimize by not reading the
same tree multiple times in such a case.
* jh/unpack-trees-micro-optim:
unpack-trees: avoid duplicate ODB lookups during checkout
The string-list API used a custom reallocation strategy that was
very inefficient, instead of using the usual ALLOC_GROW() macro,
which has been fixed.
* jh/string-list-micro-optim:
string-list: use ALLOC_GROW macro when reallocing string_list
$GIT_DIR may in some cases be normalized with all symlinks resolved
while "gitdir" path expansion in the pattern does not receive the
same treatment, leading to incorrect mismatch. This has been fixed.
* nd/conditional-config-include:
config: resolve symlinks in conditional include's patterns
path.c: and an option to call real_path() in expand_user_path()
Allow the http.postbuffer configuration variable to be set to a
size that can be expressed in size_t, which can be larger than
ulong on some platforms.
* dt/http-postbuffer-can-be-large:
http.postbuffer: allow full range of ssize_t values
t/perf: correctly align non-ASCII descriptions in output
Change the test descriptions from being treated as binary blobs by
perl to being treated as UTF-8. This ensures that e.g. a test
description like "æ" is counted as 1 character, not 2.
I have WIP performance tests for non-ASCII grep patterns on another
topic that are affected by this.
Now instead of:
$ ./run p0000-perf-lib-sanity.sh
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
We emit:
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
Fixes code originally added in 342e9ef2d9 ("Introduce a performance
testing framework", 2012-02-17).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
PRItime: introduce a new "printf format" for timestamps
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.
There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.
So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_timestamp(): specify explicitly where we parse timestamps
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generally disallow null sha1s from entering the index,
due to 4337b5856 (do not write null sha1s to on-disk index,
2012-07-28). However, we loosened that in 83bd7437c
(write_index: optionally allow broken null sha1s,
2013-08-27) so that tools like filter-branch could be used
to repair broken history.
However, we should make sure that these broken entries do
not get propagated into new trees. For most entries, we'd
catch them with the missing-object check (since presumably
the null sha1 does not exist in our object database). But
gitlink entries do not need reachability, so we may blindly
copy the entry into a bogus tree.
This patch rejects all null sha1s (with the same "invalid
entry" message that missing objects get) when building trees
from the index. It does so even for non-gitlinks, and even
when "write-tree" is given the --missing-ok flag. The null
sha1 is a special sentinel value that is already rejected in
trees by fsck; whether the object exists or not, it is an
error to put it in a tree.
Note that for this to work, we must also avoid reusing an
existing cache-tree that contains the null sha1. This patch
does so by just refusing to write out any cache tree when
the index contains a null sha1. This is blunter than we need
to be; we could just reject the subtree that contains the
offending entry. But it's not worth the complexity. The
behavior is unchanged unless you have a broken index entry,
and even then we'd refuse the whole index write unless the
emergency GIT_ALLOW_NULL_SHA1 is in use. And even then the
end result is only a performance drop (any write-tree will
have to generate the whole cache-tree from scratch).
The tests bear some explanation.
The existing test in t7009 doesn't catch this problem,
because our index-filter runs "git rm --cached", which will
try to rewrite the updated index and barf on the bogus
entry. So we never even make it to write-tree. The new test
there adds a noop index-filter, which does show the problem.
The new tests in t1601 are slightly redundant with what
filter-branch is doing under the hood in t7009. But as
they're much more direct, they're easier to reason about.
And should filter-branch ever change or go away, we'd want
to make sure that these plumbing commands behave sanely.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>