gitweb.git
completion: simplify prefix path component handling... SZEDER Gábor Mon, 16 Apr 2018 22:41:07 +0000 (00:41 +0200)

completion: simplify prefix path component handling during path completion

Once upon a time 'git -C "" cmd' errored out with "Cannot change to
'': No such file or directory", therefore the completion script took
extra steps to run 'git -C "." cmd' instead; see fca416a41e
(completion: use "git -C $there" instead of (cd $there && git ...),
2014-10-09).

Those extra steps are not needed since 6a536e2076 (git: treat "git -C
'<path>'" as a no-op when <path> is empty, 2015-03-06), so remove
them.

While at it, also simplify how the trailing '/' is appended to the
variable holding the prefix path components.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

completion: move __git_complete_index_file() next to... SZEDER Gábor Mon, 16 Apr 2018 22:41:06 +0000 (00:41 +0200)

completion: move __git_complete_index_file() next to its helpers

It's much easier to read, understand and modify the functions related
to git-aware path completion when they are right next to each other.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t9902-completion: add tests demonstrating issues with... SZEDER Gábor Mon, 16 Apr 2018 22:41:05 +0000 (00:41 +0200)

t9902-completion: add tests demonstrating issues with quoted pathnames

Completion functions see all words on the command line verbatim,
including any backslash-escapes, single and double quotes that might
be there. Furthermore, git commands quote pathnames if they contain
certain special characters. All these create various issues when
doing git-aware path completion.

Add a couple of failing tests to demonstrate these issues.

Later patches in this series will discuss these issues in detail as
they fix them.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

glossary: substitute "ancestor" for "direct ancestor... Sergey Organov Mon, 16 Apr 2018 05:43:16 +0000 (08:43 +0300)

glossary: substitute "ancestor" for "direct ancestor" in 'push' description.

Even though "direct ancestor" is not defined in the glossary, the
common meaning of the term is simply "parent", parents being the only
direct ancestors, and the rest of ancestors being indirect ancestors.

As "parent" is obviously wrong in this place in the description, we
should simply say "ancestor", as everywhere else.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1510-repo-setup.sh: remove useless mkdirTao Qingyun Sun, 15 Apr 2018 02:45:04 +0000 (10:45 +0800)

t1510-repo-setup.sh: remove useless mkdir

Signed-off-by: Tao Qingyun <845767657@qq.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

git{,-blame}.el: remove old bitrotting Emacs codeÆvar Arnfjörð Bjarmason Wed, 11 Apr 2018 20:42:05 +0000 (20:42 +0000)

git{,-blame}.el: remove old bitrotting Emacs code

The git-blame.el mode has been superseded by Emacs's own
vc-annotate (invoked by C-x v g). Users of the git.el mode are now
much better off using either Magit or the Git backend for Emacs's own
VC mode.

These modes were added over 10 years ago when Emacs's own Git support
was much less mature, and there weren't other mature modes in the wild
or shipped with Emacs itself.

These days these modes have few if any users, and users of git aren't
well served by us shipping these (some OS's install them alongside git
by default, which is confusing and leads users astray).

So let's remove these per Alexandre Julliard's message to the
ML[1]. If someone still wants these for some reason they're better
served by hosting these elsewhere (e.g. on ELPA), instead of us
distributing them with git.

However, since downstream packagers such as Debian are packaging this
as git-el it's less disruptive to still carry these files as Elisp
code that'll error out with a message suggesting alternatives, rather
than drop the files entirely[2].

Then rather than receive a cryptic load error when they upgrade
existing users will get an error directing them to the README file, or
to just stop requiring these modes. I think it makes sense to link to
GitHub's hosting of contrib/emacs/README (which'll be updated by the
time users see this) so they don't have to hunt down the packaged
README on their local system.

1. "Re: [PATCH] git.el: handle default excludesfile
properly" (87muzlwhb0.fsf@winehq.org) --
https://public-inbox.org/git/87muzlwhb0.fsf@winehq.org/

2. "Re: [PATCH v3] git{,-blame}.el: remove old bitrotting Emacs
code" (20180327165751.GA4343@aiede.svl.corp.google.com) --
https://public-inbox.org/git/20180327165751.GA4343@aiede.svl.corp.google.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: find the last gpg signature lineJeff King Fri, 13 Apr 2018 21:18:35 +0000 (15:18 -0600)

gpg-interface: find the last gpg signature line

A signed tag has a detached signature like this:

object ...
[...more header...]

This is the tag body.

-----BEGIN PGP SIGNATURE-----
[opaque gpg data]
-----END PGP SIGNATURE-----

Our parser finds the _first_ line that appears to start a
PGP signature block, meaning we may be confused by a
signature (or a signature-like line) in the actual body.
Let's keep parsing and always find the final block, which
should be the detached signature over all of the preceding
content.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: extract gpg line matching helperJeff King Fri, 13 Apr 2018 21:18:34 +0000 (15:18 -0600)

gpg-interface: extract gpg line matching helper

Let's separate the actual line-by-line parsing of signatures
from the notion of "is this a gpg signature line". That will
make it easier to do more refactoring of this loop in future
patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: fix const-correctness of "eol" pointerJeff King Fri, 13 Apr 2018 21:18:33 +0000 (15:18 -0600)

gpg-interface: fix const-correctness of "eol" pointer

We accidentally shed the "const" of our buffer by passing it
through memchr. Let's fix that, and while we're at it, move
our variable declaration inside the loop, which is the only
place that uses it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: use size_t for signature buffer sizeJeff King Fri, 13 Apr 2018 21:18:32 +0000 (15:18 -0600)

gpg-interface: use size_t for signature buffer size

Even though our object sizes (from which these buffers would
come) are typically "unsigned long", this is something we'd
like to eventually fix (since it's only 32-bits even on
64-bit Windows). It makes more sense to use size_t when
taking an in-memory buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: modernize function declarationsJeff King Fri, 13 Apr 2018 21:18:31 +0000 (15:18 -0600)

gpg-interface: modernize function declarations

Let's drop "extern" from our declarations, which brings us
in line with our modern style guidelines. While we're
here, let's wrap some of the overly long lines, and move
docstrings for public functions to their declarations, since
they document the interface.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: handle bool user.signingkeyJeff King Fri, 13 Apr 2018 21:18:30 +0000 (15:18 -0600)

gpg-interface: handle bool user.signingkey

The config handler for user.signingkey does not check for a
boolean value, and thus:

git -c user.signingkey tag

will segfault. We could fix this and even shorten the code
by using git_config_string(). But our set_signing_key()
helper is used by other code outside of gpg-interface.c, so
we must keep it (and we may as well use it, because unlike
git_config_string() it does not leak when we overwrite an
old value).

Ironically, the handler for gpg.program just below _could_
use git_config_string() but doesn't. But since we're going
to touch that in a future patch, we'll leave it alone for
now. We will add some whitespace and returns in preparation
for adding more config keys, though.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t7004: fix mistaken tag nameJeff King Fri, 13 Apr 2018 21:18:29 +0000 (15:18 -0600)

t7004: fix mistaken tag name

We have a series of tests which create signed tags with
various properties, but one test accidentally verifies a tag
from much earlier in the series.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: add a DEVOPTS to get all of -WextraÆvar Arnfjörð Bjarmason Sat, 14 Apr 2018 19:19:46 +0000 (19:19 +0000)

Makefile: add a DEVOPTS to get all of -Wextra

Change DEVOPTS to understand a "extra-all" option. When the DEVELOPER
flag is enabled we turn on -Wextra, but manually switch some of the
warnings it turns on off.

This is because we have many existing occurrences of them in the code
base. This mode will stop the suppression, let the developer see and
decide whether to fix them.

This change is a slight alteration of Nguyễn Thái Ngọc Duy
EAGER_DEVELOPER mode patch[1]

1. "[PATCH v3 3/3] Makefile: add EAGER_DEVELOPER
mode" (<20180329150322.10722-4-pclouds@gmail.com>;
https://public-inbox.org/git/20180329150322.10722-4-pclouds@gmail.com/)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: add a DEVOPTS to suppress -Werror under DEVELOPERÆvar Arnfjörð Bjarmason Sat, 14 Apr 2018 19:19:45 +0000 (19:19 +0000)

Makefile: add a DEVOPTS to suppress -Werror under DEVELOPER

Add a DEVOPTS variable that'll be used to tweak the behavior of
DEVELOPER.

I've long wanted to use DEVELOPER=1 in my production builds, but on
some old systems I still get warnings, and thus the build would
fail. However if the build/tests fail for some other reason, it would
still be useful to scroll up and see what the relevant code is warning
about.

This change allows for that. Now setting DEVELOPER will set -Werror as
before, but if DEVOPTS=no-error is provided is set you'll get the same
warnings, but without -Werror.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: detect compiler and enable more warnings... Nguyễn Thái Ngọc Duy Sat, 14 Apr 2018 19:19:44 +0000 (19:19 +0000)

Makefile: detect compiler and enable more warnings in DEVELOPER=1

The set of extra warnings we enable when DEVELOPER has to be
conservative because we can't assume any compiler version the
developer may use. Detect the compiler version so we know when it's
safe to enable -Wextra and maybe more.

These warning settings are mostly from my custom config.mak a long
time ago when I tried to enable as many warnings as possible that can
still build without showing warnings. Some of those warnings are
probably worth fixing instead of just suppressing in future.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

connect.c: mark die_initial_contact() NORETURNNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 19:19:43 +0000 (19:19 +0000)

connect.c: mark die_initial_contact() NORETURN

There is a series running in parallel with this one that adds code
like this

switch (...) {
case ...:
die_initial_contact();
case ...:

There is nothing wrong with this. There is no actual falling
through. But since gcc is not that smart and gcc 7.x introduces
-Wimplicit-fallthrough, it raises a false alarm in this case.

This class of warnings may be useful elsewhere, so instead of
suppressing the whole class, let's try to fix just this code. gcc is
smart enough to realize that no execution can continue after a
NORETURN function call and no longer raises the warning.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: show some progress when counting kept... Nguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:18 +0000 (17:36 +0200)

pack-objects: show some progress when counting kept objects

We only show progress when there are new objects to be packed. But
when --keep-pack is specified on the base pack, we will exclude most
of objects. This makes 'pack-objects' stay silent for a long time
while the counting phase is going.

Let's show some progress whenever we visit an object instead. The old
"Counting objects" is renamed to "Enumerating objects" and a new
progress "Counting objects" line is added.

This new "Counting objects" line should progress pretty quick when the
system is beefy. But when the system is under pressure, the reading
object header done in this phase could be slow and showing progress is
an improvement over staying silent in the current code.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc --auto: exclude base pack if not enough mem to ... Nguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:17 +0000 (17:36 +0200)

gc --auto: exclude base pack if not enough mem to "repack -ad"

pack-objects could be a big memory hog especially on large repos,
everybody knows that. The suggestion to stick a .keep file on the
giant base pack to avoid this problem is also known for a long time.

Recent patches add an option to do just this, but it has to be either
configured or activated manually. This patch lets `git gc --auto`
activate this mode automatically when it thinks `repack -ad` will use
a lot of memory and start affecting the system due to swapping or
flushing OS cache.

gc --auto decides to do this based on an estimation of pack-objects
memory usage, which is quite accurate at least for the heap part, and
whether that fits in half of system memory (the assumption here is for
desktop environment where there are many other applications running).

This mechanism only kicks in if gc.bigBasePackThreshold is not configured.
If it is, it is assumed that the user already knows what they want.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc: handle a corner case in gc.bigPackThresholdNguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:16 +0000 (17:36 +0200)

gc: handle a corner case in gc.bigPackThreshold

This config allows us to keep <N> packs back if their size is larger
than a limit. But if this N >= gc.autoPackLimit, we may have a
problem. We are supposed to reduce the number of packs after a
threshold because it affects performance.

We could tell the user that they have incompatible gc.bigPackThreshold
and gc.autoPackLimit, but it's kinda hard when 'git gc --auto' runs in
background. Instead let's fall back to the next best stategy: try to
reduce the number of packs anyway, but keep the base pack out. This
reduces the number of packs to two and hopefully won't take up too
much resources to repack (the assumption still is the base pack takes
most resources to handle).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc: add gc.bigPackThreshold configNguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:15 +0000 (17:36 +0200)

gc: add gc.bigPackThreshold config

The --keep-largest-pack option is not very convenient to use because
you need to tell gc to do this explicitly (and probably on just a few
large repos).

Add a config key that enables this mode when packs larger than a limit
are found. Note that there's a slight behavior difference compared to
--keep-largest-pack: all packs larger than the threshold are kept, not
just the largest one.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc: add --keep-largest-pack optionNguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:14 +0000 (17:36 +0200)

gc: add --keep-largest-pack option

This adds a new repack mode that combines everything into a secondary
pack, leaving the largest pack alone.

This could help reduce memory pressure. On linux-2.6.git, valgrind
massif reports 1.6GB heap in "pack all" case, and 535MB in "pack
all except the base pack" case. We save roughly 1GB memory by
excluding the base pack.

This should also lower I/O because we don't have to rewrite a giant
pack every time (e.g. for linux-2.6.git that's a 1.4GB pack file)..

PS. The use of string_list here seems overkill, but we'll need it in
the next patch...

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repack: add --keep-pack optionNguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:13 +0000 (17:36 +0200)

repack: add --keep-pack option

We allow to keep existing packs by having companion .keep files. This
is helpful when a pack is permanently kept. In the next patch, git-gc
just wants to keep a pack temporarily, for one pack-objects
run. git-gc can use --keep-pack for this use case.

A note about why the pack_keep field cannot be reused and
pack_keep_in_core has to be added. This is about the case when
--keep-pack is specified together with either --keep-unreachable or
--unpack-unreachable, but --honor-pack-keep is NOT specified.

In this case, we want to exclude objects from the packs specified on
command line, not from ones with .keep files. If only one bit flag is
used, we have to clear pack_keep on pack files with the .keep file.

But we can't make any assumption about unreachable objects in .keep
packs. If "pack_keep" field is false for .keep packs, we could
potentially pull lots of unreachable objects into the new pack, or
unpack them loose. The safer approach is ignore all packs with either
.keep file or --keep-pack.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t7700: have closing quote of a test at the beginning... Nguyễn Thái Ngọc Duy Sun, 15 Apr 2018 15:36:12 +0000 (17:36 +0200)

t7700: have closing quote of a test at the beginning of line

The closing quote of a test body by convention is always at the start
of line.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: exercise the whole test suite with uncommon code... Nguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:13 +0000 (17:35 +0200)

ci: exercise the whole test suite with uncommon code in pack-objects

Some recent optimizations have been added to pack-objects to reduce
memory usage and some code paths are split into two: one for common
use cases and one for rare ones. Make sure the rare cases are tested
with Travis since it requires manual test configuration that is
unlikely to be done by developers.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: reorder members to shrink struct object_entryNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:12 +0000 (17:35 +0200)

pack-objects: reorder members to shrink struct object_entry

Previous patches leave lots of holes and padding in this struct. This
patch reorders the members and shrinks the struct down to 80 bytes
(from 136 bytes on 64-bit systems, before any field shrinking is done)
with 16 bits to spare (and a couple more in in_pack_header_size when
we really run out of bits).

This is the last in a series of memory reduction patches (see
"pack-objects: a bit of document about struct object_entry" for the
first one).

Overall they've reduced repack memory size on linux-2.6.git from
3.747G to 3.424G, or by around 320M, a decrease of 8.5%. The runtime
of repack has stayed the same throughout this series. Ævar's testing
on a big monorepo he has access to (bigger than linux-2.6.git) has
shown a 7.9% reduction, so the overall expected improvement should be
somewhere around 8%.

See 87po42cwql.fsf@evledraar.gmail.com on-list
(https://public-inbox.org/git/87po42cwql.fsf@evledraar.gmail.com/) for
more detailed numbers and a test script used to produce the numbers
cited above.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: shrink delta_size field in struct object_... Nguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:11 +0000 (17:35 +0200)

pack-objects: shrink delta_size field in struct object_entry

Allowing a delta size of 64 bits is crazy. Shrink this field down to
20 bits with one overflow bit.

If we find an existing delta larger than 1MB, we do not cache
delta_size at all and will get the value from oe_size(), potentially
from disk if it's larger than 4GB.

Note, since DELTA_SIZE() is used in try_delta() code, it must be
thread-safe. Luckily oe_size() does guarantee this so we it is
thread-safe.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: shrink size field in struct object_entryNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:10 +0000 (17:35 +0200)

pack-objects: shrink size field in struct object_entry

It's very very rare that an uncompressed object is larger than 4GB
(partly because Git does not handle those large files very well to
begin with). Let's optimize it for the common case where object size
is smaller than this limit.

Shrink size field down to 31 bits and one overflow bit. If the size is
too large, we read it back from disk. As noted in the previous patch,
we need to return the delta size instead of canonical size when the
to-be-reused object entry type is a delta instead of a canonical one.

Add two compare helpers that can take advantage of the overflow
bit (e.g. if the file is 4GB+, chances are it's already larger than
core.bigFileThreshold and there's no point in comparing the actual
value).

Another note about oe_get_size_slow(). This function MUST be thread
safe because SIZE() macro is used inside try_delta() which may run in
parallel. Outside parallel code, no-contention locking should be dirt
cheap (or insignificant compared to i/o access anyway). To exercise
this code, it's best to run the test suite with something like

make test GIT_TEST_OE_SIZE=4

which forces this code on all objects larger than 3 bytes.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: clarify the use of object_entry::sizeNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:09 +0000 (17:35 +0200)

pack-objects: clarify the use of object_entry::size

While this field most of the time contains the canonical object size,
there is one case it does not: when we have found that the base object
of the delta in question is also to be packed, we will very happily
reuse the delta by copying it over instead of regenerating the new
delta.

"size" in this case will record the delta size, not canonical object
size. Later on in write_reuse_object(), we reconstruct the delta
header and "size" is used for this purpose. When this happens, the
"type" field contains a delta type instead of a canonical type.
Highlight this in the code since it could be tricky to see.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: don't check size when the object is badNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:08 +0000 (17:35 +0200)

pack-objects: don't check size when the object is bad

sha1_object_info() in check_objects() may fail to locate an object in
the pack and return type OBJ_BAD. In that case, it will likely leave
the "size" field untouched. We delay error handling until later in
prepare_pack() though. Until then, do not touch "size" field.

This field should contain the default value zero, but we can't say
sha1_object_info() cannot damage it. This becomes more important later
when the object size may have to be retrieved back from the
(non-existing) pack.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: shrink z_delta_size field in struct objec... Nguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:07 +0000 (17:35 +0200)

pack-objects: shrink z_delta_size field in struct object_entry

We only cache deltas when it's smaller than a certain limit. This limit
defaults to 1000 but save its compressed length in a 64-bit field.
Shrink that field down to 20 bits, so you can only cache 1MB deltas.
Larger deltas must be recomputed at when the pack is written down.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: refer to delta objects by index instead... Nguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:06 +0000 (17:35 +0200)

pack-objects: refer to delta objects by index instead of pointer

These delta pointers always point to elements in the objects[] array
in packing_data struct. We can only hold maximum 4G of those objects
because the array size in nr_objects is uint32_t. We could use
uint32_t indexes to address these elements instead of pointers. On
64-bit architecture (8 bytes per pointer) this would save 4 bytes per
pointer.

Convert these delta pointers to indexes. Since we need to handle NULL
pointers as well, the index is shifted by one [1].

[1] This means we can only index 2^32-2 objects even though nr_objects
could contain 2^32-1 objects. It should not be a problem in
practice because when we grow objects[], nr_alloc would probably
blow up long before nr_objects hits the wall.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: move in_pack out of struct object_entryNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:05 +0000 (17:35 +0200)

pack-objects: move in_pack out of struct object_entry

Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
pack. Use an index instead since the number of packs should be
relatively small.

This limits the number of packs we can handle to 1k. Since we can't be
sure people can never run into the situation where they have more than
1k pack files. Provide a fall back route for it.

If we find out they have too many packs, the new in_pack_by_idx[]
array (which has at most 1k elements) will not be used. Instead we
allocate in_pack[] array that holds nr_objects elements. This is
similar to how the optional in_pack_pos field is handled.

The new simple test is just to make sure the too-many-packs code path
is at least executed. The true test is running

make test GIT_TEST_FULL_IN_PACK_ARRAY=1

to take advantage of other special case tests.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: move in_pack_pos out of struct object_entryNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:04 +0000 (17:35 +0200)

pack-objects: move in_pack_pos out of struct object_entry

This field is only need for pack-bitmap, which is an optional
feature. Move it to a separate array that is only allocated when
pack-bitmap is used (like objects[], it is not freed, since we need it
until the end of the process)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: use bitfield for object_entry::depthNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:03 +0000 (17:35 +0200)

pack-objects: use bitfield for object_entry::depth

Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: use bitfield for object_entry::dfs_stateNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:02 +0000 (17:35 +0200)

pack-objects: use bitfield for object_entry::dfs_state

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: turn type and in_pack_type to bitfieldsNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:01 +0000 (17:35 +0200)

pack-objects: turn type and in_pack_type to bitfields

An extra field type_valid is added to carry the equivalent of OBJ_BAD
in the original "type" field. in_pack_type always contains a valid
type so we only need 3 bits for it.

A note about accepting OBJ_NONE as "valid" type. The function
read_object_list_from_stdin() can pass this value [1] and it
eventually calls create_object_entry() where current code skip setting
"type" field if the incoming type is zero. This does not have any bad
side effects because "type" field should be memset()'d anyway.

But since we also need to set type_valid now, skipping oe_set_type()
leaves type_valid zero/false, which will make oe_type() return
OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in
prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger

fatal: unable to get type of object ...

Accepting OBJ_NONE [2] does sound wrong, but this is how it is has
been for a very long time and I haven't time to dig in further.

[1] See 5c49c11686 (pack-objects: better check_object() performances -
2007-04-16)

[2] 21666f1aae (convert object type handling from a string to a number
- 2007-02-26)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-objects: a bit of document about struct object_entryNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:35:00 +0000 (17:35 +0200)

pack-objects: a bit of document about struct object_entry

The role of this comment block becomes more important after we shuffle
fields around to shrink this struct. It will be much harder to see what
field is related to what.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

read-cache.c: make $GIT_TEST_SPLIT_INDEX booleanNguyễn Thái Ngọc Duy Sat, 14 Apr 2018 15:34:59 +0000 (17:34 +0200)

read-cache.c: make $GIT_TEST_SPLIT_INDEX boolean

While at there, document about this special mode when running the test
suite.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

convert: add round trip check based on 'core.checkRound... Lars Schneider Sun, 15 Apr 2018 18:16:10 +0000 (20:16 +0200)

convert: add round trip check based on 'core.checkRoundtripEncoding'

UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are known to have round trip issues [1].

Add 'core.checkRoundtripEncoding', which contains a comma separated
list of encodings, to define for what encodings Git should check the
conversion round trip if they are used in the 'working-tree-encoding'
attribute.

Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'.

[1] https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

convert: add tracing for 'working-tree-encoding' attributeLars Schneider Sun, 15 Apr 2018 18:16:09 +0000 (20:16 +0200)

convert: add tracing for 'working-tree-encoding' attribute

Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

convert: check for detectable errors in UTF encodingsLars Schneider Sun, 15 Apr 2018 18:16:08 +0000 (20:16 +0200)

convert: check for detectable errors in UTF encodings

Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

convert: add 'working-tree-encoding' attributeLars Schneider Sun, 15 Apr 2018 18:16:07 +0000 (20:16 +0200)

convert: add 'working-tree-encoding' attribute

Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as most Git web front ends do not
visualize the content.

Add an attribute to tell Git what encoding the user has defined for a
given file. If the content is added to the index, then Git reencodes
the content to a canonical UTF-8 representation. On checkout Git will
reverse this operation.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

utf8: add function to detect a missing UTF-16/32 BOMLars Schneider Sun, 15 Apr 2018 18:16:06 +0000 (20:16 +0200)

utf8: add function to detect a missing UTF-16/32 BOM

If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.

The Unicode standard instructs to assume big-endian if there in no BOM
for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard used
in HTML5 recommends to assume little-endian to "deal with deployed
content" [3]. Strictly requiring a BOM seems to be the safest option
for content in Git.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#gen6
[2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
Section 3.10, D98, page 132
[3] https://encoding.spec.whatwg.org/#utf-16le

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

utf8: add function to detect prohibited UTF-16/32 BOMLars Schneider Sun, 15 Apr 2018 18:16:05 +0000 (20:16 +0200)

utf8: add function to detect prohibited UTF-16/32 BOM

Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#bom10

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

utf8: teach same_encoding() alternative UTF encoding... Lars Schneider Sun, 15 Apr 2018 18:16:04 +0000 (20:16 +0200)

utf8: teach same_encoding() alternative UTF encoding names

The function same_encoding() could only recognize alternative names for
UTF-8 encodings. Teach it to recognize all kinds of alternative UTF
encoding names (e.g. utf16).

While we are at it, fix a crash that would occur if same_encoding() was
called with a NULL argument and a non-NULL argument.

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'master' of https://github.com/ralfth... Jiang Xin Sun, 15 Apr 2018 14:25:48 +0000 (22:25 +0800)

Merge branch 'master' of https://github.com/ralfth/git-po-de into maint

* 'master' of https://github.com/ralfth/git-po-de:
l10n: de.po: fix typos

l10n: TEAMS: remove inactive de team membersRalf Thielow Tue, 3 Apr 2018 17:30:14 +0000 (19:30 +0200)

l10n: TEAMS: remove inactive de team members

Thanks for your contributions!

Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>

mem-pool: move reusable parts of memory pool into its... Jameson Miller Wed, 11 Apr 2018 18:37:55 +0000 (18:37 +0000)

mem-pool: move reusable parts of memory pool into its own file

This moves the reusable parts of the memory pool logic used by
fast-import.c into its own file for use by other components.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: allow lookup_replace_object to handle... Stefan Beller Thu, 12 Apr 2018 00:21:18 +0000 (17:21 -0700)

replace-object: allow lookup_replace_object to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: allow do_lookup_replace_object to handl... Stefan Beller Thu, 12 Apr 2018 00:21:17 +0000 (17:21 -0700)

replace-object: allow do_lookup_replace_object to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: allow prepare_replace_object to handle... Stefan Beller Thu, 12 Apr 2018 00:21:16 +0000 (17:21 -0700)

replace-object: allow prepare_replace_object to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: allow for_each_replace_ref to handle arbitrary... Stefan Beller Thu, 12 Apr 2018 00:21:15 +0000 (17:21 -0700)

refs: allow for_each_replace_ref to handle arbitrary repositories

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: store the main ref store inside the repository... Stefan Beller Thu, 12 Apr 2018 00:21:14 +0000 (17:21 -0700)

refs: store the main ref store inside the repository struct

This moves the 'main_ref_store', which was a global variable in refs.c
into the repository struct.

This patch does not deal with the parts in the refs subsystem which deal
with the submodules there. A later patch needs to get rid of the submodule
exposure in the refs API, such as 'get_submodule_ref_store(path)'.

Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: add repository argument to lookup_repla... Stefan Beller Thu, 12 Apr 2018 00:21:13 +0000 (17:21 -0700)

replace-object: add repository argument to lookup_replace_object

Add a repository argument to allow callers of lookup_replace_object
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: add repository argument to do_lookup_re... Stefan Beller Thu, 12 Apr 2018 00:21:12 +0000 (17:21 -0700)

replace-object: add repository argument to do_lookup_replace_object

Add a repository argument to allow the do_lookup_replace_object caller
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: add repository argument to prepare_repl... Stefan Beller Thu, 12 Apr 2018 00:21:11 +0000 (17:21 -0700)

replace-object: add repository argument to prepare_replace_object

Add a repository argument to allow the prepare_replace_object caller
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: add repository argument to for_each_replace_refStefan Beller Thu, 12 Apr 2018 00:21:10 +0000 (17:21 -0700)

refs: add repository argument to for_each_replace_ref

Add a repository argument to allow for_each_replace_ref callers to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: add repository argument to get_main_ref_storeStefan Beller Thu, 12 Apr 2018 00:21:09 +0000 (17:21 -0700)

refs: add repository argument to get_main_ref_store

Add a repository argument to allow the get_main_ref_store caller
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: check_replace_refs is safe in multi... Stefan Beller Thu, 12 Apr 2018 00:21:08 +0000 (17:21 -0700)

replace-object: check_replace_refs is safe in multi repo environment

In e1111cef23 (inline lookup_replace_object() calls, 2011-05-15) a shortcut
for checking the object replacement was added by setting check_replace_refs
to 0 once the replacements were evaluated to not exist. This works fine in
with the assumption of only one repository in existence.

The assumption won't hold true any more when we work on multiple instances
of a repository structs (e.g. one struct per submodule), as the first
repository to be inspected may have no replacements and would set the
global variable. Other repositories would then completely omit their
evaluation of replacements.

This reverts back the meaning of the flag `check_replace_refs` of
"Do we need to check with the lookup table?" to "Do we need to read
the replacement definition?", adding the bypassing logic to
lookup_replace_object after the replacement definition was read.
As with the original patch, delay the renaming of the global variable

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: eliminate replace objects prepared... Stefan Beller Thu, 12 Apr 2018 00:21:07 +0000 (17:21 -0700)

replace-object: eliminate replace objects prepared flag

Make the oidmap a pointer.

That way we eliminate the need for the global boolean
variable 'replace_object_prepared' as we can put this information
into the pointer being NULL or not.

Another advantage of this is that we would more quickly catch
code that tries to access replace-map without initializing it.

This also allows the '#include "oidmap.h"' introduced in a previous
patch to be replaced by the forward declaration of 'struct oidmap;'.
Keeping the type opaque discourages circumventing accessor functions;
not dragging in other headers avoids some compile time overhead.

One disadvantage of this is change is performance as we need to
pay the overhead for a malloc. The alternative of moving the
global variable into the object store is less modular code.

Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-store: move lookup_replace_object to replace... Stefan Beller Thu, 12 Apr 2018 00:21:06 +0000 (17:21 -0700)

object-store: move lookup_replace_object to replace-object.h

lookup_replace_object is a low-level function that most users of the
object store do not need to use directly.

Move it to replace-object.h to avoid a dependency loop in an upcoming
change to its inline definition that will make use of repository.h.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace-object: move replace_map to object storeStefan Beller Thu, 12 Apr 2018 00:21:05 +0000 (17:21 -0700)

replace-object: move replace_map to object store

The relationship between an object X and another object Y that
replaces the object X is defined only within the scope of a
single repository.

The exception in reachability rule around these replacement objects
is also local to a repository (i.e. if traversal from refs reaches
X, then both X and Y are reachable and need to be kept from gc).

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

replace_object: use oidmapRené Scharfe Thu, 12 Apr 2018 00:21:04 +0000 (17:21 -0700)

replace_object: use oidmap

Load the replace objects into an oidmap to allow for easy lookups in
constant time.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

SubmittingPatches: mention the git contacts commandThomas Gummerer Wed, 11 Apr 2018 20:20:00 +0000 (21:20 +0100)

SubmittingPatches: mention the git contacts command

Instead of just mentioning 'git blame' and 'git shortlog', which make it
quite hard for new contributors to pick out the appropriate list of
people to cc on their patch series, mention the 'git contacts' utility,
which makes it much easier to get a reasonable list of contacts for a
change.

This should help new contributors pick out a reasonable cc list by
simply using a single command.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import: introduce mem_pool typeJameson Miller Wed, 11 Apr 2018 18:37:54 +0000 (18:37 +0000)

fast-import: introduce mem_pool type

Introduce the mem_pool type which encapsulates all the information necessary to
manage a pool of memory. This change moves the existing variables in
fast-import used to support the global memory pool to use this structure. It
also renames variables that are no longer used by memory pools to reflect their
more scoped usage.

These changes allow for the multiple instances of a memory pool to
exist and be reused outside of fast-import. In a future commit the
mem_pool type will be moved to its own file.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import: rename mem_pool type to mp_blockJameson Miller Wed, 11 Apr 2018 18:37:53 +0000 (18:37 +0000)

fast-import: rename mem_pool type to mp_block

This is part of a patch series to extract the memory pool logic in
fast-import into a more generalized version. The existing mem_pool type
maps more closely to a "block of memory" (mp_block) in the more
generalized memory pool. This commit renames the mem_pool to mp_block to
reduce churn in future patches.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'svn/authors-prog-2' of git://bogomips... Junio C Hamano Wed, 11 Apr 2018 23:05:28 +0000 (08:05 +0900)

Merge branch 'svn/authors-prog-2' of git://bogomips.org/git-svn

* 'svn/authors-prog-2' of git://bogomips.org/git-svn:
git-svn: allow empty email-address using authors-prog and authors-file
git-svn: search --authors-prog in PATH too

l10n: de.po: fix typosAndre Hinrichs Tue, 3 Apr 2018 05:12:12 +0000 (07:12 +0200)

l10n: de.po: fix typos

Signed-off-by: Andre Hinrichs <andre.hinrichs@gmx.de>

replace_object.c: rename to use dash in file nameStefan Beller Tue, 10 Apr 2018 21:26:21 +0000 (14:26 -0700)

replace_object.c: rename to use dash in file name

This is more consistent with the project style. The majority of
Git's source files use dashes in preference to underscores in their file
names.

Noticed while adding a header corresponding to this file.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>

sha1_file.c: rename to use dash in file nameStefan Beller Tue, 10 Apr 2018 21:26:20 +0000 (14:26 -0700)

sha1_file.c: rename to use dash in file name

This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Signed-off-by: Stefan Beller <sbeller@google.com>

sha1_name.c: rename to use dash in file nameStefan Beller Tue, 10 Apr 2018 21:26:19 +0000 (14:26 -0700)

sha1_name.c: rename to use dash in file name

This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Signed-off-by: Stefan Beller <sbeller@google.com>

exec_cmd: rename to use dash in file nameStefan Beller Tue, 10 Apr 2018 21:26:18 +0000 (14:26 -0700)

exec_cmd: rename to use dash in file name

This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Signed-off-by: Stefan Beller <sbeller@google.com>

unicode_width.h: rename to use dash in file nameStefan Beller Tue, 10 Apr 2018 21:26:17 +0000 (14:26 -0700)

unicode_width.h: rename to use dash in file name

This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Also adjust contrib/update-unicode as well.

Signed-off-by: Stefan Beller <sbeller@google.com>

write_or_die.c: rename to use dashes in file nameStefan Beller Tue, 10 Apr 2018 21:26:16 +0000 (14:26 -0700)

write_or_die.c: rename to use dashes in file name

This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Signed-off-by: Stefan Beller <sbeller@google.com>

mingw/msvc: use the new-style RUNTIME_PREFIX helperJohannes Schindelin Tue, 10 Apr 2018 15:05:46 +0000 (11:05 -0400)

mingw/msvc: use the new-style RUNTIME_PREFIX helper

This change also allows us to stop overriding argv[0] with the absolute
path of the executable, allowing us to preserve e.g. the case of the
executable's file name.

This fixes https://github.com/git-for-windows/git/issues/1496 partially.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

exec_cmd: provide a new-style RUNTIME_PREFIX helper... Johannes Schindelin Tue, 10 Apr 2018 15:05:45 +0000 (11:05 -0400)

exec_cmd: provide a new-style RUNTIME_PREFIX helper for Windows

The RUNTIME_PREFIX feature comes from Git for Windows, but it was
enhanced to allow support for other platforms. While changing the
original idea, the concept was also improved by not forcing argv[0] to
be adjusted.

Let's allow the same for Windows by implementing a helper just as for
the other platforms.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

exec_cmd: RUNTIME_PREFIX on some POSIX systemsDan Jacques Tue, 10 Apr 2018 15:05:44 +0000 (11:05 -0400)

exec_cmd: RUNTIME_PREFIX on some POSIX systems

Enable Git to resolve its own binary location using a variety of
OS-specific and generic methods, including:

- procfs via "/proc/self/exe" (Linux)
- _NSGetExecutablePath (Darwin)
- KERN_PROC_PATHNAME sysctl on BSDs.
- argv0, if absolute (all, including Windows).

This is used to enable RUNTIME_PREFIX support for non-Windows systems,
notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will
do a best-effort resolution of its executable path and automatically use
this as its "exec_path" for relative helper and data lookups, unless
explicitly overridden.

Small incidental formatting cleanup of "exec_cmd.c".

Signed-off-by: Dan Jacques <dnj@google.com>
Thanks-to: Robbie Iannucci <iannucci@google.com>
Thanks-to: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: add Perl runtime prefix supportDan Jacques Tue, 10 Apr 2018 15:05:43 +0000 (11:05 -0400)

Makefile: add Perl runtime prefix support

Broaden the RUNTIME_PREFIX flag to configure Git's Perl scripts to
locate the Git installation's Perl support libraries by resolving
against the script's path, rather than hard-coding that path at
build-time. Hard-coding at build time worked on previous
RUNTIME_PREFIX configurations (i.e., Windows) because the Perl
scripts were run within a virtual filesystem whose paths were
consistent regardless of the location of the actual installation.
This will no longer be the case for non-Windows RUNTIME_PREFIX users.

When enabled, RUNTIME_PREFIX now requires Perl's system paths to be
expressed relative to a common installation directory in the Makefile,
and uses that relationship to locate support files based on the known
starting point of the script being executed, much like RUNTIME_PREFIX
does for the Git binary.

This change enables Git's Perl scripts to work when their Git installation
is relocated or moved to another system, even when they are not in a
virtual filesystem environment.

Signed-off-by: Dan Jacques <dnj@google.com>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Thanks-to: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Makefile: generate Perl header from template fileDan Jacques Tue, 10 Apr 2018 15:05:42 +0000 (11:05 -0400)

Makefile: generate Perl header from template file

Currently, the generated Perl script headers are emitted by commands in
the Makefile. This mechanism restricts options to introduce alternative
header content, needed by Perl runtime prefix support, and obscures the
origin of the Perl script header.

Change the Makefile to generate a header by processing a template file and
move the header content into the "perl/" subdirectory. The generated
header content will now be stored in the "GIT-PERL-HEADER" file. This
allows the content of the Perl header to be controlled by changing the path
of the template in the Makefile.

Signed-off-by: Dan Jacques <dnj@google.com>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Thanks-to: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsmonitor: force index write after full scanBen Peart Tue, 10 Apr 2018 18:14:31 +0000 (14:14 -0400)

fsmonitor: force index write after full scan

fsmonitor currently only flags the index as dirty if the extension is being
added or removed. This is a performance optimization that recognizes you can
stat() a lot of files in less time than it takes to write out an updated index.

This patch makes a small enhancement and flags the index dirty if we end up
having to stat() all files and scan the entire working directory. The assumption
being that must be expensive or you would not have turned on the feature.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Revert "Merge branch 'en/rename-directory-detection'"Junio C Hamano Wed, 11 Apr 2018 09:07:11 +0000 (18:07 +0900)

Revert "Merge branch 'en/rename-directory-detection'"

This reverts commit e4bb62fa1eeee689744b413e29a50b4d1dae6886, reversing
changes made to 468165c1d8a442994a825f3684528361727cd8c0.

The topic appears to inflict severe regression in renaming merges,
even though the promise of it was that it would improve them.

We do not yet know which exact change in the topic was wrong, but in
the meantime, let's play it safe and revert it out of 'master'
before real Git-using projects are harmed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

fsmonitor: fix incorrect buffer size when printing... Ben Peart Tue, 10 Apr 2018 18:43:43 +0000 (18:43 +0000)

fsmonitor: fix incorrect buffer size when printing version number

This is a trivial bug fix for passing the incorrect size to snprintf() when
outputting the version. It should be passing the size of the destination buffer
rather than the size of the value being printed.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/perf: add scripts to bisect performance regressionsChristian Couder Sun, 8 Apr 2018 09:35:13 +0000 (11:35 +0200)

t/perf: add scripts to bisect performance regressions

The new bisect_regression script can be used to automatically bisect
performance regressions. It will pass the new bisect_run_script to
`git bisect run`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

perf/run: add --subsection optionChristian Couder Sun, 8 Apr 2018 09:35:12 +0000 (11:35 +0200)

perf/run: add --subsection option

This new option makes it possible to run perf tests as defined
in only one subsection of a config file.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The third batch for 2.18Junio C Hamano Wed, 11 Apr 2018 04:13:49 +0000 (13:13 +0900)

The third batch for 2.18

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'eb/cred-helper-ignore-sigpipe'Junio C Hamano Wed, 11 Apr 2018 04:09:57 +0000 (13:09 +0900)

Merge branch 'eb/cred-helper-ignore-sigpipe'

When credential helper exits very quickly without reading its
input, it used to cause Git to die with SIGPIPE, which has been
fixed.

* eb/cred-helper-ignore-sigpipe:
credential: ignore SIGPIPE when writing to credential helpers

Merge branch 'lv/tls-1.3'Junio C Hamano Wed, 11 Apr 2018 04:09:57 +0000 (13:09 +0900)

Merge branch 'lv/tls-1.3'

When built with more recent cURL, GIT_SSL_VERSION can now specify
"tlsv1.3" as its value.

* lv/tls-1.3:
http: allow use of TLS 1.3

Merge branch 'pk/test-avoid-pipe-hiding-exit-status'Junio C Hamano Wed, 11 Apr 2018 04:09:56 +0000 (13:09 +0900)

Merge branch 'pk/test-avoid-pipe-hiding-exit-status'

Test cleanup.

* pk/test-avoid-pipe-hiding-exit-status:
test: avoid pipes in git related commands for test

Merge branch 'rs/status-with-removed-submodule'Junio C Hamano Wed, 11 Apr 2018 04:09:56 +0000 (13:09 +0900)

Merge branch 'rs/status-with-removed-submodule'

"git submodule status" misbehaved on a submodule that has been
removed from the working tree.

* rs/status-with-removed-submodule:
submodule: check for NULL return of get_submodule_ref_store()

Merge branch 'nd/combined-test-helper'Junio C Hamano Wed, 11 Apr 2018 04:09:56 +0000 (13:09 +0900)

Merge branch 'nd/combined-test-helper'

Small test-helper programs have been consolidated into a single
binary.

* nd/combined-test-helper: (36 commits)
t/helper: merge test-write-cache into test-tool
t/helper: merge test-wildmatch into test-tool
t/helper: merge test-urlmatch-normalization into test-tool
t/helper: merge test-subprocess into test-tool
t/helper: merge test-submodule-config into test-tool
t/helper: merge test-string-list into test-tool
t/helper: merge test-strcmp-offset into test-tool
t/helper: merge test-sigchain into test-tool
t/helper: merge test-sha1-array into test-tool
t/helper: merge test-scrap-cache-tree into test-tool
t/helper: merge test-run-command into test-tool
t/helper: merge test-revision-walking into test-tool
t/helper: merge test-regex into test-tool
t/helper: merge test-ref-store into test-tool
t/helper: merge test-read-cache into test-tool
t/helper: merge test-prio-queue into test-tool
t/helper: merge test-path-utils into test-tool
t/helper: merge test-online-cpus into test-tool
t/helper: merge test-mktemp into test-tool
t/helper: merge (unused) test-mergesort into test-tool
...

Merge branch 'sb/packfiles-in-repository'Junio C Hamano Wed, 11 Apr 2018 04:09:55 +0000 (13:09 +0900)

Merge branch 'sb/packfiles-in-repository'

Refactoring of the internal global data structure continues.

* sb/packfiles-in-repository:
packfile: keep prepare_packed_git() private
packfile: allow find_pack_entry to handle arbitrary repositories
packfile: add repository argument to find_pack_entry
packfile: allow reprepare_packed_git to handle arbitrary repositories
packfile: allow prepare_packed_git to handle arbitrary repositories
packfile: allow prepare_packed_git_one to handle arbitrary repositories
packfile: add repository argument to reprepare_packed_git
packfile: add repository argument to prepare_packed_git
packfile: add repository argument to prepare_packed_git_one
packfile: allow install_packed_git to handle arbitrary repositories
packfile: allow rearrange_packed_git to handle arbitrary repositories
packfile: allow prepare_packed_git_mru to handle arbitrary repositories

Merge branch 'sb/object-store'Junio C Hamano Wed, 11 Apr 2018 04:09:55 +0000 (13:09 +0900)

Merge branch 'sb/object-store'

Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.

Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.

* sb/object-store: (27 commits)
sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
sha1_file: allow map_sha1_file to handle arbitrary repositories
sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
sha1_file: allow open_sha1_file to handle arbitrary repositories
sha1_file: allow stat_sha1_file to handle arbitrary repositories
sha1_file: allow sha1_file_name to handle arbitrary repositories
sha1_file: add repository argument to sha1_loose_object_info
sha1_file: add repository argument to map_sha1_file
sha1_file: add repository argument to map_sha1_file_1
sha1_file: add repository argument to open_sha1_file
sha1_file: add repository argument to stat_sha1_file
sha1_file: add repository argument to sha1_file_name
sha1_file: allow prepare_alt_odb to handle arbitrary repositories
sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
sha1_file: add repository argument to prepare_alt_odb
sha1_file: add repository argument to link_alt_odb_entries
sha1_file: add repository argument to read_info_alternates
sha1_file: add repository argument to link_alt_odb_entry
sha1_file: add raw_object_store argument to alt_odb_usable
pack: move approximate object count to object store
...

Merge branch 'jc/test-must-be-empty'Junio C Hamano Wed, 11 Apr 2018 04:09:54 +0000 (13:09 +0900)

Merge branch 'jc/test-must-be-empty'

Test helper update.

* jc/test-must-be-empty:
test_must_be_empty: simplify file existence check

Merge branch 'cc/perf-aggregate-sort'Junio C Hamano Wed, 11 Apr 2018 04:09:54 +0000 (13:09 +0900)

Merge branch 'cc/perf-aggregate-sort'

Perf-test update.

* cc/perf-aggregate-sort:
perf/aggregate: add --sort-by=regression option
perf/aggregate: add display_dir()

Merge branch 'ab/doc-hash-brokenness'Junio C Hamano Wed, 11 Apr 2018 04:09:54 +0000 (13:09 +0900)

Merge branch 'ab/doc-hash-brokenness'

Doc updates.

* ab/doc-hash-brokenness:
doc hash-function-transition: clarify what SHAttered means
doc hash-function-transition: clarify how older gits die on NewHash

Merge branch 'bc/hash-independent-tests'Junio C Hamano Wed, 11 Apr 2018 04:09:54 +0000 (13:09 +0900)

Merge branch 'bc/hash-independent-tests'

Tests that rely on the exact hardcoded values of object names have
been updated in preparation for hash function migration.

* bc/hash-independent-tests:
t2107: abstract away SHA-1-specific constants
t2101: abstract away SHA-1-specific constants
t2101: modernize test style
t2020: abstract away SHA-1 specific constants
t1507: abstract away SHA-1-specific constants
t1411: abstract away SHA-1-specific constants
t1405: sort reflog entries in a hash-independent way
t1300: abstract away SHA-1-specific constants
t1304: abstract away SHA-1-specific constants
t1011: abstract away SHA-1-specific constants

Merge branch 'ab/drop-contrib-examples'Junio C Hamano Wed, 11 Apr 2018 04:09:54 +0000 (13:09 +0900)

Merge branch 'ab/drop-contrib-examples'

* ab/drop-contrib-examples:
Remove contrib/examples/*

commit-graph: lazy-load trees for commitsDerrick Stolee Fri, 6 Apr 2018 19:09:46 +0000 (19:09 +0000)

commit-graph: lazy-load trees for commits

The commit-graph file provides quick access to commit data, including
the OID of the root tree for each commit in the graph. When performing
a deep commit-graph walk, we may not need to load most of the trees
for these commits.

Delay loading the tree object for a commit loaded from the graph
until requested via get_commit_tree(). Do not lazy-load trees for
commits not in the graph, since that requires duplicate parsing
and the relative peformance improvement when trees are not needed
is small.

On the Linux repository, performance tests were run for the following
command:

git log --graph --oneline -1000

Before: 0.92s
After: 0.66s
Rel %: -28.3%

Adding '-- kernel/' to the command requires loading the root tree
for every commit that is walked. There was no measureable performance
change as a result of this patch.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

treewide: replace maybe_tree with accessor methodsDerrick Stolee Fri, 6 Apr 2018 19:09:38 +0000 (19:09 +0000)

treewide: replace maybe_tree with accessor methods

In anticipation of making trees load lazily, create a Coccinelle
script (contrib/coccinelle/commit.cocci) to ensure that all
references to the 'maybe_tree' member of struct commit are either
mutations or accesses through get_commit_tree() or
get_commit_tree_oid().

Apply the Coccinelle script to create the rest of the patch.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>