From: Junio C Hamano Date: Mon, 17 Sep 2018 20:53:50 +0000 (-0700) Subject: Merge branch 'ds/multi-pack-index' X-Git-Tag: v2.20.0-rc0~249 X-Git-Url: https://git.lorimer.id.au/gitweb.git/diff_plain/49f210fd5279eeb0106cd7e4383a1c4454d30428?ds=inline;hp=-c Merge branch 'ds/multi-pack-index' When there are too many packfiles in a repository (which is not recommended), looking up an object in these would require consulting many pack .idx files; a new mechanism to have a single file that consolidates all of these .idx files is introduced. * ds/multi-pack-index: (32 commits) pack-objects: consider packs in multi-pack-index midx: test a few commands that use get_all_packs treewide: use get_all_packs packfile: add all_packs list midx: fix bug that skips midx with alternates midx: stop reporting garbage midx: mark bad packed objects multi-pack-index: store local property multi-pack-index: provide more helpful usage info midx: clear midx on repack packfile: skip loading index if in multi-pack-index midx: prevent duplicate packfile loads midx: use midx in approximate_object_count midx: use existing midx when writing new one midx: use midx in abbreviation calculations midx: read objects from multi-pack-index config: create core.multiPackIndex setting midx: write object offsets midx: write object id fanout chunk midx: write object ids in a chunk ... --- 49f210fd5279eeb0106cd7e4383a1c4454d30428 diff --combined Documentation/config.txt index 69a27eb688,8283443c97..6ecd70df0a --- a/Documentation/config.txt +++ b/Documentation/config.txt @@@ -462,20 -462,10 +462,20 @@@ core.untrackedCache: See linkgit:git-update-index[1]. `keep` by default. core.checkStat:: - Determines which stat fields to match between the index - and work tree. The user can set this to 'default' or - 'minimal'. Default (or explicitly 'default'), is to check - all fields, including the sub-second part of mtime and ctime. + When missing or is set to `default`, many fields in the stat + structure are checked to detect if a file has been modified + since Git looked at it. When this configuration variable is + set to `minimal`, sub-second part of mtime and ctime, the + uid and gid of the owner of the file, the inode number (and + the device number, if Git was compiled to use it), are + excluded from the check among these fields, leaving only the + whole-second part of mtime (and ctime, if `core.trustCtime` + is set) and the filesize to be checked. ++ +There are implementations of Git that do not leave usable values in +some fields (e.g. JGit); by excluding these fields from the +comparison, the `minimal` mode may help interoperability when the +same repository is used by these other systems at the same time. core.quotePath:: Commands that output paths (e.g. 'ls-files', 'diff'), will @@@ -927,16 -917,23 +927,21 @@@ core.notesRef: This setting defaults to "refs/notes/commits", and it can be overridden by the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. -gc.commitGraph:: - If true, then gc will rewrite the commit-graph file when - linkgit:git-gc[1] is run. When using linkgit:git-gc[1] - '--auto' the commit-graph will be updated if housekeeping is - required. Default is false. See linkgit:git-commit-graph[1] - for details. +core.commitGraph:: + If true, then git will read the commit-graph file (if it exists) + to parse the graph structure of commits. Defaults to false. See + linkgit:git-commit-graph[1] for more information. core.useReplaceRefs:: If set to `false`, behave as if the `--no-replace-objects` option was given on the command line. See linkgit:git[1] and linkgit:git-replace[1] for more information. + core.multiPackIndex:: + Use the multi-pack-index file to track multiple packfiles using a + single index. See link:technical/multi-pack-index.html[the + multi-pack-index design document]. + core.sparseCheckout:: Enable "sparse checkout" feature. See section "Sparse checkout" in linkgit:git-read-tree[1] for more information. @@@ -1052,12 -1049,6 +1057,12 @@@ branch.autoSetupRebase: branch to track another branch. This option defaults to never. +branch.sort:: + This variable controls the sort ordering of branches when displayed by + linkgit:git-branch[1]. Without the "--sort=" option provided, the + value of this variable will be used as the default. + See linkgit:git-for-each-ref[1] field names for valid values. + branch..remote:: When on branch , it tells 'git fetch' and 'git push' which remote to fetch from/push to. The remote to push to @@@ -1154,14 -1145,6 +1159,14 @@@ and by linkgit:git-worktree[1] when 'gi remote branch. This setting might be used for other checkout-like commands or functionality in the future. +checkout.optimizeNewBranch + Optimizes the performance of "git checkout -b " when + using sparse-checkout. When set to true, git will not update the + repo based on the current sparse-checkout settings. This means it + will not update the skip-worktree bit in the index nor add/remove + files in the working directory to reflect the current sparse checkout + settings nor will it show the local changes. + clean.requireForce:: A boolean to make git-clean do nothing unless given -f, -i or -n. Defaults to true. @@@ -1225,6 -1208,18 +1230,6 @@@ This does not affect linkgit:git-format 'git-diff-{asterisk}' plumbing commands. Can be overridden on the command line with the `--color[=]` option. -diff.colorMoved:: - If set to either a valid `` or a true value, moved lines - in a diff are colored differently, for details of valid modes - see '--color-moved' in linkgit:git-diff[1]. If simply set to - true the default color mode will be used. When set to false, - moved lines are not colored. - -diff.colorMovedWS:: - When moved lines are colored using e.g. the `diff.colorMoved` setting, - this option controls the `` how spaces are treated - for details of valid modes see '--color-moved-ws' in linkgit:git-diff[1]. - color.diff.:: Use customized color for diff colorization. `` specifies which part of the patch to use the specified color, and is one @@@ -1773,13 -1768,6 +1778,13 @@@ this configuration variable is ignored will be repacked. After this the number of packs should go below gc.autoPackLimit and gc.bigPackThreshold should be respected again. +gc.writeCommitGraph:: + If true, then gc will rewrite the commit-graph file when + linkgit:git-gc[1] is run. When using linkgit:git-gc[1] + '--auto' the commit-graph will be updated if housekeeping is + required. Default is false. See linkgit:git-commit-graph[1] + for details. + gc.logExpiry:: If the file gc.log exists, then `git gc --auto` won't run unless that file is more than 'gc.logExpiry' old. Default is diff --combined Makefile index 5a969f5830,377379fcc0..bd83683a87 --- a/Makefile +++ b/Makefile @@@ -723,6 -723,7 +723,7 @@@ TEST_BUILTINS_OBJS += test-online-cpus. TEST_BUILTINS_OBJS += test-path-utils.o TEST_BUILTINS_OBJS += test-prio-queue.o TEST_BUILTINS_OBJS += test-read-cache.o + TEST_BUILTINS_OBJS += test-read-midx.o TEST_BUILTINS_OBJS += test-ref-store.o TEST_BUILTINS_OBJS += test-regex.o TEST_BUILTINS_OBJS += test-repository.o @@@ -900,6 -901,7 +901,7 @@@ LIB_OBJS += merge. LIB_OBJS += merge-blobs.o LIB_OBJS += merge-recursive.o LIB_OBJS += mergesort.o + LIB_OBJS += midx.o LIB_OBJS += name-hash.o LIB_OBJS += negotiator/default.o LIB_OBJS += negotiator/skipping.o @@@ -1060,6 -1062,7 +1062,7 @@@ BUILTIN_OBJS += builtin/merge-recursive BUILTIN_OBJS += builtin/merge-tree.o BUILTIN_OBJS += builtin/mktag.o BUILTIN_OBJS += builtin/mktree.o + BUILTIN_OBJS += builtin/multi-pack-index.o BUILTIN_OBJS += builtin/mv.o BUILTIN_OBJS += builtin/name-rev.o BUILTIN_OBJS += builtin/notes.o @@@ -2047,7 -2050,7 +2050,7 @@@ $(BUILT_INS): git$ command-list.h: generate-cmdlist.sh command-list.txt -command-list.h: $(wildcard Documentation/git*.txt) Documentation/config.txt +command-list.h: $(wildcard Documentation/git*.txt) Documentation/*config.txt $(QUIET_GEN)$(SHELL_PATH) ./generate-cmdlist.sh command-list.txt >$@+ && mv $@+ $@ SCRIPT_DEFINES = $(SHELL_PATH_SQ):$(DIFF_SQ):$(GIT_VERSION):\ diff --combined builtin/pack-objects.c index d1144a8f7e,807f034365..caa4cd0211 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@@ -31,6 -31,7 +31,7 @@@ #include "packfile.h" #include "object-store.h" #include "dir.h" + #include "midx.h" #define IN_PACK(obj) oe_in_pack(&to_pack, obj) #define SIZE(obj) oe_size(&to_pack, obj) @@@ -1040,6 -1041,7 +1041,7 @@@ static int want_object_in_pack(const st { int want; struct list_head *pos; + struct multi_pack_index *m; if (!exclude && local && has_loose_object_nonlocal(oid)) return 0; @@@ -1054,6 -1056,32 +1056,32 @@@ if (want != -1) return want; } + + for (m = get_multi_pack_index(the_repository); m; m = m->next) { + struct pack_entry e; + if (fill_midx_entry(oid, &e, m)) { + struct packed_git *p = e.p; + off_t offset; + + if (p == *found_pack) + offset = *found_offset; + else + offset = find_pack_entry_one(oid->hash, p); + + if (offset) { + if (!*found_pack) { + if (!is_pack_valid(p)) + continue; + *found_offset = offset; + *found_pack = p; + } + want = want_found_object(exclude, p); + if (want != -1) + return want; + } + } + } + list_for_each(pos, get_packed_git_mru(the_repository)) { struct packed_git *p = list_entry(pos, struct packed_git, mru); off_t offset; @@@ -2041,6 -2069,10 +2069,6 @@@ static int try_delta(struct unpacked *t delta_buf = create_delta(src->index, trg->data, trg_size, &delta_size, max_size); if (!delta_buf) return 0; - if (delta_size >= (1U << OE_DELTA_SIZE_BITS)) { - free(delta_buf); - return 0; - } if (DELTA(trg_entry)) { /* Prefer only shallower same-sized deltas. */ @@@ -2299,7 -2331,6 +2327,7 @@@ static void init_threaded_search(void pthread_mutex_init(&cache_mutex, NULL); pthread_mutex_init(&progress_mutex, NULL); pthread_cond_init(&progress_cond, NULL); + pthread_mutex_init(&to_pack.lock, NULL); old_try_to_free_routine = set_try_to_free_routine(try_to_free_from_threads); } @@@ -2806,7 -2837,7 +2834,7 @@@ static void add_objects_in_unpacked_pac memset(&in_pack, 0, sizeof(in_pack)); - for (p = get_packed_git(the_repository); p; p = p->next) { + for (p = get_all_packs(the_repository); p; p = p->next) { struct object_id oid; struct object *o; @@@ -2870,7 -2901,7 +2898,7 @@@ static int has_sha1_pack_kept_or_nonloc struct packed_git *p; p = (last_found != (void *)1) ? last_found : - get_packed_git(the_repository); + get_all_packs(the_repository); while (p) { if ((!p->pack_local || p->pack_keep || @@@ -2880,7 -2911,7 +2908,7 @@@ return 1; } if (p == last_found) - p = get_packed_git(the_repository); + p = get_all_packs(the_repository); else p = p->next; if (p == last_found) @@@ -2916,7 -2947,7 +2944,7 @@@ static void loosen_unused_packed_object uint32_t i; struct object_id oid; - for (p = get_packed_git(the_repository); p; p = p->next) { + for (p = get_all_packs(the_repository); p; p = p->next) { if (!p->pack_local || p->pack_keep || p->pack_keep_in_core) continue; @@@ -3063,7 -3094,7 +3091,7 @@@ static void add_extra_kept_packs(const if (!names->nr) return; - for (p = get_packed_git(the_repository); p; p = p->next) { + for (p = get_all_packs(the_repository); p; p = p->next) { const char *name = basename(p->pack_name); int i; @@@ -3336,7 -3367,7 +3364,7 @@@ int cmd_pack_objects(int argc, const ch add_extra_kept_packs(&keep_pack_list); if (ignore_packed_keep_on_disk) { struct packed_git *p; - for (p = get_packed_git(the_repository); p; p = p->next) + for (p = get_all_packs(the_repository); p; p = p->next) if (p->pack_local && p->pack_keep) break; if (!p) /* no keep-able packs found */ @@@ -3349,7 -3380,7 +3377,7 @@@ * it also covers non-local objects */ struct packed_git *p; - for (p = get_packed_git(the_repository); p; p = p->next) { + for (p = get_all_packs(the_repository); p; p = p->next) { if (!p->pack_local) { have_non_local_packs = 1; break; diff --combined http-backend.c index 458642ef72,809ba7d2c4..9e894f197f --- a/http-backend.c +++ b/http-backend.c @@@ -353,7 -353,7 +353,7 @@@ static ssize_t get_content_length(void ssize_t val = -1; const char *str = getenv("CONTENT_LENGTH"); - if (str && !git_parse_ssize_t(str, &val)) + if (str && *str && !git_parse_ssize_t(str, &val)) die("failed to parse CONTENT_LENGTH: %s", str); return val; } @@@ -595,13 -595,13 +595,13 @@@ static void get_info_packs(struct strbu size_t cnt = 0; select_getanyfile(hdr); - for (p = get_packed_git(the_repository); p; p = p->next) { + for (p = get_all_packs(the_repository); p; p = p->next) { if (p->pack_local) cnt++; } strbuf_grow(&buf, cnt * 53 + 2); - for (p = get_packed_git(the_repository); p; p = p->next) { + for (p = get_all_packs(the_repository); p; p = p->next) { if (p->pack_local) strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6); } diff --combined pack-objects.c index 6ef87e5683,832dcf7462..d04cfa8e9f --- a/pack-objects.c +++ b/pack-objects.c @@@ -99,7 -99,7 +99,7 @@@ static void prepare_in_pack_by_idx(stru * (i.e. in_pack_idx also zero) should return NULL. */ mapping[cnt++] = NULL; - for (p = get_packed_git(the_repository); p; p = p->next, cnt++) { + for (p = get_all_packs(the_repository); p; p = p->next, cnt++) { if (cnt == nr) { free(mapping); return; @@@ -146,8 -146,6 +146,8 @@@ void prepare_packing_data(struct packin pdata->oe_size_limit = git_env_ulong("GIT_TEST_OE_SIZE", 1U << OE_SIZE_BITS); + pdata->oe_delta_size_limit = git_env_ulong("GIT_TEST_OE_DELTA_SIZE", + 1UL << OE_DELTA_SIZE_BITS); } struct object_entry *packlist_alloc(struct packing_data *pdata, @@@ -162,8 -160,6 +162,8 @@@ if (!pdata->in_pack_by_idx) REALLOC_ARRAY(pdata->in_pack, pdata->nr_alloc); + if (pdata->delta_size) + REALLOC_ARRAY(pdata->delta_size, pdata->nr_alloc); } new_entry = pdata->objects + pdata->nr_objects++; diff --combined t/helper/test-tool.h index e954e8c522,70fc0285e8..710fb1b286 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@@ -1,8 -1,6 +1,8 @@@ #ifndef __TEST_TOOL_H__ #define __TEST_TOOL_H__ +#include "git-compat-util.h" + int cmd__chmtime(int argc, const char **argv); int cmd__config(int argc, const char **argv); int cmd__ctype(int argc, const char **argv); @@@ -24,6 -22,7 +24,7 @@@ int cmd__online_cpus(int argc, const ch int cmd__path_utils(int argc, const char **argv); int cmd__prio_queue(int argc, const char **argv); int cmd__read_cache(int argc, const char **argv); + int cmd__read_midx(int argc, const char **argv); int cmd__ref_store(int argc, const char **argv); int cmd__regex(int argc, const char **argv); int cmd__repository(int argc, const char **argv);