Optimize the check to see if a ref $F can be created by making sure
no existing ref has $F/ as its prefix, which especially matters in
a repository with a large number of existing refs.
* jk/faster-name-conflicts:
refs: speed up is_refname_available
Merge branch 'jn/unpack-trees-checkout-m-carry-deletion' into maint
* jn/unpack-trees-checkout-m-carry-deletion:
checkout -m: attempt merge when deletion of path was staged
unpack-trees: use 'cuddled' style for if-else cascade
unpack-trees: simplify 'all other failures' case
* jc/apply-ws-prefix:
apply: omit ws check for excluded paths
apply: hoist use_patch() helper for path exclusion up
apply: use the right attribute for paths in non-Git patches
* jk/pretty-empty-format:
pretty: make empty userformats truly empty
pretty: treat "--format=" as an empty userformat
revision: drop useless string offset when parsing "--pretty"
* tb/crlf-tests:
MinGW: update tests to handle a native eol of crlf
Makefile: propagate NATIVE_CRLF to C
t0027: Tests for core.eol=native, eol=lf, eol=crlf
* rs/more-uses-of-skip-prefix:
pack-write: simplify index_pack_lockfile using skip_prefix() and xstrfmt()
connect: simplify check_ref() using skip_prefix() and starts_with()
A broken reimplementation of Git could write an invalid index that
records both stage #0 and higher stage entries for the same path.
Notice and reject such an index, as there is no sensible fallback
(we do not know if the broken tool wanted to resolve and forgot to
remove higher stage entries, or if it wanted to unresolve and
forgot to remove the stage#0 entry).
* jp/index-with-corrupt-stages:
read_index_unmerged(): remove unnecessary loop index adjustment
read_index_from(): catch out of order entries when reading an index file
When receiving an invalid pack stream that records the same object
twice, multiple threads got confused due to a race. We should
reject or correct such a stream upon receiving, but that will be a
larger change.
* jk/index-pack-threading-races:
index-pack: fix race condition with duplicate bases
* jk/commit-author-parsing:
determine_author_info(): copy getenv output
determine_author_info(): reuse parsing functions
date: use strbufs in date-formatting functions
record_author_date(): use find_commit_header()
record_author_date(): fix memory leak on malformed commit
commit: provide a function to find a header in a buffer
"log --date=iso" uses a slight variant of ISO 8601 format that is
made more human readable. A new "--date=iso-strict" option gives
datetime output that is more strictly conformant.
* bb/date-iso-strict:
pretty: provide a strict ISO 8601 date format
Sometimes users want to report a bug they experience on their
repository, but they are not at liberty to share the contents of
the repository. "fast-export" was taught an "--anonymize" option
to replace blob contents, names of people and paths and log
messages with bland and simple strings to help them.
* jk/fast-export-anonymize:
docs/fast-export: explain --anonymize more completely
teach fast-export an --anonymize option
The number of refs that can be pushed at once over smart HTTP was
limited by the command line length. The limitation has been lifted
by passing these refs from the standard input of send-pack.
* jk/send-pack-many-refspecs:
send-pack: take refspecs over stdin
Documentation/git-rebase.txt: <upstream> must be given to specify <branch>
Current syntax description makes one wonder if there is any
syntactic way to distinguish between <branch> and <upstream> so that
one can specify <branch> but not <upstream>, but that is not the
case.
Make it explicit that these arguments are positional, i.e. the
earlier ones cannot be omitted if you want to give later ones.
Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On my Debian 7 system, this fixes annoying warnings when the output
of "git svn" commands are redirected:
Unable to get Terminal Size. The TIOCGWINSZ ioctl didn't work.
The COLUMNS and LINES environment variables didn't work. The
resize program didn't work.
git svn: find-rev allows short switches for near matches
Allow -B and -A to act as short aliases for --before and --after
options respectively. This reduces typing and hopefully allows
reuse of muscle memory for grep(1) users.
Git no longer seems to use these flags or their associated config keys;
when they are present, git-svn outputs a message indicating that they
are being ignored.
Signed-off-by: Lawrence Velázquez <vq@larryv.me> Signed-off-by: Eric Wong <normalperson@yhbt.net>
Calling "git svn info $(pwd)" would hit:
"Reading from filehandle failed at ..."
errors due to improper prefixing and canonicalization.
Strip the toplevel path from absolute filesystem paths to ensure
downstream canonicalization routines are only exposed to paths
tracked in git (or SVN).
v2:
Thanks to Andrej Manduch for originally noticing the issue
and fixing my original version of this to handle
more corner cases such as "/path/to/top/../top" and
"/path/to/top/../top/file" as shown in the new test cases.
v3:
Fix pathname portability problems pointed out by Johannes Sixt
with a hint from brian m. carlson.
Cc: Johannes Sixt <j6t@kdbg.org> Cc: "brian m. carlson" <sandals@crustytoothpaste.net> Signed-off-by: Andrej Manduch <amanduch@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
git-svn: branch: avoid systematic prompt for cert/pass
Commands such as "git svn init/fetch/dcommit" do not prompt for client
certificate/password if they are stored in SVN config file. Make
"git svn branch" consistent with the other commands, as SVN::Client is
capable of building its own authentication baton from information in the
SVN config directory.
Signed-off-by: Monard Vong <travelingsoul86@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
t9300: use test_cmp_bin instead of test_cmp to compare binary files
test_cmp is intended to produce diff output for human consumption. The
input in one instance in t9300-fast-import.sh are binary files, however.
Use test_cmp_bin to compare the files.
This was noticed because on Windows we have a special implementation of
test_cmp in pure bash code (to ignore differences due to intermittent CR
in actual output), and bash runs into an infinite loop due to the binary
nature of the input.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our filesystem ref storage does not allow D/F conflicts; so
if "refs/heads/a/b" exists, we do not allow "refs/heads/a"
to exist (and vice versa). This falls out naturally for
loose refs, where the filesystem enforces the condition. But
for packed-refs, we have to make the check ourselves.
We do so by iterating over the entire packed-refs namespace
and checking whether each name creates a conflict. If you
have a very large number of refs, this is quite inefficient,
as you end up doing a large number of comparisons with
uninteresting bits of the ref tree (e.g., we know that all
of "refs/tags" is uninteresting in the example above, yet we
check each entry in it).
Instead, let's take advantage of the fact that we have the
packed refs stored as a trie of ref_entry structs. We can
find each component of the proposed refname as we walk
through the trie, checking for D/F conflicts as we go. For a
refname of depth N (i.e., 4 in the above example), we only
have to visit N nodes. And at each visit, we can binary
search the M names at that level, for a total complexity of
O(N lg M). ("M" is different at each level, of course, but
we can take the worst-case "M" as a bound).
In a pathological case of fetching 30,000 fresh refs into a
repository with 8.5 million refs, this dropped the time to
run "git fetch" from tens of minutes to ~30s.
This may also help smaller cases in which we check against
loose refs (which we do when renaming a ref), as we may
avoid a disk access for unrelated loose directories.
Note that the tests we add appear at first glance to be
redundant with what is already in t3210. However, the early
tests are not robust; they are run with reflogs turned on,
meaning that we are not actually testing
is_refname_available at all! The operations will still fail
because the reflogs will hit D/F conflicts in the
filesystem. To get a true test, we must turn off reflogs
(but we don't want to do so for the entire script, because
the point of turning them on was to cover some other cases).
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fsck tries hard to detect missing objects, and will complain
(and exit non-zero) about any inter-object links that are
missing. However, it will not exit non-zero for any missing
ref tips, meaning that a severely broken repository may
still pass "git fsck && echo ok".
The problem is that we use for_each_ref to iterate over the
ref tips, which hides broken tips. It does at least print an
error from the refs.c code, but fsck does not ever see the
ref and cannot note the problem in its exit code. We can solve
this by using for_each_rawref and noting the error ourselves.
In addition to adding tests for this case, we add tests for
all types of missing-object links (all of which worked, but
which we were not testing).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce CONFIG_REGEX_NONE as a more explicit sentinel value to say
"we do not want to replace any existing entry" and use it in the
implementation of "git config --add".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: use single-parameter --cacheinfo in example
The single-parameter form is described as the preferred way. Separate
arguments are only supported for backward compatibility. Update the
example to the recommended form.
Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The API to allocate the structure to keep track of commit
decoration was cumbersome to use, inviting lazy code to
overallocate memory.
* jk/name-decoration-alloc:
log-tree: use FLEX_ARRAY in name_decoration
log-tree: make name_decoration hash static
log-tree: make add_name_decoration a public function
"git checkout -m" did not switch to another branch while carrying
the local changes forward when a path was deleted from the index.
* jn/unpack-trees-checkout-m-carry-deletion:
checkout -m: attempt merge when deletion of path was staged
unpack-trees: use 'cuddled' style for if-else cascade
unpack-trees: simplify 'all other failures' case
Teach a few codepaths to punt (instead of dying) when large blobs
that would not fit in core are involved in the operation.
* nd/large-blobs:
diff: shortcut for diff'ing two binary SHA-1 objects
diff --stat: mark any file larger than core.bigfilethreshold binary
diff.c: allow to pass more flags to diff_populate_filespec
sha1_file.c: do not die failing to malloc in unpack_compressed_entry
wrapper.c: introduce gentle xmallocz that does not die()
Add a few more places in "commit" and "checkout" that make sure
that the cache-tree is fully populated in the index.
* dt/cache-tree-repair:
cache-tree: do not try to use an invalidated subtree info to build a tree
cache-tree: Write updated cache-tree after commit
cache-tree: subdirectory tests
test-dump-cache-tree: invalid trees are not errors
cache-tree: create/update cache-tree on checkout
The second batch of the transactional ref update series.
* rs/ref-transaction-1: (22 commits)
update-ref --stdin: pass transaction around explicitly
update-ref --stdin: narrow scope of err strbuf
refs.c: make delete_ref use a transaction
refs.c: make prune_ref use a transaction to delete the ref
refs.c: remove lock_ref_sha1
refs.c: remove the update_ref_write function
refs.c: remove the update_ref_lock function
refs.c: make lock_ref_sha1 static
walker.c: use ref transaction for ref updates
fast-import.c: use a ref transaction when dumping tags
receive-pack.c: use a reference transaction for updating the refs
refs.c: change update_ref to use a transaction
branch.c: use ref transaction for all ref updates
fast-import.c: change update_branch to use ref transactions
sequencer.c: use ref transactions for all ref updates
commit.c: use ref transactions for updates
replace.c: use the ref transaction functions for updates
tag.c: use ref transactions when doing updates
refs.c: add transaction.status and track OPEN/CLOSED
refs.c: make ref_transaction_begin take an err argument
...
* nd/mv-code-cleaning:
mv: no SP between function name and the first opening parenthese
mv: combine two if(s)
mv: unindent one level for directory move code
mv: move index search code out
mv: remove an "if" that's always true
mv: split submodule move preparation code out
mv: flatten error handling code block
mv: mark strings for translations
Admit that keeping LIB_H up-to-date, only for those that do not use
the automatically generated dependencies, is a losing battle, and
make it conservative by making everything depend on anything.
* jk/make-simplify-dependencies:
Makefile: drop CHECK_HEADER_DEPENDENCIES code
Makefile: use `find` to determine static header dependencies
i18n: treat "make pot" as an explicitly-invoked target
Update git_config() users with callback functions for a very narrow
scope with calls to config-set API that lets us query a single
variable.
* ta/config-set-2:
builtin/apply.c: replace `git_config()` with `git_config_get_string_const()`
merge-recursive.c: replace `git_config()` with `git_config_get_int()`
ll-merge.c: refactor `read_merge_config()` to use `git_config_string()`
fast-import.c: replace `git_config()` with `git_config_get_*()` family
branch.c: replace `git_config()` with `git_config_get_string()
alias.c: replace `git_config()` with `git_config_get_string()`
imap-send.c: replace `git_config()` with `git_config_get_*()` family
pager.c: replace `git_config()` with `git_config_get_value()`
builtin/gc.c: replace `git_config()` with `git_config_get_*()` family
rerere.c: replace `git_config()` with `git_config_get_*()` family
fetchpack.c: replace `git_config()` with `git_config_get_*()` family
archive.c: replace `git_config()` with `git_config_get_bool()` family
read-cache.c: replace `git_config()` with `git_config_get_*()` family
http-backend.c: replace `git_config()` with `git_config_get_bool()` family
daemon.c: replace `git_config()` with `git_config_get_bool()` family
Use the new caching config-set API in git_config() calls.
* ta/config-set-1:
add tests for `git_config_get_string_const()`
add a test for semantic errors in config files
rewrite git_config() to use the config-set API
config: add `git_die_config()` to the config-set API
change `git_config()` return value to void
add line number and file name info to `config_set`
config.c: fix accuracy of line number in errors
config.c: mark error and warnings strings for translation
We write each line of a new packed-refs file individually
using a write() syscall (and sometimes 2, if the ref is
peeled). Since each line is only about 50-100 bytes long,
this creates a lot of system call overhead.
We can instead open a stdio handle around our descriptor and
use fprintf to write to it. The extra buffering is not a
problem for us, because nobody will read our new packed-refs
file until we call commit_lock_file (by which point we have
flushed everything).
On a pathological repository with 8.5 million refs, this
dropped the time to run `git pack-refs` from 20s to 6s.
Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/config-mak-document-darwin-vs-macosx:
config.mak.uname: add hint on uname_R for MacOS X
config.mak.uname: set NO_APPLE_COMMON_CRYPTO on older systems