The trace API currently rechecks the environment variable and reopens the
trace file on every API call. This has the ugly side effect that errors
(e.g. file cannot be opened, or the user specified a relative path) are
also reported on every call. Performance can be improved by about factor
three by remembering the environment state and keeping the file open.
Replace the 'const char *key' parameter in the API with a pointer to a
'struct trace_key' that bundles the environment variable name with
additional, trace-internal state. Change the call sites of these APIs to
use a static 'struct trace_key' instead of a string constant.
In trace.c::get_trace_fd(), save and reuse the file descriptor in 'struct
trace_key'.
Add a 'trace_disable()' API, so that packet_trace() can cleanly disable
tracing when it encounters packed data (instead of using unsetenv()).
Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generally want to avoid lookup_unknown_object, because it
results in allocating more memory for the object than may be
strictly necessary.
In this case, it is used to check whether we have an
already-parsed object before calling parse_object, to save
us from reading the object from disk. Using lookup_object
would be fine for that purpose, but we can take it a step
further. Since this code was written, parse_object already
learned the "check lookup_object" optimization, so we can
simply call parse_object directly.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of the "index" field of struct commit is that
every allocated commit would have one. It is supposed to be
an invariant that whenever object->type is set to
OBJ_COMMIT, we have a unique index.
Commit 969eba6 (commit: push commit_index update into
alloc_commit_node, 2014-06-10) covered this case for
newly-allocated commits. However, we may also allocate an
"unknown" object via lookup_unknown_object, and only later
convert it to a commit. We must make sure that we set the
commit index when we switch the type field.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We keep a static counter to set the commit index on newly
allocated objects. However, since we also need to set the
index on any_objects which are converted to commits, let's
make the counter available as a public function.
While we're moving it, let's make sure the counter is
allocated as an unsigned integer to match the index field in
"struct commit".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call lookup_commit, lookup_tree, etc, the logic goes
something like:
1. Look for an existing object struct. If we don't have
one, allocate and return a new one.
2. Double check that any object we have is the expected
type (and complain and return NULL otherwise).
3. Convert an object with type OBJ_NONE (from a prior
call to lookup_unknown_object) to the expected type.
We can encapsulate steps 2 and 3 in a helper function which
checks whether we have the expected object type, converts
OBJ_NONE as appropriate, and returns the object.
Not only does this shorten the code, but it also provides
one central location for converting OBJ_NONE objects into
objects of other types. Future patches will use that to
enforce type-specific invariants.
Since this is a refactoring, we would want it to behave
exactly as the current code. It takes a little reasoning to
see that this is the case:
- for lookup_{commit,tree,etc} functions, we are just
pulling steps 2 and 3 into a function that does the same
thing.
- for the call in peel_object, we currently only do step 3
(but we want to consolidate it with the others, as
mentioned above). However, step 2 is a noop here, as the
surrounding conditional makes sure we have OBJ_NONE
(which we want to keep to avoid an extraneous call to
sha1_object_info).
- for the call in lookup_commit_reference_gently, we are
currently doing step 2 but not step 3. However, step 3
is a noop here. The object we got will have just come
from deref_tag, which must have figured out the type for
each object in order to know when to stop peeling.
Therefore the type will never be OBJ_NONE.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only way that "obj" can be non-NULL is if it came from
one of the lookup_* functions. These functions always ensure
that the object has the expected type (and return NULL
otherwise), so there is no need for us to set the type.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct object" type implements basic object
polymorphism. Individual instances are allocated as
concrete types (or as a union type that can store any
object), and a "struct object *" can be cast into its real
type after examining its "type" enum. This means it is
dangerous to have a type field that does not match the
allocation (e.g., setting the type field of a "struct blob"
to "OBJ_COMMIT" would mean that a reader might read past the
allocated memory).
In most of the current code this is not a problem; the first
thing we do after allocating an object is usually to set its
type field by passing it to create_object. However, the
virtual commits we create in merge-recursive.c do not ever
get their type set. This does not seem to have caused
problems in practice, though (presumably because we always
pass around a "struct commit" pointer and never even look at
the type).
We can fix this oversight and also make it harder for future
code to get it wrong by setting the type directly in the
object allocation functions.
This will also make it easier to fix problems with commit
index allocation, as we know that any object allocated by
alloc_commit_node will meet the invariant that an object
with an OBJ_COMMIT type field will have a unique index
number.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because the allocator functions for tree, blobs, etc are all
very similar, we originally used a macro to avoid repeating
ourselves. Since the prior commit, though, the heavy lifting
is done by an inline helper function. The macro does still
save us a few lines, but at some readability cost. It
obfuscates the function definitions (and makes them hard to
find via grep).
Much worse, though, is the fact that it isn't used
consistently for all allocators. Somebody coming later may
be tempted to modify DEFINE_ALLOCATOR, but they would miss
alloc_commit_node, which is treated specially.
Let's just drop the macro and write everything out
explicitly.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
alloc.c: remove the alloc_raw_commit_node() function
In order to encapsulate the setting of the unique commit index, commit 969eba63 ("commit: push commit_index update into alloc_commit_node",
10-06-2014) introduced a (logically private) intermediary allocator
function. However, this function (alloc_raw_commit_node()) was declared
as a public function, which undermines its entire purpose.
Introduce an inline function, alloc_node(), which implements the main
logic of the allocator used by DEFINE_ALLOCATOR, and redefine the macro
in terms of the new function. In addition, use the new function in the
implementation of the alloc_commit_node() allocator, rather than the
intermediary allocator, which can now be removed.
Noticed by sparse ("symbol 'alloc_raw_commit_node' was not declared.
Should it be static?").
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-push.c: make CURLOPT_IOCTLDATA a usable pointer
Fixes a small bug affecting push to remotes which use some sort of
multi-pass authentication. In particular the bug affected SabreDAV as
configured by Box.com [1].
It must be a weird server configuration for the bug to have survived
this long. Someone should write a test for it.
The --sort tests should use the better format for >expect to maintain
indenting and ensure that no substitution is occurring. This makes
parsing and understanding the tests a bit easier.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fsck: simplify fsck_commit_buffer() by using commit_list_count()
fsck_commit_buffer() checks that the number of items in the parents
list of a commit matches the number of parent lines in its buffer or --
if a graft is used -- the number of parents in that graft. Simplify
the code by using commit_list_count() instead of counting by hand.
Also use different variables for the number of lines and the number of
list items, making it easier to compare them.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge: simplify merge_trivial() by using commit_list_append()
Build the commit_list of parents by calling commit_list_append() twice
instead of allocating and linking the items by hand. This makes the
code shorter and simpler. Rename the commit_list from parent to parents
(plural) while at it because there are two of them.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/Makefile: always test all lint targets when running tests
Only the two targets "test-lint-duplicates" and "test-lint-executable" are
currently executed when running the test target. This was done on purpose
when the TEST_LINT variable was added in 81127d74 to avoid twisted shell
scripting by developers only to avoid false positives that might result
from the rather simple minded tests, e.g. test-lint-shell-syntax. But it
looks like it might be better to include all lint tests to help developers
to detect non portable shell constructs before the patch is sent to the
list and reviewed there.
Change the TEST_LINT variable to run all lint test unless the TEST_LINT
variable is overridden. If we hit false positives more often than helping
developers to avoid non-portable code (or add less accurate or slow tests
later) we could still fall back to exclude them like 81127d74 proposed.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/Makefile: check helper scripts for non-portable shell commands too
Currently only the "t[0-9][0-9][0-9][0-9]-*.sh" scripts are tested for
shell incompatibilities using the check-non-portable-shell.pl script. This
makes it easy to miss non-POSIX constructs added to one of the t/*lib*.sh
helper scripts, as they aren't automatically detected.
Fix that by adding a THELPERS variable containing all shell scripts that
aren't tests and add these to the "test-lint-shell-syntax" target too.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add 'verify-commit' to be used in a way similar to 'verify-tag' is
used. Further work on verifying the mergetags might be needed.
* mg/verify-commit:
t7510: test verify-commit
t7510: exit for loop with test result
verify-commit: scriptable commit signature verification
gpg-interface: provide access to the payload
gpg-interface: provide clear helper for struct signature_check
"git clone -b brefs/tags/bar" would have mistakenly thought we were
following a single tag, even though it was a name of the branch,
because it incorrectly used strstr().
* jc/fix-clone-single-starting-at-a-tag:
builtin/clone.c: detect a clone starting at a tag correctly
Merge branch 'jk/repack-pack-keep-objects' into maint
* jk/repack-pack-keep-objects:
repack: s/write_bitmap/&s/ in code
repack: respect pack.writebitmaps
repack: do not accidentally pack kept objects by default
remote-curl: mark helper-protocol errors more clearly
When we encounter an error in remote-curl, we generally just
report it to stderr. There is no need for the user to care
that the "could not connect to server" error was generated
by git-remote-https rather than a function in the parent
git-fetch process.
However, when the error is in the protocol between git and
the helper, it makes sense to clearly identify which side is
complaining. These cases shouldn't ever happen, but when
they do, we can make them less confusing by being more
verbose.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We usually prefix our error messages with "error: ", but
many error messages from remote-curl are simply printed with
fprintf. This can make the output a little harder to read
(especially because such message may be intermingled with
errors from the parent git process).
There is no reason to avoid error(), as we are already
calling it many places (in addition to libgit.a functions
which use it).
While we're adjusting the messages, we can also drop the
capitalization which makes them unlike other git error
messages.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-curl: do not complain on EOF from parent git
The parent git process is supposed to send us an empty line
to indicate that the conversation is over. However, the
parent process may die() if there is a problem with the
operation (e.g., we try to fetch a ref that does not exist).
In this case, it produces a useful message, but then
remote-curl _also_ produces an unhelpful message:
$ git pull origin matser
fatal: couldn't find remote ref matser
Unexpected end of command stream
The "right" way to fix this is to teach the parent git to
always cleanly close the connection to the helper, letting
it know that we are done. Implementing that is rather
clunky, though, as it would involve either replacing die()
operations with returning errors up the stack (until we
disconnect the transport), or adding an atexit handler to
clean up any transport helpers left open.
It's much simpler to just suppress the EOF message in
remote-curl. It was not added to address any real-world
situation in the first place, but rather a "we should
probably report unexpected things" suggestion[1].
It is the parent git which drives the operation, and whose
exit value actually matters. If the parent dies, then the
helper has no need to complain (except as a debugging aid).
In the off chance that the pipe is closed without the parent
dying, it can still notice the non-zero exit code.
* jk/pretty-G-format-fixes:
move "%G" format test from t7510 to t6006
pretty: avoid reading past end-of-string with "%G"
t7510: check %G* pretty-format output
t7510: test a commit signed by an unknown key
t7510: use consistent &&-chains in loop
t7510: stop referring to master in later tests
* jk/xstrfmt:
setup_git_env(): introduce git_path_from_env() helper
unique_path: fix unlikely heap overflow
walker_fetch: fix minor memory leak
merge: use argv_array when spawning merge strategy
sequencer: use argv_array_pushf
setup_git_env: use git_pathdup instead of xmalloc + sprintf
use xstrfmt to replace xmalloc + strcpy/strcat
use xstrfmt to replace xmalloc + sprintf
use xstrdup instead of xmalloc + strcpy
use xstrfmt in favor of manual size calculations
strbuf: add xstrfmt helper
* jk/skip-prefix:
http-push: refactor parsing of remote object names
imap-send: use skip_prefix instead of using magic numbers
use skip_prefix to avoid repeated calculations
git: avoid magic number with skip_prefix
fetch-pack: refactor parsing in get_ack
fast-import: refactor parsing of spaces
stat_opt: check extra strlen call
daemon: use skip_prefix to avoid magic numbers
fast-import: use skip_prefix for parsing input
use skip_prefix to avoid repeating strings
use skip_prefix to avoid magic numbers
transport-helper: avoid reading past end-of-string
fast-import: fix read of uninitialized argv memory
apply: use skip_prefix instead of raw addition
refactor skip_prefix to return a boolean
avoid using skip_prefix as a boolean
daemon: mark some strings as const
parse_diff_color_slot: drop ofs parameter
The git log --graph --show-signature command incorrectly indents the gpg
information about signed commits and merged signed tags. It does not
follow the level of indentation of the current commit.
Example of garbled output:
$ git log --show-signature --graph
* commit 258e0a237cb69aaa587b0a4fb528bb0316b1b776
|\ gpg: Signature made Mon, Jun 30, 2014 13:22:33 EDT using RSA key ID DA08
gpg: Good signature from "Jason Pyeron <jpye...@pdinc.us>"
Merge: 727c3551ca13ed
| | Author: Jason Pyeron <jpye...@pdinc.us>
| | Date: Mon Jun 30 13:22:29 2014 -0400
| |
| | Merge of 1ca13ed2271d60ba9 branch - rebranding
| |
| * commit 1ca13ed2271d60ba93d40bcc8db17ced8545f172
| | gpg: Signature made Mon, Jun 23, 2014 9:45:47 EDT using RSA key ID DD37
gpg: Good signature from "Stephen Robert Guglielmo <s...@guglielmo.us>"
gpg: aka "Stephen Robert Guglielmo <srguglie...@gmail.com>"
Author: Stephen R Guglielmo <s...@guglielmo.us>
| | Date: Mon Jun 23 09:45:27 2014 -0400
| |
| | Minor URL updates
In log-tree.c modify show_sig_lines() function to call graph_show_oneline()
after each line of gpg information it has printed in order to preserve
the level of indentation for the next output line.
Reported-by: Jason Pyeron <jpyeron@pdinc.us> Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add all of the ways in which check_refname_format violates valgrind's
expectations to the valgrind suppression file; remove an assumption about
the call chain of check_refname_format from same.
Signed-off-by: David Turner <dturner@twitter.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0027: combinations of core.autocrlf, core.eol and text
Historically there are 3 different parameters controlling how line endings
are handled by Git:
- core.autocrlf
- core.eol
- the "text" attribute in .gitattributes
There are different types of content:
- (1) Files with only LF
- (2) Files with only CRLF
- (3) Files with mixed LF and CRLF
- (4) Files with LF and/or CRLF with CR not followed by LF
- (5) Files which are binary (e.g. have NUL bytes)
Recently the question came up, how files with mixed EOLs are handled by Git
(and libgit2) when they are checked out and core.autocrlf=true.
See
http://git.661346.n2.nabble.com/The-different-EOL-behavior-between-libgit2-based-software-and-official-Git-td7613670.html#a7613801
Add the EXPENSIVE t0027-auto-crlf.sh to test all combination of files
and parameters for both "git add/commit" and "git checkout".
Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current test files are named one, two and three.
Make it clearer what the tests do and rename them into
LFonly, CRLFonly and LFwithNUL.
After the renaming we can see easier that we may want more test cases
for 2 types of files:
- files which have mixed LF and CRLF line endings,
- files which have mixed LF and CR line endings.
See commit fd6cce9e, "Add per-repository eol normalization" and
"the new safer autocrlf handling" in convert.c
Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix profile feedback with -jN and add profile-fast
Profile feedback always failed for me with -jN. The problem
was that there was no implicit ordering between the profile generate
stage and the profile use stage. So some objects in the later stage
would be linked with profile generate objects, and fail due
to the missing -lgcov.
This adds a new profile target that implicitely enforces the
correct ordering by using submakes. Plus a profile-install target
to also install. This is also nicer to type that PROFILE=...
Plus I always run the performance test suite now for the full
profile run.
In addition I also added a profile-fast / profile-fast-install
target the only runs the performance test suite instead of the
whole test suite. This significantly speeds up the profile build,
which was totally dominated by test suite run time. However
it may have less coverage of course.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-dump-cache-tree: invalid trees are not errors
Do not treat known-invalid trees as errors even when their subtree_nr is
incorrect. Because git already knows that these trees are invalid,
an incorrect subtree_nr will not cause problems.
Add a couple of comments.
Signed-off-by: David Turner <dturner@twitter.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the same way as there is for_each_ref() to iterate on refs,
for_each_mergetag() allows the caller to iterate on the mergetags of
a given commit. Use it to rewrite show_mergetag() used in "git log".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the whole directory of test files at once using git add instead of
calling git update-index on each of them and use git commit instead of
the plumbing commands write-tree, update-ref and commit-tree to build
the commit. This simplifies the code considerably.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Profile feedback sets -DNO_NORETURN, which causes the compat
header file to go into a default #else block. That #else
block defines away __attribute__(). Doing so causes all
kinds of problems with the Linux and gcc system headers:
in particular it makes the xmmintrin.h headers error out,
breaking the build.
Don't define away __attribute__ when __GNUC__ is set.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Interning short strings with high probability of duplicates can reduce the
memory footprint and speed up comparisons.
Add strintern() and memintern() APIs that use a hashmap to manage the pool
of unique, interned strings.
Note: strintern(getenv()) could be used to sanitize git's use of getenv(),
in case we ever encounter a platform where a call to getenv() invalidates
previous getenv() results (which is allowed by POSIX).
Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
hashmap: add simplified hashmap_get_from_hash() API
Hashmap entries are typically looked up by just a key. The hashmap_get()
API expects an initialized entry structure instead, to support compound
keys. This flexibility is currently only needed by find_dir_entry() in
name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
(currently five) call sites of hashmap_get() have to set up a near emtpy
entry structure, resulting in duplicate code like this:
hashmap: factor out getting a hash code from a SHA1
Copying the first bytes of a SHA1 is duplicated in six places,
however, the implications (the actual value would depend on the
endianness of the platform) is documented only once.
Add a properly documented API for this.
Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git checkout checks out a branch, create or update the
cache-tree so that subsequent operations are faster.
update_main_cache_tree learned a new flag, WRITE_TREE_REPAIR. When
WRITE_TREE_REPAIR is set, portions of the cache-tree which do not
correspond to existing tree objects are invalidated (and portions which
do are marked as valid). No new tree objects are created.
Signed-off-by: David Turner <dturner@twitter.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git checkout' fails if a directory is longer than PATH_MAX, because the
lstat_cache in symlinks.c checks if the leading directory exists using
PATH_MAX-bounded string operations.
Remove the limitation by using strbuf instead.
Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs.c: handle REFNAME_REFSPEC_PATTERN at end of page
When a ref crosses a memory page boundary, we restart the parsing
at the beginning with the bytewise code. Pass the original flags
to that code, rather than the current flags.
Reported-By: Øyvind A. Holm <sunny@sunbase.org> Signed-off-by: David Turner <dturner@twitter.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move "commit->buffer" out of the in-core commit object and keep
track of their lengths. Use this to optimize the code paths to
validate GPG signatures in commit objects.
* jk/commit-buffer-length:
reuse cached commit buffer when parsing signatures
commit: record buffer length in cache
commit: convert commit->buffer to a slab
commit-slab: provide a static initializer
use get_commit_buffer everywhere
convert logmsg_reencode to get_commit_buffer
use get_commit_buffer to avoid duplicate code
use get_cached_commit_buffer where appropriate
provide helpers to access the commit buffer
provide a helper to set the commit buffer
provide a helper to free commit buffer
sequencer: use logmsg_reencode in get_message
logmsg_reencode: return const buffer
do not create "struct commit" with xcalloc
commit: push commit_index update into alloc_commit_node
alloc: include any-object allocations in alloc_report
replace dangerous uses of strbuf_attach
commit_tree: take a pointer/len pair rather than a const strbuf
If the middle step fails but leaves the directory (e.g., the
bug is that clean does not notice the failure), this
pollutes the test repo with an unremovable directory. Not
only does this cause further tests to fail, but it means
that "rm -rf" fails on the whole trash directory, and the
user has to intervene manually to even re-run the test script.
We can bump the "chmod 755" recovery to a test_when_finished
block to be sure that it always runs.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1_file: avoid overrunning alternate object base string
While checking if a new alternate object database is a duplicate make
sure that old and new base paths have the same length before comparing
them with memcmp. This avoids overrunning the buffer of the existing
entry if the new one is longer and it stops rejecting foobar/ after
foo/ was already added.
Signed-off-by: Rene Scharfe <ls.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When multiple parents of a merge commit get mapped to the same
commit, filter-branch used to pass all instances of the parent
commit to the parent and commit filters and to "git commit-tree" or
"git_commit_non_empty_tree".
This can often happen when extracting a small project from a large
repository; merges can join history with no commits on any branch
which affect the paths being retained. Once the intermediate
commits have been filtered out, all the immediate parents of the
merge commit can end up being mapped to the same commit - either the
original merge-base or an ancestor of it.
"git commit-tree" would display an error but write the commit with
the normalized parents in any case. "git_commit_non_empty_tree"
would fail to notice that the commit being made was in fact a
non-merge commit and would retain it even if a further pass with
"--prune-empty" would discard the commit as empty.
Ensure that duplicate parents are pruned before the parent filter to
make "--prune-empty" idempotent, removing all empty non-merge
commits in a singe pass.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-merge-file: do not add LF at EOF while applying unrelated change
If 'current-file' does not contain LF at EOF, and change between
'base-file' and 'other-file' does not change any line close to EOF, the
3-way merge should not add LF to EOF. This is what 'diff3 -m' does, and
seems to be a reasonable expectation.
The change which introduced the behavior is cd1d61c44f. It always calls
function xdl_recs_copy() for sides with add_nl == 1. In fact, it looks
like the only case when this is needed is when 2 files are being
union-merged, and they do not have LF at EOF (strictly speaking, the
first of them).
Add tests:
* "merge without conflict (missing LF at EOF, away from change in the
other file)" and "merge does not add LF away of change", to demonstrate
the changed behavior.
* "conflict at EOF without LF resolved by --union", to verify that the
union-merge at the end inerts newline between versions.
* some more tests which I felt like not covering the functionality well
Signed-off-by: Max Kirillov <max@max630.net> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t6023-merge-file.sh: fix and mark as broken invalid tests
Tests "merge without conflict (missing LF at EOF" and "merge result
added missing LF" are meaningless - the first one is identical to
"merge without conflict" and the second compares results of those
identical tests, which are always same.
This has been so since their addition in ba1f5f3537. Probably "new4.txt"
was meant to be used instead of "new2.txt". Unfortunately, the current
merge-file breaks with new4 - conflict is reported. They also fail at
that revision if fixed.
Fix the file reference to "new4.txt" and mark the tests as failing -
they look like legitimate expectations, just not satisfied at time
being.
Signed-off-by: Max Kirillov <max@max630.net> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are reloading the list of packs, we check whether a
particular pack has been loaded. This is slightly tricky,
because we load packs based on the presence of their ".idx"
files, but record the name of the matching ".pack" file.
Therefore we want to compare their bases.
The existing code stripped off ".idx" from a file we found,
then compared that whole base length to strings containing
the ".pack" version. This meant we could end up comparing
bytes past what the ".pack" string contained, if the ".idx"
file name was much longer.
In practice, it worked OK because memcmp would end up seeing
a difference in the two strings and would return early
before hitting the full length. However, memcmp may
sometimes read extra bytes past a difference (e.g., because
it is comparing 64-bit words), or is even free to compare in
reverse order.
Furthermore, our memcmp made no guarantees that we matched
the whole pack name, up to ".pack". So "foo.idx" would match
"foo-bar.pack", which is wrong (but does not typically
happen, because our pack names have a fixed size).
We can fix both issues, avoid magic numbers, and document
that we expect to compare against a string with ".pack" by
using strip_suffix.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this code, we try to convert both "foo.idx" and "foo"
into "foo.pack". By stripping the suffix, we can avoid a
confusing use of strbuf_splice, and make it clear that both
cases are adding ".pack" to the end.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
You can almost get away with just calling "strip_suffix_mem"
on a strbuf's buf and len fields. But we also need to move
the NUL-terminator to satisfy strbuf's invariants. Let's
provide a convenience wrapper that handles this.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
use strip_suffix instead of ends_with in simple cases
When stripping a suffix like:
if (ends_with(str, "foo"))
buf = xmemdupz(str, strlen(str) - 3);
we can instead use strip_suffix to avoid the constant 3,
which must match the literal "foo" (we sometimes use
strlen("foo") instead, but that means we are repeating
ourselves). The example above becomes:
if (strip_suffix(str, "foo", &len))
buf = xmemdupz(str, len);
This also saves a strlen(), since we calculate the string
length when detecting the suffix.
Note that in some cases we also switch from xstrndup to
xmemdupz, which saves a further strlen call.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two are almost the same function, with the exception
that has_extension only matches if there is content before
the suffix. So ends_with(".exe", ".exe") is true, but
has_extension would not be.
This distinction does not matter to any of the callers,
though, and we can just replace uses of has_extension with
ends_with. We prefer the "ends_with" name because it is more
generic, and there is nothing about the function that
requires it to be used for file extensions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ends_with function is essentially a simplified version
of strip_suffix, in which we throw away the stripped length.
Implementing it as an inline on top of strip_suffix has two
advantages:
1. We save a bit of duplicated code.
2. The suffix is typically a string literal, and we call
strlen on it. By making the function inline, many
compilers can replace the strlen call with a constant.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many callers of ends_with want to not only find out whether
a string has a suffix, but want to also strip it off. Doing
that separately has two minor problems:
1. We often run over the string twice (once to find
the suffix, and then once more to find its length to
subtract the suffix length).
2. We have to specify the suffix length again, which means
either a magic number, or repeating ourselves with
strlen("suffix").
Just as we have skip_prefix to avoid these cases with
starts_with, we can add a strip_suffix to avoid them with
ends_with.
Note that we add two forms of strip_suffix here: one that
takes a string, with the resulting length as an
out-parameter; and one that takes a pointer/length pair, and
reuses the length as an out-parameter. The latter is more
efficient when the caller already has the length (e.g., when
using strbufs), but it can be easy to confuse the two, as
they take the same number and types of parameters.
For that reason, the "mem" form puts its length parameter
next to the buffer (since they are a pair), and the string
form puts it at the end (since it is an out-parameter). The
compiler can notice when you get the order wrong, which
should help prevent writing one when you meant the other.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1_file: replace PATH_MAX buffer with strbuf in prepare_packed_git_one()
Instead of using strbuf to create a message string in case a path is
too long for our fixed-size buffer, replace that buffer with a strbuf
and thus get rid of the limitation.
Helped-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a PATH_MAX buffer, use argv_array for constructing the
environment for git submodule summary. This simplifies the code a bit
and removes the arbitrary length limit.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitk: Add visiblerefs option, which lists always-shown branches
When many branches contain a commit, the branches used to be shown in
the form "A, B and many more", where A, B can be master of current
HEAD. But there are more which might be interesting to always know about.
For example, "origin/master".
The new option, visiblerefs, is stored in ~/.gitk. It contains a list
of references which are always shown before "and many more" if they
contain the commit. By default it is `{"master"}', which is compatible
with previous behavior.
Signed-off-by: Max Kirillov <max@max630.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
105b5d3f ("gitk: Use mktemp -d to avoid predictable temporary
directories") introduced a dependency on mkdtemp, which is not
available on Windows.
Use the original temporary directory behavior when mkdtemp fails.
This makes the code use mkdtemp when available and gracefully
fallback to the existing behavior when it is not available.
Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Merge early parts from git://ozlabs.org/~paulus/gitk.git
* master~2:
gitk: Show staged submodules regardless of ignore config
gitk: Allow displaying time zones from author and commit dates timestamps
gitk: Switch to patch mode when searching for line origin
gitk: Replace SHA1 entry field on keyboard paste
l10n: Init Vietnamese translation
* git://repo.or.cz/git-gui:
git-gui: tolerate major version changes when comparing the git version
git-gui: show staged submodules regardless of ignore config
One of the purposes of "git replace --edit" is to help a
user repair objects which are malformed or corrupted.
Usually we pretty-print trees with "ls-tree", which is much
easier to work with than the raw binary data. However, some
forms of corruption break the tree-walker, in which case our
pretty-printing fails, rendering "--edit" useless for the
user.
This patch introduces a "--raw" option, which lets you edit
the binary data in these instances.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a little more verbose, but will make it easier to
make parts of our command-line conditional (without
resorting to magic numbers or lots of NULLs to get an
appropriately sized argv array).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>