Here is a 'feature' command for streams to use to require support for
the notemodify (N) command.
When the 'feature' facility was introduced (v1.7.0-rc0~95^2~4,
2009-12-04), the notes import feature was old news (v1.6.6-rc0~21^2~8,
2009-10-09) and it was not obvious it deserved to be a named feature.
But now that is clear, since all major non-git fast-import backends
lack support for it.
Details: on git version with this patch applied, any "feature notes"
command in the features/options section at the beginning of a stream
will be treated as a no-op. On fast-import implementations without
the feature (and older git versions), the command instead errors out
with a message like
This version of fast-import does not support feature notes.
So by declaring use of notes at the beginning of a stream, frontends
can avoid wasting time and other resources when the backend does not
support notes. (This would be especially important for backends that
do not support rewinding history after a botched import.)
Improved-by: Thomas Rast <trast@student.ethz.ch> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import: clarify documentation of "feature" command
The "feature" command allows streams to specify options for the import
that must not be ignored. Logically, they are part of the stream,
even though technically most supported features are synonyms to
command-line options.
Make this more obvious by being more explicit about how the analogy
between most "feature" commands and command-line options works. Treat
the feature (import-marks) that does not fit this analogy separately.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout: rearrange update_refs_for_switch for clarity
Take care of simple, exceptional cases before the meat of the "check
out by branch name" code begins. After this change, the function
vaguely follows the following pseudocode:
if (-B or -b)
create branch;
if (plain "git checkout" or "git checkout HEAD")
;
else if (--detach or checking out by non-branch commit name)
detach HEAD;
else if (checking out by branch name)
attach HEAD;
One nice side benefit is to make it possible to remove handling of
the --detach option from outside switch_branches.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout: introduce --detach synonym for "git checkout foo^{commit}"
For example, one might use this when making a temporary merge to
test that two topics work well together.
Patch by Junio, with tests from Jeff King.
[jn: with some extra checks for bogus commandline usage]
Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout: split off a function to peel away branchname arg
The code to parse and consume the tree name and "--" in commands such
as "git checkout @{-1} -- '*.c'" is intimidatingly long. Split it out
into a separate function and make it easier to skip on first reading
by making the data it uses and produces more explicit.
No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, EACCES overrules ENOTEMPTY when calling rmdir(). But if the
directory is busy, we only want to retry deleting the directory if it
is empty, so test specifically for that case and set ENOTEMPTY rather
than EACCES.
Noticed by Greg Hazel.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw: add fallback for rmdir in case directory is in use
The same logic as for unlink and rename also applies to rmdir. For
example in case you have a shell open in a git controlled folder. This
will easily fail. So lets be nice for such cases as well.
Signed-off-by: Heiko Voigt <heiko.voigt@mahr.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw: make failures to unlink or move raise a question
On Windows in case a program is accessing a file unlink or
move operations may fail. To give the user a chance to correct
this we simply wait until the user asks us to retry or fail.
This is useful because of the following use case which seem
to happen rarely but when it does it is a mess:
After making some changes the user realizes that he was on the
incorrect branch. When trying to change the branch some file
is still in use by some other process and git stops in the
middle of changing branches. Now the user has lots of files
with changes mixed with his own. This is especially confusing
on repositories that contain lots of files.
Although the recent implementation of automatic retry makes
this scenario much more unlikely lets provide a fallback as
a last resort.
Thanks to Albert Dvornik for disabling the question if users can't see it.
If the stdout of the command is connected to a terminal but the stderr
has been redirected, the odds are good that the user can't see any
question we print out to stderr. This will result in a "mysterious
hang" while the app is waiting for user input.
It seems better to be conservative, and avoid asking for input
whenever the stderr is not a terminal, just like we do for stdin.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Albert Dvornik <dvornik+git@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw: work around irregular failures of unlink on windows
If a file is opened by another process (e.g. indexing of an IDE) for
reading it is not allowed to be deleted. So in case unlink fails retry
after waiting for some time. This extends the workaround from 6ac6f878.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull: Document the "--[no-]recurse-submodules" options
In commits be254a0ea9 and 7dce19d374 the handling of the new fetch options
"--[no-]recurse-submodules" had been added to git-pull.sh. But they were
not documented as the pull options they now are, so let's fix that.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Attempting to include quote.h without first including strbuf.h results
in warnings:
./quote.h:33:33: warning: ‘struct strbuf’ declared inside parameter list
./quote.h:33:33: warning: its scope is only this definition or declaration, which is probably not what you want
./quote.h:34:34: warning: ‘struct strbuf’ declared inside parameter list
...
Add a toplevel declaration for struct strbuf to avoid this.
While at it, stop including system headers from quote.h. git source
files already need to include git-compat-util.h sooner to ensure the
appropriate feature test macros are defined.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cached object store was added in d66b37b (Add pretend_sha1_file()
interface. - 2007-02-04) as a way to temporarily inject some objects
to object store.
But only read_sha1_file() knows about this store. While it will return
an object from this store, sha1_object_info() will happily say
"object not found".
Teach sha1_object_info() about the cached store for consistency.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make hash-object more robust against malformed objects
Commits, trees and tags have structure. Don't let users feed git
with malformed ones. Sooner or later git will die() when
encountering them.
Note that this patch does not check semantics. A tree that points
to non-existent objects is perfectly OK (and should be so, users
may choose to add commit first, then its associated tree for example).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --cached" (without revision) used to mean "git diff --cached
HEAD" (i.e. the user was too lazy to type HEAD). This "correctly"
failed when there was no commit yet. But was that correctness useful?
This patch changes the definition of what particular command means.
It is a request to show what _would_ be committed without further "git
add". The internal implementation is the same "git diff --cached HEAD"
when HEAD exists, but when there is no commit yet, it compares the index
with an empty tree object to achieve the desired result.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t4120-apply-popt: help systems with core.filemode=false
A test case verifies that filemode-only patches work as expected. Help
systems where "test -x" does not work by applying the test patch also to
the index, where the effects can be verified even on such systems.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
start_command: flush buffers in the WIN32 code path as well
The POSIX code path did The Right Thing already, but we have to do the same
on Windows.
This bug caused failures in t5526-fetch-submodules, where the output of
'git fetch --recurse-submodules' was in the wrong order.
Debugged-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-bundle first appeared in 2e0afafe ("Add git-bundle") in Feb 2007,
and first shipped in Git 1.5.1.
However, OFS_DELTA is an even earlier invention, coming about in eb32d236 ("introduce delta objects with offset to base") in Sep 2006,
and first shipped in Git 1.4.4.5.
OFS_DELTA is smaller, about 3.2%-5% smaller, and is typically faster
to access than REF_DELTA because the exact location of the delta base
is available after parsing the object header. Since all bundle aware
versions of Git are also OFS_DELTA aware, just make it the default.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add testcases showing how pathspecs are handled with rev-list --objects
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make rev-list --objects work together with pathspecs
When traversing commits, the selection of commits would heed the list of
pathspecs passed, but subsequent walking of the trees of those commits
would not. This resulted in 'rev-list --objects HEAD -- <paths>'
displaying objects at unwanted paths.
Have process_tree() call tree_entry_interesting() to determine which paths
are interesting and should be walked.
Naturally, this change can provide a large speedup when paths are specified
together with --objects, since many tree entries are now correctly ignored.
Interestingly, though, this change also gives me a small (~1%) but
repeatable speedup even when no paths are specified with --objects.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier e10cb0f (tree_entry_interesting(): support wildcard matching,
2010-12-15) and b3d4b34 (tree_entry_interesting(): optimize wildcard
matching when base is matched, 2010-12-15) added tests for globbing
support for diff-tree plumbing. This is a follow-up to update the test
for revision traversal and path pruning machinery for the same topic.
match_pathspec_depth() is a clone of match_pathspec() except that it
can take depth limit. Computation is a bit lighter compared to
match_pathspec() because it's usually precomputed and stored in struct
pathspec.
In long term, match_pathspec() and match_one() should be removed in
favor of this function.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree_entry_interesting(): optimize wildcard matching when base is matched
If base is already matched, skip that part when calling
fnmatch(). This happens quite often if users start a command from
worktree's subdirectory and prefix is usually prepended to all
pathspecs.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree_entry_interesting(): fix depth limit with overlapping pathspecs
Suppose we have two pathspecs 'a' and 'a/b' (both are dirs) and depth
limit 1. In current code, pathspecs are checked in input order. When
'a/b' is checked against pathspec 'a', it fails depth limit and
therefore is excluded, although it should match 'a/b' pathspec.
This patch reorders all pathspecs alphabetically, then teaches
tree_entry_interesting() to check against the deepest pathspec first,
so depth limit of a shallower pathspec won't affect a deeper one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is needed to replace pathspec_matches() in builtin/grep.c.
max_depth == -1 means infinite depth. Depth limit is only effective
when pathspec.recursive == 1. When pathspec.recursive == 0, the
behavior depends on match functions: non-recursive for
tree_entry_interesting() and recursive for match_pathspec{,_depth}
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff-tree: convert base+baselen to writable strbuf
In traversing trees, a full path is splitted into two parts: base
directory and entry. They are however quite often concatenated
whenever a full path is needed. Current code allocates a new buffer,
do two memcpy(), use it, then release.
Instead this patch turns "base" to a writable, extendable buffer. When
a concatenation is needed, the callee only needs to append "entry" to
base, use it, then truncate the entry out again. "base" must remain
unchanged before and after entering a function.
This avoids quite a bit of malloc() and memcpy().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree_entry_interesting(): remove dependency on struct diff_options
This function can be potentially used in more places than just
tree-diff.c. "struct diff_options" does not make much sense outside
diff_tree_sha1().
While removing the use of diff_options, it also removes
tree_entry_extract() call, which means S_ISDIR() uses the entry->mode
directly, without being filtered by canon_mode() (called internally
inside tree_entry_extract).
The only use of the mode information in this function is to check the
type of the entry by giving it to S_ISDIR() macro, and the result does
not change with or without canon_mode(), so it is ok to bypass
tree_entry_extract().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old pathspec structure remains as pathspec.raw[]. New things are
stored in pathspec.items[]. There's no guarantee that the pathspec
order in raw[] is exactly as in items[].
raw[] is external (source) data and is untouched by pathspec
manipulation functions. It eases migration from old const char ** to
this new struct.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there is a random garbage file whose name happens to be 38-byte
long in a .git/objects/??/ directory, the loop terminated prematurely
without marking all the other files that it hasn't checked in the
readdir() loop.
Treat such a file just like any other garbage file, and do not break out
of the readdir() loop.
While at it, replace repeated sprintf() calls to a single one outside the
loop.
Don't pass "--xhtml" to hightlight in gitweb.perl script.
The "--xhtml" option is supported only in highlight < 3.0. There is no option
to enforce (X)HTML output format compatible with both highlight < 3.0 and
highlight >= 3.0. However default output format is HTML so we don't need to
explicitly specify it.
Signed-off-by: Adam Tkac <atkac@redhat.com> Helped-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
rebase -i: clarify in-editor documentation of "exec"
tests: sanitize more git environment variables
fast-import: treat filemodify with empty tree as delete
rebase: give a better error message for bogus branch
rebase: use explicit "--" with checkout
rebase -i: clarify in-editor documentation of "exec"
The hints in the current "instruction sheet" template look like so:
# Rebase 3f14246..a1d7e01 onto 3f14246
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x <cmd>, exec <cmd> = Run a shell command <cmd>, and stop if it fails
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#
This does not make it clear that the format of each line is
<insn> <commit id> <explanatory text that will be printed>
but the reader will probably infer that from the automatically
generated pick examples above it.
What about the "exec" instruction? By analogy, I might imagine that
the format of that line is "exec <command> <explanatory text>", and
the "x <cmd>" hint does not address that question (at first I read it
as taking an argument <cmd> that is the name of a shell). Meanwhile,
the mention of <cmd> makes the hints harder to scan as a table.
So remove the <cmd> and add some words to remind the reader that
"exec" runs a command named by the rest of the line. To make room, it
is left to the manpage to explain that that command is run using
$SHELL and that nonzero status from that command will pause the
rebase.
Wording from Junio.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import: treat filemodify with empty tree as delete
Normal git processes do not allow one to build a tree with an empty
subtree entry without trying hard at it. This is in keeping with the
general UI philosophy: git tracks content, not empty directories.
v1.7.3-rc0~75^2 (2010-06-30) changed that by making it easy to include
an empty subtree in fast-import's active commit:
One can trigger this by reading an empty tree (for example, the tree
corresponding to an empty root commit) and trying to move it to a
subtree. It is better and more closely analogous to 'git read-tree
--prefix' to treat such commands as requests to remove the subtree.
Noticed-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase: give a better error message for bogus branch
When you give a non-existent branch to git-rebase, it spits
out the usage. This can be confusing, since you may
understand the usage just fine, but simply have made a
mistake in the branch name.
Before:
$ git rebase origin bogus
Usage: git rebase ...
After:
$ git rebase origin bogus
fatal: no such branch: bogus
Usage: git rebase ...
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case of a ref/pathname conflict, checkout will
already do the right thing and checkout the ref. However,
for a non-existant ref, this has two advantages:
1. If a file with that pathname exists, rebase will
refresh the file from the index and then rebase the
current branch instead of producing an error.
2. If no such file exists, the error message using an
explicit "--" is better:
# before
$ git rebase -i origin bogus
error: pathspec 'bogus' did not match any file(s) known to git.
Could not checkout bogus
# after
$ git rebase -i origin bogus
fatal: invalid reference: bogus
Could not checkout bogus
The problems seem to be trigger-able only through "git
rebase -i", as regular git-rebase checks the validity of the
branch parameter as a ref very early on. However, it doesn't
hurt to be defensive.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/setup-fixes:
t1510: fix typo in the comment of a test
Documentation updates for 'GIT_WORK_TREE without GIT_DIR' historical usecase
Subject: setup: officially support --work-tree without --git-dir
tests: compress the setup tests
tests: cosmetic improvements to the repo-setup test
t/README: hint about using $(pwd) rather than $PWD in tests
Fix expected values of setup tests on Windows
Subject: setup: officially support --work-tree without --git-dir
The original intention of --work-tree was to allow people to work in a
subdirectory of their working tree that does not have an embedded .git
directory. Because their working tree, which their $cwd was in, did not
have an embedded .git, they needed to use $GIT_DIR to specify where it is,
and because this meant there was no way to discover where the root level
of the working tree was, so we needed to add $GIT_WORK_TREE to tell git
where it was.
However, this facility has long been (mis)used by people's scripts to
start git from a working tree _with_ an embedded .git directory, let git
find .git directory, and then pretend as if an unrelated directory were
the associated working tree of the .git directory found by the discovery
process. It happens to work in simple cases, and is not worth causing
"regression" to these scripts.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: do not treat reset --keep as a special case
The current treatment of "git reset --keep" emphasizes how it
differs from --hard (treatment of local changes) and how it breaks
down into plumbing (git read-tree -m -u HEAD <commit> followed by git
update-ref HEAD <commit>). This can discourage people from using
it, since it might seem to be a complex or niche option.
Better to emphasize what the --keep flag is intended for --- moving
the index and worktree from one commit to another, like "git checkout"
would --- so the reader can make a more informed decision about the
appropriate situations in which to use it.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The errno check added in commit 3ba7a06 "A loose object is not corrupt
if it cannot be read due to EMFILE" only checked for whether errno is
not ENOENT and thus incorrectly treated "no error" as an error
condition.
Because of that, it never reached the code path that would report that
the object is corrupted and instead caused funny errors like:
- setup_repo, to initialize a repository or gitfile pointing to a
repository, with core.bare and core.worktree set as specified;
- try_case, to run setup from a given directory and validate the
result, with GIT_DIR and GIT_WORK_TREE set as specified;
- try_repo, to initialize a repository and call "try_case" from the
toplevel and a subdirectory;
- run_wt_tests, to run a battery of tests that check for sane
behavior when GIT_WORK_TREE is set to various positions relative to
the .git dir and cwd.
Use these helpers to make the test shorter, less repetitive, and (one
hopes) easier to understand and modify.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rj/maint-test-fixes:
t9501-*.sh: Fix a test failure on Cygwin
lib-git-svn.sh: Add check for mis-configured web server variables
lib-git-svn.sh: Avoid setting web server variables unnecessarily
t9142: Move call to start_httpd into the setup test
t3600-rm.sh: Don't pass a non-existent prereq to test #15
* ak/describe-exact:
describe: Delay looking up commits until searching for an inexact match
describe: Store commit_names in a hash table by commit SHA1
describe: Do not use a flex array in struct commit_name
describe: Use for_each_rawref
Documentation/fast-import: put explanation of M 040000 <dataref> "" in context
Omit needless words ("Additionally ... <path> may also" is redundant).
While at it, place the explanation of this special case after the
general rules for paths to provide the reader with some context.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In particular, on systems that define uint32_t as an unsigned long,
gcc complains as follows:
CC vcs-svn/svndump.o
vcs-svn/svndump.c: In function `svndump_read':
vcs-svn/svndump.c:215: warning: int format, uint32_t arg (arg 2)
In order to suppress the warning we use the C99 format specifier
macro PRIu32 from <inttypes.h>.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of stripping space characters past the beginning of the
line and overflowing a buffer, stop at the beginning of the line
(mimicking the corresponding fix in remote-fd).
The argument to isspace does not need to be cast explicitly because
git isspace takes care of that already.
Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- first data
- then functions
- then test assertions
Mark up inline test vectors as
cat >vector <<-\EOF
data
data
EOF
for visual scannability. Use words like "set up" for tests that set
up for other tests, to make it obvious which tests are safe to skip.
Use repeated function calls instead of a loop for the
language-specific tests, so the invocations can be easily tweaked
individually (for example if one starts to fail).
This means if you add a new subdirectory to t4034/, it will not be
automatically used. I think that's worth it for the added
explicitness.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git's diff-words support has a detail that can be a little dangerous:
any text not matched by a given language's tokenization pattern is
treated as whitespace and changes in such text would go unnoticed.
Therefore each of the built-in regexes allows a special token type
consisting of a single non-whitespace character [^[:space:]].
To make sure UTF-8 sequences remain human readable, the builtin
regexes also have a special token type for runs of bytes with the high
bit set. In English, non-ASCII characters are usually isolated so
this is analogous to the [^[:space:]] pattern, except it matches a
single _multibyte_ character despite use of the C locale.
Unfortunately it is easy to make typos or forget entirely to include
these catch-all token types when adding support for new languages (see
v1.7.3.5~16, userdiff: fix typo in ruby and python word regexes,
2010-12-18). Avoid this by including them automatically within the
PATTERNS and IPATTERN macros.
While at it, change the UTF-8 sequence token type to match exactly one
non-ASCII multi-byte character, rather than an arbitrary run of them.
Suggested-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The builtin word regexes should be tested with some simple examples
against simple issues. Do this in bulk.
Mainly due to a lack of language knowledge and inspiration, most of
the test cases (cpp, csharp, java, objc, pascal, php, python, ruby)
are directly based off a C operator precedence table to verify that
all operators are split correctly. This means that they are probably
incomplete or inaccurate except for 'cpp' itself.
Still, they are good enough to already have uncovered a typo in the
python and ruby patterns.
'fortran' is based on my anecdotal knowledge of the DO10I parsing
rules, and thus probably useless. The rest (bibtex, html, tex) are an
ad-hoc test of what I consider important splits in those languages.
Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a frontend uses a marks file to ensure its state persists between
runs, it may represent "clean slate" when bootstrapping with "no marks
yet". In such a case, feeding the last state with --import-marks and
saving the state after the current run with --export-marks would be a
natural thing to do.
The --import-marks option however errors out when the specified marks file
doesn't exist; this makes bootstrapping a bit difficult. The location of
the marks file becomes backend-dependent when --relative-marks is in
effect, and the frontend cannot check for the existence of the file in
such a case.
The --import-marks-if-exists option does the same thing as --import-marks
but does not flag an error if the named file does not exist yet to help
these frontends.
Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
As long as sizeof(struct ll_merge_options) is small, there is not
much reason not to keep a copy of the default merge options in the BSS
section. In return, we get clearer code and one less stack frame in
the opts == NULL case.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>