git-svn: do not escape certain characters in paths
Subversion 1.7 and newer implement HTTPv2, an extension that should make HTTP
more efficient. Servers with support for this protocol will make the subversion
client library take an alternative code path that checks (with assertions)
whether the URL is "canonical" or not.
This patch fixes an issue I encountered while trying to `git svn dcommit` a
rename action for a file containing a single quote character ("User's Manual"
to "UserMan.tex"). It does not happen for older subversion 1.6 servers nor
non-HTTP(S) protocols such as the native svn protocol, only on an Apache server
shipping SVN 1.7. Trying to `git svn dcommit` under the aforementioned
conditions yields the following error which aborts the commit process:
Committing to http://example.com/svn ...
perl: subversion/libsvn_subr/dirent_uri.c:1520: uri_skip_ancestor:
Assertion `svn_uri_is_canonical(child_uri, ((void *)0))' failed.
error: git-svn died of signal 6
An analysis of the subversion source for the cause:
- The assertion originates from uri_skip_ancestor which calls
svn_uri_is_canonical, which fails when the URL contains percent-encoded values
that do not necessarily have to be encoded (not "canonical" enough). This is
done by a table lookup in libsvn_subr/path.c. Putting some debugging prints
revealed that the character ' is indeed encoded to %27 which is not
considered canonical.
- url_skip_ancestor is called by svn_ra_neon__get_baseline_info with the root
repository URL and path as parameters;
- which is called by copy_resource (libsvn_ra_neon/commit.c) for a copy action
(or in my case, renaming which is actually copy + delete old);
- which is called by commit_add_dir;
- which is assigned as a structure method "add_file" in
svn_ra_neon__get_commit_editor.
In the whole path, the path argument is not modified.
Through some more uninteresting wrapper functions, the Perl bindings gives you
access to the add_file method which will pass the path argument without
modifications to svn.
git-svn calls the "R"(ename) subroutine in Git::SVN::Editor which contains:
326 my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
327 $self->url_path($m->{file_a}), $self->{r});
"repo_path" basically returns the path as-is, unless the "svn.pathnameencoding"
configuration property is set. "url_path" tries to escape some special
characters, but does not take all special characters into account, thereby
causing the path to contain some escaped characters which do not have to be
escaped.
The list of characters not to be escaped are taken from the
subversion/libsvn_subr/path.c file to fully account for all characters. Tested
with a filename containing all characters in the range 0x20 to 0x78 (inclusive).
Signed-off-by: Peter Wu <lekensteyn@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
* git://bogomips.org/git-svn:
git-svn: teach find-rev to find near matches
git svn: do not overescape URLs (fallback case)
Git::SVN::Editor::T: pass $deletions to ->A and ->D
An internal ls-tree call made by completion code only to probe if
a path exists in the tree recorded in a commit object leaked error
messages when the path is not there. It is not an error at all and
should not be shown to the end user.
* ds/completion-silence-in-tree-path-probe:
git-completion.bash: silence "not a valid object" errors
In the precedence order, the environment variable $EMAIL comes
between the built-in default (i.e. taking value by asking the
system's gethostname() etc.) and the user.email configuration
variable; the documentation implied that it is stronger than the
configuration like $GIT_COMMITTER_EMAIL is, which is wrong.
* pe/doc-email-env-is-trumped-by-config:
git-commit-tree(1): correct description of defaults
When a single SVN repository is split into multiple Git repositories
many SVN revisions will exist in only one of the Git repositories
created. For some projects the only way to build a working artifact is
to check out corresponding versions of various repositories, with no
indication of what those are in the Git world - in the SVN world the
revision numbers are sufficient.
By adding "--before" to "git-svn find-rev" we can say "tell me what this
repository looked like when that other repository looked like this":
Subversion's canonical URLs are intended to make URL comparison easy
and therefore have strict rules about what characters are special
enough to urlencode and what characters should be left alone.
When in the fallback codepath because unable to use libsvn's own
canonicalization function for some reason, escape special characters
in URIs according to the svn_uri__char_validity[] table in
subversion/libsvn_subr/path.c (r935829). The libsvn versions that
trigger this code path are not likely to be strict enough to care, but
it's nicer to be consistent.
Noticed by using SVN 1.6.17 perl bindings, which do not provide
SVN::_Core::svn_uri_canonicalize (triggering the fallback code),
with libsvn 1.7.5, whose do_switch is fussy enough to care:
Committing to file:///home/jrn/src/git/t/trash%20directory.\
t9118-git-svn-funky-branch-names/svnrepo/pr%20ject/branches\
/more%20fun%20plugin%21 ...
svn: E235000: In file '[...]/subversion/libsvn_subr/dirent_uri.c' \
line 2291: assertion failed (svn_uri_is_canonical(url, pool))
error: git-svn died of signal 6
not ok - 3 test dcommit to funky branch
After this change, the '!' in 'more%20fun%20plugin!' is not urlencoded
and t9118 passes again.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
Git::SVN::Editor::T: pass $deletions to ->A and ->D
This shouldn't make a difference because the $deletions hash is
only used when adding a directory (see 379862ec, 2012-02-20) but
it's nice to be consistent to make reading smoother anyway. No
functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
Commit 82dce99 (attr: more matching optimizations from .gitignore -
2012-10-15) changed match_attr structure but it did not update
DEBUG_ATTR-specific code. This fixes it.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
pretty: use prefixcmp instead of memcmp on NUL-terminated strings
This conversion avoids the need for magic string length numbers in the
code. And unlike memcmp(), prefixcmp() is careful to not run over the
end of a string.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the longer term, tightening rules is a good thing to do, and
because nobody who has worked in the remote helper area seems to be
interested in reviewing this, I would assume they do not think
such a retroactive tightening will affect their remote helpers. So
let's advance this topic to see what happens.
* fc/remote-testgit-feature-done:
remote-testgit: properly check for errors
When user spells "cc:" in lowercase in the fake "header" in the
trailer part, send-email failed to pick up the addresses from
there. As e-mail headers field names are case insensitive, this
script should follow suit and treat "cc:" and "Cc:" the same way.
* nz/send-email-headers-are-case-insensitive:
git-send-email: treat field names as case-insensitively
Merge branch 'jl/interrupt-clone-remove-separate-git-dir' into maint
When "git clone --separate-git-dir=$over_there" is interrupted, it
failed to remove the real location of the $GIT_DIR it created. This
was most visible when interrupting a submodule update.
* jl/interrupt-clone-remove-separate-git-dir:
clone: support atomic operation with --separate-git-dir
Merge branch 'jc/maint-fmt-merge-msg-no-edit-lose-credit' into maint
"git merge --no-edit" computed who were involved in the work done
on the side branch, even though that information is to be discarded
without getting seen in the editor.
* jc/maint-fmt-merge-msg-no-edit-lose-credit:
merge --no-edit: do not credit people involved in the side branch
The behaviour visible to the end users was confusing, when they
attempt to kill a process spawned in the editor that was in turn
launched by Git with SIGINT (or SIGQUIT), as Git would catch that
signal and die. We ignore these signals now.
* pf/editor-ignore-sigint:
fix compilation with NO_PTHREADS
launch_editor: propagate signals from editor to git
run-command: do not warn about child death from terminal
launch_editor: ignore terminal signals while editor has control
launch_editor: refactor to use start/finish_command
run-command: drop silent_exec_failure arg from wait_or_whine
Improve compatibility of our zip output to fill uncompressed size
in the header, which we can do without seeking back (even though it
should not be necessary).
* rs/zip-with-uncompressed-size-in-the-header:
archive-zip: write uncompressed size into header even with streaming
Update zip tests to skip some that cannot be handled on platform
unzip.
* rs/zip-tests:
t5003: check if unzip supports symlinks
t5000, t5003: move ZIP tests into their own script
t0024, t5000: use test_lazy_prereq for UNZIP
t0024, t5000: clear variable UNZIP, use GIT_UNZIP instead
Update the disused merge-tree proof-of-concept code.
* jc/merge-blobs:
merge-tree: fix d/f conflicts
merge-tree: add comments to clarify what these functions are doing
merge-tree: lose unused "resolve_directories"
merge-tree: lose unused "flags" from merge_list
Which merge_file() function do you mean?
Teach "format-patch" to prefix v4- to its output files for the
fourth iteration of a patch series, to make it easier for the
submitter to keep separate copies for iterations.
* jc/format-patch-reroll:
format-patch: give --reroll-count a short synonym -v
format-patch: document and test --reroll-count
format-patch: add --reroll-count=$N option
get_patch_filename(): split into two functions
get_patch_filename(): drop "just-numbers" hack
get_patch_filename(): simplify function signature
builtin/log.c: stop using global patch_suffix
builtin/log.c: drop redundant "numbered_files" parameter from make_cover_letter()
builtin/log.c: drop unused "numbered" parameter from make_cover_letter()
* jc/submittingpatches:
SubmittingPatches: give list and maintainer addresses
SubmittingPatches: remove overlong checklist
SubmittingPatches: mention subsystems with dedicated repositories
SubmittingPatches: who am I and who cares?
Merge branch 'os/gitweb-highlight-uncaptured' into maint
"gitweb", when sorting by age to show repositories with new
activities first, used to sort repositories with absolutely nothing
in it early, which was not very useful.
* os/gitweb-highlight-uncaptured:
gitweb: fix error in sanitize when highlight is enabled
Merge branch 'jn/warn-on-inaccessible-loosen' into maint
When attempting to read the XDG-style $HOME/.config/git/config and
finding that $HOME/.config/git is a file, we gave a wrong error
message, instead of treating the case as "a custom config file does
not exist there" and moving on.
* jn/warn-on-inaccessible-loosen:
config: exit on error accessing any config file
doc: advertise GIT_CONFIG_NOSYSTEM
config: treat user and xdg config permission problems as errors
config, gitignore: failure to access with ENOTDIR is ok
"git fetch --mirror" and fetch that uses other forms of refspec with
wildcard used to attempt to update a symbolic ref that match the
wildcard on the receiving end, which made little sense (the real ref
that is pointed at by the symbolic ref would be updated anyway).
Symbolic refs no longer are affected by such a fetch.
* jc/fetch-ignore-symref:
fetch: ignore wildcarded refspecs that update local symbolic refs
The way "git svn" asked for password using SSH_ASKPASS and
GIT_ASKPASS was not in line with the rest of the system.
* ss/svn-prompt:
git-svn, perl/Git.pm: extend and use Git->prompt method for querying users
perl/Git.pm: Honor SSH_ASKPASS as fallback if GIT_ASKPASS is not set
git-svn, perl/Git.pm: add central method for prompting passwords
contrib/vim: simplify instructions for old vim support
Rely on the upstream filetype.vim instead of duplicating its rules in
git's instructions for syntax highlighting support on pre-7.2 vim
versions.
The result is a shorter contrib/vim/README. More importantly, it lets
us punt on maintenance of the autocmd rules.
So now when we fix the upstream gitsendemail rule in light of commit eed6ca7, new git users stuck on old vim reading contrib/vim/README can
automagically get the fix without any further changes needed to git.
Once the world has moved on to vim 7.2+ completely, we can get rid of
these instructions, but for now if they are this simple it's
effortless to keep them.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When make is run, the python scripts are created from *.py files that
are changed to use the python given by PYTHON_PATH. And PYTHON_PATH
is set by default to /usr/bin/python on Linux.
However, next time make is run with a different value in PYTHON_PATH,
we failed to regenerate these scripts.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Acked-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'nd/invalidate-i-t-a-cache-tree' into maint
* nd/invalidate-i-t-a-cache-tree:
cache-tree: invalidate i-t-a paths after generating trees
cache-tree: fix writing cache-tree when CE_REMOVE is present
cache-tree: replace "for" loops in update_one with "while" loops
cache-tree: remove dead i-t-a code in verify_cache()
Refactor and generally clean up the directory traversal API
implementation.
* as/dir-c-cleanup:
dir.c: rename free_excludes() to clear_exclude_list()
dir.c: refactor is_path_excluded()
dir.c: refactor is_excluded()
dir.c: refactor is_excluded_from_list()
dir.c: rename excluded() to is_excluded()
dir.c: rename excluded_from_list() to is_excluded_from_list()
dir.c: rename path_excluded() to is_path_excluded()
dir.c: rename cryptic 'which' variable to more consistent name
Improve documentation and comments regarding directory traversal API
api-directory-listing.txt: update to match code
Move the bits to set fallback default based on the platform from
the main Makefile to a separate file, so that it can be included in
Makefiles in subdirectories.
* jk/config-uname:
Makefile: hoist uname autodetection to config.mak.uname
Allows pathname patterns in .gitignore and .gitattributes files
with double-asterisks "foo/**/bar" to match any number of directory
hierarchies.
* nd/wildmatch:
wildmatch: replace variable 'special' with better named ones
compat/fnmatch: respect NO_FNMATCH* even on glibc
wildmatch: fix "**" special case
t3070: Disable some failing fnmatch tests
test-wildmatch: avoid Windows path mangling
Support "**" wildcard in .gitignore and .gitattributes
wildmatch: make /**/ match zero or more directories
wildmatch: adjust "**" behavior
wildmatch: fix case-insensitive matching
wildmatch: remove static variable force_lower_case
wildmatch: make wildmatch's return value compatible with fnmatch
t3070: disable unreliable fnmatch tests
Integrate wildmatch to git
wildmatch: follow Git's coding convention
wildmatch: remove unnecessary functions
Import wildmatch from rsync
ctype: support iscntrl, ispunct, isxdigit and isprint
ctype: make sane_ctype[] const array
git-commit-tree(1): correct description of defaults
The old phrasing indicated that the EMAIL environment variable takes
precedence over the user.email configuration setting, but it is the
other way around.
Signed-off-by: Peter Eisentraut <peter@eisentraut.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options in git-fast-import(1) are not currently arranged in a
logical order, which has caused the '--done' options to be documented
twice (commit 3266de10).
Rearrange them into logical groups under subheadings.
Suggested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-fast-import(1): combine documentation of --[no-]relative-marks
The descriptions of '--relative-marks' and '--no-relative-marks' make
more sense when read together instead of as two independent options.
Combine them into a single description block.
Signed-off-by: John Keeping <john@keeping.me.uk> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-shortlog(1): document behaviour of zero-width wrap
Commit 00d3947 (Teach --wrap to only indent without wrapping) added
special behaviour for a width of zero in the '-w' argument to
'git-shortlog' but this was not documented. Fix this.
Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach various forms of "format-patch" command line to identify what
branch the patches are taken from, so that the branch description
is picked up in more cases.
* nd/maint-branch-desc-doc:
format-patch: pick up branch description when no ref is specified
format-patch: pick up correct branch name from symbolic ref
t4014: a few more tests on cover letter using branch description
branch: delete branch description if it's empty
config.txt: a few lines about branch.<name>.description
"git merge" started calling prepare-commit-msg hook like "git
commit" does some time ago, but forgot to pay attention to the exit
status of the hook. t7505 may want a general clean-up but that is
a different topic.
New remote helper for bzr, with minimum fix squashed in.
* fc/remote-bzr:
remote-bzr: detect local repositories
remote-bzr: add support for older versions of bzr
remote-bzr: add support to push special modes
remote-bzr: add support for fecthing special modes
remote-bzr: add simple tests
remote-bzr: update working tree upon pushing
remote-bzr: add support for remote repositories
remote-bzr: add support for pushing
Add new remote-bzr transport helper
Streamline the document and update with a few e-mail addresses the
patches should be sent to.
* jc/submittingpatches:
SubmittingPatches: give list and maintainer addresses
SubmittingPatches: remove overlong checklist
SubmittingPatches: mention subsystems with dedicated repositories
SubmittingPatches: who am I and who cares?