cvsserver: remove unused functions _headrev and gethistory
Remove:
- _headrev() - It uses similar functionality from getmeta() and gethead().
- gethistory() - It uses similar functions gethistorydense() and getlog().
Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
cvsserver: removed unused sha1Or-k mode from kopts_from_path
sha1Or-k was a vestige from an early, never-released
attempt to handle some oddball cases of CRLF conversion (-k option).
Ultimately it wasn't needed, and I should have gotten rid of it
before submitting the CRLF patch in the first place.
See also 90948a42892779 (add ability to guess -kb from contents).
Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'cvs log' output is arguably deficient in a number of ways
(see the comment added with the test), but add a test for
the current output to detect for accidental regressions.
Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: lock symref that is to be deleted, not its target
When delete_ref is called on a symref then it locks its target and then
either deletes the target or the symref, depending on whether the flag
REF_NODEREF was set in the parameter delopt.
Instead, simply pass the flag to lock_ref_sha1_basic, which will then
either lock the target or the symref, and delete the locked ref.
This reimplements part of eca35a25 (Fix git branch -m for symrefs.).
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
.gitattributes and .gitignore share the same pattern syntax but has
separate matching implementation. Over the years, ignore's
implementation accumulates more optimizations while attr's stays the
same.
This patch reuses the core matching functions that are also used by
excluded_from_list. excluded_from_list and path_matches can't be
merged due to differences in exclude and attr, for example:
* "!pattern" syntax is forbidden in .gitattributes. As an attribute
can be unset (i.e. set to a special value "false") or made back to
unspecified (i.e. not even set to "false"), "!pattern attr" is unclear
which one it means.
* we support attaching attributes to directories, but git-core
internally does not currently make use of attributes on
directories.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitignore: make pattern parsing code a separate function
This function can later be reused by attr.c. Also turn to_exclude
field into a flag.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "namelen" becomes zero at this stage, we have matched the fixed
part, but whether it actually matches the pattern still depends on the
pattern in "exclude". As demonstrated in t3001, path "three/a.3"
exists and it matches the "three/a.3" part in pattern "three/a.3[abc]",
but that does not mean a true match.
Don't be too optimistic and let fnmatch() do the job.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
exclude: stricten a length check in EXC_FLAG_ENDSWITH case
This block of code deals with the "basename" part only, which has the
length of "pathlen - (basename - pathname)". Stricten the length check
and remove "pathname" from the main expression to avoid confusion.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we get an http 401, we prompt for credentials and put
them in our global credential struct. We also feed them to
the curl handle that produced the 401, with the intent that
they will be used on a retry.
When the code was originally introduced in commit 42653c0,
this was a necessary step. However, since dfa1725, we always
feed our global credential into every curl handle when we
initialize the slot with get_active_slot. So every further
request already feeds the credential to curl.
Moreover, accessing the slot here is somewhat dubious. After
the slot has produced a response, we don't actually control
it any more. If we are using curl_multi, it may even have
been re-initialized to handle a different request.
It just so happens that we will reuse the curl handle within
the slot in such a case, and that because we only keep one
global credential, it will be the one we want. So the
current code is not buggy, but it is misleading.
By cleaning it up, we can remove the slot argument entirely
from handle_curl_result, making it much more obvious that
slots should not be accessed after they are marked as
finished.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b81401c (http: prompt for credentials on failed POST)
taught post_rpc to call run_slot in a loop in order to retry
a request after asking the user for credentials. However,
after a call to run_slot we will have called
finish_active_slot. This means we have released the slot,
and we should no longer look at it.
As it happens, this does not cause any bugs in the current
code, since we know that we are not using curl_multi in this
code path, and therefore nobody will have taken over our
slot in the meantime. However, it is good form to actually
call get_active_slot again. It also future proofs us against
changes in the http code.
We can do this by jumping back to a retry label at the top
of our function. We just need to reorder a few setup lines
that should not be repeated; everything else within the loop
is either idempotent, needs to be repeated, or in a path we
do not follow (e.g., we do not even try when large_request
is set, because we don't know how much data we might have
streamed from our helper program).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we create an http active_request_slot, we can set its
"results" pointer back to local storage. The http code will
fill in the details of how the request went, and we can
access those details even after the slot has been cleaned
up.
Commit 8809703 (http: factor out http error code handling)
switched us from accessing our local results struct directly
to accessing it via the "results" pointer of the slot. That
means we're accessing the slot after it has been marked as
finished, defeating the whole purpose of keeping the results
storage separate.
Most of the time this doesn't matter, as finishing the slot
does not actually clean up the pointer. However, when using
curl's multi interface with the dumb-http revision walker,
we might actually start a new request before handing control
back to the original caller. In that case, we may reuse the
slot, zeroing its results pointer, and leading the original
caller to segfault while looking for its results inside the
slot.
Instead, we need to pass a pointer to our local results
storage to the handle_curl_result function, rather than
relying on the pointer in the slot struct. This matches what
the original code did before the refactoring (which did not
use a separate function, and therefore just accessed the
results struct directly).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep: stop looking at random places for .gitattributes
grep searches for .gitattributes using "name" field in struct
grep_source but that field is not real on-disk path name. For example,
"grep pattern rev" fills the field with "rev:path", and Git looks for
.gitattributes in the (non-existent but exploitable) path "rev:path"
instead of "path".
This patch passes real paths down to grep_source_load_driver() when:
- grep on work tree
- grep on the index
- grep a commit (or a tag if it points to a commit)
so that these cases look up .gitattributes at proper paths.
.gitattributes lookup is disabled in all other cases.
Initial-work-by: Jeff King <peff@peff.net> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
p4merge does not properly handle the case where "/dev/null"
is passed as a filename.
Work it around by creating a temporary file for this purpose.
Reported-by: Jeremy Morton <admin@game-point.net> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
Needs to be amended with Tested-by when a report comes...
test-lib: Fix say_color () not to interpret \a\b\c in the message
When running with color disabled (e.g. under prove to produce TAP
output), say_color() helper function is defined to use echo to show
the message. With a message that ends with "\c", echo is allowed to
interpret it as "Do not end the line with LF".
That sets up for later tests that fetch the branch and check that git
svn mangles the refname appropriately.
Unfortunately, modern svn versions interpret path arguments with an
@-sign as an example of path@revision syntax (which pegs a path to a
particular revision) and truncate the path or error out with message
"svn: E205000: Syntax error parsing peg revision '{0}reflog'".
When using subversion 1.6.x, escaping the @ sign as %40 avoids trouble
(see 08fd28bb, 2010-07-08). Newer versions are stricter:
The recommended method for escaping a literal @ sign in a path passed
to subversion is to add an empty peg revision at the end of the path
("branches/not-a@{0}reflog@"). Do that.
Pre-1.6.12 versions of Subversion probably treat the trailing @ as
another literal @-sign (svn issue 3651). Luckily ever since
v1.8.0-rc0~155^2~7 (t9118: workaround inconsistency between SVN
versions, 2012-07-28) the test can survive that.
Tested with Debian Subversion 1.6.12dfsg-6 and 1.7.5-1 and r1395837
of Subversion trunk (1.8.x).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Tested-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Eric Wong <normalperson@yhbt.net>
git svn: work around SVN 1.7 mishandling of svn:special changes
Subversion represents symlinks as ordinary files with content starting
with "link " and the svn:special property set to "*". Thus a file can
switch between being a symlink and a non-symlink simply by toggling
its svn:special property, and new checkouts will automatically write a
file of the appropriate type. Likewise, in subversion 1.6 and older,
running "svn update" would notice changes in filetype and update the
working copy appropriately.
Starting in subversion 1.7 (issue 4091), changes to the svn:special
property trip an assertion instead:
$ svn up svn-tree
Updating 'svn-tree':
svn: E235000: In file 'subversion/libsvn_wc/update_editor.c' \
line 1583: assertion failed (action == svn_wc_conflict_action_edit \
|| action == svn_wc_conflict_action_delete || action == \
svn_wc_conflict_action_replace)
Revisions prepared with ordinary svn commands ("svn add" and not "svn
propset") don't trip this because they represent these filetype
changes using a replace operation, which is approximately equivalent
to removal followed by adding a new file and works fine. Follow suit.
Noticed using t9100. After this change, git-svn's file-to-symlink
changes are sent in a format that modern "svn update" can handle and
tests t9100.11-13 pass again.
[ew: s,git-svn\.perl,perl/Git/SVN/Editor.pm,g]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
MALLOC_CHECK: Allow checking to be disabled from config.mak
The malloc checks can be disabled using the TEST_NO_MALLOC_CHECK
variable, either from the environment or command line of an
'make test' invocation. In order to allow the malloc checks to be
disabled from the 'config.mak' file, we add TEST_NO_MALLOC_CHECK
to the environment using an export directive.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The introduction email (--compose option) have encoding hardcoded to
UTF-8, but invoked editor may not use UTF-8 encoding.
The encoding used by patches can be changed by the "8bit-encoding"
option, but this option does not have effect on introduction email
and equivalent for introduction email is missing.
Added compose-encoding command line option and sendemail.composeencoding
configuration option specify encoding of introduction email.
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now the grep_config() callback is reusable from other configuration
callbacks, call it from git_log_config() so that grep.patterntype
and friends can be used with the commands in the "git log" family.
log --grep: accept --basic-regexp and --perl-regexp
When we added the "--perl-regexp" option (or "-P") to "git grep", we
should have done the same for the commands in the "git log" family,
but somehow we forgot to do so. This corrects it, but we will
reserve the short-and-sweet "-P" option for something else for now.
Also introduce the "--basic-regexp" option for completeness, so that
the "last one wins" principle can be used to defeat an earlier -E
option, e.g. "git log -E --basic-regexp --grep='<bre>'". Note that
it cannot have the short "-G" option as the option is to grep in the
patch text in the context of "log" family.
log --grep: use the same helper to set -E/-F options as "git grep"
The command line option parser for "git log -F -E --grep='<ere>'"
did not flip the "fixed" bit, violating the general "last option
wins" principle among conflicting options.
revisions: initialize revs->grep_filter using grep_init()
Instead of using the hand-rolled initialization sequence,
use grep_init() to populate the necessary bits. This opens
the door to allow the calling commands to optionally read
grep.* configuration variables via git_config() if they
want to.
grep: move pattern-type bits support to top-level grep.[ch]
Switching between -E/-G/-P/-F correctly needs a lot more than just
flipping opt->regflags bit these days, and we have a nice helper
function buried in builtin/grep.c for the sole use of "git grep".
Extract it so that "log --grep" family can also use it.
grep: move the configuration parsing logic to grep.[ch]
The configuration handling is a library-ish part of this program,
that is not specific to "git grep" command. It should be reusable
by "log" and others.
builtin/grep.c: make configuration callback more reusable
The grep_config() function takes one instance of grep_opt as its
callback parameter, and populates it by running git_config().
This has three practical implications:
- You have to have an instance of grep_opt already when you call
the configuration, but that is not necessarily always true. You
may be trying to initialize the grep_filter member of rev_info,
but are not ready to call init_revisions() on it yet.
- It is not easy to enhance grep_config() in such a way to make it
cascade to other callback functions to grab other variables in
one call of git_config(); grep_config() can be cascaded into from
other callbacks, but it has to be at the leaf level of a cascade.
- If you ever need to use more than one instance of grep_opt, you
will have to open and read the configuration file(s) every time
you initialize them.
Rearrange the configuration mechanism and model it after how diff
configuration variables are handled. An early call to git_config()
reads and remembers the values taken from the configuration in the
default "template", and a separate call to grep_init() uses this
template to instantiate a grep_opt.
The next step will be to move some of this out of this file so that
the other user of the grep machinery (i.e. "log") can use it.
40bfbde ("build: don't duplicate substitution of make variables",
2012-09-11) by mistake removed a necessary comma at the end of
"CC_LD_DYNPATH=-Wl,rpath," in line 414.
When executing "./configure --with-zlib=PATH", this resulted in
[...]
CC xdiff/xhistogram.o
AR xdiff/lib.a
LINK git-credential-store
/usr/bin/ld: bad -rpath option
collect2: ld returned 1 exit status
make: *** [git-credential-store] Error 1
$
during make.
Signed-off-by: Øyvind A. Holm <sunny@sunbase.org> Acked-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests just want a bit-for-bit identical copy; they do not need
even -H (there is no symbolic link involved) nor -p (there is no
funny permission or ownership issues involved).
Just use "cp -R" instead.
Signed-off-by: Ben Walton <bdwalton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git url doc: mark ftp/ftps as read-only and deprecate them
It is not even worth mentioning their removal; just discourage
people from using them.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fmt-merge-msg" (an internal helper reduce_heads() it uses) had
a severe performance regression; an empty "git pull" took forever to
finish as the result.
* jc/merge-bases-paint-fix:
paint_down_to_common(): parse commit before relying on its timestamp
Merge branch 'jk/receive-pack-unpack-error-to-pusher' into maint
"git receive-pack" (the counterpart to "git push") did not give
progress output while processing objects it received to the puser
when run over the smart-http protocol.
* jk/receive-pack-unpack-error-to-pusher:
receive-pack: drop "n/a" on unpacker errors
receive-pack: send pack-processing stderr over sideband
receive-pack: redirect unpack-objects stdout to /dev/null
A repository created with "git clone --single" had its fetch
refspecs set up just like a clone without "--single", leading the
subsequent "git fetch" to slurp all the other branches, defeating
the whole point of specifying "only this branch".
* rt/maint-clone-single:
clone --single: limit the fetch refspec to fetched branch
gitignore.txt: suggestions how to get literal # or ! at the beginning
We support backslash escape, but we hide the details behind the phrase
"a shell glob suitable for consumption by fnmatch(3)". So it may not
be obvious how one can get literal # or ! at the beginning of pattern.
Add a few lines on how to work around the magic characters.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use svnrdump_sim.py to emulate svnrdump without an svn server.
Tests fetching, incremental fetching, fetching from file://,
and the regeneration of fast-import's marks file.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import mark files are stored outside the object database and are
therefore not fetched and can be lost somehow else. marks provide a
svn revision --> git sha1 mapping, while the notes that are attached
to each commit when it is imported provide a git sha1 --> svn revision
mapping.
If the marks file is not available or not plausible, regenerate it by
walking through the notes tree. , i.e. The plausibility check tests
if the highest revision in the marks file matches the revision of the
top ref. It doesn't ensure that the mark file is completely correct.
This could only be done with an effort equal to unconditional
regeneration.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a svnrdump-simulator replaying a dump file for testing
To ease testing without depending on a reachable svn server, this
compact python script mimics parts of svnrdumps behaviour. It
requires the remote url to start with sim://.
Start and end revisions are evaluated. If the requested revision
doesn't exist, as it is the case with incremental imports, if no new
commit was added, it returns 1 (like svnrdump).
To allow using the same dump file for simulating multiple incremental
imports, the highest revision can be limited by setting the environment
variable SVNRMAX to that value. This simulates the situation where
higher revs don't exist yet.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.
If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.
On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-svn: Activate import/export-marks for fast-import
Enable import and export of a marks file by sending the appropriate
feature commands to fast-import before sending data.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a note for every imported commit containing svn metadata
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that stores
additional information. The notes are currently hard-coded in
refs/notes/svn/revs. Currently the following lines from the svn dump
are directly accumulated in the note. This can be refined as needed.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast_export lacked a method to writes notes to fast-import stream.
Add two new functions fast_export_note which is similar to
fast_export_modify. And also add fast_export_buf_to_data to be able to
write inline blobs that don't come from a line_buffer or from delta
application.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow reading svn dumps from files via file:// urls
For testing as well as for importing large, already available dumps,
it's useful to bypass svnrdump and replay the svndump from a file
directly.
Add support for file:// urls in the remote url, e.g.
svn::file:///path/to/dump
When the remote helper finds an url starting with file:// it tries to
open that file instead of invoking svnrdump.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote-svn, vcs-svn: Enable fetching to private refs
The reference to update by the fast-import stream is hard-coded. When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/. This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.
Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debug==1, start fast-import with "--stats" instead of "--quiet"
fast-import prints statistics that could be interesting to the
developer of remote helpers.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add documentation for the 'bidi-import' capability of remote-helpers
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Connect fast-import to the remote-helper via pipe, adding 'bidi-import' capability
The fast-import commands 'cat-blob' and 'ls' can be used by
remote-helpers to retrieve information about blobs and trees that
already exist in fast-import's memory. This requires a channel from
fast-import to the remote-helper.
remote-helpers that use these features shall advertise the new
'bidi-import' capability to signal that they require the communication
channel. When forking fast-import in transport-helper.c connect it to
a dup of the remote-helper's stdin-pipe. The additional file
descriptor is passed to fast-import via its command line
(--cat-blob-fd). It follows that git and fast-import are connected to
the remote-helpers's stdin.
Because git can send multiple commands to the remote-helper on it's
stdin, it is required that helpers that advertise 'bidi-import' buffer
all input commands until the batch of 'import' commands is ended by a
newline before sending data to fast-import. This is to prevent mixing
commands and fast-import responses on the helper's stdin.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add argv_array_detach and argv_array_free_detached
Allow detaching of ownership of the argv_array's contents and add a
function to free those detached argv_arrays later.
This makes it possible to use argv_array efficiently with the exiting
struct child_process which only contains a member char **argv.
Add to documentation.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add svndump_init_fd to allow reading dumps from arbitrary FDs
The existing function only allows reading from a filename or from
stdin. Allow passing of a FD and an additional FD for the back report
pipe. This allows us to retrieve the name of the pipe in the caller.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The link-rule is a copy of the standard git$X rule but adds VCSSVN_LIB.
Add executable to .gitignore.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable basic fetching from subversion repositories. When processing
remote URLs starting with testsvn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.
Imported refs are created in a private namespace at
refs/svn/<remote-name>/master. The revision history is imported
linearly (no branch detection) and completely, i.e. from revision 0 to
HEAD.
The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com> Acked-by: David Michael Barr <b@rr-dav.id.au> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-svn: keep leading slash when canonicalizing paths (fallback case)
Subversion's svn_dirent_canonicalize() and svn_path_canonicalize()
APIs keep a leading slash in the return value if one was present on
the argument, which can be useful since it allows relative and
absolute paths to be distinguished.
When git-svn's canonicalize_path() learned to use these functions if
available, its semantics changed in the corresponding way. Some new
callers rely on the leading slash --- for example, if the slash is
stripped out then _canonicalize_url_ourselves() will transform
"proto://host/path/to/resource" to "proto://hostpath/to/resource".
Unfortunately the fallback _canonicalize_path_ourselves(), used when
the appropriate SVN APIs are not usable, still follows the old
semantics, so if that code path is exercised then it breaks. Fix it
to follow the new convention.
Noticed by forcing the fallback on and running tests. Without this
patch, t9101.4 fails:
Bad URL passed to RA layer: Unable to open an ra_local session to \
URL: Local URL 'file://homejrnsrcgit-scratch/t/trash%20directory.\
t9101-git-svn-props/svnrepo' contains unsupported hostname at \
/home/jrn/src/git-scratch/perl/blib/lib/Git/SVN.pm line 148
With it, the git-svn tests pass again.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
All users of $gs->{path} should have been converted to use the
accessor by now. Check our work by renaming the underlying variable
to break callers that try to use it directly.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
When using the {word,[...]} style of configuration for tags and branches,
it appears the intent is to only match whole path parts, since the words
in the {} pattern are meta-character quoted.
When the pattern word appears in the beginning or middle of the url,
it's matched completely, since the left side, pattern, and (non-empty)
right side are joined together with path separators.
However, when the pattern word appears at the end of the URL, the
right side is an empty pattern, and the resulting regex matches
more than just the specified pattern.
For example, if you specify something along the lines of
branches = branches/project/{release_1,release_2}
and your repository also contains "branches/project/release_1_2", you
will also get the release_1_2 branch. By restricting the match regex
with anchors, this is avoided.
Signed-off-by: Ammon Riley <ammon.riley@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
git-svn.perl: keep processing all commits in parents_exclude
This fixes a bug where git finds the incorrect merge parent. Consider a
repository with trunk, branch1 of trunk, and branch2 of branch1.
Without this change, git interprets a merge of branch2 into trunk as a
merge of branch1 into trunk.
Signed-off-by: Steven Walter <stevenrwalter@gmail.com> Reviewed-by: Sam Vilain <sam@vilain.net> Signed-off-by: Eric Wong <normalperson@yhbt.net>
git-svn.perl: consider all ranges for a given merge, instead of only tip-by-tip
Consider the case where you have trunk, branch1 of trunk, and branch2 of
branch1. trunk is merged back into branch2, and then branch2 is
reintegrated into trunk. The merge of branch2 into trunk will have
svn:mergeinfo property references to both branch1 and branch2. When
git-svn fetches the commit that merges branch2 (check_cherry_pick),
it is necessary to eliminate the merged contents of branch1 as well as
branch2, or else the merge will be incorrectly ignored as a cherry-pick.
Signed-off-by: Steven Walter <stevenrwalter@gmail.com> Reviewed-by: Sam Vilain <sam@vilain.net> Signed-off-by: Eric Wong <normalperson@yhbt.net>
Merge commit 'f9f6e2c' into nd/attr-match-optim-more
* commit 'f9f6e2c':
exclude: do strcmp as much as possible before fnmatch
dir.c: get rid of the wildcard symbol set in no_wildcard()
Unindent excluded_from_list()
When upload-pack advertises refs, we attempt to peel tags
and advertise the peeled version. We currently hand-roll the
tag dereferencing, and use as many optimizations as we can
to avoid loading non-tag objects into memory.
Not only has peel_ref recently learned these optimizations,
too, but it also contains an even more important one: it
has access to the "peeled" data from the pack-refs file.
That means we can avoid not only loading annotated tags
entirely, but also avoid doing any kind of object lookup at
all.
This cut the CPU time to advertise refs by 50% in the
linux-2.6 repo, as measured by:
echo 0000 | git-upload-pack . >/dev/null
best-of-five, warm cache, objects and refs fully packed:
[before] [after]
real 0m0.026s real 0m0.013s
user 0m0.024s user 0m0.008s
sys 0m0.000s sys 0m0.000s
Those numbers are irrelevantly small compared to an actual
fetch. Here's a larger repo (400K refs, of which 12K are
unique, and of which only 107 are unique annotated tags):
[before] [after]
real 0m0.704s real 0m0.596s
user 0m0.600s user 0m0.496s
sys 0m0.096s sys 0m0.092s
This shows only a 15% speedup (mostly because it has fewer
actual tags to parse), but a larger absolute value (100ms,
which isn't a lot compared to a real fetch, but this
advertisement happens on every fetch, even if the client is
just finding out they are completely up to date).
In truly pathological cases, where you have a large number
of unique annotated tags, it can make an even bigger
difference. Here are the numbers for a linux-2.6 repository
that has had every seventh commit tagged (so about 50K
tags):
[before] [after]
real 0m0.443s real 0m0.097s
user 0m0.416s user 0m0.080s
sys 0m0.024s sys 0m0.012s
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of peel_ref is to dereference tags; if the base
object is not a tag, then we can return early without even
loading the object into memory.
This patch accomplishes that by checking sha1_object_info
for the type. For a packed object, we can get away with just
looking in the pack index. For a loose object, we only need
to inflate the first couple of header bytes.
This is a bit of a gamble; if we do find a tag object, then
we will end up loading the content anyway, and the extra
lookup will have been wasteful. However, if it is not a tag
object, then we save loading the object entirely. Depending
on the ratio of non-tags to tags in the input, this can be a
minor win or minor loss.
However, it does give us one potential major win: if a ref
points to a large blob (e.g., via an unannotated tag), then
we can avoid looking at it entirely.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of the peel_ref function is to dereference tag
objects recursively until we hit a non-tag, and return the
sha1. Conceptually, it should return 0 if it is successful
(and fill in the sha1), or -1 if there was nothing to peel.
However, the current behavior is much more confusing. For a
regular loose ref, the behavior is as described above. But
there is an optimization to reuse the peeled-ref value for a
ref that came from a packed-refs file. If we have such a
ref, we return its peeled value, even if that peeled value
is null (indicating that we know the ref definitely does
_not_ peel).
It might seem like such information is useful to the caller,
who would then know not to bother loading and trying to peel
the object. Except that they should not bother loading and
trying to peel the object _anyway_, because that fallback is
already handled by peel_ref. In other words, the whole point
of calling this function is that it handles those details
internally, and you either get a sha1, or you know that it
is not peel-able.
This patch catches the null sha1 case internally and
converts it into a -1 return value (i.e., there is nothing
to peel). This simplifies callers, which do not need to
bother checking themselves.
Two callers are worth noting:
- in pack-objects, a comment indicates that there is a
difference between non-peelable tags and unannotated
tags. But that is not the case (before or after this
patch). Whether you get a null sha1 has to do with
internal details of how peel_ref operated.
- in show-ref, if peel_ref returns a failure, the caller
tries to decide whether to try peeling manually based on
whether the REF_ISPACKED flag is set. But this doesn't
make any sense. If the flag is set, that does not
necessarily mean the ref came from a packed-refs file
with the "peeled" extension. But it doesn't matter,
because even if it didn't, there's no point in trying to
peel it ourselves, as peel_ref would already have done
so. In other words, the fallback peeling is guaranteed
to fail.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are asked to peel a ref to a sha1, we internally call
deref_tag, which will recursively parse each tagged object
until we reach a non-tag. This has the benefit that we will
verify our ability to load and parse the pointed-to object.
However, there is a performance downside: we may not need to
load that object at all (e.g., if we are listing peeled
simply listing peeled refs), or it may be a large object
that should follow a streaming code path (e.g., an annotated
tag of a large blob).
It makes more sense for peel_ref to choose the fast thing
rather than performing the extra check, for two reasons:
1. We will already sometimes short-circuit the tag parsing
in favor of a peeled entry from a packed-refs file. So
we are already favoring speed in some cases, and it is
not wise for a caller to rely on peel_ref to detect
corruption.
2. We already silently ignore much larger corruptions,
like a ref that points to a non-existent object, or a
tag object that exists but is corrupted.
2. peel_ref is not the right place to check for such a
database corruption. It is returning only the sha1
anyway, not the actual object. Any callers which use
that sha1 to load an object will soon discover the
corruption anyway, so we are really just pushing back
the discovery to later in the program.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
paint_down_to_common(): parse commit before relying on its timestamp
When refactoring the merge-base computation to reduce the pairwise
O(n*(n-1)) traversals to parallel O(n) traversals, the code forgot
that timestamp based heuristics needs each commit to have been
parsed. This caused an empty "git pull" to spend cycles, traversing
the history all the way down to 0 (because an unparsed commit object
has 0 timestamp, and any other commit object with positive timestamp
will be processed for its parents, all getting parsed), only to come
up with a merge message to be used.
Teach the commands from the "log" family the "--grep-reflog" option
to limit output by string that appears in the reflog entry when the
"--walk-reflogs" option is in effect.
* nd/grep-reflog:
revision: make --grep search in notes too if shown
log --grep-reflog: reject the option without -g
revision: add --grep-reflog to filter commits by reflog messages
grep: prepare for new header field filter