Let's have hdr be a simple char pointer/array when possible, and let's
reduce its storage to 32 bytes. Especially for sha1_loose_object_info()
where 128 bytes is way excessive and wastes extra CPU cycles inflating.
The object type is already restricted to 10 bytes in parse_sha1_header()
and the size, even if it is 64 bits, will fit in 20 decimal numbers. So
32 bytes is plenty.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
git-apply: do not fix whitespaces on context lines.
diff --cc: integer overflow given a 2GB-or-larger file
mailinfo: do not get confused with logical lines that are too long.
git-apply: do not fix whitespaces on context lines.
Internal function apply_line() is called to copy both context lines
and added lines to the output buffer, while possibly fixing the
whitespace breakages depending on --whitespace=strip settings.
However, it did its fix-up on both context lines and added lines.
This resulted in two symptoms:
(1) The number of lines reported to have been fixed up included
these context lines.
(2) However, the lines actually shown were limited to the added
lines that had whitespace breakages.
diff --cc: integer overflow given a 2GB-or-larger file
Few of us use git to compare or even version-control 2GB files,
but when we do, we'll want it to work.
Reading a recent patch, I noticed two lines like this:
int len = st.st_size;
Instead of "int", that should be "size_t". Otherwise, in the
non-symlink case, with 64-bit size_t, if the file's size is 2GB,
the following xmalloc will fail:
result = xmalloc(len + 1);
trying to allocate 2^64 - 2^31 + 1 bytes (assuming sign-extension
in the int-to-size_t promotion). And even if it didn't fail, the
subsequent "result[len] = 0;" would be equivalent to an unpleasant
"result[-2147483648] = 0;"
The other nearby "int"-declared size variable, sz, should also be of
type size_t, for the same reason. If sz ever wraps around and becomes
negative, xread will corrupt memory _before_ the "result" buffer.
Signed-off-by: Jim Meyering <jim@meyering.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
mailinfo: do not get confused with logical lines that are too long.
It basically considers all the continuation lines to be lines of their
own, and if the total line is bigger than what we can fit in it, we just
truncate the result rather than stop in the middle and then get confused
when we try to parse the "next" line (which is just the remainder of the
first line).
[jc: added test, and tightened boundary a bit per list discussion.]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
GIT 1.5.0.2
git-remote: support remotes with a dot in the name
Documentation: describe "-f/-t/-m" options to "git-remote add"
diff --cc: fix display of symlink conflicts during a merge.
git-remote: support remotes with a dot in the name
[jc: the original from Pavel was limiting the variable names to only
fetch and url, but I loosened it to take valid variable names.]
[jc: cherry-picked from 'master', since people seem to be reinventing
this many times.]
Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
diff --cc: fix display of symlink conflicts during a merge.
"git-diff-files --cc" to show conflicts during merge did not pass
the correct mode information for the working tree down, and showed
bogus combined diff.
merge-recursive: fix longstanding bug in merging symlinks
Commit 3af244ca added unlink(2) before running symlink(2) to
update the working tree with the merge result, but it was
unlinking a wrong path. This resulted in loss of the path
pointed by a symlink.
merge-index: fix longstanding bug in merging symlinks
Ancient commit e2b6a9d0 added code to pass "file modes" from
merge-index to merge-one-file, and then later commit 54dd99a1
wanted to make sure we do not end up creating a nonsense symlink
that points at a path whose name contains conflict markers.
However, nobody noticed that the code in merge-index added by e2b6a9d0 were stripping the S_IFMT bits and the code in 54dd99a1
was meaningless. This fixes it.
diff --cached: give more sensible error message when HEAD is yet to be created.
It is not like the user said 'diff --cached HEAD', so complaining about
HEAD not being a valid commit, while technically might be correct, is
not very helpful.
'+' increments the mtime on the files by <seconds>
'-' decrements the mtime on the files by <seconds>
'=' sets the mtime on the file to exactly <seconds>
'=+' and '=-' sets the mtime on the file to <seconds> after or
before the current time.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
Add Release Notes to prepare for 1.5.0.2
Allow arbitrary number of arguments to git-pack-objects
rerere: do not deal with symlinks.
rerere: do not skip two conflicted paths next to each other.
Don't modify CREDITS-FILE if it hasn't changed.
Allow arbitrary number of arguments to git-pack-objects
If a repository ever gets in a situation where there are too many
packs (more than 60 or so), perhaps because of frequent use of
git-fetch -k or incremental git-repack, then it becomes impossible to
fully repack the repository with git-repack -a. That command just
dies with the cryptic message
fatal: too many internal rev-list options
This message comes from git-pack-objects, which is passed one command
line option like --unpacked=pack-<SHA1>.pack for each pack file to be
repacked. However, the current code has a static limit of 64 command
line arguments and just aborts if more arguments are passed to it.
Fix this by dynamically allocating the array of command line
arguments, and doubling the size each time it overflows.
Signed-off-by: Roland Dreier <roland@digitalvampire.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
We should always avoid rewriting a built file during `make install`
if nothing has changed since `make all`. This is to help support
the typical installation process of compiling a package as yourself,
then installing it as root.
Forcing CREDITS-FILE to be always be rebuilt in the Makefile means
that CREDITS-GEN needs to check for a change and only update
CREDITS-FILE if the file content actually differs. After all,
content is king in Git.
Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* maint:
diff-patch: Avoid emitting double-slashes in textual patch.
Reword git-am 3-way fallback failure message.
Limit filename for format-patch
core.legacyheaders: Use the description used in RelNotes-1.5.0
git-show-ref --verify: Fail if called without a reference
When the blobs recorded on the index lines in the patch as pre-image
blobs are not found in the repository, "git-am" punted saying
that the index line does not record anything useful. This was not
clear enough -- the index line does have something useful but the
problem was that it was not useful in _that_ repository.
Badly formatted commits may have very long comments. This causes
git-format-patch to fail. To avoid that, truncate the filename
to a value we believe will always work.
Err out if the patch file cannot be created.
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: fix some potential bugs with --follow-parent
When using do_switch:
We only need to ensure the index is clean and set to that of the
parent tree) we rely on being able to reconstruct full files
with deltas transferred over the network.
When using do_update:
We may safely unlink the index if we are fetching an entire
new tree with do_update. Having an old index (from a
previously deleted/abandoned directory) around can cause
irrelevant files to be mistakenly kept.
git-svn: fix reconnections to different paths of svn:// repositories
Clearing the pool of the previous SVN::Ra connection we have
seems to to fix mysterious connection dropping errors when
reconnecting to different paths of svn:// repositories hosted by
rubyforge.org.
Note: I'm not sure *why* this fixes things things,
but it does for me.
git-svn: don't consider SVN URL usernames significant when comparing
http://foo@blah.com/path is the same as http://blah.com/path, so
remove usernames from URLs before storing them in commits, and when
reading them from commits.
git-svn: ensure we're at the top-level and can access $GIT_DIR
If we are run inside a subdirectory of a working tree, we'll
chdir to the top first before touching anything. This also
prevents the accidental creation of .git directories inside
subdirectories since they need metadata.
git-svn: give show-ignore HEAD smarts, like dcommit and log
This allows the user to run git-svn show-ignore on there
current HEAD without needing to remember which branch/ref they
branched from with -i. Also, find_by_url should correctly
handle cases where the URL passed to it is not valid.
git-svn: allow metadata options to be specified with 'init' and 'clone'
Since the options that affect the way metadata is handled in
git-svn, should be consistently set/unset throughout history
imported by git-svn; it makes sense to allow the user to set
certain options from the command-line that will write to the
config file when initially creating the repository.
Also, fix some formatting issues while we're updating
documentation.
This documents the 'clone' and 'rebase' commands
of git-svn. Additionaly, examples are updated
to use them instead of the lower-level 'init' and
'fetch' commands.
These tests are very similar as the ones I used for useSvmProps
and expect the same results because both dumps were generated
from the same original repo.
git-svn: fix useSvmProps, hopefully for the last time
svm:mirror is not useful at all for us. Parts of the old unit
test were broken and based on my misunderstanding of the
svm:mirror property.
When we read svm:source; make sure we correctly handle the '!'
in it: it is used to separate the path of the repository root
from the virtual path within the repository. We don't need
to make that distinction, honestly!
We also ensure that subdirectories are also mirrored with the
correct URL if we're using useSvmProps.
We have a new test that uses dumped repo that was really
created using SVN::Mirror to avoid ambiguities and
mis-understandings about the svm: properties.
Note: trailing whitespace in the svm.dump file is unfortunately
a reality and required by SVN; so please ignore it when applying
this patch.
Also, ensure that the -R/--remote/--svn-remote flag is always
in effect if explicitly passed via the command-line. This
allows us to track logically different mirrors sharing the
same URL (probably common with SVN::Mirror/SVK users).
This is similar to useSvmProps, but far simpler in
implementation because svnsync retains a 1:1
between revision numbers and relative paths within
the repository
git-svn: allow overriding of the SVN repo root in metadata
This feature allows users to create repositories from alternate
URLs. For example, an administrator could run git-svn on the
server locally (accessing via file://) but wish to distribute
the repository with a public http:// or svn:// URL in the
metadata so users of it will see the public URL.
This works similarly to 'svn update' or 'git pull' except that
it preserves linear history with 'git rebase' instead of 'git
merge' for ease of dcommit-ing with git-svn.
While we're at it, put the working_head_info() logic
into its own function and allow --fetch-all/--all for
dcommit and rebase (which will fetch all refs in the
current [svn-remote] instead of just the working one).
Note that the '-a' switch (short for --fetch-all/--all) has been
removed as it conflicts with the non-svn 'git fetch'
On newly-created repositories, 'refs/heads/master' does not
point to anything. This can be confusing to new users; so we
update 'master' to point to the last imported ref after fetching
is done.
Once 'master' is valid; we assume HEAD points to it; and if
the repository is not bare, then checkout the files if the
working tree is clean and unused.
git-svn: allow dcommit for those who only fetch from SVM with useSvmProps
This allows users to use SVM (SVN::Mirror) to mirror a remote
repository to use dcommit to commit to the repository that SVM
was mirroring. When dcommit is used in this manner, the automatic
fetch + rebase/reset does not happen; in which case the user will
have to manually invoke svm/svk, run 'git svn fetch', and finally
'git rebase'.
git-svn: allow --log-window-size to be specified, default to 100
The newer default value should should lower memory usage for
large fetches and also help with fetching from less reliable
servers. Previously the value was 1000 and memory usage
got a bit high on some repositories and fetching became
less reliable in some cases.
git-svn: hopefully make 'fetch' more user-friendly
multi-fetch is deprecated, "fetch -a" is easier to type
By default, fetch will fetch everything from its default
[svn-remote]; if fetch [--all|-a] is specified, then it will
fetch from all svn remotes. Refspecs on the command-line
(like git-fetch) are not supported.
Also, enable -r/--revision arguments for fetch so
users can shoot themselves in the foot^W^W^W^W^W
skip some history and do the equivalent of a shallow
clone/fetch they're not interested in.
t910*: s/repo-config/config/g; poke around possible race conditions
Some of the repo-config => config renaming missed the git-svn
tests; so I'm just renaming them to be consisten with the
rest of the modern git.
Also, some of the newer tests didn't have 'poke' in them
to workaround race conditions on fast machines. This adds
places where they can _possibly_ occur; but I don't have
fast enough hardware to trigger them.
git-svn: usability fixes for the 'git svn log' command
Similar in spirit to the recent dcommit change, we now
look at 'HEAD' by default to look for a GIT_SVN_ID
so the user won't have to pass -i <GIT_SVN_ID> argument.
We are also more tolerant of of people passing bare remote names
as a result (just $GIT_SVN_ID without the -i)
* dcommit no longer requires the correct -i/GIT_SVN_ID option
passed to it. Since you're committing from HEAD (or another
commit that is a parent of HEAD), you'll be able to find
a commit with metadata information containing the SVN URL
that your HEAD was descended from anyways.
* I don't think dcommit ever worked for people using the
noMetadata option; so I don't think relying on metadata
is an issue.
* useSvmProps users shouldn't commit to SVN::Mirror created
repositories anyways, right?
* Users of globbing should automatically be able to commit
to paths that are not explicitly set in .git/config
A manual test that sets up a repository that looks like an SVK depot,
and then imports it to check that it looks like we mirrored the
'original' source.
There is also a minor modification to the git-svn test library shell
file which sets a variable for the subversion repository's filesystem
path.
[ew: made some of the tests stricter and more thorough]
git-svn: handle multi-init without --trunk, UseSvmProps fixes
multi-init did not write a svn-remote.<remote>.url config
entry without a --trunk argument.
Also, The svm:mirror property is used by SVN::Mirror to track
the path of the repository that we are mirroring. We need to
append that to the source (which is (presumably) just the URL of
the repository root).
Lastly, we now look harder for svm:(source|mirror|uuid) properties
in sub and parent directories. Since our relative path could
be tweaked.
git-svn: write the highest maxRex out for branches and tags
Even if nothing touched paths we care about in a fetch;
increment the maxRev like we do with rev_db since
we don't like having to run get_log on revisions we've
seen before.
git-svn: use separate, per-repository .rev_db files
We need a separate .rev_db file for each repository we're
tracking. This allows us to track the same logical path off
multiple mirrors. We preserve a symlink to the old .rev_db
(no-UUID) if we're (auto-)migrating from an old version to
preserve backwards compatibility.
Also, get rid of the uuid() wrapper since we cache UUID in our
private config, and the SVN::Ra::get_uuid() function memoizes
the return value per-connection.
git-svn: extra safety for noMetadata and useSvmProps users
Make sure we flush our userspace buffers and and fsync(2)
.rev_db information to disk if we use these options because
we really don't want to lose this information.
Also, disallow --use-svm-props and --no-metadata from the
command-line because history will be inconsistent if they're
only used occasionally. If a user wants to use these options,
they must be set in the config so they're always on.
These boolean switches will override options set globally in
[svn], and even override options set on the command-line (this
should probably change in the future, however).
Note that the noMetadata and useSvmProps options conflict. It's
both technically and logically impossible to use them together.
git-svn: add support for SVN::Mirror/svk using revprops for metadata
Pass --use-svm-props or set the svn.usesvmprops key with git-config
to enable using properties set by SVN::Mirror when it mirrored the
upstream URL.
This is heavily based on work from Sam Vilain:
> From: Sam Vilain <sam@vilain.net>
> Date: Sun, 11 Feb 2007 12:34:45 +1300
> Subject: [PATCH] git-svn: re-map repository URLs and UUIDs on SVK mirror paths
>
> If an SVN revision has a property, "svm:headrev", it is likely that
> the revision was created by SVN::Mirror (a part of SVK). The property
> contains a repository UUID and a revision. We want to make it look
> like we are mirroring the original URL, so introduce a helper function
> that returns the original identity URL and UUID, and use it when
> generating commit messages.
git-svn: remove optimized commit stuff for set-tree
I may resurrect it for dcommit at some point, but nobody really
uses set-tree anymore and I don't feel like introducing more
complexity into the code at this point.
git-svn: correctly handle globs with a right-hand-side path component
Several bugs were found and fixed while getting this to work:
* Remember the 'R'(eplace) case of actions and treat it like we
would an 'A'(dd) case.
* Fix a small case of follow-parent missing a parent if a
subdirectory was modified in the revision where the parent was
copied.
* dirents returned by get_dir sometimes expire if the data
structure is too big and the pool is destroyed, so we
cache get_dir (along with check_path and get_revprops)
temporarily along with its pool.
git-svn: fix buggy regular expression usage in several places
I incorrectly used $path/? and $path/* to strip off leading
directories, but places where $path = 'branches/0.17' would
incorrectly strip changes to 'branches/0.17.1' as well.
For globs, we require that our '*' is its own path component
(surrounded by '/' or nothing). Enforce this when --prefix= is
passed to us, too.
We don't need them anymore, all the rough points of
the --follow-parent implementation have been worked out.
The only improvement in the future will probably be
--follow-parent-harder, which will track subdirectories and
follow individual file history (so annotate/blame can be
complete); but that is still a ways off.
git-svn: remove check_path calls before calling do_update
These checks were needed before git-svn got smarter about
match_paths() and using path information returned by get_log().
We also have extra checking against fetching revisions
out-of-order these days; so we don't have to worry about that as
much. We also check for tree deletions in match_paths() and
skip those as well.
We can have a branch that was deleted, then re-added under the
same name but copied from another path, in which case we'll have
multiple parents (we don't want to break the original ref, nor
lose copypath info).
git-svn: implement auto-discovery of branches/tags
This is similar to the way git proper handles refs, except we
use the keys 'branches' and 'tags' to distinguish when we want
to use wildcards.
The left-hand side of the ':' contains the remote path, and must
have one asterisk ('*') in it for the branch name. The asterisk
may be in any component of the path as long as is it on its own
directory level.
The right-hand side contains the refname and must have the
asterisk as the last path component.
git-svn: avoid extra get_log calls when refspecs are added for fetching
Since fetch_loop_common starts from the lowest revision number
in a group of Git::SVN objects; we want to avoid refetching
get_log for current users for things we've already cut it.
git-svn: don't write to the config file from --follow-parent
Having 'fetch' entries in the config file created from
--follow-parent is wasteful because it can cause *future* of
invocations to follow revisions we were never interested in
in the first place.
Using buffered IO for reading 40-41 bytes at a time isn't very
efficient. Buffering writes for a short duration is alright
since we close() right away and buffers will be flushed.
git-svn: avoid redundant get_log calls between invocations
Prefill .rev_db to the maximum revision we tried to fetch;
and take advantage of that so we can avoid using get_log()
on ranges we've already seen (and have deemed uninteresting).
git-svn: do our best to ensure that our ref and rev_db are consistent
Defer any signals that cause termination while they are
updating; and put the update-ref call as close to the rename()
as possible. Also, make things extra-safe (but slower) for
people using --no-metadata since they can't rely on .rev_db
being rebuilt if it's clobbered (well, I'm calling update-ref
with the -m flag for reflogs, we don't yet have a way to rebuild
.rev_db from reflogs.
git-svn: avoid a huge memory spike with high-numbered revisions
Passing very large strings as arguments is bad for memory usage
as it never seems to get freed in Perl. The .rev_db format is
already not optimized for projects with sparse history.
get_log with explicit paths is the safest way to get revisions
that change a particular path we're interested in.
Unfortunately that means we still have to run get_log multiple
times for each path we're interested in, and even more if
a path gets deleted.
The first argument of get_log() is an array reference, but we
shouldn't use more than one element in that array ref because
the non-existence of _one_ of those paths for a particular range
would cause an error for all paths in that range, so yes, we
need multiple get_log calls to be on the safe side...