* np/delta:
improve diff-delta with sparse and/or repetitive data
tiny optimization to diff-delta
replace adler32 with Rabin's polynomial in diff-delta
use delta index data when finding best delta matches
split the diff-delta interface
apply: fix infinite loop with multiple patches with --index
When multiple patches are passed to git-apply, it will attempt
to open multiple file descriptors to an index, which means
multiple entries will be in the circular cache_file_list.
This change makes git-apply only open the index once and
write the index at exit.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
This code is arguably pretty hot, if you use binary patches of course.
This patch helps gcc generate both smaller and faster code especially in
the error free path.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
When we cut off the front of a filename to make it fit on the line, we add
a "..." in front. However, the way the "git diff" code was written, we
will never reset the prefix back to the empty string, so every single
filename afterwards will have the "..." prefix, whether appropriate or
not.
You can see this with "git diff v2.6.16.." on the current kernel tree,
since there are filenames with long names that changed there:
notice how the two Documentation/firmware** filenames caused the "..." to
be added, but then the later filenames don't want it, and it also screws
up the alignment of the line numbering afterwards.
Trivially fixed by moving the declaration (and initial setting) of the
"prefix" variable into the for-loop where it is used.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* fix:
repack: honor -d even when no new pack was created
clone: keep --reference even with -l -s
repo-config: document what value_regexp does a bit more clearly.
Release config lock if the regex is invalid
core-tutorial.txt: escape asterisk
Fix users of prefix_path() to free() only when necessary
Unfortunately, prefix_path() sometimes returns a newly xmalloc()ed buffer,
and in other cases it returns a substring!
For example, when calling
git update-index ./hello.txt
prefix_path() returns "hello.txt", but does not allocate a new buffer. The
original code only checked if the result of prefix_path() was different from
what was passed in, and thusly trigger a segmentation fault.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix users of prefix_path() to free() only when necessary
Unfortunately, prefix_path() sometimes returns a newly xmalloc()ed buffer,
and in other cases it returns a substring!
For example, when calling
git update-index ./hello.txt
prefix_path() returns "hello.txt", but does not allocate a new buffer. The
original code only checked if the result of prefix_path() was different from
what was passed in, and thusly trigger a segmentation fault.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
If the variable we need to store should go into a section
that currently only has a single variable (not matching
the one we're trying to insert), we will already be into
the next section before we notice we've bypassed the correct
location to insert the variable.
To handle this case we store the current location as soon
as we find a variable matching the section of our new
variable.
This breakage was brought up by Linus.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca> Signed-off-by: Junio C Hamano <junkio@cox.net>
prefix_path() sometimes allocates new memory and returns it, and
other times returns the incoming path argument intact. The
callers need to be a bit careful not to leak memory.
checkout-index: plug memory leak from prefix_path()
prefix_path() sometimes allocates new memory and returns it, and
other times returns the incoming path argument intact. The
callers need to be a bit careful not to leak memory.
This updates the user interface and generated diff data format.
* "diff --binary" is used to signal that we want an e-mailable
binary patch. It implies --full-index and -p.
* "apply --allow-binary-replacement" acquired a short synonym
"apply --binary".
* After the "GIT binary patch\n" header line there is a token
to record which binary patch mechanism was used, so that we
can extend it later. Currently there are two mechanisms
defined: "literal" and "delta". The former records the
deflated postimage and the latter records the deflated delta
from the preimage to postimage.
For purely implementation convenience, I added the deflated
length after these "literal/delta" tokens (otherwise the
decoding side needs to guess and reallocate the buffer while
inflating). Improvement patches are very welcomed.
This adds "binary patch" to the diff output and teaches apply
what to do with them.
On the diff generation side, traditionally, we said "Binary
files differ\n" without giving anything other than the preimage
and postimage object name on the index line. This was good
enough for applying a patch generated from your own repository
(very useful while rebasing), because the postimage would be
available in such a case. However, this was not useful when the
recipient of such a patch via e-mail were to apply it, even if
the preimage was available.
This patch allows the diff to generate "binary" patch when
operating under --full-index option. The binary patch follows
the usual extended git diff headers, and looks like this:
Each line is prefixed with a "length-byte", whose value is upper
or lowercase alphabet that encodes number of bytes that the data
on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ...,
'Z' means 26, 'a' means 27, ...). <data> is 1 or more groups of
5-byte sequence, each of which encodes up to 4 bytes in base85
encoding. Because 52 / 4 * 5 = 65 and we have the length byte,
an output line is capped to 66 characters. The payload is the
same diff-delta as we use in the packfiles.
On the consumption side, git-apply now can decode and apply the
binary patch when --allow-binary-replacement is given, the diff
was generated with --full-index, and the receiving repository
has the preimage blob, which is the same condition as it always
required when accepting an "Binary files differ\n" patch.
core.prefersymlinkrefs: use symlinks for .git/HEAD
When inspecting a project whose build infrastructure used to
assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs
in the config file of such a project would help to bisect its
history.
would yield "world " (i.e. with a trailing space).
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from c1aee1fd8d94da9b3c5d2dc1d4264f7e73a58f80 commit)
Currently, if the target key has a section that matches
the initial substring of another section we mistakenly
believe we've found the correct section. To avoid this
problem, ensure that the section lengths are identical
before comparison.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca> Signed-off-by: Junio C Hamano <junkio@cox.net>
A bare "--" doesn't show up in man or html pages correctly
as two individual dashes unless backslashed as \--
in the asciidoc source. Note, no backslash is needed
inside a literal block.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca> Signed-off-by: Junio C Hamano <junkio@cox.net>
Move incorrect asciidoc level 2 titles back to level 1.
Show output of git-name-rev in man page example.
Reword sentences that begin with a period (.) in asciidoc
numbered lists to work around conversion to man page bug.
Mention that git-repack now calls git-prune-packed
when the -d option is passed to it.
[imap] section headers in the config file example need to be
contained in a literal block. imap.pass is the proper config
file variable to use, not imap.password.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca> Signed-off-by: Junio C Hamano <junkio@cox.net>
* Clarify that 'init' requires an argument
* Remove instances of 'SVN_URL' in the manpage, it's not an
environment variable.
* Refer to 'Additional Fetch Arguments' when documenting 'fetch'
* document --authors-file / -A option
Thanks to Pavel Roskin and Seth Falcon for bringing these issues
to my attention.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to depend on bignum from openssl for rev-list to compute
merge-order, but there is no reason to use different build
recipe from other programs anymore. Just build it with git-%$X
rule like everybody else.
* js/remoteconfig:
Revert "fetch, pull: ask config for remote information"
fetch, pull: ask config for remote information
builtin-push: also ask config for remote information
builtin-push: make it official.
Fix builtin-push to honor Push: lines in remotes file.
builtin-push: resurrect parsing of Push: lines
git builtin "push"
Somebody on the #git channel complained that the sha1_to_hex() thing uses
a static buffer which caused an error message to show the same hex output
twice instead of showing two different ones.
That's pretty easily rectified by making it uses a simple LRU of a few
buffers, which also allows some other users (that were aware of the buffer
re-use) to be written in a more straightforward manner.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
improve diff-delta with sparse and/or repetitive data
It is useless to preserve multiple hash entries for consecutive blocks
with the same hash. Keeping only the first one will allow for matching
the longest string of identical bytes while subsequent blocks will only
allow for shorter matches. The backward matching code will match the
end of it as necessary.
This improves both performances (no repeated string compare with long
successions of identical bytes, or even small group of bytes), as well
as compression (less likely to need random hash bucket entry culling),
especially with sparse files.
With well behaved data sets this patch doesn't change much.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
This is my assembly freak side looking at generated code again. And
since create_delta() is certainly pretty high on the radar every bits
count. In this case shorter code is generated if hash_mask is not
copied to a local variable.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
core.prefersymlinkrefs: use symlinks for .git/HEAD
When inspecting a project whose build infrastructure used to
assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs
in the config file of such a project would help to bisect its
history.
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Allow view to specify arbitrary arguments to git-rev-list
gitk: Fix file list display when files are renamed
gitk: Basic support for highlighting one view within another
gitk: Add a tree-browsing mode
gitk: Use a text widget for the file list
gitk: add menu item for editing the current view
gitk: Implement "permanent" views (stored in ~/.gitk)
gitk: Use git-rev-parse only to identify file/dir names on cmd line
gitk: Remember the view in the history list
gitk: Don't reread git-rev-list output from scratch on view switch
gitk: Fix various bugs in the view support
gitk: Make File->Update work properly again
gitk: Implement multiple views
[PATCH] gitk: Add a visual tag for remote refs
gitk: Allow view to specify arbitrary arguments to git-rev-list
The list of arguments to git-rev-list, including arguments that
select the range of commits, is now a part of the view specification.
If any arguments are given to gitk, they become part of the
"Command line" view, and the non-file arguments become the default
for any new views created.
Getting an error from git-rev-list is no longer fatal; instead the
error window pops up, and when you press OK, the main window just
shows "No commits selected".
The git-rev-list arguments are entered in an entry widget in the
view editor window using shell quoting conventions, not Tcl quoting
conventions.
* fix:
git-send-email: fix version string to be valid perl
Give the user a hint for how to continue in the case that git-am fails because it requires user intervention
gitk: Fix file list display when files are renamed
The conversion of the file list to use a text widget assumed incorrectly
that the list of files from git-diff-tree -r would correspond 1-1 with
the diff sections in the output of git-diff-tree -r -p -C, which is
not true when renames are detected. This fixes it by keeping the
elements in the difffilestart list in the order they appear in the
file list window.
Since this means that the elements of difffilestart are no longer
necessarily in ascending order, it's somewhat hard to do the dynamic
highlighting in the file list as the diff window is scrolled, so I
have taken that out for now.
gitk: Basic support for highlighting one view within another
With this, one view can be used as a highlight for another, so that
the commits that are in the highlight view are displayed in bold.
This required some fairly major changes to how the list of ids,
parents, children, and id to row mapping were stored for each view.
We can now be reading in several views at once; for all except the
current view, we just update the displayorder and the lists of parents
and children for the view.
This also creates a little bit of infrastructure for handling the
watch cursor.
get_sha1(): :path and :[0-3]:path to extract from index.
Earlier patch to say <ent>:<path> by Linus was very useful, and
this extends the same idea to the current index. An sha1
expression :<path> extracts the object name for the named path
from the current index.
Extended SHA1 -- "rev^@" syntax to mean "all parents"
A short-hand "rev^@" is understood to be "all parents of the
named commit" with this patch. So you can do
git show v1.0.0^@
to view the parents of a merge commit,
gitk ^v1.0.0^@ v1.0.4
to view the log between two revs (including the bottom one), and
git diff --cc v1.1.0 v1.0.0^@
to inspect what got changed from the merge parents of v1.0.0 to v1.1.0.
This might be just my shiny new toy that is not very useful in
practice. I needed it to do the multi-tree diff on Len's
infamous 12-way Octopus; typing "diff --cc funmerge funmerge^1
funmerge^2 funmerge^3 ..." was too painful.
[jc: taking suggestions from Linus and Johannes to match expectations
from shell users who are used to see $@ or $* either of which makes
sense. I tend to write "$@" more often so...]
You can now select whether you want to see the patch for a commit
or the whole tree. If you select the tree, gitk will now display
the commit message plus the contents of one file in the bottom-left
pane, when you click on the name of the file in the bottom-right pane.
This adds a builtin "push" command, which is largely just a C'ification of
the "git-push.sh" script.
Now, the reason I did it as a built-in is partly because it's yet another
step on relying less on shell, but it's actually mostly because I've
wanted to be able to push to _multiple_ repositories, and the most obvious
and simplest interface for that would seem be to just have a "remotes"
file that has multiple URL entries.
(For "pull", having multiple entries should either just select the first
one, or you could fall back on the others on failure - your choice).
And quite frankly, it just became too damn messy to do that in shell.
Besides, we actually have a fair amount of infrastructure in C, so it just
wasn't that hard to do.
Of course, this is almost totally untested. It probably doesn't work for
anything but the one trial I threw at it. "Simple" doesn't necessarily
mean "obviously correct".
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>